Title

Merging Children's Oncology Group Data with an External Administrative Database Using Indirect Patient Identifiers: A Report from the Children's Oncology Group.

Year of Publication

2015

Number of Pages

e0143480

Date Published

2015

ISSN Number

1932-6203

Abstract

<p><strong>PURPOSE: </strong>Clinical trials data from National Cancer Institute (NCI)-funded cooperative oncology group trials could be enhanced by merging with external data sources. Merging without direct patient identifiers would provide additional patient privacy protections. We sought to develop and validate a matching algorithm that uses only indirect patient identifiers.</p>

<p><strong>METHODS: </strong>We merged the data from two Phase III Children's Oncology Group (COG) trials for de novo acute myeloid leukemia (AML) with the Pediatric Health Information Systems (PHIS). We developed a stepwise matching algorithm that used indirect identifiers including treatment site, gender, birth year, birth month, enrollment year and enrollment month. Results from the stepwise algorithm were compared against the direct merge method that used date of birth, treatment site, and gender. The indirect merge algorithm was developed on AAML0531 and validated on AAML1031.</p>

<p><strong>RESULTS: </strong>Of 415 patients enrolled on the AAML0531 trial at PHIS centers, we successfully matched 378 (91.1%) patients using the indirect stepwise algorithm. Comparison to the direct merge result suggested that 362 (95.7%) matches identified by the indirect merge algorithm were concordant with the direct merge result. When validating the indirect stepwise algorithm using the AAML1031 trial, we successfully matched 157 out of 165 patients (95.2%) and 150 (95.5%) of the indirectly merged matches were concordant with the directly merged matches.</p>

<p><strong>CONCLUSIONS: </strong>These data demonstrate that patients enrolled on COG clinical trials can be successfully merged with PHIS administrative data using a stepwise algorithm based on indirect patient identifiers. The merged data sets can be used as a platform for comparative effectiveness and cost effectiveness studies.</p>

DOI

10.1371/journal.pone.0143480

Alternate Title

PLoS ONE

PMID

26606521

WATCH THIS PAGE

Subscription is not available for this page.