Electronic Health Record Based Algorithm to Identify Patients with Autism Spectrum Disorder

Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for det...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 11; no. 7; p. e0159621
Main Authors Lingren, Todd, Chen, Pei, Bochenek, Joseph, Doshi-Velez, Finale, Manning-Courtney, Patty, Bickel, Julie, Wildenger Welchons, Leah, Reinhold, Judy, Bing, Nicole, Ni, Yizhao, Barbaresi, William, Mentch, Frank, Basford, Melissa, Denny, Joshua, Vazquez, Lyam, Perry, Cassandra, Namjou, Bahram, Qiu, Haijun, Connolly, John, Abrams, Debra, Holm, Ingrid A., Cobb, Beth A., Lingren, Nataline, Solti, Imre, Hakonarson, Hakon, Kohane, Isaac S., Harley, John, Savova, Guergana
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 29.07.2016
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0159621

Cover

More Information
Summary:Cohort selection is challenging for large-scale electronic health record (EHR) analyses, as International Classification of Diseases 9th edition (ICD-9) diagnostic codes are notoriously unreliable disease predictors. Our objective was to develop, evaluate, and validate an automated algorithm for determining an Autism Spectrum Disorder (ASD) patient cohort from EHR. We demonstrate its utility via the largest investigation to date of the co-occurrence patterns of medical comorbidities in ASD. We extracted ICD-9 codes and concepts derived from the clinical notes. A gold standard patient set was labeled by clinicians at Boston Children's Hospital (BCH) (N = 150) and Cincinnati Children's Hospital and Medical Center (CCHMC) (N = 152). Two algorithms were created: (1) rule-based implementing the ASD criteria from Diagnostic and Statistical Manual of Mental Diseases 4th edition, (2) predictive classifier. The positive predictive values (PPV) achieved by these algorithms were compared to an ICD-9 code baseline. We clustered the patients based on grouped ICD-9 codes and evaluated subgroups. The rule-based algorithm produced the best PPV: (a) BCH: 0.885 vs. 0.273 (baseline); (b) CCHMC: 0.840 vs. 0.645 (baseline); (c) combined: 0.864 vs. 0.460 (baseline). A validation at Children's Hospital of Philadelphia yielded 0.848 (PPV). Clustering analyses of comorbidities on the three-site large cohort (N = 20,658 ASD patients) identified psychiatric, developmental, and seizure disorder clusters. In a large cross-institutional cohort, co-occurrence patterns of comorbidities in ASDs provide further hypothetical evidence for distinct courses in ASD. The proposed automated algorithms for cohort selection open avenues for other large-scale EHR studies and individualized treatment of ASD.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Conceived and designed the experiments: TL PC YN IS GS JH. Performed the experiments: TL PC HQ DA FM J. Bochenek. Analyzed the data: TL YN WB IAH IS JD BN NL JC ISK LWW JR NB PMC J. Bochenek J. Bickel. Contributed reagents/materials/analysis tools: IS IAH HH JC FDV CP. Wrote the paper: TL PC FDV PMC J. Bochenek J. Bickel LWW JR NB YN WB FM JD CP BN JC IAH BAC NL IS ISK JH GS. Coordinated gold standard development: TL BC CP JD. Coordinated the algorithm validation and data analysis between sites: TL CP BC JD MB LV. Provided expert guidance for the algorithm development and analysis: PMC WB IAS J. Bickel LWW JR NB. Provided expert guidance on algorithm development and cluster analysis: YN J. Bochenek LDV. Provided expert guidance on genetic analyses and revised the paper: BN.
Competing Interests: The authors of this manuscript have the following competing interests: GS is on the Advisory Board of Wired Informatics, LLC which provides services and products for clinical NLP applications. The other authors have no competing interests relevant to this article to disclose. This does not alter the authors' adherence to PLOS ONE policies on sharing data and materials.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0159621