Adapting electronic health records-derived phenotypes to claims data: Lessons learned in using limited clinical data for phenotyping

[Display omitted] •Coarse code granularity, erroneous data entry and poor generalizability may influence the performance of phenotyping algorithms.•Vocabulary-driven methods for concept sets creation shows advantages in improving the accuracy for phenotyping.•Observational Health Data Sciences and I...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biomedical informatics Vol. 102; p. 103363
Main Authors	Ostropolets, Anna, Reich, Christian, Ryan, Patrick, Shang, Ning, Hripcsak, George, Weng, Chunhua
Format	Journal Article
Language	English
Published	United States Elsevier Inc 01.02.2020
Subjects	Algorithms Chronic kidney disorder Data quality Electronic Health Records Humans Medical Informatics Observational Health Data Sciences and Informatics (OHDSI) Phenotype Phenotyping Portability Predictive Value of Tests Reproducibility Chronic kidney disorder Data quality Portability Observational Health Data Sciences and Informatics (OHDSI) Reproducibility Phenotyping
Online Access	Get full text
ISSN	1532-0464 1532-0480 1532-0480
DOI	10.1016/j.jbi.2019.103363

Cover

More Information
Summary:	[Display omitted] •Coarse code granularity, erroneous data entry and poor generalizability may influence the performance of phenotyping algorithms.•Vocabulary-driven methods for concept sets creation shows advantages in improving the accuracy for phenotyping.•Observational Health Data Sciences and Informatics (OHDSI) OMOP Common Data Model facilitate phenotype generalizability and consistency.•More data is not necessarily better: performance of a diagnosis-based chronic kidney failure algorithm is not improved by adding other codes indirectly related to chronic kidney disorder. Algorithms for identifying patients of interest from observational data must address missing and inaccurate data and are desired to achieve comparable performance on both administrative claims and electronic health records data. However, administrative claims data do not contain the necessary information to develop accurate algorithms for disorders that require laboratory results, and this omission can result in insensitive diagnostic code-based algorithms. In this paper, we tested our assertion that the performance of a diagnosis code-based algorithm for chronic kidney disorder (CKD) can be improved by adding other codes indirectly related to CKD (e.g., codes for dialysis, kidney transplant, suspicious kidney disorders). Following the best practices from Observational Health Data Sciences and Informatics (OHDSI), we adapted an electronic health record-based gold standard algorithm for CKD and then created algorithms that can be executed on administrative claims data and account for related data quality issues. We externally validated our algorithms on four electronic health record datasets in the OHDSI network. Compared to the algorithm that uses CKD diagnostic codes only, positive predictive value of the algorithms that use additional codes was slightly increased (47.4% vs. 47.9–48.5% respectively). The algorithms adapted from the gold standard algorithm can be used to infer chronic kidney disorder based on administrative claims data. We succeeded in improving the generalizability and consistency of the CKD phenotypes by using data and vocabulary standardized across the OHDSI network, although performance variability across datasets remains. We showed that identifying and addressing coding and data heterogeneity can improve the performance of the algorithms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Anna Ostropolets: Conceptualization, Methodology, Algorithm, Data Collection and Analysis, Writing- Original draft preparation and revision Christian Reich and Patrick Ryan: Conceptualization, Results Validation, Discussion, Methodology, Writing- Reviewing and Editing Author Contributions George Hripcsak, Chunhua Weng: Co-Supervision, Conceptualization, Discussion, Supervision, Writing- Reviewing and Editing Ning Shang: Phenotyping Knowledge Resource Provision, Writing- Reviewing
ISSN:	1532-0464 1532-0480 1532-0480
DOI:	10.1016/j.jbi.2019.103363