PheValuator: Development and evaluation of a phenotype algorithm evaluator

[Display omitted] •Phenotype Algorithms (PAs) are used in research to determine presence of disease.•Evaluation of PAs for sensitivity/specificity/predictive values is rarely performed.•PheValuator uses diagnostic predictive modeling to perform PA evaluation.•The tool provides conservative sensitivi...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biomedical informatics Vol. 97; p. 103258
Main Authors	Swerdel, Joel N., Hripcsak, George, Ryan, Patrick B.
Format	Journal Article
Language	English
Published	United States Elsevier Inc 01.09.2019
Subjects	Algorithms Atrial Fibrillation - diagnosis Cerebral Infarction - diagnosis Computational Biology Current Procedural Terminology Databases, Factual - statistics & numerical data Diagnosis, Computer-Assisted - statistics & numerical data Diagnostic Errors - statistics & numerical data Diagnostic predictive modeling Humans Models, Statistical Myocardial Infarction - diagnosis Phenotype Phenotype algorithms Predictive Value of Tests Probability Renal Insufficiency, Chronic - diagnosis Sensitivity and Specificity Validation Validation eGFR MDCR AF Phenotype algorithms SNOMED CDM AUC PA ICD-9 NPV PLP LASSO PPV IRB Diagnostic predictive modeling CCAE CPT-4 MDCD AMI CKD
Online Access	Get full text
ISSN	1532-0464 1532-0480 1532-0480
DOI	10.1016/j.jbi.2019.103258

Cover

More Information
Summary:	[Display omitted] •Phenotype Algorithms (PAs) are used in research to determine presence of disease.•Evaluation of PAs for sensitivity/specificity/predictive values is rarely performed.•PheValuator uses diagnostic predictive modeling to perform PA evaluation.•The tool provides conservative sensitivity/specificity/predictive estimates for PAs.•PheValuator shows promise as a tool to assess PA performance characteristics. The primary approach for defining disease in observational healthcare databases is to construct phenotype algorithms (PAs), rule-based heuristics predicated on the presence, absence, and temporal logic of clinical observations. However, a complete evaluation of PAs, i.e., determining sensitivity, specificity, and positive predictive value (PPV), is rarely performed. In this study, we propose a tool (PheValuator) to efficiently estimate a complete PA evaluation. We used 4 administrative claims datasets: OptumInsight’s de-identified Clinformatics™ Datamart (Eden Prairie,MN); IBM MarketScan Multi-State Medicaid); IBM MarketScan Medicare Supplemental Beneficiaries; and IBM MarketScan Commercial Claims and Encounters from 2000 to 2017. Using PheValuator involves (1) creating a diagnostic predictive model for the phenotype, (2) applying the model to a large set of randomly selected subjects, and (3) comparing each subject’s predicted probability for the phenotype to inclusion/exclusion in PAs. We used the predictions as a ‘probabilistic gold standard’ measure to classify positive/negative cases. We examined 4 phenotypes: myocardial infarction, cerebral infarction, chronic kidney disease, and atrial fibrillation. We examined several PAs for each phenotype including 1-time (1X) occurrence of the diagnosis code in the subject’s record and 1-time occurrence of the diagnosis in an inpatient setting with the diagnosis code as the primary reason for admission (1X-IP-1stPos). Across phenotypes, the 1X PA showed the highest sensitivity/lowest PPV among all PAs. 1X-IP-1stPos yielded the highest PPV/lowest sensitivity. Specificity was very high across algorithms. We found similar results between algorithms across datasets. PheValuator appears to show promise as a tool to estimate PA performance characteristics.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23
ISSN:	1532-0464 1532-0480 1532-0480
DOI:	10.1016/j.jbi.2019.103258