Validation of diagnosis codes to identify hospitalized COVID‐19 patients in health care claims data

Purpose Health plan claims may provide complete longitudinal data for timely, real‐world population‐level COVID‐19 assessment. However, these data often lack laboratory results, the standard for COVID‐19 diagnosis. Methods We assessed the validity of ICD‐10‐CM diagnosis codes for identifying patient...

Full description

Saved in:

Bibliographic Details
Published in	Pharmacoepidemiology and drug safety Vol. 31; no. 4; pp. 476 - 480
Main Authors	Kluberg, Sheryl A., Hou, Laura, Dutcher, Sarah K., Billings, Monisha, Kit, Brian, Toh, Sengwee, Dublin, Sascha, Haynes, Kevin, Kline, Annemarie, Maiyani, Mahesh, Pawloski, Pamala A., Watson, Eric S., Cocoros, Noelle M.
Format	Journal Article
Language	English
Published	Chichester, UK John Wiley & Sons, Inc 01.04.2022 Wiley Subscription Services, Inc
Subjects	Algorithms Codes Coronaviruses COVID-19 COVID-19 - diagnosis COVID-19 - epidemiology COVID-19 Testing Databases, Factual Delivery of Health Care Diagnosis Health care Hospitalization Humans ICD‐10‐CM Integrated delivery systems International Classification of Diseases Laboratories medical claims Patients SARS-CoV-2 validation United States > US COVID-19 ICD-10-CM medical claims validation
Online Access	Get full text
ISSN	1053-8569 1099-1557 1099-1557
DOI	10.1002/pds.5401

Cover

More Information
Summary:	Purpose Health plan claims may provide complete longitudinal data for timely, real‐world population‐level COVID‐19 assessment. However, these data often lack laboratory results, the standard for COVID‐19 diagnosis. Methods We assessed the validity of ICD‐10‐CM diagnosis codes for identifying patients hospitalized with COVID‐19 in U.S. claims databases, compared to linked laboratory results, among six Food and Drug Administration Sentinel System data partners (two large national insurers, four integrated delivery systems) from February 20–October 17, 2020. We identified patients hospitalized with COVID‐19 according to five ICD‐10‐CM diagnosis code‐based algorithms, which included combinations of codes U07.1, B97.29, general coronavirus codes, and diagnosis codes for severe symptoms. We calculated the positive predictive value (PPV) and sensitivity of each algorithm relative to laboratory test results. We stratified results by data source type and across three time periods: February 20–March 31 (Time A), April 1–30 (Time B), May 1–October 17 (Time C). Results The five algorithms identified between 34 806 and 47 293 patients across the study periods; 23% with known laboratory results contributed to PPV calculations. PPVs were high and similar across algorithms. PPV of U07.1 alone was stable around 93% for integrated delivery systems, but declined over time from 93% to 70% among national insurers. Overall PPV of U07.1 across all data partners was 94.1% (95% CI, 92.3%–95.5%) in Time A and 81.2% (95% CI, 80.1%–82.2%) in Time C. Sensitivity was consistent across algorithms and over time, at 94.9% (95% CI, 94.2%–95.5%). Conclusion Our results support the use of code U07.1 to identify hospitalized COVID‐19 patients in U.S. claims data.
Bibliography:	Funding information U.S. Food and Drug Administration ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1053-8569 1099-1557 1099-1557
DOI:	10.1002/pds.5401