Development and Evaluation of a Natural Language Processing Annotation Tool to Facilitate Phenotyping of Cognitive Status in Electronic Health Records: Diagnostic Study

Electronic health records (EHRs) with large sample sizes and rich information offer great potential for dementia research, but current methods of phenotyping cognitive status are not scalable. The aim of this study was to evaluate whether natural language processing (NLP)-powered semiautomated annot...

Full description

Saved in:

Bibliographic Details
Published in	Journal of medical Internet research Vol. 24; no. 8; p. e40384
Main Authors	Noori, Ayush, Magdamo, Colin, Liu, Xiao, Tyagi, Tanish, Li, Zhaozhi, Kondepudi, Akhil, Alabsi, Haitham, Rudmann, Emily, Wilcox, Douglas, Brenner, Laura, Robbins, Gregory K, Moura, Lidia, Zafar, Sahar, Benson, Nicole M, Hsu, John, R Dickson, John, Serrano-Pozo, Alberto, Hyman, Bradley T, Blacker, Deborah, Westover, M Brandon, Mukerji, Shibani S, Das, Sudeshna
Format	Journal Article
Language	English
Published	Canada Journal of Medical Internet Research 30.08.2022 Gunther Eysenbach MD MPH, Associate Professor JMIR Publications
Subjects	Accountable care organizations Adjudication Aged Agreements Algorithms Annotations Beneficiaries Chart reviews Cognition Cognition & reasoning Cognitive ability Cognitive impairment Computational linguistics Computerized medical records Coronaviruses COVID-19 COVID-19 Testing Data warehouses Datasets Dementia Dementia - diagnosis Electronic Health Records Electronic records Health records Health services utilization Health status Humans Interrater reliability Laboratories Language processing Medical diagnosis Medical records Medicare Multimedia Natural language interfaces Natural Language Processing Original Paper Patients Polymerase chain reaction Reproducibility of Results Severe acute respiratory syndrome coronavirus 2 United States United States cognitive status research cohort cognition natural language processing dementia diagnostic chart review electronic health record health care
Online Access	Get full text
ISSN	1438-8871 1439-4456 1438-8871
DOI	10.2196/40384

Cover

More Information
Summary:	Electronic health records (EHRs) with large sample sizes and rich information offer great potential for dementia research, but current methods of phenotyping cognitive status are not scalable. The aim of this study was to evaluate whether natural language processing (NLP)-powered semiautomated annotation can improve the speed and interrater reliability of chart reviews for phenotyping cognitive status. In this diagnostic study, we developed and evaluated a semiautomated NLP-powered annotation tool (NAT) to facilitate phenotyping of cognitive status. Clinical experts adjudicated the cognitive status of 627 patients at Mass General Brigham (MGB) health care, using NAT or traditional chart reviews. Patient charts contained EHR data from two data sets: (1) records from January 1, 2017, to December 31, 2018, for 100 Medicare beneficiaries from the MGB Accountable Care Organization and (2) records from 2 years prior to COVID-19 diagnosis to the date of COVID-19 diagnosis for 527 MGB patients. All EHR data from the relevant period were extracted; diagnosis codes, medications, and laboratory test values were processed and summarized; clinical notes were processed through an NLP pipeline; and a web tool was developed to present an integrated view of all data. Cognitive status was rated as cognitively normal, cognitively impaired, or undetermined. Assessment time and interrater agreement of NAT compared to manual chart reviews for cognitive status phenotyping was evaluated. NAT adjudication provided higher interrater agreement (Cohen κ=0.89 vs κ=0.80) and significant speed up (time difference mean 1.4, SD 1.3 minutes; P<.001; ratio median 2.2, min-max 0.4-20) over manual chart reviews. There was moderate agreement with manual chart reviews (Cohen κ=0.67). In the cases that exhibited disagreement with manual chart reviews, NAT adjudication was able to produce assessments that had broader clinical consensus due to its integrated view of highlighted relevant information and semiautomated NLP features. NAT adjudication improves the speed and interrater reliability for phenotyping cognitive status compared to manual chart reviews. This study underscores the potential of an NLP-based clinically adjudicated method to build large-scale dementia research cohorts from EHRs.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1438-8871 1439-4456 1438-8871
DOI:	10.2196/40384