Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review

Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolki...

Full description

Saved in:

Bibliographic Details
Published in	Artificial intelligence in medicine Vol. 146; p. 102701
Main Authors	Sim, Jin-ah, Huang, Xiaolei, Horan, Madeline R., Stewart, Christopher M., Robison, Leslie L., Hudson, Melissa M., Baker, Justin N., Huang, I-Chan
Format	Journal Article
Language	English
Published	Netherlands Elsevier B.V 01.12.2023
Subjects	Electronic Health Records Humans Machine Learning Natural Language Processing Patient Outcome Assessment Patient-reported outcomes PubMed Unstructured clinical narrative Natural language processing Patient-reported outcomes Electronic health records Unstructured clinical narrative Machine learning
Online Access	Get full text
ISSN	0933-3657 1873-2860 1873-2860
DOI	10.1016/j.artmed.2023.102701

Cover

More Information
Summary:	Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care. We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs. Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP. This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted. •This systematic review summarizes the NLP/ML applications for analyzing PROs in clinical narratives of EHRs.•Most studies used NLP/ML techniques to extract unstructured PROs or predict disease progression.•Studies used different linguistic (i.e., lexical, syntactic, semantic, contextual) features to process the unstructured PROs.•Using annotation rules to analyze unstructured PROs is dominant, yet deploying novel neural ML-based methods is warranted.•Applying NLP/ML techniques to analyze unstructured PROs in EHRs can facilitate the clinical decision-making process.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-3 content type line 23 ObjectType-Undefined-2 Conceptualization: Jin-ah Sim, I-Chan Huang; Data curation: Jin-ah Sim; Funding acquisition: Melissa M. Hudson, Justin N. Baker, I-Chan Huang; Methodology: Jin-ah Sim, Xiaolei Huang, Christopher M. Stewart, I-Chan Huang; Project administration: I-Chan Huang; Resources: I-Chan Huang; Supervision: I-Chan Huang; Visualization: Jin-ah Sim; Writing - original draft preparation: Jin-ah Sim, I-Chan Huang; Writing - review & editing: Xiaolei Huang, Madeline R. Horan, Christopher M. Stewart, Leslie L. Robison, Melissa M. Hudson, Justin N. Baker, I-Chan Huang; All authors have read and agreed to the submitted version of the manuscript. AUTHOR CONTRIBUTORSHIP
ISSN:	0933-3657 1873-2860 1873-2860
DOI:	10.1016/j.artmed.2023.102701