Natural language processing with machine learning methods to analyze unstructured patient-reported outcomes derived from electronic health records: A systematic review
Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolki...
Saved in:
| Published in | Artificial intelligence in medicine Vol. 146; p. 102701 |
|---|---|
| Main Authors | , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Netherlands
Elsevier B.V
01.12.2023
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0933-3657 1873-2860 1873-2860 |
| DOI | 10.1016/j.artmed.2023.102701 |
Cover
| Summary: | Natural language processing (NLP) combined with machine learning (ML) techniques are increasingly used to process unstructured/free-text patient-reported outcome (PRO) data available in electronic health records (EHRs). This systematic review summarizes the literature reporting NLP/ML systems/toolkits for analyzing PROs in clinical narratives of EHRs and discusses the future directions for the application of this modality in clinical care.
We searched PubMed, Scopus, and Web of Science for studies written in English between 1/1/2000 and 12/31/2020. Seventy-nine studies meeting the eligibility criteria were included. We abstracted and summarized information related to the study purpose, patient population, type/source/amount of unstructured PRO data, linguistic features, and NLP systems/toolkits for processing unstructured PROs in EHRs.
Most of the studies used NLP/ML techniques to extract PROs from clinical narratives (n = 74) and mapped the extracted PROs into specific PRO domains for phenotyping or clustering purposes (n = 26). Some studies used NLP/ML to process PROs for predicting disease progression or onset of adverse events (n = 22) or developing/validating NLP/ML pipelines for analyzing unstructured PROs (n = 19). Studies used different linguistic features, including lexical, syntactic, semantic, and contextual features, to process unstructured PROs. Among the 25 NLP systems/toolkits we identified, 15 used rule-based NLP, 6 used hybrid NLP, and 4 used non-neural ML algorithms embedded in NLP.
This study supports the potential utility of different NLP/ML techniques in processing unstructured PROs available in EHRs for clinical care. Though using annotation rules for NLP/ML to analyze unstructured PROs is dominant, deploying novel neural ML-based methods is warranted.
•This systematic review summarizes the NLP/ML applications for analyzing PROs in clinical narratives of EHRs.•Most studies used NLP/ML techniques to extract unstructured PROs or predict disease progression.•Studies used different linguistic (i.e., lexical, syntactic, semantic, contextual) features to process the unstructured PROs.•Using annotation rules to analyze unstructured PROs is dominant, yet deploying novel neural ML-based methods is warranted.•Applying NLP/ML techniques to analyze unstructured PROs in EHRs can facilitate the clinical decision-making process. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-3 content type line 23 ObjectType-Undefined-2 Conceptualization: Jin-ah Sim, I-Chan Huang; Data curation: Jin-ah Sim; Funding acquisition: Melissa M. Hudson, Justin N. Baker, I-Chan Huang; Methodology: Jin-ah Sim, Xiaolei Huang, Christopher M. Stewart, I-Chan Huang; Project administration: I-Chan Huang; Resources: I-Chan Huang; Supervision: I-Chan Huang; Visualization: Jin-ah Sim; Writing - original draft preparation: Jin-ah Sim, I-Chan Huang; Writing - review & editing: Xiaolei Huang, Madeline R. Horan, Christopher M. Stewart, Leslie L. Robison, Melissa M. Hudson, Justin N. Baker, I-Chan Huang; All authors have read and agreed to the submitted version of the manuscript. AUTHOR CONTRIBUTORSHIP |
| ISSN: | 0933-3657 1873-2860 1873-2860 |
| DOI: | 10.1016/j.artmed.2023.102701 |