Use of natural language processing in electronic medical records to identify pregnant women with suicidal behavior towards a solution to the complex classification problem

We developed algorithms to identify pregnant women with suicidal behavior using information extracted from clinical notes by natural language processing (NLP) in electronic medical records. Using both codified data and NLP applied to unstructured clinical notes, we first screened pregnant women in P...

Full description

Saved in:
Bibliographic Details
Published inEuropean journal of epidemiology Vol. 34; no. 2; pp. 153 - 162
Main Authors Zhong, Qiu-Yue, Mittal, Leena P., Nathan, Margo D., Brown, Kara M., González, Deborah Knudson, Cai, Tianrun, Finan, Sean, Gelaye, Bizu, Avillach, Paul, Smoller, Jordan W., Karlson, Elizabeth W., Cai, Tianxi, Williams, Michelle A.
Format Journal Article
LanguageEnglish
Published Dordrecht Springer Science + Business Media 01.02.2019
Springer Netherlands
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0393-2990
1573-7284
1573-7284
DOI10.1007/s10654-018-0470-0

Cover

More Information
Summary:We developed algorithms to identify pregnant women with suicidal behavior using information extracted from clinical notes by natural language processing (NLP) in electronic medical records. Using both codified data and NLP applied to unstructured clinical notes, we first screened pregnant women in Partners HealthCare for suicidal behavior. Psychiatrists manually reviewed clinical charts to identify relevant features for suicidal behavior and to obtain gold-standard labels. Using the adaptive elastic net, we developed algorithms to classify suicidal behavior. We then validated algorithms in an independent validation dataset. From 275,843 women with codes related to pregnancy or delivery, 9331 women screened positive for suicidal behavior by either codified data (N = 196) or NLP (N = 9,145). Using expert-curated features, our algorithm achieved an area under the curve of 0.83. By setting a positive predictive value comparable to that of diagnostic codes related to suicidal behavior (0.71), we obtained a sensitivity of 0.34, specificity of 0.96, and negative predictive value of 0.83. The algorithm identified 1423 pregnant women with suicidal behavior among 9331 women screened positive. Mining unstructured clinical notes using NLP resulted in a 11-fold increase in the number of pregnant women identified with suicidal behavior, as compared to solely reliance on diagnostic codes.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0393-2990
1573-7284
1573-7284
DOI:10.1007/s10654-018-0470-0