Extracting Diagnoses and Investigation Results from Unstructured Text in Electronic Health Records by Semi-Supervised Machine Learning
Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually. To develop an algorithm to identify relevant free texts automatically based on labelled examples. We developed a novel machine...
Saved in:
| Published in | PloS one Vol. 7; no. 1; p. e30412 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
Public Library of Science
19.01.2012
Public Library of Science (PLoS) |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1932-6203 1932-6203 |
| DOI | 10.1371/journal.pone.0030412 |
Cover
| Abstract | Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually.
To develop an algorithm to identify relevant free texts automatically based on labelled examples.
We developed a novel machine learning algorithm, the 'Semi-supervised Set Covering Machine' (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our 'Freetext Matching Algorithm' natural language processor.
Only 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%).
Our novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. |
|---|---|
| AbstractList | Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually. To develop an algorithm to identify relevant free texts automatically based on labelled examples. We developed a novel machine learning algorithm, the 'Semi-supervised Set Covering Machine' (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our 'Freetext Matching Algorithm' natural language processor. Only 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%). Our novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. Background Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually. Aim To develop an algorithm to identify relevant free texts automatically based on labelled examples. Methods We developed a novel machine learning algorithm, the ‘Semi-supervised Set Covering Machine’ (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our ‘Freetext Matching Algorithm’ natural language processor. Results Only 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%). Conclusions Our novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. BackgroundElectronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually.AimTo develop an algorithm to identify relevant free texts automatically based on labelled examples.MethodsWe developed a novel machine learning algorithm, the 'Semi-supervised Set Covering Machine' (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our 'Freetext Matching Algorithm' natural language processor.ResultsOnly 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%).ConclusionsOur novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually. To develop an algorithm to identify relevant free texts automatically based on labelled examples. We developed a novel machine learning algorithm, the 'Semi-supervised Set Covering Machine' (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our 'Freetext Matching Algorithm' natural language processor. Only 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%). Our novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually.BACKGROUNDElectronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually.To develop an algorithm to identify relevant free texts automatically based on labelled examples.AIMTo develop an algorithm to identify relevant free texts automatically based on labelled examples.We developed a novel machine learning algorithm, the 'Semi-supervised Set Covering Machine' (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our 'Freetext Matching Algorithm' natural language processor.METHODSWe developed a novel machine learning algorithm, the 'Semi-supervised Set Covering Machine' (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our 'Freetext Matching Algorithm' natural language processor.Only 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%).RESULTSOnly 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%).Our novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets.CONCLUSIONSOur novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. Background Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to review manually. Aim To develop an algorithm to identify relevant free texts automatically based on labelled examples. Methods We developed a novel machine learning algorithm, the ‘Semi-supervised Set Covering Machine’ (S3CM), and tested its ability to detect the presence of coronary angiogram results and ovarian cancer diagnoses in free text in the General Practice Research Database. For training the algorithm, we used texts classified as positive and negative according to their associated Read diagnostic codes, rather than by manual annotation. We evaluated the precision (positive predictive value) and recall (sensitivity) of S3CM in classifying unlabelled texts against the gold standard of manual review. We compared the performance of S3CM with the Transductive Vector Support Machine (TVSM), the original fully-supervised Set Covering Machine (SCM) and our ‘Freetext Matching Algorithm’ natural language processor. Results Only 60% of texts with Read codes for angiogram actually contained angiogram results. However, the S3CM algorithm achieved 87% recall with 64% precision on detecting coronary angiogram results, outperforming the fully-supervised SCM (recall 78%, precision 60%) and TSVM (recall 2%, precision 3%). For ovarian cancer diagnoses, S3CM had higher recall than the other algorithms tested (86%). The Freetext Matching Algorithm had better precision than S3CM (85% versus 74%) but lower recall (62%). Conclusions Our novel S3CM machine learning algorithm effectively detected free texts in primary care records associated with angiogram results and ovarian cancer diagnoses, after training on pre-classified test sets. It should be easy to adapt to other disease areas as it does not rely on linguistic rules, but needs further testing in other electronic health record datasets. |
| Audience | Academic |
| Author | Hemingway, Harry Shawe-Taylor, John Shah, Anoop D. Tate, A. Rosemary Wang, Zhuoran Denaxas, Spiros |
| AuthorAffiliation | Dana-Farber Cancer Institute, United States of America 1 Department of Computer Science, University College London, London, United Kingdom 4 Department of Informatics, University of Sussex, Brighton, United Kingdom 3 Clinical Epidemiology Group, Department of Epidemiology and Public Health, University College London, London, United Kingdom 2 School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, United Kingdom |
| AuthorAffiliation_xml | – name: 3 Clinical Epidemiology Group, Department of Epidemiology and Public Health, University College London, London, United Kingdom – name: 4 Department of Informatics, University of Sussex, Brighton, United Kingdom – name: Dana-Farber Cancer Institute, United States of America – name: 1 Department of Computer Science, University College London, London, United Kingdom – name: 2 School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, United Kingdom |
| Author_xml | – sequence: 1 givenname: Zhuoran surname: Wang fullname: Wang, Zhuoran – sequence: 2 givenname: Anoop D. surname: Shah fullname: Shah, Anoop D. – sequence: 3 givenname: A. Rosemary surname: Tate fullname: Tate, A. Rosemary – sequence: 4 givenname: Spiros surname: Denaxas fullname: Denaxas, Spiros – sequence: 5 givenname: John surname: Shawe-Taylor fullname: Shawe-Taylor, John – sequence: 6 givenname: Harry surname: Hemingway fullname: Hemingway, Harry |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/22276193$$D View this record in MEDLINE/PubMed |
| BookMark | eNqNk9tq3DAQhk1JaQ7tG5TWUGjpxW51sGW7F4WQps1CSiCH3gpZGnsVtNJGkrfJC_S5q81uQjYEGnxhM_rm98z8o91syzoLWfYWozGmFf5y6QZvhRnPU3iMEEUFJi-yHdxQMmIE0a0H39vZbgiXCJW0ZuxVtk0IqVg63cn-Hl5HL2TUts-_a9FbFyDkwqp8YhcQou5F1M7mpxAGE0PeeTfLL2yIfpBx8KDyc7iOubb5oQEZvbNa5kcgTJymHOm8Cnl7k5_BTI_Ohjn4hQ4p6ZeQU20hPwbhbfr36-xlJ0yAN-v3Xnbx4_D84Gh0fPJzcrB_PJKswXGkyrpQrADEWswo6xrEBLQEQ1NVLS4RA0JYzVraVk0tRFt3BWuJqikjrQICdC97v9KdGxf4eoSBY0ooquqypomYrAjlxCWfez0T_oY7ofltwPmeCx-1NMAVrqAB1jIliqItqgbqppS0LEWBlCpI0ipXWoOdi5s_wph7QYz40sW7EvjSRb52MeV9W1c5tDNQEmzyyGwUs3li9ZT3bsEpSa6iJgl8Wgt4dzUkG_lMBwnGCAtuCLzBNaIVKlkiPzwinx7KmupF6lvbzi2XZqnJ94uqQoxhtix7_ASVHpXcl6nBTqf4RsLnjYTExLRNvRhC4JOz0-ezJ7832Y8P2OntNgZnhuUmh03w3cNB30_47n4k4OsKkN6F4KHjUsfbG5Fa0-Z_NhaPkp_l_j9WbDlu |
| CitedBy_id | crossref_primary_10_3390_app11020865 crossref_primary_10_1186_s12911_025_02897_w crossref_primary_10_1007_s13167_019_00188_9 crossref_primary_10_1186_1472_6947_12_88 crossref_primary_10_2196_14330 crossref_primary_10_1109_TCBB_2018_2849968 crossref_primary_10_1016_j_trip_2020_100176 crossref_primary_10_1186_s12873_018_0188_z crossref_primary_10_1109_RBME_2020_3013489 crossref_primary_10_4103_ijcm_ijcm_806_24 crossref_primary_10_3389_fepid_2022_871630 crossref_primary_10_1016_j_injury_2014_11_012 crossref_primary_10_1161_JAHA_119_013924 crossref_primary_10_1186_s13040_016_0109_1 crossref_primary_10_1177_15347346211041866 crossref_primary_10_1093_fampra_cmu009 crossref_primary_10_1016_j_cosrev_2021_100370 crossref_primary_10_1016_j_jbi_2022_104147 crossref_primary_10_1097_AOG_0000000000004706 crossref_primary_10_1155_2022_1833507 crossref_primary_10_1093_ehjqcco_qcv005 crossref_primary_10_1109_JBHI_2020_2977925 crossref_primary_10_1109_JBHI_2014_2361688 crossref_primary_10_1186_s12917_016_0861_y crossref_primary_10_1542_peds_2013_3232 crossref_primary_10_1016_j_jbi_2018_04_005 crossref_primary_10_1371_journal_pone_0110900 crossref_primary_10_2147_JAA_S285742 crossref_primary_10_1016_j_jbi_2014_04_001 crossref_primary_10_1002_pds_3856 crossref_primary_10_1371_journal_pone_0074262 crossref_primary_10_1093_eurheartj_ehx487 crossref_primary_10_1017_S0033291719000151 crossref_primary_10_1111_exsy_12388 crossref_primary_10_1371_journal_pone_0107797 crossref_primary_10_1109_ACCESS_2020_3012082 crossref_primary_10_1136_bmjopen_2014_007355 crossref_primary_10_1186_1472_6947_13_30 crossref_primary_10_1007_s41109_021_00395_2 crossref_primary_10_1371_journal_pone_0136270 crossref_primary_10_3389_fmed_2019_00036 crossref_primary_10_2196_33799 crossref_primary_10_1177_03611981211002523 crossref_primary_10_1145_3490234 crossref_primary_10_1109_JBHI_2021_3134835 crossref_primary_10_1186_s13326_019_0214_4 crossref_primary_10_1007_s41870_022_00970_5 crossref_primary_10_1097_MOL_0000000000000554 crossref_primary_10_1016_j_bspc_2024_106160 crossref_primary_10_3310_pgfar05040 crossref_primary_10_1302_2046_3758_73_BJR_2017_0147_R1 crossref_primary_10_1093_tse_tdaa010 crossref_primary_10_1016_j_artmed_2023_102625 crossref_primary_10_1155_2021_6663884 crossref_primary_10_1177_0962280219837676 crossref_primary_10_2196_16760 crossref_primary_10_2139_ssrn_4049602 crossref_primary_10_1002_pds_4681 crossref_primary_10_1371_journal_pone_0226272 crossref_primary_10_1055_s_0041_1733945 crossref_primary_10_1093_jamia_ocy094 |
| Cites_doi | 10.1016/j.jbi.2004.11.016 10.1016/j.ijmedinf.2009.02.003 10.1186/1471-2288-9-42 10.1111/j.1365-2125.2009.03537.x 10.1197/jamia.M2442 10.1197/jamia.M1552 10.1197/jamia.M2437 10.1136/bmjopen-2010-000025 10.1016/j.ahj.2006.12.022 10.1371/journal.pone.0013377 |
| ContentType | Journal Article |
| Copyright | COPYRIGHT 2012 Public Library of Science 2012 Wang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. Wang et al. 2012 |
| Copyright_xml | – notice: COPYRIGHT 2012 Public Library of Science – notice: 2012 Wang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: Wang et al. 2012 |
| DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM IOV ISR 3V. 7QG 7QL 7QO 7RV 7SN 7SS 7T5 7TG 7TM 7U9 7X2 7X7 7XB 88E 8AO 8C1 8FD 8FE 8FG 8FH 8FI 8FJ 8FK ABJCF ABUWG AEUYN AFKRA ARAPS ATCPS AZQEC BBNVY BENPR BGLVJ BHPHI C1K CCPQU D1I DWQXO FR3 FYUFA GHDGH GNUQQ H94 HCIFZ K9. KB. KB0 KL. L6V LK8 M0K M0S M1P M7N M7P M7S NAPCQ P5Z P62 P64 PATMY PDBOC PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS PTHSS PYCSY RC3 7X8 5PM ADTOC UNPAY DOA |
| DOI | 10.1371/journal.pone.0030412 |
| DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Opposing Viewpoints (Gale) Gale In Context: Science ProQuest Central (Corporate) Animal Behavior Abstracts Bacteriology Abstracts (Microbiology B) Biotechnology Research Abstracts Nursing & Allied Health Database (ProQuest) Ecology Abstracts Entomology Abstracts (Full archive) Immunology Abstracts Meteorological & Geoastrophysical Abstracts Nucleic Acids Abstracts Virology and AIDS Abstracts Agricultural Science Collection Health & Medical Collection (ProQuest) ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) ProQuest Pharma Collection Public Health Database Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Natural Science Journals ProQuest Hospital Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest One Sustainability ProQuest Central Advanced Technologies & Computer Science Collection Agricultural & Environmental Science Collection ProQuest Central Essentials Biological Science Collection ProQuest Central Technology Collection Natural Science Collection Environmental Sciences and Pollution Management ProQuest One Community College ProQuest Materials Science Collection ProQuest Central Korea Engineering Research Database Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student AIDS and Cancer Research Abstracts SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Materials Science Database (ProQuest) Nursing & Allied Health Database (Alumni Edition) Meteorological & Geoastrophysical Abstracts - Academic ProQuest Engineering Collection Biological Sciences Agriculture Science Database Health & Medical Collection (Alumni Edition) ProQuest Medical Database Algology Mycology and Protozoology Abstracts (Microbiology C) Biological Science Database (ProQuest) Engineering Database Nursing & Allied Health Premium Advanced Technologies & Aerospace Database (ProQuest) ProQuest Advanced Technologies & Aerospace Collection Biotechnology and BioEngineering Abstracts Environmental Science Database (ProQuest) Materials Science Collection ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection Environmental Science Collection Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall Directory of Open Access Journals - DOAJ (NTUSG) |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Agricultural Science Database Publicly Available Content Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials Nucleic Acids Abstracts SciTech Premium Collection ProQuest Central China Environmental Sciences and Pollution Management ProQuest One Applied & Life Sciences ProQuest One Sustainability Health Research Premium Collection Meteorological & Geoastrophysical Abstracts Natural Science Collection Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) Engineering Collection Advanced Technologies & Aerospace Collection Engineering Database Virology and AIDS Abstracts ProQuest Biological Science Collection ProQuest One Academic Eastern Edition Agricultural Science Collection ProQuest Hospital Collection ProQuest Technology Collection Health Research Premium Collection (Alumni) Biological Science Database Ecology Abstracts ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts Environmental Science Collection Entomology Abstracts Nursing & Allied Health Premium ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Environmental Science Database ProQuest Nursing & Allied Health Source (Alumni) Engineering Research Database ProQuest One Academic Meteorological & Geoastrophysical Abstracts - Academic ProQuest One Academic (New) Technology Collection Technology Research Database ProQuest One Academic Middle East (New) Materials Science Collection ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Pharma Collection ProQuest Central ProQuest Health & Medical Research Collection Genetics Abstracts ProQuest Engineering Collection Biotechnology Research Abstracts Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) Agricultural & Environmental Science Collection AIDS and Cancer Research Abstracts Materials Science Database ProQuest Materials Science Collection ProQuest Public Health ProQuest Nursing & Allied Health Source ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest Medical Library Animal Behavior Abstracts Materials Science & Engineering Collection Immunology Abstracts ProQuest Central (Alumni) MEDLINE - Academic |
| DatabaseTitleList | Agricultural Science Database MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 4 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository – sequence: 5 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Sciences (General) Public Health Computer Science Medicine Biology Mathematics |
| DocumentTitleAlternate | Machine Learning to Extract Information from Text |
| EISSN | 1932-6203 |
| ExternalDocumentID | 1323078583 oai_doaj_org_article_d17e9e6b6da44b479e895c355a40dd42 10.1371/journal.pone.0030412 PMC3261909 2935099161 A477066162 22276193 10_1371_journal_pone_0030412 |
| Genre | Research Support, Non-U.S. Gov't Journal Article |
| GrantInformation_xml | – fundername: Wellcome Trust grantid: 093830 – fundername: Wellcome Trust grantid: 0938/30/Z/10/Z – fundername: Department of Health grantid: RP-PG-0407-10314 – fundername: Wellcome Trust grantid: 086091/Z/08/Z |
| GroupedDBID | --- 123 29O 2WC 53G 5VS 7RV 7X2 7X7 7XC 88E 8AO 8C1 8CJ 8FE 8FG 8FH 8FI 8FJ A8Z AAFWJ AAUCC AAWOE AAYXX ABDBF ABIVO ABJCF ABUWG ACGFO ACIHN ACIWK ACPRK ACUHS ADBBV ADRAZ AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHMBA ALMA_UNASSIGNED_HOLDINGS AOIJS APEBS ARAPS ATCPS BAWUL BBNVY BCNDV BENPR BGLVJ BHPHI BKEYQ BPHCQ BVXVI BWKFM CCPQU CITATION CS3 D1I D1J D1K DIK DU5 E3Z EAP EAS EBD EMOBN ESTFP ESX EX3 F5P FPL FYUFA GROUPED_DOAJ GX1 HCIFZ HH5 HMCUK HYE IAO IEA IGS IHR IHW INH INR IOV IPNFZ IPY ISE ISR ITC K6- KB. KQ8 L6V LK5 LK8 M0K M1P M48 M7P M7R M7S M~E NAPCQ O5R O5S OK1 OVT P2P P62 PATMY PDBOC PHGZM PHGZT PIMPY PJZUB PPXIY PQGLB PQQKQ PROAC PSQYO PTHSS PUEGO PYCSY RIG RNS RPM SV3 TR2 UKHRP WOQ WOW ~02 ~KM ALIPV CGR CUY CVF ECM EIF NPM PV9 RZL BBORY 3V. 7QG 7QL 7QO 7SN 7SS 7T5 7TG 7TM 7U9 7XB 8FD 8FK ACCTH AFFHD AZQEC C1K DWQXO FR3 GNUQQ H94 K9. KL. M7N P64 PKEHL PQEST PQUKI PRINS RC3 7X8 5PM ADTOC UNPAY AAPBV ABPTK BBAFP N95 |
| ID | FETCH-LOGICAL-c691t-d584d64e06b1636f906aeb21e977b1506e22686b3b798aab8f46b2d8362bde2e3 |
| IEDL.DBID | M48 |
| ISSN | 1932-6203 |
| IngestDate | Sun Jan 01 07:45:41 EST 2023 Fri Oct 03 12:52:47 EDT 2025 Sun Oct 26 04:09:14 EDT 2025 Tue Sep 30 16:49:35 EDT 2025 Fri Sep 05 11:35:00 EDT 2025 Wed Oct 29 18:33:02 EDT 2025 Mon Oct 20 22:15:18 EDT 2025 Mon Oct 20 16:47:42 EDT 2025 Thu Oct 16 15:00:00 EDT 2025 Thu Oct 16 15:19:30 EDT 2025 Thu May 22 21:09:56 EDT 2025 Mon Jul 21 06:06:21 EDT 2025 Thu Apr 24 22:53:46 EDT 2025 Wed Oct 01 06:39:06 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. cc-by Creative Commons Attribution License |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c691t-d584d64e06b1636f906aeb21e977b1506e22686b3b798aab8f46b2d8362bde2e3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Conceived and designed the experiments: ZW. Performed the experiments: ZW. Analyzed the data: ZW ADS ART. Wrote the paper: ADS. Reviewed and contributed to the manuscript: ZW ART SD JST HH. Obtained anonymised free text from GPRD for testing the algorithm: SD ART. Study supervision: JST HH. |
| OpenAccessLink | http://journals.scholarsportal.info/openUrl.xqy?doi=10.1371/journal.pone.0030412 |
| PMID | 22276193 |
| PQID | 1323078583 |
| PQPubID | 1436336 |
| PageCount | e30412 |
| ParticipantIDs | plos_journals_1323078583 doaj_primary_oai_doaj_org_article_d17e9e6b6da44b479e895c355a40dd42 unpaywall_primary_10_1371_journal_pone_0030412 pubmedcentral_primary_oai_pubmedcentral_nih_gov_3261909 proquest_miscellaneous_918037056 proquest_journals_1323078583 gale_infotracmisc_A477066162 gale_infotracacademiconefile_A477066162 gale_incontextgauss_ISR_A477066162 gale_incontextgauss_IOV_A477066162 gale_healthsolutions_A477066162 pubmed_primary_22276193 crossref_citationtrail_10_1371_journal_pone_0030412 crossref_primary_10_1371_journal_pone_0030412 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2012-01-19 |
| PublicationDateYYYYMMDD | 2012-01-19 |
| PublicationDate_xml | – month: 01 year: 2012 text: 2012-01-19 day: 19 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States – name: San Francisco – name: San Francisco, USA |
| PublicationTitle | PloS one |
| PublicationTitleAlternate | PLoS One |
| PublicationYear | 2012 |
| Publisher | Public Library of Science Public Library of Science (PLoS) |
| Publisher_xml | – name: Public Library of Science – name: Public Library of Science (PLoS) |
| References | S DeLisle (ref4) 2010; 5 S Schulz (ref23) 2002 S Pakhomov (ref2) 2007; 153 S Pakhomov (ref3) 2005; 38 Y Li (ref10) 2010 H Suominen (ref13) 2008 AR Tate (ref22) 2009; 9 F Ginter (ref9) 2009; 78 (ref18) 2011; 22 K Crammer (ref12) 2007 S Pakhomov (ref8) 2008 AR Aronson (ref11) 2007 R Rosales (ref15) 2010 E Herrett (ref17) 2010; 69 C Friedman (ref5) 2004; 11 V Sindhwani (ref20) 2006 BCM Fung (ref21) 2003 M Marchand (ref19) 2002; 3 AR Tate (ref1) 2011; 1 C Clark (ref7) 2008; 15 R Rosales (ref14) 2007 GK Savova (ref6) 2008; 15 (ref16) 2011; 22 |
| References_xml | – start-page: 59 year: 2003 ident: ref21 article-title: Hierarchical document clustering using frequent itemsets. publication-title: In: Proceedings of SIAM International Conference on Data Mining – volume: 38 start-page: 145 year: 2005 ident: ref3 article-title: Prospective recruitment of patients with congestive heart failure using an ad-hoc binary classifier. publication-title: J Biomed Inform doi: 10.1016/j.jbi.2004.11.016 – year: 2008 ident: ref13 article-title: Machine learning to automate the assignment of diagnosis codes to free-text radiology reports: a method description. publication-title: In: Proceedings of the ICML/UAI/COLT Workshop on Machine Learning for Health-Care Applications – volume: 3 start-page: 723 year: 2002 ident: ref19 article-title: The set covering machine. publication-title: J Mach Learn Res – start-page: 129 year: 2007 ident: ref12 article-title: Automatic code assignment to medical text. publication-title: In: Proceedings of the Workshop on Biological, Translational, and Clinical Language Processing – volume: 78 start-page: 1 year: 2009 ident: ref9 article-title: Combining hidden Markov models and latent semantic analysis for topic segmentation and labeling: Method and clinical application. publication-title: Int J Med Inform doi: 10.1016/j.ijmedinf.2009.02.003 – start-page: 477 year: 2006 ident: ref20 article-title: Large scale semi-supervised linear SVMs. publication-title: In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval – volume: 9 start-page: 42 year: 2009 ident: ref22 article-title: Determining the date of diagnosis - is it a simple matter? The impact of different approaches to dating diagnosis on estimates of delayed care for ovarian cancer in UK primary care. publication-title: BMC Med Res Methodol doi: 10.1186/1471-2288-9-42 – start-page: 105 year: 2007 ident: ref11 article-title: From indexing the biomedical literature to coding clinical text: experience with mti and machine learning approaches. publication-title: In: Proceedings of the Workshop on Biological, Translational, and Clinical Language Processing – volume: 69 start-page: 4 year: 2010 ident: ref17 article-title: Validation and validity of diagnoses in the General Practice Research Database: a systematic review. publication-title: Br J Clin Pharmacol doi: 10.1111/j.1365-2125.2009.03537.x – volume: 15 start-page: 36 year: 2008 ident: ref7 article-title: Identifying smokers with a medical extraction system. publication-title: J Am Med Inform Assoc doi: 10.1197/jamia.M2442 – volume: 22 year: 2011 ident: ref18 article-title: The Read Codes. – start-page: 682 year: 2010 ident: ref15 article-title: Automated identification of medical concepts and assertions in medical text. publication-title: In: Proceedings of the American Medical Informatics Association Annual Symposium – volume: 11 start-page: 392 year: 2004 ident: ref5 article-title: Automated encoding of clinical documents based on natural language processing. publication-title: J Am Med Inform Assoc doi: 10.1197/jamia.M1552 – start-page: 545 year: 2008 ident: ref8 article-title: Automatic quality of life prediction using electronic medical records. – volume: 15 start-page: 25 year: 2008 ident: ref6 article-title: Mayo clinic NLP system for patient smoking status identification. publication-title: J Am Med Inform Assoc doi: 10.1197/jamia.M2437 – volume: 1 start-page: e000025 year: 2011 ident: ref1 article-title: Using free text information to explore how and when GPs code a diagnosis of ovarian cancer: an observational study using primary care records of patients with ovarian cancer. publication-title: BMJ Open doi: 10.1136/bmjopen-2010-000025 – start-page: 744 year: 2010 ident: ref10 article-title: Section classification in clinical notes using supervised hidden Markov model. – start-page: 61 year: 2002 ident: ref23 article-title: Biomedical text retrieval in languages with a complex morphology. publication-title: In: Proceedings of the ACL Workshop on Natural Language Processing in the Biomedical Domain – start-page: 530 year: 2007 ident: ref14 article-title: Semi-supervised active learning for modeling medical concepts from free text. publication-title: In: Proceedings of the Sixth International Conference on Machine Learning and Applications – volume: 22 year: 2011 ident: ref16 article-title: The General Practice Research Database. – volume: 153 start-page: 666 year: 2007 ident: ref2 article-title: Epidemiology of angina pectoris: Role of natural language processing of the medical record. publication-title: Am Heart J doi: 10.1016/j.ahj.2006.12.022 – volume: 5 start-page: e13377 year: 2010 ident: ref4 article-title: Combining free text and structured electronic medical record entries to detect acute respiratory infections. publication-title: PLoS One doi: 10.1371/journal.pone.0013377 |
| SSID | ssj0053866 |
| Score | 2.3780868 |
| SecondaryResourceType | review_article |
| Snippet | Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is time-consuming to... Background Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is... BackgroundElectronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is... Background Electronic health records are invaluable for medical research, but much of the information is recorded as unstructured free text which is... |
| SourceID | plos doaj unpaywall pubmedcentral proquest gale pubmed crossref |
| SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source |
| StartPage | e30412 |
| SubjectTerms | Active learning Algorithms Angiography Annotations Artificial Intelligence Biology Cancer Cancer research Codes Computer Science Confidentiality Data mining Diagnostic systems Electronic Health Records Electronic medical records Electronic records Epidemiology Evaluation Family medicine Female Health care Health informatics Health services Humans International conferences Language Learning algorithms Machine learning Male Matching Mathematics Medical diagnosis Medical records Medical research Medicine Microprocessors Morphology Natural language processing Ovarian cancer Ovarian carcinoma Ovarian Neoplasms - diagnosis Patients Primary care Public health R&D Recall Research & development Semi-supervised learning Sensitivity analysis Studies Test sets Texts Training Unstructured data |
| SummonAdditionalLinks | – databaseName: Directory of Open Access Journals - DOAJ (NTUSG) dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwELbQXuCCKK8utGAhJOCQbbJ2_DgW1KogARJQ1Ftkx852pW12RTai_QP8bmZib9qISu2B6_pzpMzLM5uZz4S8llVpTFaxRDltoUBRIjE-F4lOWcXBn_KK43Dy5y_i6Jh_OslPrlz1hT1hgR44CG7PZdJrL6xwhnPLpfZK5yWckoanzvEu-qZKb4qpEIPBi4WIg3JMZntRL5PVsvaT7mtgNh0cRB1ffx-VR6vFsrku5fy3c_JuW6_MxW-zWFw5lg4fkPsxn6T74T22yB1fPyRb0WMb-jbSSr97RP4cnK-7kah6Rl1osAOAqR2dX3JtLGsK9Xe7WDcUB09oGwlm21_eUWwSATC9vDqHhjFKGv7qaai9oI0_mydNu8IY1MCms65b09N4PcXsMTk-PPjx4SiJtzAkpdDZOnGQojjBfSos5G6i0qkwUI5nHjJHi_yEHjI4JSyzUitjrKq4sFOn4GS0zk89e0JGNch9m1BTVlKUlXfclzw10kA5JEzOyqmExzMzJmyjkqKMFOV4U8ai6L67SShVglQLVGQRFTkmSb9rFSg6bsC_R233WCTY7n4Asyui2RU3md2YvERbKYKY-zBR7HMpIYvLBCBedQgk2ahRQTPTNk3x8evPW4C-fxuA3kRQtURDMXFyAt4JybsGyJ0BEkJFOVjeRsveSKUBGeEcgMoVg50ba79-mfbL-FDszKv9sm0KnamUSUijx-Rp8I1esDhmDfU5bJYDrxlIfrhSz087inOGhX2qx2TS-9etdPvsf-j2ObkHaTF2LCWZ3iEj8DO_C6nn2r7oosxf2DSG5w priority: 102 providerName: Directory of Open Access Journals – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3db9MwELdG9wASAlY-VhhgISTgIV3SOHbygNAGnQbSCmq3aW-RHTulUknC0gr2D_B3c5c4YRET7LU-u4rvw3f23e8IeSnSREov9Z1QRwoClJA70gTciVw_ZaBPQcqwOPlowg9P2Kez4GyDTJpaGEyrbGxiZah1nuAd-S5ETSCOYRD674rvDnaNwtfVpoWGtK0V9NsKYuwG2RwhMlaPbO6PJ1-mjW0G7ebcFtD5wtu1_BoWeWaG1SuhN-ocUBWOf2ute8UyL69yRf_OqLy5zgp58UMul5eOq4N75I71M-leLRhbZMNkfXK36eFArUr3ye363o7W5Uh9smVHSvraIlK_uU9-jX-uqmqqbE4_1Ll5QCAzTS_BdOQZnZpyvVyVFGtW6InFpl2fG02P4Qygi4yO26479h9pHf-WVF3Qmfm2cGbrAs1XCZOOqkRPQy0G7PwBOTkYH78_dGwDByfhkbdyNHg3mjPjcgVuH08jl0uI5D0DTqdCaEMDzl_Ila9EFEqpwpRxNdIhHKpKm5HxH5JeBqzZJlQmqeBJajQzCXOlkBBJcRn4yUjA8r4cEL_hWpxYdHNssrGMqyc7AVFOvfEx8jq2vB4Qp51V1Oge_6HfR4FoaRGbu_ohP5_HVtVj7QkTGa64lowpJiITRkECfp1krtYMFnmO4hTXha6thYn3mBDgAHocKF5UFIjPkWEC0FyuyzL--Pn0GkSzaYfolSVKcxQUaYsu4JsQ96tDudOhBCuTdIa3UfibXSnjP_oIMxuFuHqYtsO4KCb1ZSZfl3Hkha4vwAMfkEe1-rQbixXaENrDZNFRrM7Od0eyxdcKHd3HOwE3GpBhq4LX4u3jf3_GE3ILfGVMY3K8aIf0QIPMU_BHV-qZNTK_AUTAjnU priority: 102 providerName: ProQuest – databaseName: Unpaywall dbid: UNPAY link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELbK9gAXoLy6UMBCiMchaRI7dnJcYKuC1ILaLioHFNmJs6xYsiuSCMqBI7-bmcQbNVBEOXBbxWNrM54Zf45nPhPyUOapUn7OnCiLNWxQIuEoEwon9ljOwZ_CnGNx8t6-2J3wV8fh8Rp5v6qFsRqEPeJ8UTYn-fhjUZhtq8lt5CtqT09dn0l_1cNdgpDbnPT5waOGcQi_jFVYgHSBrIsQoPqArE_234zetSfNgSMCj9lyuj-N1FuuGlb_LnYP8J-dBUx_z6-8WBdLdfJFzeenFq-dK-T76rXbnJWPbl1pN_32CyPkf9PLVXLZwl46akfZIGumuEY2bGAp6RPLfv30Ovkx_lo1lVvFlL5o8wBBQBUZPUUJsijogSnreVVSrI-hE8uDW382GT2C9YbOCjrubvihbbEVbffaJdUn9NB8mjmH9RJDZQmd9pqkUkMt3-z0BpnsjI-e7zr2sggnFbFfORkgqUxw4wkNEFPksSeU0YFvAOBqpFE0ADQjoZmWcaSUjnIudJBFsIDrzASG3SSDAlS1SahKcynS3GTcpNxTUsGuTaiQpYGE4ZkaErayiSS1TOp4occ8aY4HJeyoWq0mqPvE6n5InK7XsmUS-Yv8MzS3ThZ5wJsHMPmJnfQk86WJjdAiU5xrLmMTxWEKGFJxL8s4DHIfjTVpi2q7aJaMuJQANn0BEg8aCeQCKTDZaKrqskxevn57DqHDg57QYyuUL9BQlC3wgHdC2-xJbvUkIaKlveZNNO6VVkrQEZYrRGHEoOfK3c5upl0zDooJhIVZ1GUS-5HHJKD9IbnVOmenWKwGFxAPhkT23Lan-X5LMfvQMLEz_P7gxUPidg5-rrm9_a8d7pBLgNQxicrx4y0yAJ8ydwENV_qejWk_ASOFvEI priority: 102 providerName: Unpaywall |
| Title | Extracting Diagnoses and Investigation Results from Unstructured Text in Electronic Health Records by Semi-Supervised Machine Learning |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/22276193 https://www.proquest.com/docview/1323078583 https://www.proquest.com/docview/918037056 https://pubmed.ncbi.nlm.nih.gov/PMC3261909 https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0030412&type=printable https://doaj.org/article/d17e9e6b6da44b479e895c355a40dd42 http://dx.doi.org/10.1371/journal.pone.0030412 |
| UnpaywallVersion | publishedVersion |
| Volume | 7 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVFSB databaseName: Free Full-Text Journals in Chemistry (Open Access) customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: HH5 dateStart: 20060101 isFulltext: true titleUrlDefault: http://abc-chemistry.org/ providerName: ABC ChemistRy – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: KQ8 dateStart: 20060101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: KQ8 dateStart: 20061001 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: DOA dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVEBS databaseName: EBSCOhost Academic Search Ultimate customDbUrl: https://search.ebscohost.com/login.aspx?authtype=ip,shib&custid=s3936755&profile=ehost&defaultdb=asn eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: ABDBF dateStart: 20080101 isFulltext: true titleUrlDefault: https://search.ebscohost.com/direct.asp?db=asn providerName: EBSCOhost – providerCode: PRVEBS databaseName: EBSCOhost Food Science Source customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: A8Z dateStart: 20080101 isFulltext: true titleUrlDefault: https://search.ebscohost.com/login.aspx?authtype=ip,uid&profile=ehost&defaultdb=fsr providerName: EBSCOhost – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: DIK dateStart: 20060101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: GX1 dateStart: 20060101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M~E dateStart: 20060101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAQN databaseName: PubMed customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: RPM dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 7X7 dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: http://www.proquest.com/pqcentral?accountid=15518 eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: BENPR dateStart: 20061201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Technology Collection customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 8FG dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/technologycollection1 providerName: ProQuest – providerCode: PRVPQU databaseName: Public Health Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 8C1 dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/publichealth providerName: ProQuest – providerCode: PRVFZP databaseName: Scholars Portal Journals: Open Access customDbUrl: eissn: 1932-6203 dateEnd: 20250930 omitProxy: true ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M48 dateStart: 20061201 isFulltext: true titleUrlDefault: http://journals.scholarsportal.info providerName: Scholars Portal |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1fb9MwELdG9wAviPFvhVEshAQ8pEoax04eEOpGy0BqmdoVlafISZxSKSSlacT6Bfjc3CVutIii7SUP8dlS7nznu_jud4S8FnEopRXbhht5AQQoLjekcrjhmXbMQJ-cmGFx8mjMz2fsy9yZH5Bdz1bNwHxvaIf9pGbrpHv1a_sBFP592bVBWLtJ3VWWqm5514dthw_hrPKwmcOI1fcKoN3l7SV6LQbvmbYupvvfKo3DqsT0ry13a5Vk-T639N_syrtFupLb3zJJrh1dwwfkvvY5ab_aJEfkQKUPyZHW6py-1dDT7x6RP4OrTVk2lS7oxyoJDwhkGtFreBxZSicqL5JNTrE4hc40CG2xVhG9BGNPlykd1O11aFXpRKtAN6fBlk7Vz6UxLVZop3KYNCozOhXVYK-Lx2Q2HFyenRu6U4MRcs_aGBG4MRFnyuQB-Hc89kwuIWS3FHiXAWIYKvDyXB7YgfBcKQM3ZjzoRS6cnkGkesp-Qlop8P2YUBnGgoexipgKmSmFhJCJS8cOewKWt2Wb2DuR-KGGMcduGolf3s0JCGcqrvooSF8Lsk2MetaqgvG4gf4UpV3TIgh3-SJbL3yt035kCeUpHvBIMhYw4SnXc0Jw4CQzo4jBIi9xr_hVRWttSvw-EwI8PYsDxauSAoE4Usz0Wcgiz_3PX7_dgmg6aRC90URxhhtF6uoK-CYE-GpQnjQowZyEjeFj3Nk7ruTAI6wVcB3Xhpm73b5_mNbDuChm76UqK3Lfs1zTFuBqt8nTSjdqxmIpNsTwMFk0tKbB-eZIuvxRwqDbGPybXpt0a_26lWyf3SiW5-Qe-MWYsmRY3glpgRKpF-B7boIOuSPmAp7umYXP4acOOTwdjC8mnfJvTqc0N_BuNr7of_8L5aeMQg |
| linkProvider | Scholars Portal |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1fb9MwELdGeRgSAlb-rDCYhUDAQ7qkcezkAaHBOrVsHdLaor4FJ3ZKpZKEpdHoF-Dj8Bk5J05YxAR72Wt9dhXf3c939v1B6AWLQs6tyDZc4QXgoLjU4NKhhmfaEQF9ciKikpNHJ3QwJR9nzmwD_apyYVRYZYWJBVCLJFR35HvgNYE4uo5rv0u_G6prlHpdrVpolGJxJNfn4LJlb4cHwN-Xvd5hf_JhYOiuAkZIPWtlCDhyBSXSpAHYIjTyTMrBvbQkWEKBqrcnwSJxaWAHzHM5D9yI0KAnXED6QMietGHdG-gmsQFLQH_YrHbwADso1el5NrP2tDR00ySW3eIN0uo1jr-iS0B9FrTSZZJdZuj-Ha-5mccpX5_z5fLCYXh4D93RVizeL8VuC23IuI3uVh0isAaMNrpd3griMtmpjbb0SIZf63rXb-6jn_0fqyJXK57jgzLyDwh4LPCFIiBJjE9lli9XGVYZMXiqK9_mZ1LgCZwweBHjft3TR_8jLr3rDAdrPJbfFsY4TxU4ZjBpVISRSqwrzM4foOm1MPIhasXAmm2EeRgxGkZSEBkSkzMOfhrljh32GCxv8w6yK675oa6drlp4LP3iQZCBD1VuvK947Wted5BRz0rL2iH_oX-vBKKmVZW_ix-Ss7mvgcQXFpOepAEVnJCAME-6nhOC1ciJKQSBRXaVOPllGm2NX_4-YQzMS4sCxfOCQlX_iFV40ZznWeYPP32-AtH4tEH0ShNFiRIUrlM64JtUVbEG5U6DEjAsbAxvK-GvdiXz_2g7zKwU4vJhXA-rRVXIYCyTPPM9yzVtBvZ9Bz0q1afeWJX_TcFz6SDWUKzGzjdH4sXXova6rW4cTK-DurUKXom3j__9GbtoczAZHfvHw5OjJ-gWWOUqYMqwvB3UAm2ST8HyXQXPCrjB6Mt149tviAPERg |
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1fb9MwELdGkQAJASt_VhjMQiDgIW1SO3bygNCgrVbGBlpX1LfgxE6pVJKytBr9AnwoPh3nxAmLmGAve63PruK7-_nOvj8IPeNxJIQTE8uTfggOiscsoVxm-TaJKeiTG1OdnHxwyPbG9P3EnWygX2UujA6rLDExB2qZRvqOvANeE4ij53qkE5uwiE-9wZvFd0t3kNIvrWU7jUJE9tX6FNy37PWwB7x-3u0O-sfv9izTYcCKmO8sLQnHr2RU2SwEu4TFvs0EuJqOAqso1LX3FFgnHgtJyH1PiNCLKQu70gPUD6XqKgLrXkFXOSG-Difkk8rZAxxhzKTqEe50jGS0F2mi2vl7pNOtHYV5x4DqXGgs5ml2ntH7d-zm9VWyEOtTMZ-fORgHd9AtY9Hi3UIEN9GGSprodtktAhvwaKKbxQ0hLhKfmmjTjGT4pal9_eou-tn_sczztpIp7hVRgEAgEonPFARJE3ykstV8mWGdHYPHpgru6kRJfAwsw7ME96v-PuYfceFpZzhc45H6NrNGq4UGygwmHeQhpQqbarPTe2h8KYy8jxoJsGYLYRHFnEWxklRF1BZcgM_GhEuiLofliWghUnItiEwddd3OYx7kj4Mc_Kli4wPN68DwuoWsataiqCPyH_q3WiAqWl0FPP8hPZkGBlQC6XDlKxYyKSgNKfeV57sRWJCC2lJSWGRHi1NQpNRWWBbsUs7B1HQYUDzNKXQlkETr1FSssiwYfvx8AaLRUY3ohSGKUy0owqR3wDfpCmM1yu0aJeBZVBve0sJf7koW_NF8mFkqxPnDuBrWi-rwwUSlqyzwHc8mHGz9FnpQqE-1sToXnIEX00K8pli1na-PJLOveR12om8fbL-F2pUKXoi3D__9GTvoGiBb8GF4uP8I3QADXcdOWY6_jRqgTOoxGMHL8EmONhh9uWx4-w0BAMiJ |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELbK9gAXoLy6UMBCiMchaRI7dnJcYKuC1ILaLioHFNmJs6xYsiuSCMqBI7-bmcQbNVBEOXBbxWNrM54Zf45nPhPyUOapUn7OnCiLNWxQIuEoEwon9ljOwZ_CnGNx8t6-2J3wV8fh8Rp5v6qFsRqEPeJ8UTYn-fhjUZhtq8lt5CtqT09dn0l_1cNdgpDbnPT5waOGcQi_jFVYgHSBrIsQoPqArE_234zetSfNgSMCj9lyuj-N1FuuGlb_LnYP8J-dBUx_z6-8WBdLdfJFzeenFq-dK-T76rXbnJWPbl1pN_32CyPkf9PLVXLZwl46akfZIGumuEY2bGAp6RPLfv30Ovkx_lo1lVvFlL5o8wBBQBUZPUUJsijogSnreVVSrI-hE8uDW382GT2C9YbOCjrubvihbbEVbffaJdUn9NB8mjmH9RJDZQmd9pqkUkMt3-z0BpnsjI-e7zr2sggnFbFfORkgqUxw4wkNEFPksSeU0YFvAOBqpFE0ADQjoZmWcaSUjnIudJBFsIDrzASG3SSDAlS1SahKcynS3GTcpNxTUsGuTaiQpYGE4ZkaErayiSS1TOp4occ8aY4HJeyoWq0mqPvE6n5InK7XsmUS-Yv8MzS3ThZ5wJsHMPmJnfQk86WJjdAiU5xrLmMTxWEKGFJxL8s4DHIfjTVpi2q7aJaMuJQANn0BEg8aCeQCKTDZaKrqskxevn57DqHDg57QYyuUL9BQlC3wgHdC2-xJbvUkIaKlveZNNO6VVkrQEZYrRGHEoOfK3c5upl0zDooJhIVZ1GUS-5HHJKD9IbnVOmenWKwGFxAPhkT23Lan-X5LMfvQMLEz_P7gxUPidg5-rrm9_a8d7pBLgNQxicrx4y0yAJ8ydwENV_qejWk_ASOFvEI |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Extracting+Diagnoses+and+Investigation+Results+from+Unstructured+Text+in+Electronic+Health+Records+by+Semi-Supervised+Machine+Learning&rft.jtitle=PloS+one&rft.au=Shawe-Taylor%2C+John&rft.au=Wang%2C+Zhuoran&rft.au=Hemingway%2C+Harry&rft.au=Shah%2C+Anoop+D&rft.date=2012-01-19&rft.pub=Public+Library+of+Science&rft.issn=1932-6203&rft.eissn=1932-6203&rft.volume=7&rft.issue=1&rft.spage=e30412&rft_id=info:doi/10.1371%2Fjournal.pone.0030412&rft.externalDBID=n%2Fa&rft.externalDocID=A477066162 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon |