High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP)
Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challe...
Saved in:
| Published in | Nature protocols Vol. 14; no. 12; pp. 3426 - 3444 |
|---|---|
| Main Authors | , , , , , , , , , , , , , , , , , , , , , , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
London
Nature Publishing Group UK
01.12.2019
Nature Publishing Group |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1754-2189 1750-2799 1750-2799 |
| DOI | 10.1038/s41596-019-0227-6 |
Cover
| Abstract | Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1–2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no).
PheCAP takes structured data and narrative notes from electronic medical records and enables patients with a particular clinical phenotype to be identified. |
|---|---|
| AbstractList | Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping using EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures reducing the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 days if all data are available; however, the timing is largely dependent on the chart review stage which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no). Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1–2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no). PheCAP takes structured data and narrative notes from electronic medical records and enables patients with a particular clinical phenotype to be identified. Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1–2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no). Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no). PheCAP takes structured data and narrative notes from electronic medical records and enables patients with a particular clinical phenotype to be identified. Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no).Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping with EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures, which reduce the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 d if all data are available; however, the timing is largely dependent on the chart review stage, which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes or no). |
| Audience | Academic |
| Author | Huang, Sicong Zhang, Yichi Xia, Zongqi Gainer, Vivian Liao, Katherine P. Castro, Victor Plenge, Robert M. Churchill, Susanne Link, Nicholas Sun, Jiehuan Hong, Chuan Szolovits, Peter Cai, Tianxi Karlson, Elizabeth W. Kohane, Isaac Gagnon, David Ho, Yuk-Lam Savova, Guergana Shaw, Stanley Y. Cho, Kelly Huang, Jie Ananthakrishnan, Ashwin N. Gaziano, J. Michael Murphy, Shawn N. O’Donnell, Christopher Yu, Sheng Cai, Tianrun Honerlaw, Jacqueline |
| AuthorAffiliation | 4 Department of Industrial Engineering, Tsinghua University, Beijing, China 8 Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, MA 5 Division of Data Sciences, VA Boston Healthcare System, Boston, MA 7 Department of Neurology, University of Pittsburgh, Pittsburgh, PA 11 Inflammation & Immunology Thematic Center of Excellence (TCoE) Unit, Celgene, Cambridge, MA (contribution to study prior to current affiliation) 12 Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 14 Department of Biomedical Informatics, Harvard Medical School, Boston, MA 15 Division of Cardiology, VA Boston Healthcare System, Boston, MA 9 Research Information Science and Computing, Partners Healthcare, Boston, MA 3 Center for Statistical Science, Tsinghua University, Beijing, China 6 Department of Gastroenterology, Massachusetts General Hospital, Boston, MA 13 Computational Health Informatics Program, Children’s Hospital, Boston, MA 16 Department of Neurology, Massachusetts Gen |
| AuthorAffiliation_xml | – name: 10 Department of Biostatistics, Boston University, Boston, MA, USA – name: 14 Department of Biomedical Informatics, Harvard Medical School, Boston, MA – name: 2 Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Boston, MA USA – name: 8 Division of Cardiovascular Medicine, Brigham and Women’s Hospital, Boston, MA – name: 16 Department of Neurology, Massachusetts General Hospital, Boston, MA – name: 11 Inflammation & Immunology Thematic Center of Excellence (TCoE) Unit, Celgene, Cambridge, MA (contribution to study prior to current affiliation) – name: 13 Computational Health Informatics Program, Children’s Hospital, Boston, MA – name: 15 Division of Cardiology, VA Boston Healthcare System, Boston, MA – name: 4 Department of Industrial Engineering, Tsinghua University, Beijing, China – name: 7 Department of Neurology, University of Pittsburgh, Pittsburgh, PA – name: 5 Division of Data Sciences, VA Boston Healthcare System, Boston, MA – name: 9 Research Information Science and Computing, Partners Healthcare, Boston, MA – name: 1 Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA – name: 17 Division of Aging, Brigham and Women’s Hospital, Boston, MA – name: 12 Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA – name: 3 Center for Statistical Science, Tsinghua University, Beijing, China – name: 6 Department of Gastroenterology, Massachusetts General Hospital, Boston, MA |
| Author_xml | – sequence: 1 givenname: Yichi surname: Zhang fullname: Zhang, Yichi organization: Department of Biostatistics, Harvard T.H. Chan School of Public Health – sequence: 2 givenname: Tianrun surname: Cai fullname: Cai, Tianrun organization: Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital – sequence: 3 givenname: Sheng surname: Yu fullname: Yu, Sheng organization: Center for Statistical Science, Tsinghua University, Department of Industrial Engineering, Tsinghua University – sequence: 4 givenname: Kelly surname: Cho fullname: Cho, Kelly organization: Division of Data Sciences, VA Boston Healthcare System, Division of Aging, Brigham and Women’s Hospital – sequence: 5 givenname: Chuan surname: Hong fullname: Hong, Chuan organization: Department of Biostatistics, Harvard T.H. Chan School of Public Health – sequence: 6 givenname: Jiehuan surname: Sun fullname: Sun, Jiehuan organization: Department of Biostatistics, Harvard T.H. Chan School of Public Health – sequence: 7 givenname: Jie surname: Huang fullname: Huang, Jie organization: Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital – sequence: 8 givenname: Yuk-Lam orcidid: 0000-0003-3305-3830 surname: Ho fullname: Ho, Yuk-Lam organization: Division of Data Sciences, VA Boston Healthcare System – sequence: 9 givenname: Ashwin N. surname: Ananthakrishnan fullname: Ananthakrishnan, Ashwin N. organization: Department of Gastroenterology, Massachusetts General Hospital – sequence: 10 givenname: Zongqi orcidid: 0000-0003-1500-2589 surname: Xia fullname: Xia, Zongqi organization: Department of Neurology, University of Pittsburgh – sequence: 11 givenname: Stanley Y. surname: Shaw fullname: Shaw, Stanley Y. organization: Division of Cardiovascular Medicine, Brigham and Women’s Hospital – sequence: 12 givenname: Vivian surname: Gainer fullname: Gainer, Vivian organization: Research Information Science and Computing, Partners Healthcare – sequence: 13 givenname: Victor surname: Castro fullname: Castro, Victor organization: Research Information Science and Computing, Partners Healthcare – sequence: 14 givenname: Nicholas surname: Link fullname: Link, Nicholas organization: Division of Data Sciences, VA Boston Healthcare System – sequence: 15 givenname: Jacqueline surname: Honerlaw fullname: Honerlaw, Jacqueline organization: Division of Data Sciences, VA Boston Healthcare System – sequence: 16 givenname: Sicong surname: Huang fullname: Huang, Sicong organization: Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital – sequence: 17 givenname: David surname: Gagnon fullname: Gagnon, David organization: Division of Data Sciences, VA Boston Healthcare System, Department of Biostatistics, Boston University – sequence: 18 givenname: Elizabeth W. surname: Karlson fullname: Karlson, Elizabeth W. organization: Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital – sequence: 19 givenname: Robert M. surname: Plenge fullname: Plenge, Robert M. organization: Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital – sequence: 20 givenname: Peter surname: Szolovits fullname: Szolovits, Peter organization: Department of Electrical Engineering and Computer Science, MIT – sequence: 21 givenname: Guergana surname: Savova fullname: Savova, Guergana organization: Computational Health Informatics Program, Boston Children’s Hospital – sequence: 22 givenname: Susanne surname: Churchill fullname: Churchill, Susanne organization: Department of Biomedical Informatics, Harvard Medical School – sequence: 23 givenname: Christopher surname: O’Donnell fullname: O’Donnell, Christopher organization: Division of Data Sciences, VA Boston Healthcare System, Division of Cardiology, VA Boston Healthcare System – sequence: 24 givenname: Shawn N. surname: Murphy fullname: Murphy, Shawn N. organization: Research Information Science and Computing, Partners Healthcare, Department of Biomedical Informatics, Harvard Medical School, Department of Neurology, Massachusetts General Hospital – sequence: 25 givenname: J. Michael surname: Gaziano fullname: Gaziano, J. Michael organization: Division of Data Sciences, VA Boston Healthcare System, Division of Aging, Brigham and Women’s Hospital – sequence: 26 givenname: Isaac surname: Kohane fullname: Kohane, Isaac organization: Department of Biomedical Informatics, Harvard Medical School – sequence: 27 givenname: Tianxi surname: Cai fullname: Cai, Tianxi organization: Department of Biostatistics, Harvard T.H. Chan School of Public Health, Department of Biomedical Informatics, Harvard Medical School – sequence: 28 givenname: Katherine P. orcidid: 0000-0002-4797-3200 surname: Liao fullname: Liao, Katherine P. email: kliao@bwh.harvard.edu organization: Division of Rheumatology, Immunology, and Allergy, Brigham and Women’s Hospital, Division of Data Sciences, VA Boston Healthcare System, Department of Biomedical Informatics, Harvard Medical School |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31748751$$D View this record in MEDLINE/PubMed |
| BookMark | eNqNkk1v1DAQhiNURNuFH8AFReLSCqX4I4njC9JqBbRSBRUUcbQcx05cJXawnZb99zjswrIVRcgHW_bzznjemePkwFgjk-Q5BGcQ4Oq1z2FBywxAmgGESFY-So4gKUCGCKUHP895hmBFD5Nj728AyAkuyZPkEEOSV6SAR8l4rtsuC52zU9uNU0jHThob1qM2bXqnQ5fKXorgrNEiHWSjBe9TJ4V1TdrwwNPJzyRPhR0Ga1IvB535aZTuVnvZpHwcneWiS0-uOrlaXp0-TR4r3nv5bLsvki_v3l6vzrPLj-8vVsvLTJSAhEwhmCNcFqLGqiqBqGiuCK8kRjVsKMI1USXmNZKoEapuBKUKIKWUKKoCYFHgRYI2cScz8vUd73s2Oj1wt2YQsNk-trGPRfvYbB8ro-jNRjROdSxWSBMc3wkt12z_xeiOtfaWEYxw_GIMcLIN4Oy3SfrABu2F7HtupJ08QxiWpMJFrGCRvLyH3tjJmejJTFFIAQVkR7W8l0wbZWNeMQdlyxKUtARVTiN19hcqria2Q8ShUTre7wlO9wSRCfJ7aPnkPbv4_GmfffUwu7z-uvqwT7_408Lf3v0auQjADSCc9d5J9V99Ifc0QgcetJ27oPt_Krdj4GMW00q3c_lh0Q-CFAhu |
| CitedBy_id | crossref_primary_10_1093_jamia_ocad226 crossref_primary_10_3389_fnins_2022_884708 crossref_primary_10_1016_j_hlpt_2022_100638 crossref_primary_10_1002_acr_24132 crossref_primary_10_1001_jamanetworkopen_2021_14723 crossref_primary_10_1038_s41467_024_48568_8 crossref_primary_10_1002_acr2_11289 crossref_primary_10_1093_bib_bbad228 crossref_primary_10_2196_40384 crossref_primary_10_1002_acr_24804 crossref_primary_10_1093_jamia_ocae121 crossref_primary_10_1093_jamia_ocae005 crossref_primary_10_3390_diagnostics10110972 crossref_primary_10_1186_s12874_024_02443_8 crossref_primary_10_1093_jamia_ocaa343 crossref_primary_10_1038_s41598_021_03204_z crossref_primary_10_1080_1744666X_2024_2359019 crossref_primary_10_1210_jendso_bvad123 crossref_primary_10_1016_j_msard_2021_103333 crossref_primary_10_1001_jamanetworkopen_2020_8236 crossref_primary_10_1016_j_ijmedinf_2022_104753 crossref_primary_10_1016_j_cll_2022_09_023 crossref_primary_10_1111_cts_13871 crossref_primary_10_1038_s41467_025_55879_x crossref_primary_10_3389_fdgth_2023_1150687 crossref_primary_10_1016_j_jaip_2022_04_016 crossref_primary_10_1038_s41746_024_01331_1 crossref_primary_10_1038_s41746_024_01120_w crossref_primary_10_1161_CIRCOUTCOMES_120_006528 crossref_primary_10_1001_jamacardio_2023_0857 crossref_primary_10_1093_jamiaopen_ooae134 crossref_primary_10_1038_s41746_021_00519_z crossref_primary_10_1093_aje_kwac182 crossref_primary_10_1093_jamiaopen_ooab028 crossref_primary_10_1002_acn3_51324 crossref_primary_10_1001_jamanetworkopen_2024_3062 crossref_primary_10_1093_jamia_ocac234 crossref_primary_10_1183_13993003_04644_2020 crossref_primary_10_1016_j_jbi_2023_104335 crossref_primary_10_1093_jamia_ocab264 crossref_primary_10_3389_fncom_2023_1192876 crossref_primary_10_1016_j_jid_2024_08_025 crossref_primary_10_1186_s13073_022_01074_2 crossref_primary_10_1161_CIRCRESAHA_120_316401 crossref_primary_10_1007_s11936_023_01032_0 crossref_primary_10_1016_j_heliyon_2024_e26434 crossref_primary_10_1016_j_jbi_2022_104175 crossref_primary_10_1080_03772063_2024_2304002 crossref_primary_10_1093_jamia_ocae062 crossref_primary_10_1186_s40164_022_00333_7 crossref_primary_10_1038_s41598_021_99481_9 crossref_primary_10_2196_22219 crossref_primary_10_3390_nu14051121 crossref_primary_10_1016_j_tjnut_2023_12_051 crossref_primary_10_1038_s41598_021_86361_5 crossref_primary_10_1002_wics_1549 crossref_primary_10_1038_s41591_023_02274_y crossref_primary_10_1093_rheumatology_keaa198 crossref_primary_10_1093_jamiaopen_ooab117 crossref_primary_10_1109_ACCESS_2024_3457850 crossref_primary_10_1007_s10985_022_09557_5 crossref_primary_10_1016_j_jbi_2023_104425 crossref_primary_10_1109_ACCESS_2023_3325896 crossref_primary_10_1016_j_jbi_2024_104685 crossref_primary_10_1016_j_trecan_2023_10_006 crossref_primary_10_1093_jamia_ocac063 crossref_primary_10_1016_j_compbiomed_2024_108577 crossref_primary_10_1055_a_1938_0436 crossref_primary_10_1093_aje_kwae226 crossref_primary_10_1016_j_yamp_2020_07_013 crossref_primary_10_1055_s_0040_1702007 crossref_primary_10_1093_ajh_hpad081 crossref_primary_10_1001_jamanetworkopen_2021_34627 crossref_primary_10_1093_jamia_ocae072 crossref_primary_10_1016_j_engappai_2024_109972 crossref_primary_10_1038_s41588_024_01793_9 crossref_primary_10_1038_s41598_022_19244_y crossref_primary_10_1109_ACCESS_2024_3467251 crossref_primary_10_1136_bmjnph_2021_000401 crossref_primary_10_2196_45662 crossref_primary_10_7475_kjan_2022_34_4_351 crossref_primary_10_1016_j_jbi_2020_103626 crossref_primary_10_1093_jamia_ocac216 crossref_primary_10_1080_02699052_2024_2373920 crossref_primary_10_1097_CIN_0000000000001146 |
| Cites_doi | 10.1093/jamia/ocw011 10.1038/s41588-018-0248-z 10.1016/j.cgh.2013.10.011 10.1006/jbin.2001.1029 10.1093/ibd/izy127 10.1016/j.ajhg.2012.01.010 10.1097/MIB.0b013e31828133fd 10.1371/journal.pone.0078927 10.1212/WNL.0000000000003490 10.1136/jamia.2009.000893 10.1017/S0033291711000997 10.1016/j.ajhg.2010.12.007 10.1038/nbt.2749 10.1093/jamia/ocv034 10.3115/v1/P14-5010 10.1002/acr.20184 10.1002/art.37801 10.1097/MIB.0000000000000524 10.1136/annrheumdis-2012-203202 10.1111/j.1475-6773.2005.00444.x 10.1093/jamia/ocx111 10.1136/bmj.h1885 10.1080/14737159.2018.1439380 10.1038/nature12873 10.1136/amiajnl-2011-000583 10.1093/jamia/ocw028 10.1002/art.39851 10.1176/appi.ajp.2014.14030423 10.2337/dc09-1506 10.1186/s12958-015-0115-z 10.1016/j.jbi.2014.06.007 10.1007/s00439-014-1466-9 10.1016/j.jclinepi.2015.09.016 10.1016/j.jpeds.2017.05.037 10.1016/j.semarthrit.2019.01.002 10.1016/j.ajhg.2018.05.010 10.1055/s-0038-1634945 10.1136/jamia.2009.001560 10.1093/jamia/ocv202 10.1093/jamia/ocw135 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer Nature Limited 2019 COPYRIGHT 2019 Nature Publishing Group Copyright Nature Publishing Group Dec 2019 |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Nature Limited 2019 – notice: COPYRIGHT 2019 Nature Publishing Group – notice: Copyright Nature Publishing Group Dec 2019 |
| DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM ATWCN ISR 3V. 7QG 7T5 7T7 7TM 7X7 7XB 88E 8FD 8FE 8FH 8FI 8FJ 8FK ABUWG AEUYN AFKRA ATCPS AZQEC BBNVY BENPR BHPHI C1K CCPQU DWQXO FR3 FYUFA GHDGH GNUQQ H94 HCIFZ K9. LK8 M0S M1P M7N M7P P64 PATMY PHGZM PHGZT PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS PYCSY RC3 7X8 5PM ADTOC UNPAY |
| DOI | 10.1038/s41596-019-0227-6 |
| DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Middle School Gale In Context: Science ProQuest Central (Corporate) Animal Behavior Abstracts Immunology Abstracts Industrial and Applied Microbiology Abstracts (Microbiology A) Nucleic Acids Abstracts Health & Medical Collection (Proquest) ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Natural Science Journals ProQuest Hospital Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest One Sustainability (subscription) ProQuest Central UK/Ireland Agricultural & Environmental Science Collection ProQuest Central Essentials Biological Science Collection ProQuest Central Natural Science Collection Environmental Sciences and Pollution Management ProQuest One Community College ProQuest Central Engineering Research Database Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student AIDS and Cancer Research Abstracts SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Biological Sciences Health & Medical Collection (Alumni Edition) Medical Database Algology Mycology and Protozoology Abstracts (Microbiology C) Biological Science Database Biotechnology and BioEngineering Abstracts Environmental Science Database (subscripiton) ProQuest Central Premium ProQuest One Academic ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Environmental Science Collection Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) ProQuest Central Student ProQuest Central Essentials Nucleic Acids Abstracts SciTech Premium Collection ProQuest Central China Environmental Sciences and Pollution Management ProQuest One Applied & Life Sciences ProQuest One Sustainability Health Research Premium Collection Natural Science Collection Health & Medical Research Collection Biological Science Collection Industrial and Applied Microbiology Abstracts (Microbiology A) ProQuest Central (New) ProQuest Medical Library (Alumni) ProQuest Biological Science Collection ProQuest One Academic Eastern Edition ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts Environmental Science Collection ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Environmental Science Database Engineering Research Database ProQuest One Academic ProQuest One Academic (New) Technology Research Database ProQuest One Academic Middle East (New) ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Central ProQuest Health & Medical Research Collection Genetics Abstracts Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Algology Mycology and Protozoology Abstracts (Microbiology C) Agricultural & Environmental Science Collection AIDS and Cancer Research Abstracts ProQuest SciTech Collection ProQuest Medical Library Animal Behavior Abstracts Immunology Abstracts ProQuest Central (Alumni) MEDLINE - Academic |
| DatabaseTitleList | ProQuest Central Student MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository – sequence: 4 dbid: BENPR name: ProQuest Central url: http://www.proquest.com/pqcentral?accountid=15518 sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Biology |
| EISSN | 1750-2799 |
| EndPage | 3444 |
| ExternalDocumentID | oai:dash.harvard.edu:1/42083016 PMC7323894 A606960849 31748751 10_1038_s41596_019_0227_6 |
| Genre | Research Support, U.S. Gov't, Non-P.H.S Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural |
| GeographicLocations | United States |
| GeographicLocations_xml | – name: United States |
| GrantInformation_xml | – fundername: Pfizer (Pfizer Inc.) funderid: https://doi.org/10.13039/100004319 – fundername: U.S. Department of Health & Human Services | National Institutes of Health (NIH) grantid: P30 AR072577; U54 LM008748; P30 AR 072577; P30 AR 072577; P30 AR 072577; U54 LM008748; U54 LM008748; NINDS098023; U54 LM008748; U54 LM008748; R01 HG009174; R01 HG009174; U54 LM008748; T32 AR007530; U54 LM008748; U54 LM008748; U54 LM008748; U54 LM008748; U54 LM008748; R01 HG009174; U54 LM008748; U54 LM008748; U54 LM008748; P30 AR 072577 funderid: https://doi.org/10.13039/100000002 – fundername: Harold and DuVal Bowen Fund – fundername: Office of Research and Development (VHA Office of Research and Development) grantid: I01-CX001025; I01-CX001025; I01-CX001025; I01-CX001025; I01-CX001025; I01-CX001025 funderid: https://doi.org/10.13039/100006379 – fundername: NIAMS NIH HHS grantid: P30 AR072577 – fundername: NHGRI NIH HHS grantid: R01 HG009174 – fundername: CSRD VA grantid: I01 CX001025 – fundername: NLM NIH HHS grantid: U54 LM008748 – fundername: NINDS NIH HHS grantid: R01 NS098023 – fundername: NIAMS NIH HHS grantid: T32 AR007530 |
| GroupedDBID | --- 0R~ 123 29M 39C 3TQ 3V. 4.4 53G 5BI 5M7 70F 7X7 7XC 88E 8FE 8FH 8FI 8FJ AAEEF AARCD AAWYQ AAYZH AAZLF ABAWZ ABJNI ABLJU ABUWG ACGFO ACGFS ACMJI ACPRK ADBBV ADFRT AENEX AEUYN AFBBN AFKRA AFRAH AFSHS AGAYW AHBCP AHMBA AHSBF AIBTJ ALFFA ALIPV ALMA_UNASSIGNED_HOLDINGS AMTXH ARMCB ASPBG ATCPS ATWCN AVWKF AXYYD AZFZN BBNVY BENPR BHPHI BKKNO BPHCQ BVXVI CAG CCPQU COF DB5 DU5 EBS EE. EJD EMOBN F5P FEDTE FSGXE FYUFA FZEXT HCIFZ HMCUK HVGLF HZ~ IAO IGS IHR INH INR ISR ITC LGEZI LK8 LOTEE M1P M7P NADUK NNMJJ NXXTH O9- ODYON P2P PATMY PQQKQ PROAC PSQYO PYCSY RNT RNTTT SHXYY SIXXV SNYQT SOJ SV3 TAOOD TBHMF TDRGL TSG UKHRP AAYXX AFANA ATHPR CITATION PUEGO CGR CUY CVF ECM EIF NFIDA NPM PHGZM PHGZT PJZUB PPXIY PQGLB AGSTI 7QG 7T5 7T7 7TM 7XB 8FD 8FK AZQEC C1K DWQXO FR3 GNUQQ H94 K9. M7N P64 PKEHL PQEST PQUKI PRINS RC3 7X8 5PM ADTOC UNPAY |
| ID | FETCH-LOGICAL-c607t-f2142365cb3f860c894f7a8e32b1d923b7f63ab2e2dcfbdc99f02fffc58503c53 |
| IEDL.DBID | UNPAY |
| ISSN | 1754-2189 1750-2799 |
| IngestDate | Sun Oct 26 03:37:42 EDT 2025 Tue Sep 30 16:58:49 EDT 2025 Thu Sep 04 16:50:10 EDT 2025 Tue Oct 07 06:03:59 EDT 2025 Mon Oct 20 22:03:55 EDT 2025 Mon Oct 20 16:49:29 EDT 2025 Thu Oct 16 14:37:52 EDT 2025 Thu Oct 16 14:21:28 EDT 2025 Mon Jul 21 04:15:24 EDT 2025 Thu Apr 24 22:51:38 EDT 2025 Wed Oct 01 00:19:59 EDT 2025 Fri Feb 21 02:37:37 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 12 |
| Language | English |
| License | cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c607t-f2142365cb3f860c894f7a8e32b1d923b7f63ab2e2dcfbdc99f02fffc58503c53 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 AUTHOR CONTRIBUTIONS YZ, TC1, SY, CH, JS, JH, ANA, ZX, SYS, VG, VC, NL, DWK, RMP, PS, GS, SC, SNM, IK, TC2, KPL contributed to the development of pipeline; YZ, TC1, SY, CH, JS, NL, TC2, contributed to the development of the R package and software development used in this protocol; YZ, TC, KC, CH, JS, JH, HL, ANA, ZX, SYS, VG, VC, NL, JH, SH, DG, PS, GS, SC, CO, SNM, JMG, IK, TC, KPL, contributed to the validation and enhancements to pipeline; YZ, TC1, SY, CH, JS, VG, VC, GS, TC2, KPL drafted the manuscript; all authors contributed to revisions and proofreading of the manuscript. contributed equally to the work |
| ORCID | 0000-0002-4797-3200 0000-0003-1500-2589 0000-0003-3305-3830 |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=http://nrs.harvard.edu/urn-3:HUL.InstRepos:42083016 |
| PMID | 31748751 |
| PQID | 2319190907 |
| PQPubID | 536306 |
| PageCount | 19 |
| ParticipantIDs | unpaywall_primary_10_1038_s41596_019_0227_6 pubmedcentral_primary_oai_pubmedcentral_nih_gov_7323894 proquest_miscellaneous_2316783592 proquest_journals_2319190907 gale_infotracmisc_A606960849 gale_infotracacademiconefile_A606960849 gale_incontextgauss_ISR_A606960849 gale_incontextgauss_ATWCN_A606960849 pubmed_primary_31748751 crossref_primary_10_1038_s41596_019_0227_6 crossref_citationtrail_10_1038_s41596_019_0227_6 springer_journals_10_1038_s41596_019_0227_6 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2019-12-01 |
| PublicationDateYYYYMMDD | 2019-12-01 |
| PublicationDate_xml | – month: 12 year: 2019 text: 2019-12-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | London |
| PublicationPlace_xml | – name: London – name: England |
| PublicationSubtitle | Recipes for Researchers |
| PublicationTitle | Nature protocols |
| PublicationTitleAbbrev | Nat Protoc |
| PublicationTitleAlternate | Nat Protoc |
| PublicationYear | 2019 |
| Publisher | Nature Publishing Group UK Nature Publishing Group |
| Publisher_xml | – name: Nature Publishing Group UK – name: Nature Publishing Group |
| References | Gaziano (CR6) 2016; 70 Banda (CR7) 2017 Ananthakrishnan (CR20) 2013; 19 Lindberg (CR33) 1993; 32 CR39 Agarwal (CR31) 2016; 23 CR38 Perlis (CR45) 2012; 42 CR37 Carroll (CR21) 2012; 19 Goryachev (CR36) 2006; 2006 Xia (CR22) 2013; 8 Jorge (CR44) 2019; 49 Geva (CR47) 2017; 188 Castro (CR15) 2015; 172 Kurreeman (CR26) 2012; 90 Liao (CR12) 2015; 350 Brownstein (CR1) 2010; 33 Kurreeman (CR3) 2011; 88 Cai (CR24) 2018; 24 Okada (CR27) 2014; 506 Ananthakrishnan (CR23) 2014; 12 Kirby (CR9) 2016; 23 Yu (CR32) 2018; 25 Jupp, Burdett, Leroy, Parkinson (CR34) 2015; 1546 Chapman (CR41) 2001; 34 Yu (CR14) 2015; 22 Liao (CR25) 2014; 73 Canela-Xandri (CR5) 2018; 50 Ananthakrishnan (CR28) 2015; 21 Castro (CR42) 2017; 88 Liao (CR11) 2010; 62 Liao (CR4) 2013; 65 Halpern (CR30) 2016; 23 Castro (CR43) 2015; 13 Doss, Mo, Carroll, Crofford, Denny (CR46) 2017; 69 Savova (CR35) 2010; 17 CR40 Murphy (CR16) 2010; 17 Basile (CR19) 2018; 18 Rasmussen (CR18) 2014; 51 Kho (CR8) 2011; 3 O’Malley (CR10) 2005; 40 Sinnott (CR29) 2014; 133 Son (CR17) 2018; 103 Denny (CR2) 2013; 31 Yu (CR13) 2017; 24 V Agarwal (227_CR31) 2016; 23 J Doss (227_CR46) 2017; 69 F Kurreeman (227_CR3) 2011; 88 T Cai (227_CR24) 2018; 24 Z Xia (227_CR22) 2013; 8 RH Perlis (227_CR45) 2012; 42 JC Kirby (227_CR9) 2016; 23 VM Castro (227_CR15) 2015; 172 S Jupp (227_CR34) 2015; 1546 JS Brownstein (227_CR1) 2010; 33 AN Ananthakrishnan (227_CR28) 2015; 21 S Goryachev (227_CR36) 2006; 2006 227_CR40 WW Chapman (227_CR41) 2001; 34 JH Son (227_CR17) 2018; 103 Y Halpern (227_CR30) 2016; 23 LV Rasmussen (227_CR18) 2014; 51 KP Liao (227_CR11) 2010; 62 VM Castro (227_CR43) 2015; 13 RJ Carroll (227_CR21) 2012; 19 O Canela-Xandri (227_CR5) 2018; 50 S Yu (227_CR13) 2017; 24 AO Basile (227_CR19) 2018; 18 KP Liao (227_CR25) 2014; 73 Y Okada (227_CR27) 2014; 506 KP Liao (227_CR4) 2013; 65 JM Gaziano (227_CR6) 2016; 70 KJ O’Malley (227_CR10) 2005; 40 JC Denny (227_CR2) 2013; 31 FA Kurreeman (227_CR26) 2012; 90 SN Murphy (227_CR16) 2010; 17 AN Ananthakrishnan (227_CR20) 2013; 19 JA Sinnott (227_CR29) 2014; 133 S Yu (227_CR32) 2018; 25 A Geva (227_CR47) 2017; 188 AN Ananthakrishnan (227_CR23) 2014; 12 GK Savova (227_CR35) 2010; 17 KP Liao (227_CR12) 2015; 350 VM Castro (227_CR42) 2017; 88 227_CR39 227_CR38 A Jorge (227_CR44) 2019; 49 JM Banda (227_CR7) 2017 227_CR37 AN Kho (227_CR8) 2011; 3 S Yu (227_CR14) 2015; 22 DA Lindberg (227_CR33) 1993; 32 |
| References_xml | – volume: 19 start-page: 1411 year: 2013 end-page: 1420 ident: CR20 article-title: Improving case definition of Crohn’s disease and ulcerative colitis in electronic medical records using natural language processing: a novel informatics approach publication-title: Inflamm. Bowel. Dis. – volume: 42 start-page: 41 year: 2012 end-page: 50 ident: CR45 article-title: Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model publication-title: Psychol. Med. – volume: 24 start-page: e143 year: 2017 end-page: e149 ident: CR13 article-title: Surrogate-assisted feature extraction for high-throughput phenotyping publication-title: J. Am. Med. Inform. Assoc. – volume: 22 start-page: 993 year: 2015 end-page: 1000 ident: CR14 article-title: Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources publication-title: J. Am. Med. Inform. Assoc. – volume: 70 start-page: 214 year: 2016 end-page: 223 ident: CR6 article-title: Million Veteran Program: a mega-biobank to study genetic influences on health and disease publication-title: J. Clin. Epidemiol. – volume: 51 start-page: 280 year: 2014 end-page: 286 ident: CR18 article-title: Design patterns for the development of electronic health record-driven phenotype extraction algorithms publication-title: J. Biomed. Inform. – volume: 1546 start-page: 118 year: 2015 end-page: 119 ident: CR34 article-title: A new ontology lookup service at EMBL-EBI publication-title: CEUR Workshop Proc. – ident: CR39 – volume: 34 start-page: 301 year: 2001 end-page: 310 ident: CR41 article-title: A simple algorithm for identifying negated findings and diseases in discharge summaries publication-title: J. Biomed. Inform. – ident: CR37 – volume: 133 start-page: 1369 year: 2014 end-page: 1382 ident: CR29 article-title: Improving the power of genetic association tests with imperfect phenotype derived from electronic medical records publication-title: Hum. Genet. – volume: 18 start-page: 219 year: 2018 end-page: 226 ident: CR19 article-title: Informatics and machine learning to define the phenotype publication-title: Expert. Rev. Mol. Diagn. – volume: 19 start-page: e162 year: 2012 end-page: e169 ident: CR21 article-title: Portability of an algorithm to identify rheumatoid arthritis in electronic health records publication-title: J. Am. Med. Inform. Assoc. – volume: 23 start-page: 1046 year: 2016 end-page: 1052 ident: CR9 article-title: PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability publication-title: J. Am. Med. Inform. Assoc. – volume: 13 year: 2015 ident: CR43 article-title: Identification of subjects with polycystic ovary syndrome using electronic health records publication-title: Reprod. Biol. Endocrinol. – volume: 69 start-page: 291 year: 2017 end-page: 300 ident: CR46 article-title: Phenome-wide association study of rheumatoid arthritis subgroups identifies association between seronegative disease and fibromyalgia publication-title: Arthritis Rheumatol. – volume: 65 start-page: 571 year: 2013 end-page: 581 ident: CR4 article-title: Associations of autoantibodies, autoimmune risk alleles, and clinical diagnoses from the electronic medical records in rheumatoid arthritis cases and non-rheumatoid arthritis controls publication-title: Arthritis Rheumatol. – volume: 32 start-page: 281 year: 1993 end-page: 291 ident: CR33 article-title: The Unified Medical Language System publication-title: Methods Inf. Med. – volume: 172 start-page: 363 year: 2015 end-page: 372 ident: CR15 article-title: Validation of electronic health record phenotyping of bipolar disorder cases and controls publication-title: Am. J. Psychiatry – ident: CR40 – volume: 23 start-page: 1166 year: 2016 end-page: 1173 ident: CR31 article-title: Learning statistical models of phenotypes using noisy labeled training data publication-title: J. Am. Med. Inform. Assoc. – volume: 73 start-page: 1170 year: 2014 end-page: 1175 ident: CR25 article-title: Association between low density lipoprotein and rheumatoid arthritis genetic factors with low density lipoprotein levels in rheumatoid arthritis and non-rheumatoid arthritis controls publication-title: Ann. Rheum. Dis. – volume: 350 start-page: h1885 year: 2015 ident: CR12 article-title: Development of phenotype algorithms using electronic medical records and incorporating natural language processing publication-title: BMJ – volume: 90 start-page: 524 year: 2012 end-page: 532 ident: CR26 article-title: Use of a multiethnic approach to identify rheumatoid- arthritis-susceptibility loci, 1p36 and 17q12 publication-title: Am. J. Hum. Genet. – volume: 40 start-page: 1620 year: 2005 end-page: 1639 ident: CR10 article-title: Measuring diagnoses: ICD code accuracy publication-title: Health Serv. Res. – volume: 62 start-page: 1120 year: 2010 end-page: 1127 ident: CR11 article-title: Electronic medical records for discovery research in rheumatoid arthritis publication-title: Arthritis Care. Res. – volume: 50 start-page: 1593 year: 2018 end-page: 1599 ident: CR5 article-title: An atlas of genetic associations in UK Biobank publication-title: Nat. Genet. – volume: 88 start-page: 57 year: 2011 end-page: 69 ident: CR3 article-title: Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records publication-title: Am. J. Hum. Genet. – ident: CR38 – volume: 24 start-page: 2242 year: 2018 end-page: 2246 ident: CR24 article-title: The association between arthralgia and vedolizumab using natural language processing publication-title: Inflamm. Bowel. Dis. – volume: 188 start-page: 224 year: 2017 end-page: 231 ident: CR47 article-title: A computable phenotype improves cohort ascertainment in a pediatric pulmonary hypertension registry publication-title: J. Pediatr. – volume: 31 start-page: 1102 year: 2013 end-page: 1110 ident: CR2 article-title: Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data publication-title: Nat. Biotechnol. – volume: 23 start-page: 731 year: 2016 end-page: 740 ident: CR30 article-title: Electronic medical record phenotyping using the anchor and learn framework publication-title: J. Am. Med. Inform. Assoc. – volume: 17 start-page: 507 year: 2010 end-page: 513 ident: CR35 article-title: Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications publication-title: J. Am. Med. Inform. Assoc. – volume: 49 start-page: 84 year: 2019 end-page: 90 ident: CR44 article-title: Identifying lupus patients in electronic health records: development and validation of machine learning algorithms and application of rule-based algorithms publication-title: Semin. Arthritis Rheum. – volume: 88 start-page: 164 year: 2017 end-page: 168 ident: CR42 article-title: Large-scale identification of patients with cerebral aneurysms using natural language processing publication-title: Neurology – volume: 3 start-page: 79re71 year: 2011 ident: CR8 article-title: Electronic medical records for genetic research: results of the eMERGE consortium publication-title: Sci. Transl. Med. – volume: 103 start-page: 58 year: 2018 end-page: 73 ident: CR17 article-title: Deep phenotyping on electronic health records facilitates genetic diagnosis by clinical exomes publication-title: Am. J. Hum. Genet. – volume: 17 start-page: 124 year: 2010 end-page: 130 ident: CR16 article-title: Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) publication-title: J. Am. Med. Inform. Assoc. – volume: 2006 start-page: 931 year: 2006 ident: CR36 article-title: A suite of natural language processing tools developed for the I2B2 project publication-title: AMIA Annu. Symp. Proc. – volume: 25 start-page: 54 year: 2018 end-page: 60 ident: CR32 article-title: Enabling phenotypic big data with PheNorm publication-title: J. Am. Med. Inform. Assoc. – volume: 21 start-page: 2507 year: 2015 end-page: 2514 ident: CR28 article-title: Common genetic variants influence circulating vitamin D levels in inflammatory bowel diseases publication-title: Inflamm. Bowel. Dis. – volume: 33 start-page: 526 year: 2010 end-page: 531 ident: CR1 article-title: Rapid identification of myocardial infarction risk associated with diabetes medications using electronic medical records publication-title: Diabetes Care – start-page: 48 year: 2017 end-page: 57 ident: CR7 article-title: Electronic phenotyping with APHRODITE and the Observational Health Sciences and Informatics (OHDSI) data network publication-title: AMIA Jt. Summit. Transl. Sci. Proc. – volume: 506 start-page: 376 year: 2014 end-page: 381 ident: CR27 article-title: Genetics of rheumatoid arthritis contributes to biology and drug discovery publication-title: Nature – volume: 8 start-page: e78927 year: 2013 ident: CR22 article-title: Modeling disease severity in multiple sclerosis using electronic health records publication-title: PLoS One – volume: 12 start-page: 821 year: 2014 end-page: 827 ident: CR23 article-title: Association between reduced plasma 25-hydroxy vitamin D and increased risk of cancer in patients with inflammatory bowel diseases publication-title: Clin. Gastroenterol. Hepatol. – volume: 23 start-page: 731 year: 2016 ident: 227_CR30 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocw011 – volume: 50 start-page: 1593 year: 2018 ident: 227_CR5 publication-title: Nat. Genet. doi: 10.1038/s41588-018-0248-z – volume: 12 start-page: 821 year: 2014 ident: 227_CR23 publication-title: Clin. Gastroenterol. Hepatol. doi: 10.1016/j.cgh.2013.10.011 – volume: 34 start-page: 301 year: 2001 ident: 227_CR41 publication-title: J. Biomed. Inform. doi: 10.1006/jbin.2001.1029 – volume: 24 start-page: 2242 year: 2018 ident: 227_CR24 publication-title: Inflamm. Bowel. Dis. doi: 10.1093/ibd/izy127 – volume: 90 start-page: 524 year: 2012 ident: 227_CR26 publication-title: Am. J. Hum. Genet. doi: 10.1016/j.ajhg.2012.01.010 – volume: 19 start-page: 1411 year: 2013 ident: 227_CR20 publication-title: Inflamm. Bowel. Dis. doi: 10.1097/MIB.0b013e31828133fd – volume: 8 start-page: e78927 year: 2013 ident: 227_CR22 publication-title: PLoS One doi: 10.1371/journal.pone.0078927 – volume: 88 start-page: 164 year: 2017 ident: 227_CR42 publication-title: Neurology doi: 10.1212/WNL.0000000000003490 – volume: 17 start-page: 124 year: 2010 ident: 227_CR16 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1136/jamia.2009.000893 – volume: 42 start-page: 41 year: 2012 ident: 227_CR45 publication-title: Psychol. Med. doi: 10.1017/S0033291711000997 – volume: 88 start-page: 57 year: 2011 ident: 227_CR3 publication-title: Am. J. Hum. Genet. doi: 10.1016/j.ajhg.2010.12.007 – volume: 31 start-page: 1102 year: 2013 ident: 227_CR2 publication-title: Nat. Biotechnol. doi: 10.1038/nbt.2749 – volume: 22 start-page: 993 year: 2015 ident: 227_CR14 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocv034 – volume: 1546 start-page: 118 year: 2015 ident: 227_CR34 publication-title: CEUR Workshop Proc. – ident: 227_CR40 doi: 10.3115/v1/P14-5010 – volume: 62 start-page: 1120 year: 2010 ident: 227_CR11 publication-title: Arthritis Care. Res. doi: 10.1002/acr.20184 – volume: 65 start-page: 571 year: 2013 ident: 227_CR4 publication-title: Arthritis Rheumatol. doi: 10.1002/art.37801 – volume: 21 start-page: 2507 year: 2015 ident: 227_CR28 publication-title: Inflamm. Bowel. Dis. doi: 10.1097/MIB.0000000000000524 – ident: 227_CR39 – volume: 73 start-page: 1170 year: 2014 ident: 227_CR25 publication-title: Ann. Rheum. Dis. doi: 10.1136/annrheumdis-2012-203202 – ident: 227_CR37 – volume: 40 start-page: 1620 year: 2005 ident: 227_CR10 publication-title: Health Serv. Res. doi: 10.1111/j.1475-6773.2005.00444.x – volume: 3 start-page: 79re71 year: 2011 ident: 227_CR8 publication-title: Sci. Transl. Med. – volume: 25 start-page: 54 year: 2018 ident: 227_CR32 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocx111 – volume: 350 start-page: h1885 year: 2015 ident: 227_CR12 publication-title: BMJ doi: 10.1136/bmj.h1885 – volume: 18 start-page: 219 year: 2018 ident: 227_CR19 publication-title: Expert. Rev. Mol. Diagn. doi: 10.1080/14737159.2018.1439380 – volume: 506 start-page: 376 year: 2014 ident: 227_CR27 publication-title: Nature doi: 10.1038/nature12873 – volume: 19 start-page: e162 year: 2012 ident: 227_CR21 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1136/amiajnl-2011-000583 – volume: 23 start-page: 1166 year: 2016 ident: 227_CR31 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocw028 – volume: 69 start-page: 291 year: 2017 ident: 227_CR46 publication-title: Arthritis Rheumatol. doi: 10.1002/art.39851 – volume: 172 start-page: 363 year: 2015 ident: 227_CR15 publication-title: Am. J. Psychiatry doi: 10.1176/appi.ajp.2014.14030423 – volume: 33 start-page: 526 year: 2010 ident: 227_CR1 publication-title: Diabetes Care doi: 10.2337/dc09-1506 – volume: 13 year: 2015 ident: 227_CR43 publication-title: Reprod. Biol. Endocrinol. doi: 10.1186/s12958-015-0115-z – volume: 51 start-page: 280 year: 2014 ident: 227_CR18 publication-title: J. Biomed. Inform. doi: 10.1016/j.jbi.2014.06.007 – volume: 133 start-page: 1369 year: 2014 ident: 227_CR29 publication-title: Hum. Genet. doi: 10.1007/s00439-014-1466-9 – volume: 70 start-page: 214 year: 2016 ident: 227_CR6 publication-title: J. Clin. Epidemiol. doi: 10.1016/j.jclinepi.2015.09.016 – volume: 188 start-page: 224 year: 2017 ident: 227_CR47 publication-title: J. Pediatr. doi: 10.1016/j.jpeds.2017.05.037 – start-page: 48 volume-title: AMIA Jt. Summit. Transl. Sci. Proc. year: 2017 ident: 227_CR7 – volume: 49 start-page: 84 year: 2019 ident: 227_CR44 publication-title: Semin. Arthritis Rheum. doi: 10.1016/j.semarthrit.2019.01.002 – volume: 103 start-page: 58 year: 2018 ident: 227_CR17 publication-title: Am. J. Hum. Genet. doi: 10.1016/j.ajhg.2018.05.010 – volume: 32 start-page: 281 year: 1993 ident: 227_CR33 publication-title: Methods Inf. Med. doi: 10.1055/s-0038-1634945 – ident: 227_CR38 – volume: 17 start-page: 507 year: 2010 ident: 227_CR35 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1136/jamia.2009.001560 – volume: 23 start-page: 1046 year: 2016 ident: 227_CR9 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocv202 – volume: 2006 start-page: 931 year: 2006 ident: 227_CR36 publication-title: AMIA Annu. Symp. Proc. – volume: 24 start-page: e143 year: 2017 ident: 227_CR13 publication-title: J. Am. Med. Inform. Assoc. doi: 10.1093/jamia/ocw135 |
| SSID | ssj0047367 |
| Score | 2.5901947 |
| Snippet | Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR)... |
| SourceID | unpaywall pubmedcentral proquest gale pubmed crossref springer |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 3426 |
| SubjectTerms | 631/1647/48 631/1647/794 692/53 692/699 Algorithms Analysis Analytical Chemistry Biological Techniques Biomedical and Life Sciences Computational Biology/Bioinformatics Data Analysis Data Interpretation, Statistical Electronic health records Electronic Health Records - statistics & numerical data Electronic medical records Electronic records Genotype & phenotype Health risks High-throughput screening (Biochemical assaying) High-Throughput Screening Assays - methods Humans Learning algorithms Life Sciences Machine Learning Medical records Methods Microarrays Natural Language Processing Organic Chemistry Patients Phenotype Phenotypes Phenotyping Protocol |
| SummonAdditionalLinks | – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV3raxQxEA_1iqgfxLerVaIUfJTQa7K3u_kgch4tVfA4aov9FpJs0haue6t7i9x_78y-rlukfs4sZGcm88hMfkPIduQdBMpDw2I_ilhotGCSawm6DN5JSLAOFVzT92l0eBJ-Ox2dbpBp-xYG2ypbm1gZ6nRh8Y58F-IQCc4LcrnP-S-GU6OwutqO0NDNaIX0UwUxdotsckTGGpDNL_vT2VFrm8NYVDNlwWeGDJybbOucItktwJVhQy4-6uE8ZlHPU12311cc1vVmyq6ieo_cKbNcr_7o-fyK0zp4QO430SYd1-rxkGy47BG5Xc-fXD0mOXZ5sGZUT14uKTZ8LZYrfENF8YaWrqfk0Mu6pEPrax2KraUUu-bPqKbAQuAPLdzlBSvKHO1P4VLaApbT97NzNxnPPjwhJwf7x5ND1oxgYDYaxkvmEZFNRCNrhAeR2kSGPtaJE9zspRAbmthHQhvueGq9Sa2Ufsi99xaykKGwI_GUDLJF5p4Tmhofgr5ARO8wzRlpaUwirTA8NaEWNiDDlt3KNvjkOCZjrqo6uUhULSEFElIoIRUF5GP3SV6Dc9xEvI0yVAh6kWFXzZkui0KNj39OpmoMeRzkckkoA_L2X2Rffxz1iN41RH4Be7S6ecsAf4pwWj3KrR4lHF7bX241SjXGo1BrVQ_Im24Zv8SGuMwtyoomwks7yQPyrFbAjgUQEmIauheQuKeaHQFCivdXsovzClo8FhDCyTAgO60Sr7d1A2d3Oj3_vxxe3PzLL8ldjmew6hjaIoPl79K9grhvaV43h_kvYbtTBw priority: 102 providerName: ProQuest |
| Title | High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP) |
| URI | https://link.springer.com/article/10.1038/s41596-019-0227-6 https://www.ncbi.nlm.nih.gov/pubmed/31748751 https://www.proquest.com/docview/2319190907 https://www.proquest.com/docview/2316783592 https://pubmed.ncbi.nlm.nih.gov/PMC7323894 http://nrs.harvard.edu/urn-3:HUL.InstRepos:42083016 |
| UnpaywallVersion | submittedVersion |
| Volume | 14 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVLSH databaseName: SpringerLink Journals customDbUrl: mediaType: online eissn: 1750-2799 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0047367 issn: 1750-2799 databaseCode: AFBBN dateStart: 20190101 isFulltext: true providerName: Library Specific Holdings |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Zb9NAEF7RRAh44D4MJVpQJY7KwfX62r45UaOAIIpKo4Yny7v2NhWpE8W2UPj1zNiOUxdU1Cdb2rFke77Zmdm5CNlzVAyGsiF0V9mObomQ6dwMOWAZtBPjsDsU7Zq-jZzhxPoytafbOu5klXZn5VCc0olfJTo7HE6-djF2jjZpeogBYYCls0Pajg0GeIu0J6Ox_6MofbQN3XSLqZFwb-mgwPgmlsm8TymoK0y6xcId03R1p6GNru7Jl5TS1YTJOmp6j9zJk2W4_hXO55cU0-ABOdmU95T5KD-7eSa68vff3R5v8s0Pyf3KUKV-iaxH5FacPCa3y9GV6ydkiQkiejXlZ5lnFHPFFtkay68oHu7S7YAdelFGg2h5IkQxK5Viwv0ZDSkgHiSBpvHFuZ7mS9y60jiim17n9P14Fvf98YenZDI4OukP9Wp6gy4dw810hc3cmGNLwRSgQXrcUm7oxcwUBxGYlcJVDguFGZuRVCKSnCvDVEpJcGAMJm32jLSSRRK_IDQSygKogTMQo4dkh1wIj0smzEhYIZMaMTZcDGTV2hwnbMyDIsTOvKBkfACMD5DxgaORj_Ujy7Kvx3XEewiNAPtlJJiQcxbmaRr4J6f9UeCDCwgo9Cyukbf_Ivv8_bhB9K4iUgt4RxlWZRDwpdiJq0G526AEuZfN5Q1Qg2rfSQOw1jmYeNxwNfKmXsYnMZcuiRd5QePgeR83NfK8xHX9C8CaRA_2QCNuA_E1AXYjb64k57OiK7nLwPrjlkb2N7Kxfa1r_ux-LT7_58PLG1G_IndNlPQi92iXtLJVHr8GCzITHbLjTt0OafuDXm8E197RaHzcqfaRP4NmbV0 |
| linkProvider | Unpaywall |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3dT9swELcQaGJ7mPa9bGzzJqZ9oIgSp0n9gKauA7UDqgqKxptnOzYglTRbGqH-c_vbdpevEjSxJ559kRzf-X53vi9C1gNrwFBuKTe07cD1lWQu9yQHWQZ0Yhy0Q96u6WAY9I_97yftkyXyp6qFwbTKSifmijqaanwj3wQ7hAN4gS_3Jfnl4tQojK5WIzRkOVoh2s5bjJWFHXtmfgkuXLo9-Ab8fu95uzvjXt8tpwy4OmiFM9di0zEWtLViFnatO9y3oewY5qmtCMwfFdqASeUZL9JWRZpz2_KstRoM7RbTODUCIGDFZz4H52_l685wdFhhgR-yfIYtYLTvApjyKq7KOpspQCcmAGMRkeeFbtBAxuv4cAUgrydv1hHce2Q1ixM5v5STyRWQ3H1A7pfWLe0W4viQLJn4EblTzLucPyYJZpW45WigJJtRTDCbzuZYs0XxRZgupvLQiyKERItnJIqprBSz9E-ppMAy4AdNzcW5m2YJ6rvURLRqkE4_js5Mrzv69IQc3woznpLleBqb54RGyvogn-BBGHSr2pIr1eGaKS9SvmTaIa3quIUu-6HjWI6JyOPyrCMKDgngkEAOicAhn-tPkqIZyE3E68hDgU02YsziOZVZmoru-EdvKLrgN4Lv2PG5Q979i2xwdNgg-lAS2SnsUcuydgL-FNt3NSjXGpSgLHRzuZIoUSqrVCyulkPe1sv4JSbgxWaa5TQBPhJyzyHPCgGsjwBMUHR7txwSNkSzJsAW5s2V-Pwsb2UeMjAZue-QjUqIF9u64WQ3ajn_Px9e3PzLb8hqf3ywL_YHw72X5K6H9zHPVlojy7PfmXkFNudMvS4vNiU_b1uX_AW7c5D1 |
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bb9MwFLamIW4PiDuBAQYNcZmidnaaxA8IVR3VyqCa2Cb6ZmLH3iZ1aSCNpv41fh3n5NZlQuVpzz6RHJ_P52J_PoeQTd8aCJS7yg1sz3c9FXFXsEgAlsE7cQHWoSjX9G3s7x55Xya9yRr5U7-FQVplbRMLQx3PNJ6RdyAOEeC8IJfr2IoWsb8z_JT-crGDFN601u00SojsmcU5pG_Zx9EO6PoNY8PPh4Ndt-ow4Gq_G8xdiwXHuN_TiluYsQ6FZ4MoNJyp7RhCHxVYn0eKGRZrq2IthO0ya62GILvLNXaMAPN_LeBcIJ0wmDTJnhfwonsteGfPBTcq6htVHnYycJpI_cXnQ4wFrt_yiZc9wwXXeJm22dzd3iY38ySNFufRdHrBPQ7vkjtVXEv7JRDvkTWT3CfXy06XiwckRT6JWzUFSvM5RWrZbL7A11oUz4Lpsh8PPSsvj2h5gESRxEqRn39MIwrKgtWnmTk7dbM8RUuXmZjWpdHpu_0TM-jvv39Ijq5EFY_IejJLzBNCY2U9QCbkDgYTql4klAqF5orFyou4dki3Xm6pq0ro2JBjKosbeR7KUkMSNCRRQ9J3yIfmk7QsA7JKeBN1KLG8RoJAPY7yLJP9wx-DsexDxghZY-gJh7z-l9jo4HtL6G0lZGcwRx1VrybgT7FwV0tyoyUJZkK3h2tEycpMZXK5qRzyqhnGL5F6l5hZXsj4eDwomEMelwBslgCCT0x4tx0StKDZCGDx8vZIcnpSFDEPOASLwnPIVg3i5bRWrOxWg_P_6-Hp6l9-SW6ABZFfR-O9Z-QWw-1Y0JQ2yPr8d26eQ7A5Vy-KXU3Jz6s2I38BuASOjw |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3rb9MwELegEwI-8H4EBjJoEo_JJYvz8r5VFVNBUFWwauNTFDv2OtGlVZMIlb-eu7y6DDS0b5F8lpLc73x3vhchO77RYCjbkgXG85krY86EEwvAMmgnLuB0KNs1fR37o6n7-dg73tRxp6usP6uG4lRO_CplfH80_dLH2DnapNk-BoQBlv51suV7YID3yNZ0PBn8KEsfPZs5QTk1Ep5dBgpMNLFMHn7IQF1h0i0W7jhOwPyONrp4Jp9TShcTJtuo6W1ys0iX8fpXPJ-fU0wHd8lhU95T5aP87Be57Kvff3d7vMo33yN3akOVDipk3SfXdPqA3KhGV64fkiUmiLB6ys-yyCnmii3yNZZfUbzcpZsBO_SsigbR6kaIYlYqxYT7ExpTQDxIAs302SnLiiUeXZlOaNPrnL6dzPRwMHn3iEwPPh4OR6ye3sCUbwc5M9jMjfuektwAGlQoXBPEoeaO3EvArJSB8XksHe0kyshECWFsxxijwIGxufL4Y9JLF6l-SmgijQtQA2dAo4fkxULKUCgunUS6MVcWsRsuRqpubY4TNuZRGWLnYVQxPgLGR8j4yLfI-3bLsurrcRnxDkIjwn4ZKSbknMRFlkWDw6PhOBqACwgoDF1hkdf_Ivv0_VuH6E1NZBbwjiquyyDgS7ETV4dyu0MJcq-6yw1Qo_rcySKw1gWYeMIOLPKqXcadmEuX6kVR0vh43yccizypcN3-ArAm0YPds0jQQXxLgN3Iuyvp6azsSh5wsP6Ea5HdRjY2r3XJn91txef_fHh2Jern5JaDkl7mHm2TXr4q9AuwIHP5sj4x_gBS8Glg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=High-throughput+phenotyping+with+electronic+medical+record+data+using+a+common+semi-supervised+approach+%28PheCAP%29&rft.jtitle=Nature+protocols&rft.au=Zhang%2C+Yichi&rft.au=Cai%2C+Tianrun&rft.au=Yu%2C+Sheng&rft.au=Cho%2C+Kelly&rft.date=2019-12-01&rft.issn=1750-2799&rft.eissn=1750-2799&rft.volume=14&rft.issue=12&rft.spage=3426&rft_id=info:doi/10.1038%2Fs41596-019-0227-6&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1754-2189&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1754-2189&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1754-2189&client=summon |