Systematic tissue annotations of genomics samples by modeling unstructured metadata

There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are rou...

Full description

Saved in:
Bibliographic Details
Published inNature communications Vol. 13; no. 1; pp. 6736 - 13
Main Authors Hawkins, Nathaniel T., Maldaver, Marc, Yannakopoulos, Anna, Guare, Lindsay A., Krishnan, Arjun
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 08.11.2022
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text
ISSN2041-1723
2041-1723
DOI10.1038/s41467-022-34435-x

Cover

Abstract There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto . The 1+ million publicly-available human –omics samples currently remain acutely underused. Here the authors present an approach combining natural language processing and machine learning to infer the source tissue of public genomics samples based on their plain text descriptions, making these samples easy to discover and reuse.
AbstractList There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto.The 1+ million publicly-available human –omics samples currently remain acutely underused. Here the authors present an approach combining natural language processing and machine learning to infer the source tissue of public genomics samples based on their plain text descriptions, making these samples easy to discover and reuse.
There are currently >1.3 million human -omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto .There are currently >1.3 million human -omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto .
There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto . The 1+ million publicly-available human –omics samples currently remain acutely underused. Here the authors present an approach combining natural language processing and machine learning to infer the source tissue of public genomics samples based on their plain text descriptions, making these samples easy to discover and reuse.
There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto .
There are currently >1.3 million human -omics samples that are publicly available. This valuable resource remains acutely underused because discovering particular samples from this ever-growing data collection remains a significant challenge. The major impediment is that sample attributes are routinely described using varied terminologies written in unstructured natural language. We propose a natural-language-processing-based machine learning approach (NLP-ML) to infer tissue and cell-type annotations for genomics samples based only on their free-text metadata. NLP-ML works by creating numerical representations of sample descriptions and using these representations as features in a supervised learning classifier that predicts tissue/cell-type terms. Our approach significantly outperforms an advanced graph-based reasoning annotation method (MetaSRA) and a baseline exact string matching method (TAGGER). Model similarities between related tissues demonstrate that NLP-ML models capture biologically-meaningful signals in text. Additionally, these models correctly classify tissue-associated biological processes and diseases based on their text descriptions alone. NLP-ML models are nearly as accurate as models based on gene-expression profiles in predicting sample tissue annotations but have the distinct capability to classify samples irrespective of the genomics experiment type based on their text metadata. Python NLP-ML prediction code and trained tissue models are available at https://github.com/krishnanlab/txt2onto .
The 1+ million publicly-available human –omics samples currently remain acutely underused. Here the authors present an approach combining natural language processing and machine learning to infer the source tissue of public genomics samples based on their plain text descriptions, making these samples easy to discover and reuse.
ArticleNumber 6736
Author Yannakopoulos, Anna
Hawkins, Nathaniel T.
Maldaver, Marc
Guare, Lindsay A.
Krishnan, Arjun
Author_xml – sequence: 1
  givenname: Nathaniel T.
  surname: Hawkins
  fullname: Hawkins, Nathaniel T.
  organization: Department of Computational Mathematics, Science and Engineering, Michigan State University
– sequence: 2
  givenname: Marc
  orcidid: 0000-0001-9689-2768
  surname: Maldaver
  fullname: Maldaver, Marc
  organization: Department of Computational Mathematics, Science and Engineering, Michigan State University
– sequence: 3
  givenname: Anna
  surname: Yannakopoulos
  fullname: Yannakopoulos, Anna
  organization: Department of Computational Mathematics, Science and Engineering, Michigan State University
– sequence: 4
  givenname: Lindsay A.
  orcidid: 0000-0001-6988-5319
  surname: Guare
  fullname: Guare, Lindsay A.
  organization: Department of Computational Mathematics, Science and Engineering, Michigan State University, Department of Biochemistry and Molecular Biology, Michigan State University, Department of Microbiology and Molecular Genetics, Michigan State University
– sequence: 5
  givenname: Arjun
  orcidid: 0000-0002-7980-4110
  surname: Krishnan
  fullname: Krishnan, Arjun
  email: arjun.krishnan@cuanschutz.edu
  organization: Department of Computational Mathematics, Science and Engineering, Michigan State University, Department of Biochemistry and Molecular Biology, Michigan State University, Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus
BackLink https://www.ncbi.nlm.nih.gov/pubmed/36347858$$D View this record in MEDLINE/PubMed
BookMark eNqNUk1v1DAUjFARLaV_gAOKxIVLwF9xkgsSqvioVIlD4Wy92C_Bq8RebAe6_x5vs5S2hwr7YOt5Zjxv7OfFkfMOi-IlJW8p4e27KKiQTUUYq7gQvK6unxQnjAha0Ybxozv74-Isxg3Jg3e0FeJZccwlF01btyfF1dUuJpwhWV0mG-OCJTjnUy54F0s_lCM6P1sdywjzdsJY9rty9gYn68ZycTGFRacloClnTGAgwYvi6QBTxLPDelp8__Tx2_mX6vLr54vzD5eVrgVJVceNAW4GQK573UMLA4PsUZAGOjQCJNl7NAM2rNGik1jXPbIWO2koRcJPi4tV13jYqG2wM4Sd8mDVTcGHUUHIjU2oiKmJkWwAqplgdQ_QUqmBY0NZrbs2a_FVa3Fb2P2GaboVpETtE1dr4ionrm4SV9eZ9X5lbZd-RqPRpQDTPSv3T5z9oUb_S3VScFHTLPDmIBD8zwVjUrONGqcJHPolKtZwIWlD-d7h6wfQjV-CywHvUbyVeXYZ9equo1srf188A9gK0MHHGHD4vz7bByRt1y-Su7LT49RDsDHf40YM_2w_wvoDJnTjNQ
CitedBy_id crossref_primary_10_3389_frai_2024_1366273
crossref_primary_10_1093_bib_bbae652
crossref_primary_10_1016_j_chip_2025_100135
crossref_primary_10_3390_metabo13080941
Cites_doi 10.1038/sdata.2017.125
10.1136/jamia.2009.002733
10.3390/healthcare8020120
10.1093/bioinformatics/btn520
10.1093/biostatistics/kxp059
10.1007/s12551-018-0490-8
10.1093/bioinformatics/btx334
10.1093/nar/gni179
10.1186/gb-2005-6-2-r21
10.1093/nar/gks1193
10.1186/gb-2012-13-1-r5
10.1073/pnas.2001238117
10.1038/sdata.2016.18
10.1186/s12859-020-03694-0
10.1038/ncomms12846
10.1093/bioinformatics/btg405
10.1093/nar/gku1057
10.1093/nar/gkaa1062
10.1038/s41576-020-0257-5
10.1371/journal.pbio.1002195
10.1186/s12859-017-1888-1
10.1038/s41467-019-11461-w
10.1093/bioinformatics/btt529
10.3389/fgene.2020.610798
10.1038/ng1201-365
10.1016/j.jbi.2017.06.017
10.1093/bioinformatics/btaa034
10.1093/nar/gky1061
10.1093/nar/gky102
10.1016/j.cels.2018.12.010
10.1186/1471-2105-10-S2-S1
10.1093/database/bax083
10.1093/database/baab021
10.1186/s13059-021-02332-z
10.1101/078469
10.1038/s41597-019-0258-4
10.1142/9789812776136_0056
10.1093/nar/gky1106
10.5281/zenodo.7232237
10.3389/fgene.2019.00126
10.1093/database/baw080
ContentType Journal Article
Copyright The Author(s) 2022
2022. The Author(s).
The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: The Author(s) 2022
– notice: 2022. The Author(s).
– notice: The Author(s) 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID C6C
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QL
7QP
7QR
7SN
7SS
7ST
7T5
7T7
7TM
7TO
7X7
7XB
88E
8AO
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
C1K
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
H94
HCIFZ
K9.
LK8
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
RC3
SOI
7X8
5PM
ADTOC
UNPAY
DOA
DOI 10.1038/s41467-022-34435-x
DatabaseName Springer Nature OA Free Journals
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
ProQuest Central (Corporate)
Bacteriology Abstracts (Microbiology B)
Calcium & Calcified Tissue Abstracts
Chemoreception Abstracts
Ecology Abstracts
Entomology Abstracts (Full archive)
Environment Abstracts
Immunology Abstracts
Industrial and Applied Microbiology Abstracts (Microbiology A)
Nucleic Acids Abstracts
Oncogenes and Growth Factors Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Journals
Hospital Premium Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
Biological Science Collection
ProQuest Central
Technology Collection
Natural Science Collection
Environmental Sciences and Pollution Management
ProQuest One
ProQuest Central
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
AIDS and Cancer Research Abstracts
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
Biological Sciences
ProQuest Health & Medical Collection
Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
Proquest Central Premium
ProQuest One Academic
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Genetics Abstracts
Environment Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
ProQuest Central Student
Oncogenes and Growth Factors Abstracts
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
Environmental Sciences and Pollution Management
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
Chemoreception Abstracts
Industrial and Applied Microbiology Abstracts (Microbiology A)
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Ecology Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
Entomology Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
Calcium & Calcified Tissue Abstracts
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Bacteriology Abstracts (Microbiology B)
AIDS and Cancer Research Abstracts
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
Immunology Abstracts
Environment Abstracts
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList Publicly Available Content Database
MEDLINE - Academic

CrossRef
MEDLINE


Database_xml – sequence: 1
  dbid: C6C
  name: Springer Nature Link Open Access Journals
  url: http://www.springeropen.com/
  sourceTypes: Publisher
– sequence: 2
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 3
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 4
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 5
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 6
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2041-1723
EndPage 13
ExternalDocumentID oai_doaj_org_article_0d50d62fa1c2425baa816ca3e7125c98
10.1038/s41467-022-34435-x
PMC9643451
36347858
10_1038_s41467_022_34435_x
Genre Research Support, U.S. Gov't, Non-P.H.S
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: Michigan State University (Michigan State University Spartans)
  funderid: https://doi.org/10.13039/100007709
– fundername: U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences (NIGMS)
  grantid: R35 GM128765
  funderid: https://doi.org/10.13039/100000057
– fundername: NSF | BIO | Division of Biological Infrastructure (DBI)
  grantid: 2045651
  funderid: https://doi.org/10.13039/100000153
– fundername: NIGMS NIH HHS
  grantid: R35 GM128765
– fundername: ;
– fundername: ;
  grantid: R35 GM128765
– fundername: ;
  grantid: 2045651
GroupedDBID ---
0R~
39C
3V.
53G
5VS
70F
7X7
88E
8AO
8FE
8FG
8FH
8FI
8FJ
AAHBH
AAJSJ
ABUWG
ACGFO
ACGFS
ACIWK
ACMJI
ACPRK
ACSMW
ADBBV
ADFRT
ADMLS
ADRAZ
AENEX
AEUYN
AFKRA
AFRAH
AHMBA
AJTQC
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AMTXH
AOIJS
ARAPS
ASPBG
AVWKF
AZFZN
BBNVY
BCNDV
BENPR
BGLVJ
BHPHI
BPHCQ
BVXVI
C6C
CCPQU
DIK
EBLON
EBS
EE.
EMOBN
F5P
FEDTE
FYUFA
GROUPED_DOAJ
HCIFZ
HMCUK
HVGLF
HYE
HZ~
KQ8
LK8
M1P
M48
M7P
M~E
NAO
O9-
OK1
P2P
P62
PIMPY
PQQKQ
PROAC
PSQYO
RNS
RNT
RNTTT
RPM
SNYQT
SV3
TSG
UKHRP
AASML
AAYXX
CITATION
PHGZM
PHGZT
PJZUB
PPXIY
PQGLB
PUEGO
CGR
CUY
CVF
ECM
EIF
NPM
7QL
7QP
7QR
7SN
7SS
7ST
7T5
7T7
7TM
7TO
7XB
8FD
8FK
AZQEC
C1K
DWQXO
FR3
GNUQQ
H94
K9.
P64
PKEHL
PQEST
PQUKI
PRINS
RC3
SOI
7X8
5PM
4.4
ADTOC
BAPOH
CAG
COF
EJD
LGEZI
LOTEE
NADUK
NXXTH
UNPAY
ID FETCH-LOGICAL-c540t-93dda3dfae3cbcba8af2a039407a9ed4a607858dfe727c496e55be28e96d11e03
IEDL.DBID M48
ISSN 2041-1723
IngestDate Fri Oct 03 12:44:48 EDT 2025
Sun Oct 26 03:59:17 EDT 2025
Tue Sep 30 17:18:01 EDT 2025
Thu Oct 02 10:00:32 EDT 2025
Tue Oct 07 07:32:09 EDT 2025
Thu Apr 03 07:03:28 EDT 2025
Wed Oct 01 01:43:30 EDT 2025
Thu Apr 24 23:07:31 EDT 2025
Fri Feb 21 02:38:44 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License 2022. The Author(s).
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c540t-93dda3dfae3cbcba8af2a039407a9ed4a607858dfe727c496e55be28e96d11e03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0000-0001-9689-2768
0000-0001-6988-5319
0000-0002-7980-4110
OpenAccessLink http://journals.scholarsportal.info/openUrl.xqy?doi=10.1038/s41467-022-34435-x
PMID 36347858
PQID 2733868689
PQPubID 546298
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_0d50d62fa1c2425baa816ca3e7125c98
unpaywall_primary_10_1038_s41467_022_34435_x
pubmedcentral_primary_oai_pubmedcentral_nih_gov_9643451
proquest_miscellaneous_2734617138
proquest_journals_2733868689
pubmed_primary_36347858
crossref_primary_10_1038_s41467_022_34435_x
crossref_citationtrail_10_1038_s41467_022_34435_x
springer_journals_10_1038_s41467_022_34435_x
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-11-08
PublicationDateYYYYMMDD 2022-11-08
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-11-08
  day: 08
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
– name: England
PublicationTitle Nature communications
PublicationTitleAbbrev Nat Commun
PublicationTitleAlternate Nat Commun
PublicationYear 2022
Publisher Nature Publishing Group UK
Nature Publishing Group
Nature Portfolio
Publisher_xml – name: Nature Publishing Group UK
– name: Nature Publishing Group
– name: Nature Portfolio
References Hadley (CR11) 2017; 4
Mungall, Torniai, Gkoutos, Lewis, Haendel (CR8) 2012; 13
Panahiazar, Dumontier, Gevaert (CR23) 2017; 72
CR37
Galeota, Pelizzola (CR21) 2017; 18
CR36
Byrd, Greene, Prasad, Jiang, Greene (CR34) 2020; 21
CR35
Gautier, Cope, Bolstad, Irizarry (CR47) 2004; 20
Giles (CR22) 2017; 18
CR32
Dai (CR48) 2005; 33
Quiñones (CR9) 2020; 21
Lee, Krishnan, Zhu, Troyanskaya (CR28) 2013; 29
CR6
CR7
Wang, McCormick, Leek (CR40) 2020; 117
Perez-Riverol (CR14) 2019; 10
CR49
CR44
CR43
CR42
Stephens (CR15) 2015; 13
Barrett (CR3) 2013; 41
Basha (CR30) 2020; 36
Bard, Rhee, Ashburner (CR41) 2005; 6
CR18
CR17
Wang (CR10) 2016; 7
Wang, Lachmann, Ma’ayan (CR16) 2019; 11
CR13
Sarkans (CR2) 2021; 49
Syed (CR38) 2020; 8
McCall, Bolstad, Irizarry (CR46) 2010; 11
Ellis, Collado-Torres, Jaffe, Leek (CR31) 2018; 46
Courtot (CR5) 2019; 47
CR29
CR27
CR26
CR25
Bernstein, Doan, Dewey (CR24) 2017; 33
Zhu, Davis, Stephens, Meltzer, Chen (CR45) 2008; 24
CR20
Lee (CR33) 2019; 8
Brazma (CR4) 2001; 29
Aronson, Lang (CR19) 2010; 17
Kolesnikov (CR1) 2015; 43
Krassowski, Das, Sahu, Misra (CR12) 2020; 11
Wilkinson (CR39) 2016; 3
CJ Mungall (34435_CR8) 2012; 13
34435_CR29
34435_CR36
34435_CR37
34435_CR35
34435_CR32
M Dai (34435_CR48) 2005; 33
M Panahiazar (34435_CR23) 2017; 72
Z Wang (34435_CR10) 2016; 7
MN McCall (34435_CR46) 2010; 11
A Brazma (34435_CR4) 2001; 29
E Galeota (34435_CR21) 2017; 18
CB Giles (34435_CR22) 2017; 18
T Barrett (34435_CR3) 2013; 41
34435_CR18
34435_CR27
34435_CR25
34435_CR26
34435_CR20
J Bard (34435_CR41) 2005; 6
34435_CR6
34435_CR7
U Sarkans (34435_CR2) 2021; 49
Y Perez-Riverol (34435_CR14) 2019; 10
Y Lee (34435_CR33) 2019; 8
K Syed (34435_CR38) 2020; 8
M Quiñones (34435_CR9) 2020; 21
MD Wilkinson (34435_CR39) 2016; 3
M Krassowski (34435_CR12) 2020; 11
34435_CR17
M Courtot (34435_CR5) 2019; 47
AR Aronson (34435_CR19) 2010; 17
34435_CR13
Y Zhu (34435_CR45) 2008; 24
N Kolesnikov (34435_CR1) 2015; 43
Z Wang (34435_CR16) 2019; 11
O Basha (34435_CR30) 2020; 36
S Wang (34435_CR40) 2020; 117
SE Ellis (34435_CR31) 2018; 46
D Hadley (34435_CR11) 2017; 4
Y Lee (34435_CR28) 2013; 29
JB Byrd (34435_CR34) 2020; 21
L Gautier (34435_CR47) 2004; 20
34435_CR49
34435_CR43
34435_CR44
34435_CR42
ZD Stephens (34435_CR15) 2015; 13
MN Bernstein (34435_CR24) 2017; 33
References_xml – volume: 4
  year: 2017
  ident: CR11
  article-title: Precision annotation of digital samples in NCBI’s gene expression omnibus
  publication-title: Sci. Data
  doi: 10.1038/sdata.2017.125
– volume: 17
  start-page: 229
  year: 2010
  end-page: 236
  ident: CR19
  article-title: An overview of metamap: historical perspective and recent advances
  publication-title: J. Am. Med. Inform. Assoc.
  doi: 10.1136/jamia.2009.002733
– volume: 8
  start-page: 120
  year: 2020
  ident: CR38
  article-title: Integrated natural language processing and machine learning models for standardizing radiotherapy structure names
  publication-title: Healthcare
  doi: 10.3390/healthcare8020120
– ident: CR49
– volume: 24
  start-page: 2798
  year: 2008
  end-page: 2800
  ident: CR45
  article-title: GEOmetadb: powerful alternative search engine for the gene expression omnibus
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn520
– volume: 11
  start-page: 242
  year: 2010
  end-page: 253
  ident: CR46
  article-title: Frozen robust multiarray analysis (fRMA)
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxp059
– volume: 11
  start-page: 103
  year: 2019
  end-page: 110
  ident: CR16
  article-title: Mining data and metadata from the gene expression omnibus
  publication-title: Biophys. Rev.
  doi: 10.1007/s12551-018-0490-8
– volume: 33
  start-page: 2914
  year: 2017
  end-page: 2923
  ident: CR24
  article-title: MetaSRA: normalized human sample-specific metadata for the Sequence Read Archive
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx334
– ident: CR35
– ident: CR29
– volume: 33
  start-page: 175
  year: 2005
  ident: CR48
  article-title: Evolving gene/transcript definitions significantly alter the interpretation of GeneChip Data
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gni179
– ident: CR25
– ident: CR42
– volume: 6
  year: 2005
  ident: CR41
  article-title: An ontology for cell types
  publication-title: Genome Biol.
  doi: 10.1186/gb-2005-6-2-r21
– volume: 41
  start-page: D991
  year: 2013
  end-page: D995
  ident: CR3
  article-title: NCBI GEO: archive for functional genomics data sets-update
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gks1193
– volume: 13
  year: 2012
  ident: CR8
  article-title: Uberon, an integrative multi-species anatomy ontology
  publication-title: Genome Biol.
  doi: 10.1186/gb-2012-13-1-r5
– volume: 117
  start-page: 30266
  year: 2020
  end-page: 30275
  ident: CR40
  article-title: Methods for correcting inference based on outcomes predicted by machine learning
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.2001238117
– volume: 3
  year: 2016
  ident: CR39
  article-title: The FAIR guiding principles for scientific data management and stewardship
  publication-title: Sci. Data
  doi: 10.1038/sdata.2016.18
– volume: 21
  year: 2020
  ident: CR9
  article-title: METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI’s sequence read archive
  publication-title: BMC Bioinforma.
  doi: 10.1186/s12859-020-03694-0
– ident: CR32
– ident: CR36
– volume: 7
  year: 2016
  ident: CR10
  article-title: Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd
  publication-title: Nat. Commun.
  doi: 10.1038/ncomms12846
– ident: CR26
– volume: 20
  start-page: 307
  year: 2004
  end-page: 315
  ident: CR47
  article-title: Affy—Analysis of Affymetrix GeneChip Data at the Probe Level
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg405
– volume: 43
  start-page: D1113
  year: 2015
  end-page: D1116
  ident: CR1
  article-title: ArrayExpress update-simplifying data submissions
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gku1057
– ident: CR18
– ident: CR43
– ident: CR37
– volume: 49
  start-page: 1502
  year: 2021
  end-page: 1506
  ident: CR2
  article-title: From ArrayExpress to BioStudies
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkaa1062
– ident: CR6
– volume: 21
  start-page: 615
  year: 2020
  end-page: 629
  ident: CR34
  article-title: Responsible practical genomic data sharing that accelerates research
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/s41576-020-0257-5
– volume: 13
  start-page: 1002195
  year: 2015
  ident: CR15
  article-title: Big data: astronomical or genomical?
  publication-title: PLoS Biol.
  doi: 10.1371/journal.pbio.1002195
– ident: CR27
– volume: 18
  year: 2017
  ident: CR22
  article-title: ALE: automated label extraction from GEO metadata
  publication-title: BMC Bioinforma.
  doi: 10.1186/s12859-017-1888-1
– volume: 10
  year: 2019
  ident: CR14
  article-title: Quantifying the impact of public omics data
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-019-11461-w
– ident: CR44
– volume: 29
  start-page: 3036
  year: 2013
  end-page: 3044
  ident: CR28
  article-title: Ontology-aware classification of tissue and cell-type signals in gene expression profiles across platforms and technologies
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt529
– ident: CR17
– volume: 18
  start-page: 403
  year: 2017
  end-page: 412
  ident: CR21
  article-title: Ontology-based annotations and semantic relations in large-scale (epi)genomics data
  publication-title: Brief. Bioinforma.
– ident: CR13
– volume: 11
  start-page: 1598
  year: 2020
  ident: CR12
  article-title: State of the field in multi-omics research: from computational needs to data mining and sharing
  publication-title: Front. Genet.
  doi: 10.3389/fgene.2020.610798
– volume: 29
  start-page: 365
  year: 2001
  end-page: 371
  ident: CR4
  article-title: Minimum information about a microarray experiment (MIAME)-toward standards for microarray data
  publication-title: Nat. Genet.
  doi: 10.1038/ng1201-365
– volume: 72
  start-page: 132
  year: 2017
  end-page: 139
  ident: CR23
  article-title: Predicting biomedical metadata in CEDAR: A study of Gene Expression Omnibus (GEO)
  publication-title: J. Biomed. Inf.
  doi: 10.1016/j.jbi.2017.06.017
– volume: 36
  start-page: 2821
  year: 2020
  end-page: 2828
  ident: CR30
  article-title: Differential network analysis of multiple human tissue interactomes highlights tissue-selective processes and genetic disorder genes
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btaa034
– ident: CR7
– volume: 47
  start-page: D1172
  year: 2019
  end-page: D1178
  ident: CR5
  article-title: BioSamples database: an updated sample metadata hub
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1061
– volume: 46
  start-page: e54
  year: 2018
  ident: CR31
  article-title: Improving the value of public RNA-seq expression data by phenotype prediction
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gky102
– ident: CR20
– volume: 8
  start-page: 152
  year: 2019
  end-page: 162 6
  ident: CR33
  article-title: A computational framework for genome-wide characterization of the human disease landscape
  publication-title: Cell Syst.
  doi: 10.1016/j.cels.2018.12.010
– ident: 34435_CR18
  doi: 10.1186/1471-2105-10-S2-S1
– volume: 11
  start-page: 1598
  year: 2020
  ident: 34435_CR12
  publication-title: Front. Genet.
  doi: 10.3389/fgene.2020.610798
– volume: 11
  start-page: 242
  year: 2010
  ident: 34435_CR46
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxp059
– volume: 29
  start-page: 365
  year: 2001
  ident: 34435_CR4
  publication-title: Nat. Genet.
  doi: 10.1038/ng1201-365
– volume: 46
  start-page: e54
  year: 2018
  ident: 34435_CR31
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gky102
– ident: 34435_CR37
  doi: 10.1093/database/bax083
– volume: 3
  year: 2016
  ident: 34435_CR39
  publication-title: Sci. Data
  doi: 10.1038/sdata.2016.18
– volume: 11
  start-page: 103
  year: 2019
  ident: 34435_CR16
  publication-title: Biophys. Rev.
  doi: 10.1007/s12551-018-0490-8
– ident: 34435_CR25
  doi: 10.1093/database/baab021
– volume: 21
  year: 2020
  ident: 34435_CR9
  publication-title: BMC Bioinforma.
  doi: 10.1186/s12859-020-03694-0
– volume: 33
  start-page: 2914
  year: 2017
  ident: 34435_CR24
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btx334
– ident: 34435_CR35
  doi: 10.1186/s13059-021-02332-z
– ident: 34435_CR29
  doi: 10.1101/078469
– ident: 34435_CR13
  doi: 10.1038/s41597-019-0258-4
– volume: 10
  year: 2019
  ident: 34435_CR14
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-019-11461-w
– volume: 8
  start-page: 120
  year: 2020
  ident: 34435_CR38
  publication-title: Healthcare
  doi: 10.3390/healthcare8020120
– volume: 18
  year: 2017
  ident: 34435_CR22
  publication-title: BMC Bioinforma.
  doi: 10.1186/s12859-017-1888-1
– ident: 34435_CR27
– volume: 49
  start-page: 1502
  year: 2021
  ident: 34435_CR2
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gkaa1062
– volume: 4
  year: 2017
  ident: 34435_CR11
  publication-title: Sci. Data
  doi: 10.1038/sdata.2017.125
– ident: 34435_CR17
  doi: 10.1142/9789812776136_0056
– ident: 34435_CR6
– ident: 34435_CR36
  doi: 10.1093/nar/gky1106
– volume: 117
  start-page: 30266
  year: 2020
  ident: 34435_CR40
  publication-title: Proc. Natl Acad. Sci. USA
  doi: 10.1073/pnas.2001238117
– ident: 34435_CR44
– volume: 24
  start-page: 2798
  year: 2008
  ident: 34435_CR45
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btn520
– volume: 72
  start-page: 132
  year: 2017
  ident: 34435_CR23
  publication-title: J. Biomed. Inf.
  doi: 10.1016/j.jbi.2017.06.017
– volume: 29
  start-page: 3036
  year: 2013
  ident: 34435_CR28
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btt529
– ident: 34435_CR49
  doi: 10.5281/zenodo.7232237
– volume: 18
  start-page: 403
  year: 2017
  ident: 34435_CR21
  publication-title: Brief. Bioinforma.
– volume: 20
  start-page: 307
  year: 2004
  ident: 34435_CR47
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btg405
– volume: 21
  start-page: 615
  year: 2020
  ident: 34435_CR34
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/s41576-020-0257-5
– volume: 41
  start-page: D991
  year: 2013
  ident: 34435_CR3
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gks1193
– volume: 7
  year: 2016
  ident: 34435_CR10
  publication-title: Nat. Commun.
  doi: 10.1038/ncomms12846
– volume: 47
  start-page: D1172
  year: 2019
  ident: 34435_CR5
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gky1061
– volume: 13
  start-page: 1002195
  year: 2015
  ident: 34435_CR15
  publication-title: PLoS Biol.
  doi: 10.1371/journal.pbio.1002195
– ident: 34435_CR43
– ident: 34435_CR20
– volume: 43
  start-page: D1113
  year: 2015
  ident: 34435_CR1
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gku1057
– volume: 33
  start-page: 175
  year: 2005
  ident: 34435_CR48
  publication-title: Nucleic Acids Res.
  doi: 10.1093/nar/gni179
– volume: 17
  start-page: 229
  year: 2010
  ident: 34435_CR19
  publication-title: J. Am. Med. Inform. Assoc.
  doi: 10.1136/jamia.2009.002733
– ident: 34435_CR7
– volume: 36
  start-page: 2821
  year: 2020
  ident: 34435_CR30
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btaa034
– volume: 8
  start-page: 152
  year: 2019
  ident: 34435_CR33
  publication-title: Cell Syst.
  doi: 10.1016/j.cels.2018.12.010
– volume: 13
  year: 2012
  ident: 34435_CR8
  publication-title: Genome Biol.
  doi: 10.1186/gb-2012-13-1-r5
– ident: 34435_CR32
  doi: 10.3389/fgene.2019.00126
– volume: 6
  year: 2005
  ident: 34435_CR41
  publication-title: Genome Biol.
  doi: 10.1186/gb-2005-6-2-r21
– ident: 34435_CR42
– ident: 34435_CR26
  doi: 10.1093/database/baw080
SSID ssj0000391844
Score 2.4499173
Snippet There are currently >1.3 million human –omics samples that are publicly available. This valuable resource remains acutely underused because discovering...
There are currently >1.3 million human -omics samples that are publicly available. This valuable resource remains acutely underused because discovering...
The 1+ million publicly-available human –omics samples currently remain acutely underused. Here the authors present an approach combining natural language...
SourceID doaj
unpaywall
pubmedcentral
proquest
pubmed
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 6736
SubjectTerms 38
45
631/114/1305
631/114/2164
631/114/2401
631/1647/514
Annotations
Biological activity
Classification
Data collection
Descriptions
Gene expression
Genomics
Humanities and Social Sciences
Humans
Language
Learning algorithms
Machine Learning
Metadata
multidisciplinary
Natural language
Natural Language Processing
Representations
Science
Science (multidisciplinary)
Signal processing
String matching
Tissues
Unstructured data
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9QwEB6hSgg4IMoz0CIjcaNRk7XjtY-AWlVIcCmVerPGL1Fpm62aXdH994ydbNgVqPSAckucxJmH53M8_gbgvROxCqh8iQ5FSREqlDYKW7o6KistTjEXm_j6TZ6ciS_nzflGqa-UE9bTA_eCO6x8U3k5iVi7hI4toqqlQx6mFJqdztt8K6U3JlN5DOaapi5i2CVTcXXYiTwm5OR1QRihvNmKRJmw_28o889kyXHF9BE8WLZXuPqJs9lGUDp-Ao8HNMk-9l-xC_dC-xTu9_UlV8_g9HTkaWaLLGGGbTvvV987No8sUbReXriOdZhogjtmVywXx6FXs-XALru8Dp5dhgWmdNLncHZ89P3zSTlUUSgdobFFqbn3yH3EwJ11FhXGCVapIPoUdfACJaGERvkYCMo4oWVoGhsmKmjp6zpU_AXstPM2vAJmOXde-sij90JHab1EgVhFwlAU1kIB9Vqixg0U46nSxczkpW6uTK8FQ1owWQvmpoAP4z1XPcHGra0_JUWNLRM5dj5BJmMGkzH_MpkC9tZqNoPHdoZgHFeSDl3Au_Ey-VpaQME2zJe5jSDEV3N6xMveKsaecMlFkmMB0y172erq9pX24kfm806UaKKpCzhYW9bvbt0mioPR-u4gudf_Q3Jv4OEkeVD-rb4HO2SEYZ9A2cK-zf73C_TbNPU
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: ProQuest Central
  dbid: BENPR
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bb9MwFD4anRDwgLgvMJCReGPRktpxnQeEGNo0IVEhxqS9Wce3MalLuqUV67_Hdi6jAlWob43buufi88XH-T6Ad5q5zKIwKWpkqa9QNlWOqVTnTiiucIJRbOLrlB-fsi9nxdkWTPtnYcKxyn5NjAu1qXXYI9_3ZZYK7l_lx_lVGlSjQne1l9DATlrBfIgUY3dgexyYsUawfXA4_fZ92HUJfOiCse7pmYyK_YbFtSIeamceO6Q3axUqEvn_C33-fYhy6KQ-gHvLao6rXzib_VGsjh7Bww5lkk9tWDyGLVs9gbut7uTqKZycDPzNZBEtT7Cq6rYr35DakUDdenmhG9JgoA9uiFqRKJrjf5osO9bZ5bU15NIuMBwzfQanR4c_Ph-nnbpCqj1KW6QlNQapcWipVlqhQDfGLAilT7C0hiH36KEQxlkPcTQruS0KZcfCltzkuc3ocxhVdWV3gChKteHGUWcMKx1XhiNDzJzHVr7c2QTy3qJSd9TjQQFjJmMLnArZekF6L8joBXmTwPvhM_OWeGPj6IPgqGFkIM2Ob9TX57LLQZmZIjN87DDX4UZLIYqca6R24lGeLkUCu72bZZfJjbyNuwTeDpd9DobGCla2XsYxzCPBnPqveNFGxTATyikLdkxgshYva1Ndv1Jd_Iw834EqjRV5Ant9ZN1Oa5Mp9obo-w_Lvdz8p1_B_XHIjbiRvgsjH172tYdhC_Wmy63f7H0zKA
  priority: 102
  providerName: ProQuest
– databaseName: HAS SpringerNature Open Access 2022
  dbid: AAJSJ
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bb9MwFD6aOiHYA-I6AgMZiTcakdSO6zwWxDRVgpcyaW_W8U1M6tKJtBr99xw7aaAamkB5iy-xzkXni4_9HYB3VoTCo3I5WhQ5RSifmyBMbsugjDQ4xVRs4stXeXYu5hfVxQGMd3dh9vL3ibq7FcmZ06lzQcE9J8R4qMgw1QgOZ7P5Yj7sqUS2cyVEfzeGhn-4PXgv_iSa_r9hy9tHJIc86RHc3zTXuL3B5fKPUHT6CB72GJLNOqU_hgPfPIF7XVXJ7VNYLAZ2ZrZOcmXYNKsu596yVWCRmPXq0rasxUgO3DKzZakkDn2abXpO2c0P79iVX2M8RPoMzk8_f_t0lve1E3JLGGyd19w55C6g59ZYgwrDBItYBn2KtXcCJWGDSrngCcBYUUtfVcZPlK-lK0tf8OcwalaNfwHMcG6ddIEH50QdpHESBWIRCDlRMPMZlDuJatsTi8f6FkudEtxc6U4LmrSgkxb0zwzeD2OuO1qNO3t_jIoaekZK7PSCLEX3HqYLVxVOTgKWNv5GGURVSovcTwnD2VplcLJTs-79tNUE3riS9NQZvB2aycNi2gQbv9qkPoJwXslpiuPOKoaVcMlFlGMG0z172Vvqfktz-T2xeEciNFGVGYx3lvV7WXeJYjxY3z9I7uX_zf4KHkyir6Rt8xMYkbn51wS61uZN72u_AD75J7M
  priority: 102
  providerName: Springer Nature
– databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6VrRBw4P0IFGQkbjRLsna8zrEgqgqJCqmsKCdr_IKKbXZFsqLLr8d2HrBQVVS5JePIHo89XzLjbwBeaOYyi8KkqJGl3kPZVDmmUp07objCKcZiE-8P-cGMvTsujreA92dhYtJ-pLSM23SfHfaqZnFJx9xz5l18ejZeGncFtnnhMfgItmeHH_Y-h0pyGctT75Vpd0Imo-KcxhteKJL1n4cw_02UHKKlN-Daqlri-gfO5384pP1b8KkfSpuH8m28atRY__yL5fHyY70NNzuMSvZayTuwZau7cLWtWrm-B0dHA_szaeK8EayqRRvTr8nCkUD8enqia1JjIB-uiVqTWHLHD4qsOs5a3yFDTm2DIUn1Psz23358c5B2tRlS7TFek5bUGKTGoaVaaYUC3QSzUGZ9iqU1DLnHHoUwznqApFnJbVEoOxG25CbPbUYfwKhaVPYREEWpNtw46oxhpePKcGSImfPIzDtLm0Dez5XUHXF5qJ8xlzGAToVsFSa9wmRUmDxL4OXQZtnSdlwo_TqYwCAZKLfjjcX3L7KbFpmZIjN84jDX4TNNIYqca6R26jGiLkUCO70ByW4fqKUHh1Rwf5UJPB8e-xUcwjJY2cUqyjCPI3PqX_GwtbehJ5RTFvSYwHTDEje6uvmkOvkaWcID0Ror8gR2e5v93a2LVLE72PV_aO7x5cSfwPVJMOv4W34HRt7c7FMP6hr1rFvBvwA7d0qA
  priority: 102
  providerName: Unpaywall
Title Systematic tissue annotations of genomics samples by modeling unstructured metadata
URI https://link.springer.com/article/10.1038/s41467-022-34435-x
https://www.ncbi.nlm.nih.gov/pubmed/36347858
https://www.proquest.com/docview/2733868689
https://www.proquest.com/docview/2734617138
https://pubmed.ncbi.nlm.nih.gov/PMC9643451
https://www.nature.com/articles/s41467-022-34435-x.pdf
https://doaj.org/article/0d50d62fa1c2425baa816ca3e7125c98
UnpaywallVersion publishedVersion
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: KQ8
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: DOA
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: ADMLS
  dateStart: 20121101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: DIK
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVHPJ
  databaseName: Directory of Open Access Scholarly Resources (ROAD)
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: M~E
  dateStart: 20100101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVAQN
  databaseName: PubMed Central
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: RPM
  dateStart: 20120101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
– providerCode: PRVAQT
  databaseName: Springer Nature - nature.com Journals - Fully Open Access
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: NAO
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: https://www.nature.com/siteindex/index.html
  providerName: Nature Publishing
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: BENPR
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Health & Medical Collection
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: 7X7
  dateStart: 20190101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Technology Collection
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: 8FG
  dateStart: 20100401
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/technologycollection1
  providerName: ProQuest
– providerCode: PRVFZP
  databaseName: Scholars Portal Journals: Open Access
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 20250131
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: M48
  dateStart: 20101001
  isFulltext: true
  titleUrlDefault: http://journals.scholarsportal.info
  providerName: Scholars Portal
– providerCode: PRVAVX
  databaseName: HAS SpringerNature Open Access 2022
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: AAJSJ
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: https://www.springernature.com
  providerName: Springer Nature
– providerCode: PRVAVX
  databaseName: Springer Nature Link Open Access Journals
  customDbUrl:
  eissn: 2041-1723
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0000391844
  issn: 2041-1723
  databaseCode: C6C
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: http://www.springeropen.com/
  providerName: Springer Nature
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1bb9MwFLbGJmA8IO4ERhUk3lhGUrtO8oBQV61MlVZNlErlKTq-waQuGU0r1n_PsXNhFWVClVIpdlrrXHI-x873EfJOMhNqSFQAEliAFUoHwjARyMgkgguIwYlNnI356ZSNZr3ZDmnkjmoDllundlZParqYH13_XH_ChP9YvTKefCiZS3e3L51h-Q8QU-5hpUqtlMNZDffdnZmmOKFh9bsz2y_dJ_copyxOrAj8jVLlGP23wdC_d1O2S6oPyP1VfgXrXzCf36haw0fkYQ03_X4VH4_Jjs6fkLuVAOX6KZlMWiJnf-lc4EOeF9XyfOkXxrccrpcXsvRLsDzCpS_WvlPPwb_2VzX97GqhlX-pl2D3mz4j0-HJ18FpUMssBBLh2jJIqVJAlQFNpZACEjBdCK1iegypVgx4aK2hjEasI1nKda8ndDfRKVdRpEP6nOzmRa5fEl9QKhVXhhqlWGq4UBwYQGgQZGHd0x6JGotmsuYgt1IY88ythdMkqxySoUMy55Ds2iPv22uuKgaOW3sfW0e1PS17tjtRLL5ndTJmoeqFincNRNLOuARAEnEJVMcI92SaeOSgcXPWRGSGOI8mHD-pR962zZiMdoUFcl2sXB-GkDCi-BMvqqhoR9JElUfijXjZGOpmS37xwxF-W8401os8cthE1p9h3WaKwzb6_sNyr_454tdkv2szxD1MPyC7GFn6DUKxpeiQO_EsxmMy_Nwhe_3-aDLC7-OT8fkXPDvgg457yNFxeYgt0_F5_9tvaeY4Sg
linkProvider Scholars Portal
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Jb9QwFLZKESocEDuBAkaCE42axI7jHBBiq6Z0ubSVenOfN6g0TYZmRu38KX4jtrOUEWjEpcpt4kmct8fP-T6E3ihqEwNcx6CAxi5DmVhaKmOVWi6ZhAIC2cTePhsd0W_H-fEK-tV_C-O3VfYxMQRqXSu_Rr7p0izhzB3lh8nP2LNG-e5qT6HRmsWOmV-4V7bm_fYXp9-3Wbb19fDzKO5YBWLlqpNpXBKtgWgLhiipJHCwGSSeILyA0mgKzGXNnGtrXGpXtGQmz6XJuCmZTlOTEHfdG-gmJS6WOP8pjothTcejrXNKu29zEsI3GxoiUdgyT11lEl8u5L9AE_Cv2vbvLZpDn_YOWptVE5hfwHj8RyrcuofudjUs_tga3X20YqoH6FbLajl_iA4OBnRoPA16xVBVddvzb3BtsQeGPTtVDW7AgxM3WM5xoORxt8azDtN2dm40PjNT8JtYH6Gja5HyY7Ra1ZV5irAkRGmmLbFa09IyqRlQgMS6ys0lUxOhtJeoUB2wuefXGIvQYCdctFoQTgsiaEFcRujd8J9JC-uxdPQnr6hhpIfkDj_U599F5-Ei0XmiWWYhVf41TgLwlCkgpnA1pCp5hNZ7NYsuTjTiyqoj9Ho47Tzct22gMvUsjKGuzkyJu8ST1iqGmRBGqJdjhIoFe1mY6uKZ6vRHQBH3QGw0TyO00VvW1bSWiWJjsL7_kNyz5Q_9Cq2NDvd2xe72_s5zdDvzfhKW7NfRqjM188IVfFP5MngZRifX7da_AXBxayc
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZKEa8D4k2ggJHgRKNNYsdxDggBpWopVEil0t7M-AWVtsnS7Krdv8avw3YeZQVacaly23gTZx6eiWfyfQi9UNQmBriOQQGNXYQysbRUxiq1XDIJBQSyic_7bOeQfhzn4zX0q_8WxrdV9mtiWKh1rfwe-ciFWcKZO8qR7doivmxtv5n-jD2DlK-09nQarYnsmcWpe31rXu9uOV2_zLLtD1_f78Qdw0CsXKYyi0uiNRBtwRAllQQONoPEk4UXUBpNgbkImnNtjQvzipbM5Lk0GTcl02lqEuKuewlddlMrfTthMS6G_R2PvM4p7b7TSQgfNTSsSqF9nrosJT5bioWBMuBfee7f7ZpDzfYGujavprA4hcnkj7C4fQvd7PJZ_LY1wNtozVR30JWW4XJxFx0cDEjReBZ0jKGq6rb-3-DaYg8Se3ykGtyABypusFzgQM_jbo3nHb7t_MRofGxm4Bta76HDC5HyfbRe1ZV5iLAkRGmmLbFa09IyqRlQgMS6LM4FVhOhtJeoUB3IuefamIhQbCdctFoQTgsiaEGcRejV8J9pC_GxcvQ7r6hhpIfnDj_UJ99F5-0i0XmiWWYhVf6VTgLwlCkgpnD5pCp5hDZ6NYtuzWjEuYVH6Plw2nm7L-FAZep5GENdzpkSd4kHrVUMMyGMUC_HCBVL9rI01eUz1dGPgCjuQdlonkZos7es82mtEsXmYH3_IblHqx_6GbrqHFp82t3fe4yuZ95Nwu79Blp3lmaeuNxvJp8GJ8Po20V79W-BFm9q
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6VrRBw4P0IFGQkbjRLsna8zrEgqgqJCqmsKCdr_IKKbXZFsqLLr8d2HrBQVVS5JePIHo89XzLjbwBeaOYyi8KkqJGl3kPZVDmmUp07objCKcZiE-8P-cGMvTsujreA92dhYtJ-pLSM23SfHfaqZnFJx9xz5l18ejZeGncFtnnhMfgItmeHH_Y-h0pyGctT75Vpd0Imo-KcxhteKJL1n4cw_02UHKKlN-Daqlri-gfO5384pP1b8KkfSpuH8m28atRY__yL5fHyY70NNzuMSvZayTuwZau7cLWtWrm-B0dHA_szaeK8EayqRRvTr8nCkUD8enqia1JjIB-uiVqTWHLHD4qsOs5a3yFDTm2DIUn1Psz23358c5B2tRlS7TFek5bUGKTGoaVaaYUC3QSzUGZ9iqU1DLnHHoUwznqApFnJbVEoOxG25CbPbUYfwKhaVPYREEWpNtw46oxhpePKcGSImfPIzDtLm0Dez5XUHXF5qJ8xlzGAToVsFSa9wmRUmDxL4OXQZtnSdlwo_TqYwCAZKLfjjcX3L7KbFpmZIjN84jDX4TNNIYqca6R26jGiLkUCO70ByW4fqKUHh1Rwf5UJPB8e-xUcwjJY2cUqyjCPI3PqX_GwtbehJ5RTFvSYwHTDEje6uvmkOvkaWcID0Ror8gR2e5v93a2LVLE72PV_aO7x5cSfwPVJMOv4W34HRt7c7FMP6hr1rFvBvwA7d0qA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Systematic+tissue+annotations+of+genomics+samples+by+modeling+unstructured+metadata&rft.jtitle=Nature+communications&rft.au=Hawkins%2C+Nathaniel+T&rft.au=Maldaver%2C+Marc&rft.au=Yannakopoulos%2C+Anna&rft.au=Guare%2C+Lindsay+A&rft.date=2022-11-08&rft.eissn=2041-1723&rft.volume=13&rft.issue=1&rft.spage=6736&rft_id=info:doi/10.1038%2Fs41467-022-34435-x&rft_id=info%3Apmid%2F36347858&rft.externalDocID=36347858
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2041-1723&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2041-1723&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2041-1723&client=summon