Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project
Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated sear...
Saved in:
Published in | AMIA ... Annual Symposium proceedings Vol. 2022; p. 1135 |
---|---|
Main Authors | , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
2022
|
Subjects | |
Online Access | Get full text |
ISSN | 1942-597X 1559-4076 |
Cover
Abstract | Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/. |
---|---|
AbstractList | Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/.Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/. Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/. |
Author | Moore, Stephen M Wang, Yue Appaji, Abhishek Turner, Matthew D Wang, Xiaochen Rajasekar, Arcot K Lander, Howard Ambite, José-Luis Wang, Lei Sahoo, Satya S Turner, Jessica A |
Author_xml | – sequence: 1 givenname: Xiaochen surname: Wang fullname: Wang, Xiaochen organization: Pennsylvania State University, State College, PA, USA – sequence: 2 givenname: Yue surname: Wang fullname: Wang, Yue organization: University of North Carolina at Chapel Hill, Chapel Hill, NC, USA – sequence: 3 givenname: José-Luis surname: Ambite fullname: Ambite, José-Luis organization: University of Southern California, Los Angeles, CA, USA – sequence: 4 givenname: Abhishek surname: Appaji fullname: Appaji, Abhishek organization: B.M.S. College of Engineering, Bangalore, India – sequence: 5 givenname: Howard surname: Lander fullname: Lander, Howard organization: Renaissance Computing Institute, Chapel Hill, NC, USA – sequence: 6 givenname: Stephen M surname: Moore fullname: Moore, Stephen M organization: Washington University in St. Louis, St. Louis, MO, USA – sequence: 7 givenname: Arcot K surname: Rajasekar fullname: Rajasekar, Arcot K organization: Renaissance Computing Institute, Chapel Hill, NC, USA – sequence: 8 givenname: Jessica A surname: Turner fullname: Turner, Jessica A organization: Georgia State University, Atlanta, GA, USA – sequence: 9 givenname: Matthew D surname: Turner fullname: Turner, Matthew D organization: Georgia State University, Atlanta, GA, USA – sequence: 10 givenname: Lei surname: Wang fullname: Wang, Lei organization: Ohio State University, Columbus, OH, USA – sequence: 11 givenname: Satya S surname: Sahoo fullname: Sahoo, Satya S organization: Case Western Reserve University, Cleveland, OH, USA |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/37128458$$D View this record in MEDLINE/PubMed |
BookMark | eNo1kMlOwzAYhC1URBd4BeQjl0hOHC_hVkoLldhUeuAWOfaf1FVqBydB6gvw3ARRTjOHmU-jmaKR8w7O0CRmLItSIvho8FmaRCwTH2M0bds9Ialgkl-gMRVxIlMmJ-h76VRRW1fhd23Bdba0Gm-gCd702ha2tt0Rd7vg-2qHV_P1Bt-rTuFn5VQFh6Fwi-cOe9f52lfHyAT7BQ4bgAbXoIL7JatmwCm9w9YNKMAv0Ad_F6ypAL8FvwfdXaLzUtUtXJ10hrar5XbxGD29PqwX86eoYVxG3BQqlbKgpIi5FGmZcmMSlZEkKcpYcJJxqkqtmS5KohiVgkNJNShKNaOM0hm6-cMOgz57aLv8YFsNda0c-L7NE0nk8JDgcohen6J9cQCTN8EeVDjm_9fRHynMb0E |
ContentType | Journal Article |
Copyright | 2022 AMIA - All rights reserved. |
Copyright_xml | – notice: 2022 AMIA - All rights reserved. |
DBID | CGR CUY CVF ECM EIF NPM 7X8 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine |
EISSN | 1559-4076 |
ExternalDocumentID | 37128458 |
Genre | Journal Article |
GroupedDBID | 2WC 53G ADBBV ALMA_UNASSIGNED_HOLDINGS BAWUL CGR CUY CVF DIK E3Z ECM EIF GX1 HYE M~E NPM OK1 RPM WOQ 7X8 |
ID | FETCH-LOGICAL-p568-6dba488b30b16874f46dd2a9022bf1760963afcc5cbf0a53876ef3cea33c53533 |
ISSN | 1942-597X |
IngestDate | Fri Jul 11 16:44:54 EDT 2025 Sat Sep 28 08:13:21 EDT 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Language | English |
License | 2022 AMIA - All rights reserved. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p568-6dba488b30b16874f46dd2a9022bf1760963afcc5cbf0a53876ef3cea33c53533 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
PMID | 37128458 |
PQID | 2808586768 |
PQPubID | 23479 |
ParticipantIDs | proquest_miscellaneous_2808586768 pubmed_primary_37128458 |
PublicationCentury | 2000 |
PublicationDate | 2022-00-00 20220101 |
PublicationDateYYYYMMDD | 2022-01-01 |
PublicationDate_xml | – year: 2022 text: 2022-00-00 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | AMIA ... Annual Symposium proceedings |
PublicationTitleAlternate | AMIA Annu Symp Proc |
PublicationYear | 2022 |
SSID | ssj0047586 |
Score | 2.2557783 |
Snippet | Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including... |
SourceID | proquest pubmed |
SourceType | Aggregation Database Index Database |
StartPage | 1135 |
SubjectTerms | Algorithms Deep Learning Humans PubMed Reproducibility of Results Search Engine |
Title | Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project |
URI | https://www.ncbi.nlm.nih.gov/pubmed/37128458 https://www.proquest.com/docview/2808586768 |
Volume | 2022 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1559-4076 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0047586 issn: 1942-597X databaseCode: DIK dateStart: 20030101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1559-4076 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0047586 issn: 1942-597X databaseCode: GX1 dateStart: 20030101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVAQN databaseName: PubMed Central (US National Library of Medicine) customDbUrl: eissn: 1559-4076 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0047586 issn: 1942-597X databaseCode: RPM dateStart: 20030101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELbQHhAXxPJcFpCRuFWJmiZ2HG4B7Wp3RRGCIsqp8nMpqGnVbQ7LD-B3MxM7D1WsBFysyLZ88Dcaz0xmviHklUhckTLJIyHcJMqc4ZFKxjoyuWYTplMzKbA4efqen33OLuZs3oeym-qSnYr1zz_WlfwPqjAHuGKV7D8g2x0KE_AN-MIICMP4VxifYOFTQ6ndlDVi1g9a1A2Jq896ve4a8ZyW5x8B450cZLyEqCBSGGBwPTJb1H0jY-2mbSdx2bGO9xmR9Xb9xtd5ffBxnKGFW07Py1Ecx6PA3P_peoWJYfVq1L-VnR3_JUSr50uJnbuq_fmvdS-RKxVa-V2sr_zv_ehdveyOAnNafm9SE0r1DRP9fwwDGr4qObZB_bICPFrfEabVz90er2KTxPObDPDdrBqA0xwfW88Ev0ei3S7BWw0WWdPiY96lAmXgM2Erq3bTzc5GY3TM7pG7wVugpYf-kNyy1X1yexryIR6QX60E0F4C6J4E0CABFCWAogTQXgJe07Kie_hTxJ-2-NMWf7qs4ChLB_jTgP9DMjs9mb09i0JnjWjDuIi4URIUt0rHKuEiz1zGjZnIAi5auSTn4Nam0mnNtHJjCU9izq1LtZVpqlkKDsIjclCtK_uEUJfKpEBCJZnbTOdGaOwHkAhhGHMqy47Iy_YuF6C48G-UrOy6vlpMBFj7goO7e0Qe-0tebDzDyqJF4umNK8fkDsqFD4Y9Iwe7bW2fg3m4Uy8abH8DonhtzQ |
linkProvider | Geneva Foundation for Medical Education and Research |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enabling+Scientific+Reproducibility+through+FAIR+Data+Management%3A+An+ontology-driven+deep+learning+approach+in+the+NeuroBridge+Project&rft.jtitle=AMIA+...+Annual+Symposium+proceedings&rft.au=Wang%2C+Xiaochen&rft.au=Wang%2C+Yue&rft.au=Ambite%2C+Jos%C3%A9-Luis&rft.au=Appaji%2C+Abhishek&rft.date=2022&rft.eissn=1559-4076&rft.volume=2022&rft.spage=1135&rft_id=info%3Apmid%2F37128458&rft.externalDocID=37128458 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1942-597X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1942-597X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1942-597X&client=summon |