Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project

Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated sear...

Full description

Saved in:
Bibliographic Details
Published inAMIA ... Annual Symposium proceedings Vol. 2022; p. 1135
Main Authors Wang, Xiaochen, Wang, Yue, Ambite, José-Luis, Appaji, Abhishek, Lander, Howard, Moore, Stephen M, Rajasekar, Arcot K, Turner, Jessica A, Turner, Matthew D, Wang, Lei, Sahoo, Satya S
Format Journal Article
LanguageEnglish
Published United States 2022
Subjects
Online AccessGet full text
ISSN1942-597X
1559-4076

Cover

Abstract Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/.
AbstractList Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/.Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/.
Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including neuroscience, which uses imaging and electrophysiology modalities as primary endpoints or key dependency in studies. We are developing an integrated search platform called NeuroBridge to enable researchers to search for relevant study datasets that can be used to test a hypothesis or replicate a published finding without having to perform a difficult search from scratch, including contacting individual study authors and locating the site to download the data. In this paper, we describe the development of a metadata ontology based on the World Wide Web Consortium (W3C) PROV specifications to create a corpus of semantically annotated published papers. This annotated corpus was used in a deep learning model to support automated identification of candidate datasets related to neurocognitive assessment of subjects with drug abuse or schizophrenia using neuroimaging. We built on our previous work in the Provenance for Clinical and Health Research (ProvCaRe) project to model metadata information in the NeuroBridge ontology and used this ontology to annotate 51 articles using a Web-based tool called Inception. The Bidirectional Encoder Representations from Transformers (BERT) neural network model, which was trained using the annotated corpus, is used to classify and rank papers relevant to five research hypotheses and the results were evaluated independently by three users for accuracy and recall. Our combined use of the NeuroBridge ontology together with the deep learning model outperforms the existing PubMed Central (PMC) search engine and manifests considerable trainability and transparency compared with typical free-text search. An initial version of the NeuroBridge portal is available at: https://neurobridges.org/.
Author Moore, Stephen M
Wang, Yue
Appaji, Abhishek
Turner, Matthew D
Wang, Xiaochen
Rajasekar, Arcot K
Lander, Howard
Ambite, José-Luis
Wang, Lei
Sahoo, Satya S
Turner, Jessica A
Author_xml – sequence: 1
  givenname: Xiaochen
  surname: Wang
  fullname: Wang, Xiaochen
  organization: Pennsylvania State University, State College, PA, USA
– sequence: 2
  givenname: Yue
  surname: Wang
  fullname: Wang, Yue
  organization: University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
– sequence: 3
  givenname: José-Luis
  surname: Ambite
  fullname: Ambite, José-Luis
  organization: University of Southern California, Los Angeles, CA, USA
– sequence: 4
  givenname: Abhishek
  surname: Appaji
  fullname: Appaji, Abhishek
  organization: B.M.S. College of Engineering, Bangalore, India
– sequence: 5
  givenname: Howard
  surname: Lander
  fullname: Lander, Howard
  organization: Renaissance Computing Institute, Chapel Hill, NC, USA
– sequence: 6
  givenname: Stephen M
  surname: Moore
  fullname: Moore, Stephen M
  organization: Washington University in St. Louis, St. Louis, MO, USA
– sequence: 7
  givenname: Arcot K
  surname: Rajasekar
  fullname: Rajasekar, Arcot K
  organization: Renaissance Computing Institute, Chapel Hill, NC, USA
– sequence: 8
  givenname: Jessica A
  surname: Turner
  fullname: Turner, Jessica A
  organization: Georgia State University, Atlanta, GA, USA
– sequence: 9
  givenname: Matthew D
  surname: Turner
  fullname: Turner, Matthew D
  organization: Georgia State University, Atlanta, GA, USA
– sequence: 10
  givenname: Lei
  surname: Wang
  fullname: Wang, Lei
  organization: Ohio State University, Columbus, OH, USA
– sequence: 11
  givenname: Satya S
  surname: Sahoo
  fullname: Sahoo, Satya S
  organization: Case Western Reserve University, Cleveland, OH, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37128458$$D View this record in MEDLINE/PubMed
BookMark eNo1kMlOwzAYhC1URBd4BeQjl0hOHC_hVkoLldhUeuAWOfaf1FVqBydB6gvw3ARRTjOHmU-jmaKR8w7O0CRmLItSIvho8FmaRCwTH2M0bds9Ialgkl-gMRVxIlMmJ-h76VRRW1fhd23Bdba0Gm-gCd702ha2tt0Rd7vg-2qHV_P1Bt-rTuFn5VQFh6Fwi-cOe9f52lfHyAT7BQ4bgAbXoIL7JatmwCm9w9YNKMAv0Ad_F6ypAL8FvwfdXaLzUtUtXJ10hrar5XbxGD29PqwX86eoYVxG3BQqlbKgpIi5FGmZcmMSlZEkKcpYcJJxqkqtmS5KohiVgkNJNShKNaOM0hm6-cMOgz57aLv8YFsNda0c-L7NE0nk8JDgcohen6J9cQCTN8EeVDjm_9fRHynMb0E
ContentType Journal Article
Copyright 2022 AMIA - All rights reserved.
Copyright_xml – notice: 2022 AMIA - All rights reserved.
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 1559-4076
ExternalDocumentID 37128458
Genre Journal Article
GroupedDBID 2WC
53G
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BAWUL
CGR
CUY
CVF
DIK
E3Z
ECM
EIF
GX1
HYE
M~E
NPM
OK1
RPM
WOQ
7X8
ID FETCH-LOGICAL-p568-6dba488b30b16874f46dd2a9022bf1760963afcc5cbf0a53876ef3cea33c53533
ISSN 1942-597X
IngestDate Fri Jul 11 16:44:54 EDT 2025
Sat Sep 28 08:13:21 EDT 2024
IsPeerReviewed true
IsScholarly true
Language English
License 2022 AMIA - All rights reserved.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p568-6dba488b30b16874f46dd2a9022bf1760963afcc5cbf0a53876ef3cea33c53533
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 37128458
PQID 2808586768
PQPubID 23479
ParticipantIDs proquest_miscellaneous_2808586768
pubmed_primary_37128458
PublicationCentury 2000
PublicationDate 2022-00-00
20220101
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – year: 2022
  text: 2022-00-00
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle AMIA ... Annual Symposium proceedings
PublicationTitleAlternate AMIA Annu Symp Proc
PublicationYear 2022
SSID ssj0047586
Score 2.2557783
Snippet Scientific reproducibility that effectively leverages existing study data is critical to the advancement of research in many disciplines including...
SourceID proquest
pubmed
SourceType Aggregation Database
Index Database
StartPage 1135
SubjectTerms Algorithms
Deep Learning
Humans
PubMed
Reproducibility of Results
Search Engine
Title Enabling Scientific Reproducibility through FAIR Data Management: An ontology-driven deep learning approach in the NeuroBridge Project
URI https://www.ncbi.nlm.nih.gov/pubmed/37128458
https://www.proquest.com/docview/2808586768
Volume 2022
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1559-4076
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0047586
  issn: 1942-597X
  databaseCode: DIK
  dateStart: 20030101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1559-4076
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0047586
  issn: 1942-597X
  databaseCode: GX1
  dateStart: 20030101
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVAQN
  databaseName: PubMed Central (US National Library of Medicine)
  customDbUrl:
  eissn: 1559-4076
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0047586
  issn: 1942-597X
  databaseCode: RPM
  dateStart: 20030101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELbQHhAXxPJcFpCRuFWJmiZ2HG4B7Wp3RRGCIsqp8nMpqGnVbQ7LD-B3MxM7D1WsBFysyLZ88Dcaz0xmviHklUhckTLJIyHcJMqc4ZFKxjoyuWYTplMzKbA4efqen33OLuZs3oeym-qSnYr1zz_WlfwPqjAHuGKV7D8g2x0KE_AN-MIICMP4VxifYOFTQ6ndlDVi1g9a1A2Jq896ve4a8ZyW5x8B450cZLyEqCBSGGBwPTJb1H0jY-2mbSdx2bGO9xmR9Xb9xtd5ffBxnKGFW07Py1Ecx6PA3P_peoWJYfVq1L-VnR3_JUSr50uJnbuq_fmvdS-RKxVa-V2sr_zv_ehdveyOAnNafm9SE0r1DRP9fwwDGr4qObZB_bICPFrfEabVz90er2KTxPObDPDdrBqA0xwfW88Ev0ei3S7BWw0WWdPiY96lAmXgM2Erq3bTzc5GY3TM7pG7wVugpYf-kNyy1X1yexryIR6QX60E0F4C6J4E0CABFCWAogTQXgJe07Kie_hTxJ-2-NMWf7qs4ChLB_jTgP9DMjs9mb09i0JnjWjDuIi4URIUt0rHKuEiz1zGjZnIAi5auSTn4Nam0mnNtHJjCU9izq1LtZVpqlkKDsIjclCtK_uEUJfKpEBCJZnbTOdGaOwHkAhhGHMqy47Iy_YuF6C48G-UrOy6vlpMBFj7goO7e0Qe-0tebDzDyqJF4umNK8fkDsqFD4Y9Iwe7bW2fg3m4Uy8abH8DonhtzQ
linkProvider Geneva Foundation for Medical Education and Research
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enabling+Scientific+Reproducibility+through+FAIR+Data+Management%3A+An+ontology-driven+deep+learning+approach+in+the+NeuroBridge+Project&rft.jtitle=AMIA+...+Annual+Symposium+proceedings&rft.au=Wang%2C+Xiaochen&rft.au=Wang%2C+Yue&rft.au=Ambite%2C+Jos%C3%A9-Luis&rft.au=Appaji%2C+Abhishek&rft.date=2022&rft.eissn=1559-4076&rft.volume=2022&rft.spage=1135&rft_id=info%3Apmid%2F37128458&rft.externalDocID=37128458
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1942-597X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1942-597X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1942-597X&client=summon