Mining Semantic Relations in Data References to Understand the Roles of Research Data in Academic Literature
Research data serves important roles in scientific discovery and academic innovation. To appropriately assign credit for data work and to measure the value of research data, it is essential to articulate how data are actually used in research. We leveraged a combination of computational methods and...
Saved in:
| Published in | IEEE/ACM Joint Conference on Digital Libraries (Online) pp. 215 - 227 |
|---|---|
| Main Authors | , , , , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.06.2023
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2575-8152 |
| DOI | 10.1109/JCDL57899.2023.00039 |
Cover
| Abstract | Research data serves important roles in scientific discovery and academic innovation. To appropriately assign credit for data work and to measure the value of research data, it is essential to articulate how data are actually used in research. We leveraged a combination of computational methods and human analysis to characterize different types of data use by mining semantic relations from the phrases where data are referenced in academic literature. In particular, we investigated references to data in the bibliography of a large social science data archive, the Inter-university Consortium for Political and Social Research (ICPSR). After retrieving and extracting semantic relations as subject-relation-object triples, we used rule-based methods to classify them. We then annotated samples from 11 frequent classes of data reference triples and found that they vary primarily along two dimensions of data use: proximity and function. Proximity describes the distance between the author and the data they reference (e.g., direct or indirect engagement). Function describes the role that data plays in each reference (e.g., describing interaction or providing context). These semantic relationships between authors and data reveal the ways data are used in scientific publications. Evidence of the variety of ways data are used can help stakeholders in research data curation and stewardship - including data providers, data curators, and data users - recognize the myriad ways that their investments in data sharing are realized. |
|---|---|
| AbstractList | Research data serves important roles in scientific discovery and academic innovation. To appropriately assign credit for data work and to measure the value of research data, it is essential to articulate how data are actually used in research. We leveraged a combination of computational methods and human analysis to characterize different types of data use by mining semantic relations from the phrases where data are referenced in academic literature. In particular, we investigated references to data in the bibliography of a large social science data archive, the Inter-university Consortium for Political and Social Research (ICPSR). After retrieving and extracting semantic relations as subject-relation-object triples, we used rule-based methods to classify them. We then annotated samples from 11 frequent classes of data reference triples and found that they vary primarily along two dimensions of data use: proximity and function. Proximity describes the distance between the author and the data they reference (e.g., direct or indirect engagement). Function describes the role that data plays in each reference (e.g., describing interaction or providing context). These semantic relationships between authors and data reveal the ways data are used in scientific publications. Evidence of the variety of ways data are used can help stakeholders in research data curation and stewardship - including data providers, data curators, and data users - recognize the myriad ways that their investments in data sharing are realized. |
| Author | Fan, Lizhou Wofford, Morgan Yakel, Elizabeth Thomer, Andrea Hemphill, Libby Lafia, Sara |
| Author_xml | – sequence: 1 givenname: Lizhou surname: Fan fullname: Fan, Lizhou email: lizhouf@umich.edu organization: School of Information, University of Michigan,Ann Arbor,Michigan,USA – sequence: 2 givenname: Sara surname: Lafia fullname: Lafia, Sara email: slafia@umich.edu organization: Inter-university Consortium for Political and Social Research, University of Michigan,Ann Arbor,Michigan,USA – sequence: 3 givenname: Morgan surname: Wofford fullname: Wofford, Morgan email: mwofford@umich.edu organization: School of Information, University of Michigan,Ann Arbor,Michigan,USA – sequence: 4 givenname: Andrea surname: Thomer fullname: Thomer, Andrea email: athomer@arizona.edu organization: School of Information, University of Michigan,Ann Arbor,Michigan,USA – sequence: 5 givenname: Elizabeth surname: Yakel fullname: Yakel, Elizabeth email: yakel@umich.edu organization: School of Information, University of Michigan,Ann Arbor,Michigan,USA – sequence: 6 givenname: Libby surname: Hemphill fullname: Hemphill, Libby email: libbyh@umich.edu organization: Inter-university Consortium for Political and Social Research, University of Michigan,Ann Arbor,Michigan,USA |
| BookMark | eNotjMtOwzAURA0CiVLyB134B1L8iB17WbU8FYRU6Lq6ca6pUeogxyz4e4LKakZHc-aaXMQhIiELzpacM3v7vN40qjbWLgUTcskYk_aMFLa2RqqpW8nNOZkJVavScCWuSDGOn38zwXmt5Iz0LyGG-EHf8AgxB0e32EMOQxxpiHQDGSbiMWF0ONI80F3sMI0ZYkfzAel26Cc--Gk1IiR3ODmTu3LQ4XF6bELGBPk74Q259NCPWPznnOzu797Xj2Xz-vC0XjUlCKNzKYRzteQeKg1SIkoBTJmWG-RtWxnuBTKnwTlulK9aNB20XWW0kL6uK6XlnCxOvwER918pHCH97DkTWjNp5C-sJFxG |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/JCDL57899.2023.00039 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings Accès INSA - IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Library & Information Science |
| EISBN | 9798350399318 |
| EISSN | 2575-8152 |
| EndPage | 227 |
| ExternalDocumentID | 10266038 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Science Foundation grantid: 1930645,2121789 funderid: 10.13039/1000000010 |
| GroupedDBID | 6IE 6IL 6IN ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
| ID | FETCH-LOGICAL-a286t-22cc731fa46a33ee32a058b18e1bb481f2e0c6acc185f4be8dabd48623f774563 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 02:49:56 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a286t-22cc731fa46a33ee32a058b18e1bb481f2e0c6acc185f4be8dabd48623f774563 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10266038 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-June |
| PublicationDateYYYYMMDD | 2023-06-01 |
| PublicationDate_xml | – month: 06 year: 2023 text: 2023-June |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM Joint Conference on Digital Libraries (Online) |
| PublicationTitleAbbrev | JCDL |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211753 ssib057256041 |
| Score | 1.8851234 |
| Snippet | Research data serves important roles in scientific discovery and academic innovation. To appropriately assign credit for data work and to measure the value of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 215 |
| SubjectTerms | Bibliographies Data analysis information extraction knowledge discovery Libraries Particle measurements research data management semantic triples Semantics Social sciences Technological innovation text mining |
| Title | Mining Semantic Relations in Data References to Understand the Roles of Research Data in Academic Literature |
| URI | https://ieeexplore.ieee.org/document/10266038 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA1uTz75NfFrch_Et840adP0eXOMMYeog72NJE1gqJ1o9-Kv96ZfiiD4Vgo3hOSm57Y951xCrhzL4gxTIYg05UHkhA40FS6IMob1hnMpxnm2xVxMFtF0GS9rsXqphbHWluQzO_CX5b_8bGO2_lMZnnCEE8plh3QSKSqxVpM8ceLBu8Z2_xjmrHShrOVyIU1vpsPRDBM09foU5o1NqW8R_qOpSokp4z0yb2ZTUUmeB9tCD8znL6PGf093n_S-5Xtw3wLTAdmx-SHp1woFuIZaguS3BOqzfURe7spWEfBoX3Gx1wZamhyscxipQkFrSvsBxQYWrSwGsIiEB-8MBRsHDZevisHYhoIPs9bCuUcW49un4SSoWzEEiklRBIwZk_DQqUgozq3lTNFY6lDaUOtIho5ZaoQyBuHfRdrKTOkswrcl7rC-jAU_Jt18k9sTAgmGKSoM97412mG9gqNo7UKZOiYsOyU9v5Srt8ptY9Ws4tkf98_Jrt_Oir51QbrF-9b2sVAo9GWZIF-Hn7xf |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA06H_TJr4lf0_sgvnWmSZq1z5tjzm6IbrC3kaQJDLUV7V789Sb9UgTBt1K4ISQ3vaftOecidGVIEiQ2FTwmMfWY4dKTmBuPJcTiDWMiG-fYFlM-mrPxIlhUYvVCC6O1Lshnuusui3_5SabW7lOZPeG2nGAabqKtgDEWlHKtOn2CnivfVXV3D2JKCh_KSjDn4-hm3B_ENkUjp1AhztoUuybhP9qqFFVluIum9XxKMslzd53Lrvr8ZdX47wnvofa3gA8emtK0jzZ0eoA6lUYBrqESIblNgep0H6KXSdEsAp70q13ulYKGKAerFAYiF9DY0n5AnsG8EcaAhZHw6LyhIDNQs_nKGBtbk_Ahbkyc22g-vJ31R17VjMETJOS5R4hSPeobwbigVGtKBA5C6Yfal5KFviEaKy6UsgDAMKnDRMiE2fclaizCDDg9Qq00S_Uxgp4NE5gr6pxrpLGIxY4ipfHDyBCuyQlqu6VcvpV-G8t6FU__uH-JtkezSbyM76b3Z2jHbW1J5jpHrfx9rTsWNuTyokiWL5Inv6w |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+Joint+Conference+on+Digital+Libraries+%28Online%29&rft.atitle=Mining+Semantic+Relations+in+Data+References+to+Understand+the+Roles+of+Research+Data+in+Academic+Literature&rft.au=Fan%2C+Lizhou&rft.au=Lafia%2C+Sara&rft.au=Wofford%2C+Morgan&rft.au=Thomer%2C+Andrea&rft.date=2023-06-01&rft.pub=IEEE&rft.eissn=2575-8152&rft.spage=215&rft.epage=227&rft_id=info:doi/10.1109%2FJCDL57899.2023.00039&rft.externalDocID=10266038 |