LG4AV: Combining Language Models and Graph Neural Networks for Author Verification
The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact of their publications and public figures are confronted by their posts on social media platforms. Therefore, it is important that authorship i...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , |
Format | Paper Journal Article |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
03.09.2021
|
Subjects | |
Online Access | Get full text |
ISSN | 2331-8422 |
DOI | 10.48550/arxiv.2109.01479 |
Cover
Abstract | The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact of their publications and public figures are confronted by their posts on social media platforms. Therefore, it is important that authorship information in frequently used web services and platforms is correct. The question whether a given document is written by a given author is commonly referred to as authorship verification (AV). While AV is a widely investigated problem in general, only few works consider settings where the documents are short and written in a rather uniform style. This makes most approaches unpractical for online databases and knowledge graphs in the scholarly domain. Here, authorships of scientific publications have to be verified, often with just abstracts and titles available. To this point, we present our novel approach LG4AV which combines language models and graph neural networks for authorship verification. By directly feeding the available texts in a pre-trained transformer architecture, our model does not need any hand-crafted stylometric features that are not meaningful in scenarios where the writing style is, at least to some extent, standardized. By the incorporation of a graph neural network structure, our model can benefit from relations between authors that are meaningful with respect to the verification process. For example, scientific authors are more likely to write about topics that are addressed by their co-authors and twitter users tend to post about the same subjects as people they follow. We experimentally evaluate our model and study to which extent the inclusion of co-authorships enhances verification decisions in bibliometric environments. |
---|---|
AbstractList | The automatic verification of document authorships is important in various
settings. Researchers are for example judged and compared by the amount and
impact of their publications and public figures are confronted by their posts
on social media platforms. Therefore, it is important that authorship
information in frequently used web services and platforms is correct. The
question whether a given document is written by a given author is commonly
referred to as authorship verification (AV). While AV is a widely investigated
problem in general, only few works consider settings where the documents are
short and written in a rather uniform style. This makes most approaches
unpractical for online databases and knowledge graphs in the scholarly domain.
Here, authorships of scientific publications have to be verified, often with
just abstracts and titles available. To this point, we present our novel
approach LG4AV which combines language models and graph neural networks for
authorship verification. By directly feeding the available texts in a
pre-trained transformer architecture, our model does not need any hand-crafted
stylometric features that are not meaningful in scenarios where the writing
style is, at least to some extent, standardized. By the incorporation of a
graph neural network structure, our model can benefit from relations between
authors that are meaningful with respect to the verification process. For
example, scientific authors are more likely to write about topics that are
addressed by their co-authors and twitter users tend to post about the same
subjects as people they follow. We experimentally evaluate our model and study
to which extent the inclusion of co-authorships enhances verification decisions
in bibliometric environments. The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact of their publications and public figures are confronted by their posts on social media platforms. Therefore, it is important that authorship information in frequently used web services and platforms is correct. The question whether a given document is written by a given author is commonly referred to as authorship verification (AV). While AV is a widely investigated problem in general, only few works consider settings where the documents are short and written in a rather uniform style. This makes most approaches unpractical for online databases and knowledge graphs in the scholarly domain. Here, authorships of scientific publications have to be verified, often with just abstracts and titles available. To this point, we present our novel approach LG4AV which combines language models and graph neural networks for authorship verification. By directly feeding the available texts in a pre-trained transformer architecture, our model does not need any hand-crafted stylometric features that are not meaningful in scenarios where the writing style is, at least to some extent, standardized. By the incorporation of a graph neural network structure, our model can benefit from relations between authors that are meaningful with respect to the verification process. For example, scientific authors are more likely to write about topics that are addressed by their co-authors and twitter users tend to post about the same subjects as people they follow. We experimentally evaluate our model and study to which extent the inclusion of co-authorships enhances verification decisions in bibliometric environments. |
Author | Stumme, Gerd Stubbemann, Maximilian |
Author_xml | – sequence: 1 givenname: Maximilian surname: Stubbemann fullname: Stubbemann, Maximilian – sequence: 2 givenname: Gerd surname: Stumme fullname: Stumme, Gerd |
BackLink | https://doi.org/10.1007/978-3-031-01333-1_25$$DView published paper (Access to full text may be restricted) https://doi.org/10.48550/arXiv.2109.01479$$DView paper in arXiv |
BookMark | eNotj8tKw0AARQdRsNZ-gCsHXCfO--GuFI1CVFDpNkySSTq1namTxMffG1tXBy6Xyz1n4NgHbwG4wChlinN0beK3-0wJRjpFmEl9BCaEUpwoRsgpmHXdGiFEhCSc0wl4yTM2X97ARdiWzjvfwtz4djCthY-htpsOGl_DLJrdCj7ZIZrNiP4rxPcONiHC-dCvRixtdI2rTO-CPwcnjdl0dvbPKXi9u31b3Cf5c_awmOeJ4YQmttRKKdTgEguKMWK1IBXRQmrMJTGYGVlSjpGsJKrrijZj0GBc8rrUQlR0Ci4Pq3vfYhfd1sSf4s-72HuPjatDYxfDx2C7vliHIfrxUkG40EwRRSn9BZNVWv8 |
ContentType | Paper Journal Article |
Copyright | 2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
Copyright_xml | – notice: 2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKY GOX |
DOI | 10.48550/arxiv.2109.01479 |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea SciTech Premium Collection ProQuest Engineering Collection Engineering Database ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection arXiv Computer Science arXiv.org |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository – sequence: 2 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
ExternalDocumentID | 2109_01479 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS AKY GOX |
ID | FETCH-LOGICAL-a523-eb98880f1b1631104d62c296791572a14a7b35107c70ddc3f4a7f11b5db966c3 |
IEDL.DBID | GOX |
IngestDate | Tue Jul 22 21:57:39 EDT 2025 Mon Jun 30 09:17:46 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a523-eb98880f1b1631104d62c296791572a14a7b35107c70ddc3f4a7f11b5db966c3 |
Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 |
OpenAccessLink | https://arxiv.org/abs/2109.01479 |
PQID | 2569482833 |
PQPubID | 2050157 |
ParticipantIDs | arxiv_primary_2109_01479 proquest_journals_2569482833 |
PublicationCentury | 2000 |
PublicationDate | 20210903 2021-09-03 |
PublicationDateYYYYMMDD | 2021-09-03 |
PublicationDate_xml | – month: 09 year: 2021 text: 20210903 day: 03 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2021 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 1.7736931 |
SecondaryResourceType | preprint |
Snippet | The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact... The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact... |
SourceID | arxiv proquest |
SourceType | Open Access Repository Aggregation Database |
SubjectTerms | Authorship Bibliometrics Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Learning Documents Graph neural networks Knowledge representation Neural networks Scientific papers Verification Web services |
SummonAdditionalLinks | – databaseName: ProQuest Technology Collection dbid: 8FG link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3NS8MwFA-6IXjzk02n5OA1W9ukTeNFRNyGTBE_xm4lXwVhdHWd4p_vS5bpQfAUSKGHl-T93u99InRBAdWksiVJaBkRZoUmeZrmJOWuEN2aVGSuwPn-IRu_srtZOgsOtyakVW50olfUZqGdj3wA0CwY0ANKr-p34qZGuehqGKGxjdpxAljrKsWHox8fS5JxsJjpOpjpW3cN5PLr7bMPPEf0gRy4BK623_qjij2-DPdQ-1HWdrmPtmx1gHZ8WqZuDtHTZMSup5cYXq3ykxzwJPgXsRtiNm-wrAweuabT2LXZkHNYfF53g8Eaxc4BBssUrlkZnHNH6Hl4-3IzJmEKApFAEolVAkhqVMYKLCfAamayRCci4yJOeSJjJrmi8LC45pExmpawUcaxSo0CJqPpMWpVi8p2EAYJZpHOM5ZrMJsoLNQwbeEPYFRJTruo40VR1Os-F4WTUuGl1EW9jXSKcMeb4vdETv7_fIp2E5cJ4sIwtIdaq-WHPQMoX6lzf17f6yqbtA priority: 102 providerName: ProQuest |
Title | LG4AV: Combining Language Models and Graph Neural Networks for Author Verification |
URI | https://www.proquest.com/docview/2569482833 https://arxiv.org/abs/2109.01479 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV07T8MwED61ZWFBIEAFSuWBNZD4EcdsBbWpUFtQgapbFDuOhIQKagpi4rdzdlIxIJZYspwbzo_7vrPvDuCCoVXLtS0Dysow4FaZIBEiCYR0gei2ECp2Ac7TWTx-5ndLsWwB2cbC5Ouvl886P7CurpCPqEsE8VK1oU2pI1fp_bK-nPSpuJrxv-MQY_quP0ertxejfdhrgB4Z1DNzAC27OoT5JOWDxTXBXah9ZQYyafyFxBUle60IEnuSuiTSxKXNQAGz-p12RRBdEufQwmaBy6ZsnG1H8DgaPt2Og6aqQZAj6QusVkg6wzLSiITQ9vIipoaqWKpISJpHPJea4UaRRoZFYViJHWUUaVFoZCaGHUNn9bayXSAIVuLQJDFPDMIghg0ruLEoAUFSLtkJdL0qsvc6b0XmtJR5LZ1Ab6udrFmzVYbyFEcCxtjp_3-ewS51rzrclQrrQWez_rDnaJY3ug_tZJT2YedmOHuY9_1M4Xf6PfwB2RSOuA |
linkProvider | Cornell University |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LS8NAEB5qi-jNJ1ar7kGPaZvs5iWIiNqHTYtoLT0ZNpsNCCWtTX39J3-ks9tGD4K3nhY2sIeZycx88wQ4oWjVeCQTw6JJ3WDSF4Zn255hu6oRXca276gG527PaT2y26E9LMBX3gujyipznagVdTwWKkZeQ9PsM4QHlF5MXgy1NUplV_MVGnOx6MjPd4Rs2Xn7Gvl7almNm_5Vy1hsFTA4gi5DRj6CvnpiRuiJoO1jsWMJy3dc37Rdi5uMuxFFQXWFW49jQRO8SEwzsuMIkYGg-OoKlBilVBUQeo3mT0THclz0z-k8daoHhdX49OP5rYqoyq8iFFHlYiV99Ufxa2vW2IDSHZ_I6SYUZLoFq7oIVGTbcB802eXgjKCOiPTeCBIsoplErUwbZYSnMWmqEddEDfXgIzx0FXlG0PclKtyGxwCFOlmEAnfgYQnU2YViOk7lHhDkl1MXnsM8gU4axYPGTEh8AV047tIy7GlShJP5VI1QUSnUVCpDJadOuPijsvCX__v_fz6GtVa_G4RBu9c5gHVL1aCoBBCtQHE2fZWH6ETMoiPNOwJPyxWVb0-_1Wo |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LG4AV%3A+Combining+Language+Models+and+Graph+Neural+Networks+for+Author+Verification&rft.jtitle=arXiv.org&rft.au=Stubbemann%2C+Maximilian&rft.au=Stumme%2C+Gerd&rft.date=2021-09-03&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2109.01479 |