A classification method based on encoder‐decoder structure with paper content

The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is...

Full description

Saved in:
Bibliographic Details
Published inConcurrency and computation Vol. 34; no. 9
Main Authors Yin, Yi, Ouyang, Lin, Wu, Zhixiang, Yin, Shuifang
Format Journal Article
LanguageEnglish
Published Hoboken, USA John Wiley & Sons, Inc 25.04.2022
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text
ISSN1532-0626
1532-0634
DOI10.1002/cpe.5737

Cover

Abstract The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is a kind of paper classification method based on deep learning and implemented by the encoder‐decoder structure. This method inputs the words from a large number of papers into encoder, after calculating by NN (neural network) algorithm, the similarity degree of different papers is compared to achieve the purpose of classification. However, this type of method only considers the similarity between words, a NN algorithm can only calculate a large number of word information once, and it cannot find the regularity of classification through word information. But it has a difference with the similarity of the content. This paper starts from the perspective of considering the content, its label information is extracted, and the input vector of encoder‐decoder structure is formed with labels and words. This improves the original paper classification method based on encoder‐decoder structure. Firstly, the label information is based on the content, which can reflect the content of the paper. Secondly, the classification method which combines label information and word information can reflect the content of the paper comprehensively. Thirdly, the label information is independent of word information and NN algorithm is used separately to make this part of the content more consistent in the encoder‐decoder structure. Finally, the label information and the word information are combined, respectively, with the output values obtained by different NN algorithms to realize the classification of the content. This paper proves the effectiveness of the proposed method by evaluating the paper data in web of science and obtaining relevant experimental results.
AbstractList The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is a kind of paper classification method based on deep learning and implemented by the encoder‐decoder structure. This method inputs the words from a large number of papers into encoder, after calculating by NN (neural network) algorithm, the similarity degree of different papers is compared to achieve the purpose of classification. However, this type of method only considers the similarity between words, a NN algorithm can only calculate a large number of word information once, and it cannot find the regularity of classification through word information. But it has a difference with the similarity of the content. This paper starts from the perspective of considering the content, its label information is extracted, and the input vector of encoder‐decoder structure is formed with labels and words. This improves the original paper classification method based on encoder‐decoder structure. Firstly, the label information is based on the content, which can reflect the content of the paper. Secondly, the classification method which combines label information and word information can reflect the content of the paper comprehensively. Thirdly, the label information is independent of word information and NN algorithm is used separately to make this part of the content more consistent in the encoder‐decoder structure. Finally, the label information and the word information are combined, respectively, with the output values obtained by different NN algorithms to realize the classification of the content. This paper proves the effectiveness of the proposed method by evaluating the paper data in web of science and obtaining relevant experimental results.
Author Yin, Yi
Yin, Shuifang
Ouyang, Lin
Wu, Zhixiang
Author_xml – sequence: 1
  givenname: Yi
  orcidid: 0000-0001-9421-6096
  surname: Yin
  fullname: Yin, Yi
  email: yinyi@wust.edu.cn
  organization: Wuhan University of Computer Science and Technology
– sequence: 2
  givenname: Lin
  surname: Ouyang
  fullname: Ouyang, Lin
  organization: Wuhan University of Computer Science and Technology
– sequence: 3
  givenname: Zhixiang
  surname: Wu
  fullname: Wu, Zhixiang
  organization: Wuhan University of Computer Science and Technology
– sequence: 4
  givenname: Shuifang
  surname: Yin
  fullname: Yin, Shuifang
  organization: Wuhan University of Computer Science and Technology
BookMark eNp1kM9KAzEQh4NUsK2CjxDw4mVr_m2yeyylVqFQD3oOMcnSLW2yJllKbz6Cz-iTmLbizdP8ZviYGb4RGDjvLAC3GE0wQuRBd3ZSCiouwBCXlBSIUzb4y4RfgVGMG4QwRhQPwWoK9VbF2DatVqn1Du5sWnsD31W0BubeOu2NDd-fX8aeEowp9Dr1wcJ9m9awU10eau-SdekaXDZqG-3Nbx2Dt8f56-ypWK4Wz7PpstCkZKKgWrMKcaO4paRS2NSowUzUmFNUV6JiggnNudL5e2FoZUqBKEesolQoyhkdg7vz3i74j97GJDe-Dy6flIQzXJOaIJyp-zOlg48x2EZ2od2pcJAYyaMumXXJo66MFmd0327t4V9Ozl7mJ_4H4jpsQA
Cites_doi 10.1109/ICCV.2015.169
10.1126/science.253.5018.390
10.18653/v1/P19-1441
10.3115/v1/D14-1181
10.1109/ICDAR.2017.343
10.1109/FCCM.2015.50
10.1109/ICCMC.2019.8819856
10.3115/v1/P15-1162
ContentType Journal Article
Copyright 2020 John Wiley & Sons, Ltd.
2022 John Wiley & Sons, Ltd.
Copyright_xml – notice: 2020 John Wiley & Sons, Ltd.
– notice: 2022 John Wiley & Sons, Ltd.
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1002/cpe.5737
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList CrossRef

Computer and Information Systems Abstracts
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1532-0634
EndPage n/a
ExternalDocumentID 10_1002_cpe_5737
CPE5737
Genre article
GroupedDBID .3N
.DC
.GA
05W
0R~
10A
1L6
1OC
33P
3SF
3WU
4.4
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
5GY
5VS
66C
702
7PT
8-0
8-1
8-3
8-4
8-5
8UM
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAMNL
AANLZ
AAONW
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABEML
ABIJN
ACAHQ
ACCFJ
ACCZN
ACPOU
ACSCC
ACXBN
ACXQS
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADOZA
ADXAS
ADZMN
ADZOD
AEEZP
AEIGN
AEIMD
AEQDE
AEUQT
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFWVQ
AHBTC
AITYG
AIURR
AIWBW
AJBDE
AJXKR
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ATUGU
AUFTA
AZBYB
BAFTC
BDRZF
BFHJK
BHBCM
BMNLL
BROTX
BRXPI
BY8
CS3
D-E
D-F
DCZOG
DPXWK
DR2
DRFUL
DRSTM
EBS
F00
F01
F04
F5P
G-S
G.N
GNP
GODZA
HGLYW
HHY
HZ~
IX1
JPC
KQQ
LATKE
LAW
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LYRES
MEWTI
MK4
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
N9A
O66
O9-
OIG
P2W
P2X
P4D
PQQKQ
Q.N
Q11
QB0
QRW
R.K
ROL
RWI
RX1
SUPJJ
TN5
UB1
V2E
W8V
W99
WBKPD
WIH
WIK
WOHZO
WQJ
WRC
WXSBR
WYISQ
WZISG
XG1
XV2
~IA
~WT
AAYXX
ADMLS
AEYWJ
AGHNM
AGYGG
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c2547-3cc4806da6e328a1d90f1479163098784747c66ac6347d38d57036048337a3643
IEDL.DBID DR2
ISSN 1532-0626
IngestDate Fri Jul 25 04:24:45 EDT 2025
Wed Oct 01 00:59:25 EDT 2025
Wed Jan 22 16:25:00 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c2547-3cc4806da6e328a1d90f1479163098784747c66ac6347d38d57036048337a3643
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-9421-6096
PQID 2641929201
PQPubID 2045170
PageCount 7
ParticipantIDs proquest_journals_2641929201
crossref_primary_10_1002_cpe_5737
wiley_primary_10_1002_cpe_5737_CPE5737
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 25 April 2022
PublicationDateYYYYMMDD 2022-04-25
PublicationDate_xml – month: 04
  year: 2022
  text: 25 April 2022
  day: 25
PublicationDecade 2020
PublicationPlace Hoboken, USA
PublicationPlace_xml – name: Hoboken, USA
– name: Hoboken
PublicationTitle Concurrency and computation
PublicationYear 2022
Publisher John Wiley & Sons, Inc
Wiley Subscription Services, Inc
Publisher_xml – name: John Wiley & Sons, Inc
– name: Wiley Subscription Services, Inc
References 1991; 253
2019
2017
2019; 4
2016
2015
2014
2019; 33
e_1_2_7_6_1
e_1_2_7_5_1
e_1_2_7_4_1
e_1_2_7_3_1
Vilar E (e_1_2_7_8_1) 2019; 4
e_1_2_7_9_1
Gao J (e_1_2_7_2_1) 2019; 33
e_1_2_7_7_1
e_1_2_7_19_1
e_1_2_7_18_1
e_1_2_7_17_1
e_1_2_7_16_1
e_1_2_7_15_1
e_1_2_7_14_1
e_1_2_7_13_1
e_1_2_7_12_1
e_1_2_7_11_1
e_1_2_7_10_1
References_xml – volume: 4
  start-page: 35
  issue: 1
  year: 2019
  end-page: 62
  article-title: Word embedding, neural networks and text classification: what is the state‐of‐the‐art?
  publication-title: Junior Manage Sci
– volume: 33
  start-page: 6383
  year: 2019
  end-page: 6390
  article-title: Generating multiple diverse responses for short‐text conversation
  publication-title: Proc AAAI Conf Artif Intell
– year: 2017
– year: 2016
– year: 2019
– volume: 253
  start-page: 390
  issue: 5018
  year: 1991
  end-page: 395
  article-title: Statistical data analysis in the computer age
  publication-title: Science
– year: 2014
– year: 2015
– start-page: 3104
  year: 2014
  end-page: 3112
– ident: e_1_2_7_13_1
  doi: 10.1109/ICCV.2015.169
– ident: e_1_2_7_10_1
– ident: e_1_2_7_16_1
– volume: 33
  start-page: 6383
  year: 2019
  ident: e_1_2_7_2_1
  article-title: Generating multiple diverse responses for short‐text conversation
  publication-title: Proc AAAI Conf Artif Intell
– ident: e_1_2_7_18_1
  doi: 10.1126/science.253.5018.390
– volume: 4
  start-page: 35
  issue: 1
  year: 2019
  ident: e_1_2_7_8_1
  article-title: Word embedding, neural networks and text classification: what is the state‐of‐the‐art?
  publication-title: Junior Manage Sci
– ident: e_1_2_7_17_1
  doi: 10.18653/v1/P19-1441
– ident: e_1_2_7_3_1
– ident: e_1_2_7_9_1
  doi: 10.3115/v1/D14-1181
– ident: e_1_2_7_14_1
  doi: 10.1109/ICDAR.2017.343
– ident: e_1_2_7_6_1
– ident: e_1_2_7_11_1
  doi: 10.1109/FCCM.2015.50
– ident: e_1_2_7_4_1
– ident: e_1_2_7_19_1
  doi: 10.1109/ICCMC.2019.8819856
– ident: e_1_2_7_7_1
– ident: e_1_2_7_12_1
– ident: e_1_2_7_5_1
  doi: 10.3115/v1/P15-1162
– ident: e_1_2_7_15_1
SSID ssj0011031
Score 2.3023963
Snippet The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify...
SourceID proquest
crossref
wiley
SourceType Aggregation Database
Index Database
Publisher
SubjectTerms Algorithms
Classification
CNN
Coders
encoder‐decoder
hidden information
Machine learning
Mathematical analysis
Neural networks
Similarity
word embedding
Title A classification method based on encoder‐decoder structure with paper content
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fcpe.5737
https://www.proquest.com/docview/2641929201
Volume 34
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1532-0634
  dateEnd: 20241102
  omitProxy: false
  ssIdentifier: ssj0011031
  issn: 1532-0626
  databaseCode: ADMLS
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
– providerCode: PRVWIB
  databaseName: Wiley Online Library - Core collection (SURFmarket)
  issn: 1532-0626
  databaseCode: DR2
  dateStart: 19960101
  customDbUrl:
  isFulltext: true
  eissn: 1532-0634
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011031
  providerName: Wiley-Blackwell
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA7SkxfrE6tVIoi3bXeTbJI9ltIigg_EQsHDkpcXoZY-Lp78Cf5Gf4mTZLdVQRBPm102sDuTyXyTTL5B6FwWKne50olkSiUsVxm0ijSxinAwEM2c8zu61zf8csSuxvm4yqr0Z2EiP8Rqwc1bRpivvYErPe-uSUPN1HVyQf1B8ozyEE3dr5ijMl-9IFKlkiQF0F7zzqakW3f87onW8PIrSA1eZthEj_X3xeSS585yoTvm9Qd14_9-YBttVeAT9-Jo2UEbbrKLmnVhB1zZ-R667WHjUbVPIwqaw7HQNPY-z2K49_SX1s0-3t6tCy0ciWiXM4f90i6eqik89Inw4NX20Wg4eOhfJlXlhcRAwAizjjFMptwq7iiRKrNF-pQxAVCSpoUU4NGYMJwrwykTlkobeLw8Oz0VigLIOUCNycvEHSKsU2m4FsIxqcIerLYQgWpOhJXEFUULndVaKKeRYKOMVMqkBAmVXkIt1K7VU1YmNi8ByQE6LQDAtNBFkPOv_cv-3cBfj_764jHaJP6YQ8oSkrdRAyToTgB8LPRpGGafm1HVKQ
linkProvider Wiley-Blackwell
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1PS8MwFH_MedCL8y9Op0YQb51dmiYpnsbYmLpNkQ12EEqaxIswx9wunvwIfkY_iS_tuqkgiKempYH2vby833tJfg_gTEYqtKFKPMmU8lioatiKfM8oytFAEmatW9Ht9nh7wK6H4bAAl_lZmIwfYpFwc5aRztfOwF1C-mLJGqrHthqKQKzAKuMYpjhEdL_gjqq5-gUZWSr1fITtOfOsTy_ynt990RJgfoWpqZ9pleAh_8Jse8lTdTZNqvr1B3njP39hEzbm-JPUswGzBQU72oZSXtuBzE19B27rRDtg7XYSpcojWa1p4tyeIXjvGDCNnXy8vRubtkjGRTubWOKyu2SsxvjQ7YVHx7YLg1az32h78-ILnsaYEScerZn0uVHcBlSqmon8xxoTiCYDP5ICnRoTmnOlecCECaRJqbwcQX0gVIA4Zw-Ko-eR3QeS-FLzRAjLpEqXYRODQWjCqTCS2igqw2muhniccWzEGZsyjVFCsZNQGSq5fuK5lb3ECOYQoEaIYcpwngr61_5x467prgd_ffEE1tr9bifuXPVuDmGdulMPPvNoWIEiStMeIRaZJsfpmPsEBoPZSg
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB58gHjxLa7PCOKta5umeeBJXBefq4gLexBKmsSLsBYfF0_-BH-jv8RJ2voCQTw1LQmkM5nMl2TyDcCWVDpzmS4iybSOWKYTLKk4sppyNJCCOedPdM96_LDPjgfZYAR2m7swFT_Ex4abt4wwX3sDd6W92flkDTWla2ciFaMwzjIlfTxf5_KDOyrx-QsqslQaxQjbG-bZmO40Lb_7ok-A-RWmBj_TnYbrpodVeMlt--mxaJvnH-SN__yFGZiq8SfZqwbMLIy44RxMN7kdSG3q83C-R4wH1j6SKCiPVLmmiXd7luC7Z8C07v7t5dW6UCIVF-3TvSN-d5eUusSPPhYeHdsC9LsHV_uHUZ18ITK4ZsSJxxgmY241dymVOrEqvkmYQDSZxkoKdGpMGM614SkTNpU2UHl5gvpU6BRxziKMDe-GbglIEUvDCyEckzocwxYWF6EFp8JK6pRqwWajhrysODbyik2Z5iih3EuoBauNfvLayh5yBHMIUBVimBZsB0H_2j7fvzjwz-W_VtyAiYtONz896p2swCT1lx5iFtFsFcZQmG4NochjsR6G3DvjNtjO
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+classification+method+based+on+encoder%E2%80%90decoder+structure+with+paper+content&rft.jtitle=Concurrency+and+computation&rft.au=Yin%2C+Yi&rft.au=Ouyang%2C+Lin&rft.au=Wu%2C+Zhixiang&rft.au=Yin%2C+Shuifang&rft.date=2022-04-25&rft.issn=1532-0626&rft.eissn=1532-0634&rft.volume=34&rft.issue=9&rft_id=info:doi/10.1002%2Fcpe.5737&rft.externalDBID=n%2Fa&rft.externalDocID=10_1002_cpe_5737
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1532-0626&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1532-0626&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1532-0626&client=summon