A classification method based on encoder‐decoder structure with paper content
The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is...
Saved in:
| Published in | Concurrency and computation Vol. 34; no. 9 |
|---|---|
| Main Authors | , , , |
| Format | Journal Article |
| Language | English |
| Published |
Hoboken, USA
John Wiley & Sons, Inc
25.04.2022
Wiley Subscription Services, Inc |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1532-0626 1532-0634 |
| DOI | 10.1002/cpe.5737 |
Cover
| Abstract | The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is a kind of paper classification method based on deep learning and implemented by the encoder‐decoder structure. This method inputs the words from a large number of papers into encoder, after calculating by NN (neural network) algorithm, the similarity degree of different papers is compared to achieve the purpose of classification. However, this type of method only considers the similarity between words, a NN algorithm can only calculate a large number of word information once, and it cannot find the regularity of classification through word information. But it has a difference with the similarity of the content. This paper starts from the perspective of considering the content, its label information is extracted, and the input vector of encoder‐decoder structure is formed with labels and words. This improves the original paper classification method based on encoder‐decoder structure. Firstly, the label information is based on the content, which can reflect the content of the paper. Secondly, the classification method which combines label information and word information can reflect the content of the paper comprehensively. Thirdly, the label information is independent of word information and NN algorithm is used separately to make this part of the content more consistent in the encoder‐decoder structure. Finally, the label information and the word information are combined, respectively, with the output values obtained by different NN algorithms to realize the classification of the content. This paper proves the effectiveness of the proposed method by evaluating the paper data in web of science and obtaining relevant experimental results. |
|---|---|
| AbstractList | The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is a kind of paper classification method based on deep learning and implemented by the encoder‐decoder structure. This method inputs the words from a large number of papers into encoder, after calculating by NN (neural network) algorithm, the similarity degree of different papers is compared to achieve the purpose of classification. However, this type of method only considers the similarity between words, a NN algorithm can only calculate a large number of word information once, and it cannot find the regularity of classification through word information. But it has a difference with the similarity of the content. This paper starts from the perspective of considering the content, its label information is extracted, and the input vector of encoder‐decoder structure is formed with labels and words. This improves the original paper classification method based on encoder‐decoder structure. Firstly, the label information is based on the content, which can reflect the content of the paper. Secondly, the classification method which combines label information and word information can reflect the content of the paper comprehensively. Thirdly, the label information is independent of word information and NN algorithm is used separately to make this part of the content more consistent in the encoder‐decoder structure. Finally, the label information and the word information are combined, respectively, with the output values obtained by different NN algorithms to realize the classification of the content. This paper proves the effectiveness of the proposed method by evaluating the paper data in web of science and obtaining relevant experimental results. |
| Author | Yin, Yi Yin, Shuifang Ouyang, Lin Wu, Zhixiang |
| Author_xml | – sequence: 1 givenname: Yi orcidid: 0000-0001-9421-6096 surname: Yin fullname: Yin, Yi email: yinyi@wust.edu.cn organization: Wuhan University of Computer Science and Technology – sequence: 2 givenname: Lin surname: Ouyang fullname: Ouyang, Lin organization: Wuhan University of Computer Science and Technology – sequence: 3 givenname: Zhixiang surname: Wu fullname: Wu, Zhixiang organization: Wuhan University of Computer Science and Technology – sequence: 4 givenname: Shuifang surname: Yin fullname: Yin, Shuifang organization: Wuhan University of Computer Science and Technology |
| BookMark | eNp1kM9KAzEQh4NUsK2CjxDw4mVr_m2yeyylVqFQD3oOMcnSLW2yJllKbz6Cz-iTmLbizdP8ZviYGb4RGDjvLAC3GE0wQuRBd3ZSCiouwBCXlBSIUzb4y4RfgVGMG4QwRhQPwWoK9VbF2DatVqn1Du5sWnsD31W0BubeOu2NDd-fX8aeEowp9Dr1wcJ9m9awU10eau-SdekaXDZqG-3Nbx2Dt8f56-ypWK4Wz7PpstCkZKKgWrMKcaO4paRS2NSowUzUmFNUV6JiggnNudL5e2FoZUqBKEesolQoyhkdg7vz3i74j97GJDe-Dy6flIQzXJOaIJyp-zOlg48x2EZ2od2pcJAYyaMumXXJo66MFmd0327t4V9Ozl7mJ_4H4jpsQA |
| Cites_doi | 10.1109/ICCV.2015.169 10.1126/science.253.5018.390 10.18653/v1/P19-1441 10.3115/v1/D14-1181 10.1109/ICDAR.2017.343 10.1109/FCCM.2015.50 10.1109/ICCMC.2019.8819856 10.3115/v1/P15-1162 |
| ContentType | Journal Article |
| Copyright | 2020 John Wiley & Sons, Ltd. 2022 John Wiley & Sons, Ltd. |
| Copyright_xml | – notice: 2020 John Wiley & Sons, Ltd. – notice: 2022 John Wiley & Sons, Ltd. |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1002/cpe.5737 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | CrossRef Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1532-0634 |
| EndPage | n/a |
| ExternalDocumentID | 10_1002_cpe_5737 CPE5737 |
| Genre | article |
| GroupedDBID | .3N .DC .GA 05W 0R~ 10A 1L6 1OC 33P 3SF 3WU 4.4 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 5GY 5VS 66C 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHHS AAHQN AAMNL AANLZ AAONW AAXRX AAYCA AAZKR ABCQN ABCUV ABEML ABIJN ACAHQ ACCFJ ACCZN ACPOU ACSCC ACXBN ACXQS ADBBV ADEOM ADIZJ ADKYN ADMGS ADOZA ADXAS ADZMN ADZOD AEEZP AEIGN AEIMD AEQDE AEUQT AEUYR AFBPY AFFPM AFGKR AFPWT AFWVQ AHBTC AITYG AIURR AIWBW AJBDE AJXKR ALMA_UNASSIGNED_HOLDINGS ALUQN ALVPJ AMBMR AMYDB ATUGU AUFTA AZBYB BAFTC BDRZF BFHJK BHBCM BMNLL BROTX BRXPI BY8 CS3 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM EBS F00 F01 F04 F5P G-S G.N GNP GODZA HGLYW HHY HZ~ IX1 JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LYRES MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A O66 O9- OIG P2W P2X P4D PQQKQ Q.N Q11 QB0 QRW R.K ROL RWI RX1 SUPJJ TN5 UB1 V2E W8V W99 WBKPD WIH WIK WOHZO WQJ WRC WXSBR WYISQ WZISG XG1 XV2 ~IA ~WT AAYXX ADMLS AEYWJ AGHNM AGYGG CITATION 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c2547-3cc4806da6e328a1d90f1479163098784747c66ac6347d38d57036048337a3643 |
| IEDL.DBID | DR2 |
| ISSN | 1532-0626 |
| IngestDate | Fri Jul 25 04:24:45 EDT 2025 Wed Oct 01 00:59:25 EDT 2025 Wed Jan 22 16:25:00 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 9 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c2547-3cc4806da6e328a1d90f1479163098784747c66ac6347d38d57036048337a3643 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-9421-6096 |
| PQID | 2641929201 |
| PQPubID | 2045170 |
| PageCount | 7 |
| ParticipantIDs | proquest_journals_2641929201 crossref_primary_10_1002_cpe_5737 wiley_primary_10_1002_cpe_5737_CPE5737 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 25 April 2022 |
| PublicationDateYYYYMMDD | 2022-04-25 |
| PublicationDate_xml | – month: 04 year: 2022 text: 25 April 2022 day: 25 |
| PublicationDecade | 2020 |
| PublicationPlace | Hoboken, USA |
| PublicationPlace_xml | – name: Hoboken, USA – name: Hoboken |
| PublicationTitle | Concurrency and computation |
| PublicationYear | 2022 |
| Publisher | John Wiley & Sons, Inc Wiley Subscription Services, Inc |
| Publisher_xml | – name: John Wiley & Sons, Inc – name: Wiley Subscription Services, Inc |
| References | 1991; 253 2019 2017 2019; 4 2016 2015 2014 2019; 33 e_1_2_7_6_1 e_1_2_7_5_1 e_1_2_7_4_1 e_1_2_7_3_1 Vilar E (e_1_2_7_8_1) 2019; 4 e_1_2_7_9_1 Gao J (e_1_2_7_2_1) 2019; 33 e_1_2_7_7_1 e_1_2_7_19_1 e_1_2_7_18_1 e_1_2_7_17_1 e_1_2_7_16_1 e_1_2_7_15_1 e_1_2_7_14_1 e_1_2_7_13_1 e_1_2_7_12_1 e_1_2_7_11_1 e_1_2_7_10_1 |
| References_xml | – volume: 4 start-page: 35 issue: 1 year: 2019 end-page: 62 article-title: Word embedding, neural networks and text classification: what is the state‐of‐the‐art? publication-title: Junior Manage Sci – volume: 33 start-page: 6383 year: 2019 end-page: 6390 article-title: Generating multiple diverse responses for short‐text conversation publication-title: Proc AAAI Conf Artif Intell – year: 2017 – year: 2016 – year: 2019 – volume: 253 start-page: 390 issue: 5018 year: 1991 end-page: 395 article-title: Statistical data analysis in the computer age publication-title: Science – year: 2014 – year: 2015 – start-page: 3104 year: 2014 end-page: 3112 – ident: e_1_2_7_13_1 doi: 10.1109/ICCV.2015.169 – ident: e_1_2_7_10_1 – ident: e_1_2_7_16_1 – volume: 33 start-page: 6383 year: 2019 ident: e_1_2_7_2_1 article-title: Generating multiple diverse responses for short‐text conversation publication-title: Proc AAAI Conf Artif Intell – ident: e_1_2_7_18_1 doi: 10.1126/science.253.5018.390 – volume: 4 start-page: 35 issue: 1 year: 2019 ident: e_1_2_7_8_1 article-title: Word embedding, neural networks and text classification: what is the state‐of‐the‐art? publication-title: Junior Manage Sci – ident: e_1_2_7_17_1 doi: 10.18653/v1/P19-1441 – ident: e_1_2_7_3_1 – ident: e_1_2_7_9_1 doi: 10.3115/v1/D14-1181 – ident: e_1_2_7_14_1 doi: 10.1109/ICDAR.2017.343 – ident: e_1_2_7_6_1 – ident: e_1_2_7_11_1 doi: 10.1109/FCCM.2015.50 – ident: e_1_2_7_4_1 – ident: e_1_2_7_19_1 doi: 10.1109/ICCMC.2019.8819856 – ident: e_1_2_7_7_1 – ident: e_1_2_7_12_1 – ident: e_1_2_7_5_1 doi: 10.3115/v1/P15-1162 – ident: e_1_2_7_15_1 |
| SSID | ssj0011031 |
| Score | 2.3023963 |
| Snippet | The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify... |
| SourceID | proquest crossref wiley |
| SourceType | Aggregation Database Index Database Publisher |
| SubjectTerms | Algorithms Classification CNN Coders encoder‐decoder hidden information Machine learning Mathematical analysis Neural networks Similarity word embedding |
| Title | A classification method based on encoder‐decoder structure with paper content |
| URI | https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fcpe.5737 https://www.proquest.com/docview/2641929201 |
| Volume | 34 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1532-0634 dateEnd: 20241102 omitProxy: false ssIdentifier: ssj0011031 issn: 1532-0626 databaseCode: ADMLS dateStart: 20010101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost – providerCode: PRVWIB databaseName: Wiley Online Library - Core collection (SURFmarket) issn: 1532-0626 databaseCode: DR2 dateStart: 19960101 customDbUrl: isFulltext: true eissn: 1532-0634 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011031 providerName: Wiley-Blackwell |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA7SkxfrE6tVIoi3bXeTbJI9ltIigg_EQsHDkpcXoZY-Lp78Cf5Gf4mTZLdVQRBPm102sDuTyXyTTL5B6FwWKne50olkSiUsVxm0ijSxinAwEM2c8zu61zf8csSuxvm4yqr0Z2EiP8Rqwc1bRpivvYErPe-uSUPN1HVyQf1B8ozyEE3dr5ijMl-9IFKlkiQF0F7zzqakW3f87onW8PIrSA1eZthEj_X3xeSS585yoTvm9Qd14_9-YBttVeAT9-Jo2UEbbrKLmnVhB1zZ-R667WHjUbVPIwqaw7HQNPY-z2K49_SX1s0-3t6tCy0ciWiXM4f90i6eqik89Inw4NX20Wg4eOhfJlXlhcRAwAizjjFMptwq7iiRKrNF-pQxAVCSpoUU4NGYMJwrwykTlkobeLw8Oz0VigLIOUCNycvEHSKsU2m4FsIxqcIerLYQgWpOhJXEFUULndVaKKeRYKOMVMqkBAmVXkIt1K7VU1YmNi8ByQE6LQDAtNBFkPOv_cv-3cBfj_764jHaJP6YQ8oSkrdRAyToTgB8LPRpGGafm1HVKQ |
| linkProvider | Wiley-Blackwell |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1PS8MwFH_MedCL8y9Op0YQb51dmiYpnsbYmLpNkQ12EEqaxIswx9wunvwIfkY_iS_tuqkgiKempYH2vby833tJfg_gTEYqtKFKPMmU8lioatiKfM8oytFAEmatW9Ht9nh7wK6H4bAAl_lZmIwfYpFwc5aRztfOwF1C-mLJGqrHthqKQKzAKuMYpjhEdL_gjqq5-gUZWSr1fITtOfOsTy_ynt990RJgfoWpqZ9pleAh_8Jse8lTdTZNqvr1B3njP39hEzbm-JPUswGzBQU72oZSXtuBzE19B27rRDtg7XYSpcojWa1p4tyeIXjvGDCNnXy8vRubtkjGRTubWOKyu2SsxvjQ7YVHx7YLg1az32h78-ILnsaYEScerZn0uVHcBlSqmon8xxoTiCYDP5ICnRoTmnOlecCECaRJqbwcQX0gVIA4Zw-Ko-eR3QeS-FLzRAjLpEqXYRODQWjCqTCS2igqw2muhniccWzEGZsyjVFCsZNQGSq5fuK5lb3ECOYQoEaIYcpwngr61_5x467prgd_ffEE1tr9bifuXPVuDmGdulMPPvNoWIEiStMeIRaZJsfpmPsEBoPZSg |
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB58gHjxLa7PCOKta5umeeBJXBefq4gLexBKmsSLsBYfF0_-BH-jv8RJ2voCQTw1LQmkM5nMl2TyDcCWVDpzmS4iybSOWKYTLKk4sppyNJCCOedPdM96_LDPjgfZYAR2m7swFT_Ex4abt4wwX3sDd6W92flkDTWla2ciFaMwzjIlfTxf5_KDOyrx-QsqslQaxQjbG-bZmO40Lb_7ok-A-RWmBj_TnYbrpodVeMlt--mxaJvnH-SN__yFGZiq8SfZqwbMLIy44RxMN7kdSG3q83C-R4wH1j6SKCiPVLmmiXd7luC7Z8C07v7t5dW6UCIVF-3TvSN-d5eUusSPPhYeHdsC9LsHV_uHUZ18ITK4ZsSJxxgmY241dymVOrEqvkmYQDSZxkoKdGpMGM614SkTNpU2UHl5gvpU6BRxziKMDe-GbglIEUvDCyEckzocwxYWF6EFp8JK6pRqwWajhrysODbyik2Z5iih3EuoBauNfvLayh5yBHMIUBVimBZsB0H_2j7fvzjwz-W_VtyAiYtONz896p2swCT1lx5iFtFsFcZQmG4NochjsR6G3DvjNtjO |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+classification+method+based+on+encoder%E2%80%90decoder+structure+with+paper+content&rft.jtitle=Concurrency+and+computation&rft.au=Yin%2C+Yi&rft.au=Ouyang%2C+Lin&rft.au=Wu%2C+Zhixiang&rft.au=Yin%2C+Shuifang&rft.date=2022-04-25&rft.issn=1532-0626&rft.eissn=1532-0634&rft.volume=34&rft.issue=9&rft_id=info:doi/10.1002%2Fcpe.5737&rft.externalDBID=n%2Fa&rft.externalDocID=10_1002_cpe_5737 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1532-0626&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1532-0626&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1532-0626&client=summon |