A Content-Based Model for Tag Recommendation in Software Information Sites
Abstract Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might b...
Saved in:
Published in | Computer journal Vol. 64; no. 11; pp. 1680 - 1691 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Oxford University Press
01.11.2021
|
Subjects | |
Online Access | Get full text |
ISSN | 0010-4620 1460-2067 |
DOI | 10.1093/comjnl/bxz144 |
Cover
Abstract | Abstract
Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might be noisy, and postings may become poorly categorized since people tag a posting based on their knowledge of its content and other existing tags. To keep the content well organized, tag recommendation systems can help users by suggesting appropriate tags for their posted content. In this paper, we propose a tag recommendation scheme that uses the textual content of already tagged postings to recommend suitable tags for newly posted content. Our approach combines multi-label classification and textual similarity techniques to improve the performance of tag recommendation. We evaluate the performance of the proposed scheme on 11 software information sites from the Stack Exchange network. The results show a significant improvement over TagCombine, TagMulRec and FastTagRec, which are well-known tag recommendation systems. On average, the proposed model outperforms TagCombine, TagMulRec and FastTagRec by 26.2, 15.9 and 13.8% in terms of Recall@5 and by 16.9, 12.4 and 9.4% in terms of Recall@10, respectively. |
---|---|
AbstractList | Abstract
Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might be noisy, and postings may become poorly categorized since people tag a posting based on their knowledge of its content and other existing tags. To keep the content well organized, tag recommendation systems can help users by suggesting appropriate tags for their posted content. In this paper, we propose a tag recommendation scheme that uses the textual content of already tagged postings to recommend suitable tags for newly posted content. Our approach combines multi-label classification and textual similarity techniques to improve the performance of tag recommendation. We evaluate the performance of the proposed scheme on 11 software information sites from the Stack Exchange network. The results show a significant improvement over TagCombine, TagMulRec and FastTagRec, which are well-known tag recommendation systems. On average, the proposed model outperforms TagCombine, TagMulRec and FastTagRec by 26.2, 15.9 and 13.8% in terms of Recall@5 and by 16.9, 12.4 and 9.4% in terms of Recall@10, respectively. Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might be noisy, and postings may become poorly categorized since people tag a posting based on their knowledge of its content and other existing tags. To keep the content well organized, tag recommendation systems can help users by suggesting appropriate tags for their posted content. In this paper, we propose a tag recommendation scheme that uses the textual content of already tagged postings to recommend suitable tags for newly posted content. Our approach combines multi-label classification and textual similarity techniques to improve the performance of tag recommendation. We evaluate the performance of the proposed scheme on 11 software information sites from the Stack Exchange network. The results show a significant improvement over TagCombine, TagMulRec and FastTagRec, which are well-known tag recommendation systems. On average, the proposed model outperforms TagCombine, TagMulRec and FastTagRec by 26.2, 15.9 and 13.8% in terms of Recall@5 and by 16.9, 12.4 and 9.4% in terms of Recall@10, respectively. |
Author | Sadreddini, Mohammad Hadi Safdel, Atefeh Fakhrahmad, Seyed Mostafa Gharibi, Reza |
Author_xml | – sequence: 1 givenname: Reza surname: Gharibi fullname: Gharibi, Reza email: gharibi@cse.shirazu.ac.ir organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran – sequence: 2 givenname: Atefeh surname: Safdel fullname: Safdel, Atefeh organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran – sequence: 3 givenname: Seyed Mostafa surname: Fakhrahmad fullname: Fakhrahmad, Seyed Mostafa email: fakhrahmad@shirazu.ac.ir organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran – sequence: 4 givenname: Mohammad Hadi surname: Sadreddini fullname: Sadreddini, Mohammad Hadi organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran |
BookMark | eNqFkEtLAzEUhYNUsK0u3WfpJvZmMpPpLGvxUVEEW9dDkrmRKTNJSSI-fr3VcSWIqwuH7xwu34SMnHdIyCmHcw6VmBnfb103028fPM8PyJjnElgGshyRMQAHlssMjsgkxi0AZFDJMbld0KV3CV1iFypiQ-99gx21PtCNeqaPuB_t0TUqtd7R1tG1t-lVBaQrt4f6IV-3CeMxObSqi3jyc6fk6epys7xhdw_Xq-XijpmshMSMLnSFQheFkFmZZZXAyhYiB7SGa1UYi9KAVhXKObeF1VpzyHUzL0olDAgxJWLYNcHHGNDWpk3ff6Sg2q7mUH_pqAcd9aBj32K_WrvQ9iq8_8mfDbx_2f2DfgJE3XYi |
CitedBy_id | crossref_primary_10_1145_3708532 crossref_primary_10_32604_cmc_2024_050389 |
Cites_doi | 10.1109/ICPC.2015.18 10.1007/978-3-319-25255-1_22 10.1007/s11390-015-1578-2 10.1017/CBO9780511809071 10.1145/1882362.1882435 10.1007/s10515-018-0239-4 10.1109/MSR.2013.6624040 10.1007/BFb0026683 10.1093/comjnl/bxs040 10.1145/1864708.1864741 10.2307/3001968 10.1109/ICSME.2014.51 10.1109/TSE.2010.91 10.1002/smr.1805 10.1002/smr.1706 |
ContentType | Journal Article |
Copyright | The British Computer Society 2019. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019 |
Copyright_xml | – notice: The British Computer Society 2019. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019 |
DBID | AAYXX CITATION |
DOI | 10.1093/comjnl/bxz144 |
DatabaseName | CrossRef |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Computer Science |
EISSN | 1460-2067 |
EndPage | 1691 |
ExternalDocumentID | 10_1093_comjnl_bxz144 10.1093/comjnl/bxz144 |
GroupedDBID | -E4 -~X .2P .DC .I3 0B8 0R~ 123 18M 1OL 1TH 29F 3R3 4.4 41~ 48X 5VS 5WA 6J9 6TJ 70D 85S 9M8 AAIJN AAJKP AAJQQ AAMVS AAOGV AAPQZ AAPXW AARHZ AASNB AAUAY AAUQX AAVAP AAYOK ABDTM ABEFU ABEUO ABIXL ABNKS ABPTD ABQLI ABQTQ ABSAR ABSMQ ABTAH ABXVV ABZBJ ACBEA ACFRR ACGFS ACGOD ACIWK ACNCT ACUFI ACUTJ ACYTK ADEYI ADEZT ADGZP ADHKW ADHZD ADIPN ADOCK ADQBN ADRDM ADRIX ADRTK ADVEK ADYVW ADZXQ AECKG AEGPL AEGXH AEJOX AEKKA AEKSI AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFIYH AFOFC AFXEN AGINJ AGKEF AGMDO AGSYK AHXPO AI. AIDUJ AIJHB AJEEA AJEUX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC APIBT APWMN ASAOO ATDFG ATGXG AXUDD AZVOD BAYMD BCRHZ BEFXN BEYMZ BFFAM BGNUA BHONS BKEBE BPEOZ BQUQU BTQHN CAG CDBKE COF CS3 CXTWN CZ4 DAKXR DFGAJ DILTD DU5 D~K EBS EE~ EJD F20 F9B FA8 FLIZI FLUFQ FOEOM GAUVT GJXCC H13 H5~ HAR HW0 HZ~ H~9 IOX J21 JAVBF KBUDW KOP KSI KSN M-Z M49 MBTAY ML0 MVM N9A NGC NMDNZ NOMLY NU- O0~ O9- OCL ODMLO OJQWA OJZSN OWPYF O~Y P2P PAFKI PEELM PQQKQ Q1. Q5Y R44 RD5 RIG RNI ROL ROX ROZ RUSNO RW1 RXO RZO SC5 TAE TJP TN5 UCJ VH1 VOH WH7 WHG X7H XJT XOL XSW YAYTL YKOAZ YXANX ZHY ZKX ZY4 ~91 AAYXX ABAZT ABDFA ABEJV ABGNP ABVGC ABVLG ACUXJ ADMLS ADYJX AHGBF AJBYB AJNCP ALXQX ANAKG CITATION JXSIZ |
ID | FETCH-LOGICAL-c270t-cb5b9e3b5536272293e9f5340efc1ba5cfe6c0ba9e681f5fbbb104bd857a3c033 |
ISSN | 0010-4620 |
IngestDate | Wed Oct 01 03:33:45 EDT 2025 Thu Apr 24 22:53:37 EDT 2025 Wed Aug 28 03:17:35 EDT 2024 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 11 |
Keywords | textual analysis recommender system tag recommendation software information site |
Language | English |
License | This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c270t-cb5b9e3b5536272293e9f5340efc1ba5cfe6c0ba9e681f5fbbb104bd857a3c033 |
PageCount | 12 |
ParticipantIDs | crossref_citationtrail_10_1093_comjnl_bxz144 crossref_primary_10_1093_comjnl_bxz144 oup_primary_10_1093_comjnl_bxz144 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2021-11-01 |
PublicationDateYYYYMMDD | 2021-11-01 |
PublicationDate_xml | – month: 11 year: 2021 text: 2021-11-01 day: 01 |
PublicationDecade | 2020 |
PublicationTitle | Computer journal |
PublicationYear | 2021 |
Publisher | Oxford University Press |
Publisher_xml | – name: Oxford University Press |
References | Jenks (2021120104052719900_ref19) 1977 Wu (2021120104052719900_ref14) 2016 Pedregosa (2021120104052719900_ref24) 2011; 12 Zhou (2021120104052719900_ref9) 2017 Al-Kofahi (2021120104052719900_ref10) 2010 Storey (2021120104052719900_ref1) 2010 Zangerle (2021120104052719900_ref11) 2011 Field (2021120104052719900_ref26) 2013 Memmel (2021120104052719900_ref3) 2009; 15 Hsu (2021120104052719900_ref5) 2012; 55 Joachims (2021120104052719900_ref17) 1998 Li (2021120104052719900_ref13) 2015 Treude (2021120104052719900_ref4) 2012; 38 Manning (2021120104052719900_ref22) 2008 Sisman (2021120104052719900_ref21) 2017; 29 Stanley (2021120104052719900_ref15) 2013 Wilcoxon (2021120104052719900_ref25) 1945; 1 Cohen (2021120104052719900_ref27) 1988 Lipczak (2021120104052719900_ref18) 2010 Wang (2021120104052719900_ref12) 2015; 30 Avazpour (2021120104052719900_ref23) 2014 Wang (2021120104052719900_ref7) 2014 Xia (2021120104052719900_ref2) 2013 Liu (2021120104052719900_ref8) 2018; 25 González (2021120104052719900_ref16) 2015 Beyer (2021120104052719900_ref6) 2015 Xia (2021120104052719900_ref20) 2015; 27 |
References_xml | – start-page: 94 volume-title: Proc. 2015 IEEE 23rd Int. Conf. Program Comprehension, Florence, Italy, 16–24 May year: 2015 ident: 2021120104052719900_ref6 doi: 10.1109/ICPC.2015.18 – start-page: 268 volume-title: Web Technologies and Applications: 17th Asia-Pacific Web Conf., Guangzhou, China, 18–20 September year: 2015 ident: 2021120104052719900_ref13 doi: 10.1007/978-3-319-25255-1_22 – start-page: 113 volume-title: Proceedings of the 3rd International Conference on Social Informatics, Singapore, 6–8 October year: 2011 ident: 2021120104052719900_ref11 – volume: 30 start-page: 1017 year: 2015 ident: 2021120104052719900_ref12 article-title: TagCombine: Recommending tags to contents in software information sites publication-title: J. Comput. Sci. Technol. doi: 10.1007/s11390-015-1578-2 – volume-title: Introduction to Information Retrieval year: 2008 ident: 2021120104052719900_ref22 doi: 10.1017/CBO9780511809071 – start-page: 359 volume-title: Proc. FSE/SDP Workshop on Future of Software Engineering Research, Santa Fe, New Mexico, USA, 7–8 November year: 2010 ident: 2021120104052719900_ref1 doi: 10.1145/1882362.1882435 – volume: 25 start-page: 675 year: 2018 ident: 2021120104052719900_ref8 article-title: FastTagRec: fast tag recommendation for software information sites publication-title: Autom. Softw. Eng. doi: 10.1007/s10515-018-0239-4 – start-page: 287 volume-title: 2013 10th Working Conf. Mining Software Repositories (MSR), San Francisco, CA, USA, 18–19 May year: 2013 ident: 2021120104052719900_ref2 doi: 10.1109/MSR.2013.6624040 – volume: 15 start-page: 678 year: 2009 ident: 2021120104052719900_ref3 article-title: Providing multi source tag recommendations in a social resource sharing platform publication-title: J. Univers. Comput. Sci. – start-page: 137 volume-title: Machine Learning: ECML-98, Germany, 21–23 April year: 1998 ident: 2021120104052719900_ref17 doi: 10.1007/BFb0026683 – volume-title: Statistical Power Analysis for the Behavioral Sciences year: 1988 ident: 2021120104052719900_ref27 – volume: 55 start-page: 1118 year: 2012 ident: 2021120104052719900_ref5 article-title: Semantic tag-based profile framework for social tagging systems publication-title: Comput. J. doi: 10.1093/comjnl/bxs040 – start-page: 167 volume-title: Proceedings of the Fourth ACM Conference on Recommender Systems, Barcelona, Spain, 26–30 September year: 2010 ident: 2021120104052719900_ref18 doi: 10.1145/1864708.1864741 – volume: 1 start-page: 80 year: 1945 ident: 2021120104052719900_ref25 article-title: Individual comparisons by ranking methods publication-title: Biom. Bull. doi: 10.2307/3001968 – volume-title: Optimal Data Classification for Choropleth Maps year: 1977 ident: 2021120104052719900_ref19 – start-page: 1 volume-title: 2015 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), Ixtapa, Mexico, 4–6 November year: 2015 ident: 2021120104052719900_ref16 – start-page: 1 volume-title: 2010 IEEE Int. Conf. Software Maintenance, Timisoara, Romania, 12–18 September year: 2010 ident: 2021120104052719900_ref10 – start-page: 291 volume-title: 2014 IEEE Int. Conf. Software Maintenance and Evolution, Victoria, BC, Canada, 29 September–3 October year: 2014 ident: 2021120104052719900_ref7 doi: 10.1109/ICSME.2014.51 – start-page: 2287 volume-title: Proc. the 25th ACM Int. Conf. Information and Knowledge Management, Indianapolis, Indiana, USA, 24–28 October year: 2016 ident: 2021120104052719900_ref14 – volume-title: Recommendation Systems in Software Engineering year: 2014 ident: 2021120104052719900_ref23 – volume: 38 start-page: 19 year: 2012 ident: 2021120104052719900_ref4 article-title: Work item tagging: communicating concerns in collaborative software development publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2010.91 – start-page: 414 volume-title: Proc. 12th Int. Conf. Cognitive Modelling (ICCM), Ottawa, Canada, 11-14 July year: 2013 ident: 2021120104052719900_ref15 – volume: 29 start-page: e1805 year: 2017 ident: 2021120104052719900_ref21 article-title: Exploiting spatial code proximity and order for improved source code retrieval for bug localization publication-title: J. Softw. Evol. Process doi: 10.1002/smr.1805 – volume: 12 start-page: 2825 year: 2011 ident: 2021120104052719900_ref24 article-title: Scikit-learn: machine learning in Python publication-title: J. Mach. Learn. Res. – volume: 27 start-page: 195 year: 2015 ident: 2021120104052719900_ref20 article-title: Dual analysis for recommending developers to resolve bugs publication-title: J. Softw. Evol. Process doi: 10.1002/smr.1706 – volume-title: Discovering statistics using IBM SPSS Statistics year: 2013 ident: 2021120104052719900_ref26 – start-page: 272 volume-title: 2017 IEEE 24th Int. Conf. Software Analysis, Evolution and Reengineering (SANER), Klagenfurt, Austria, 20–24 February year: 2017 ident: 2021120104052719900_ref9 |
SSID | ssj0002096 |
Score | 2.3118107 |
Snippet | Abstract
Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to... Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content... |
SourceID | crossref oup |
SourceType | Enrichment Source Index Database Publisher |
StartPage | 1680 |
Title | A Content-Based Model for Tag Recommendation in Software Information Sites |
Volume | 64 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVEBS databaseName: Inspec with Full Text customDbUrl: eissn: 1460-2067 dateEnd: 20241003 omitProxy: false ssIdentifier: ssj0002096 issn: 0010-4620 databaseCode: ADMLS dateStart: 19960101 isFulltext: true titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text providerName: EBSCOhost |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELZK98KF5SmWl4yEuJSwiR3ncSyParWiXNqV9lbZzljtQrNoNyugf4C_zTh2gosKWrhEVeRME8-nb8b2PAh5wasqEVWZRFmcV7aFmYoKgDIqTR5rDlDIwiYnTz9mRyfp8ak4HQx-BFFLV416rTc780r-R6t4D_Vqs2T_QbO9ULyBv1G_eEUN4_VaOrYp5TYEvYneoDGq2sZmbT7iaC5tUB3-5xp81yS7sTFDzv1qQ718EpJjC3Q6L0MftWv0MArfwYboLHFdrVZOKZuezmfSVP6ovwED_fbyRH5aIpOtHYZm8L19QXRGTfBodQEVWs9W6PR8Kdc4fIRsuAp3I1ji0_IChkVeTzPmzlrAkWqaxZEtEx-yrqtd3qErCTg0yYo4sMe2ms9Ornd1sHAmz2r8yIn6tklcKcntqtq_Wbs-BtGdvvOFE7Bwj98geyzPMjYke-N30w-z3qizuG311n-eL9eKAg6dgEMnYMu9sSmTgbcyv01u-WUGHTvM3CEDqO-S_U6z1DP6PXI8plsQoi2EKKKDIoToNoToqqYdhGgAIdpC6D45mbyfvz2KfHuNSLM8biKthCqBKyHQickZ-n1QGsHTGIxOlBTaQKZjJUvIisQIo5TCtbuqCpFLrmPOH5BhfV7DQ0JlwdBwomee6jJlRpVacCbLFGy1wzyRB-RVNysL7WvP2xYonxc7tXBAXvbDv7iiK38a-Byn-O9jHl1X2GNy8xein5Bhc3EFT9HjbNQzj4afe7OHxA |
linkProvider | EBSCOhost |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Content-Based+Model+for+Tag+Recommendation+in+Software+Information+Sites&rft.jtitle=Computer+journal&rft.au=Gharibi%2C+Reza&rft.au=Safdel%2C+Atefeh&rft.au=Fakhrahmad%2C+Seyed+Mostafa&rft.au=Sadreddini%2C+Mohammad+Hadi&rft.date=2021-11-01&rft.issn=0010-4620&rft.eissn=1460-2067&rft.volume=64&rft.issue=11&rft.spage=1680&rft.epage=1691&rft_id=info:doi/10.1093%2Fcomjnl%2Fbxz144&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_comjnl_bxz144 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4620&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4620&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4620&client=summon |