A Content-Based Model for Tag Recommendation in Software Information Sites

Abstract Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might b...

Full description

Saved in:
Bibliographic Details
Published inComputer journal Vol. 64; no. 11; pp. 1680 - 1691
Main Authors Gharibi, Reza, Safdel, Atefeh, Fakhrahmad, Seyed Mostafa, Sadreddini, Mohammad Hadi
Format Journal Article
LanguageEnglish
Published Oxford University Press 01.11.2021
Subjects
Online AccessGet full text
ISSN0010-4620
1460-2067
DOI10.1093/comjnl/bxz144

Cover

Abstract Abstract Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might be noisy, and postings may become poorly categorized since people tag a posting based on their knowledge of its content and other existing tags. To keep the content well organized, tag recommendation systems can help users by suggesting appropriate tags for their posted content. In this paper, we propose a tag recommendation scheme that uses the textual content of already tagged postings to recommend suitable tags for newly posted content. Our approach combines multi-label classification and textual similarity techniques to improve the performance of tag recommendation. We evaluate the performance of the proposed scheme on 11 software information sites from the Stack Exchange network. The results show a significant improvement over TagCombine, TagMulRec and FastTagRec, which are well-known tag recommendation systems. On average, the proposed model outperforms TagCombine, TagMulRec and FastTagRec by 26.2, 15.9 and 13.8% in terms of Recall@5 and by 16.9, 12.4 and 9.4% in terms of Recall@10, respectively.
AbstractList Abstract Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might be noisy, and postings may become poorly categorized since people tag a posting based on their knowledge of its content and other existing tags. To keep the content well organized, tag recommendation systems can help users by suggesting appropriate tags for their posted content. In this paper, we propose a tag recommendation scheme that uses the textual content of already tagged postings to recommend suitable tags for newly posted content. Our approach combines multi-label classification and textual similarity techniques to improve the performance of tag recommendation. We evaluate the performance of the proposed scheme on 11 software information sites from the Stack Exchange network. The results show a significant improvement over TagCombine, TagMulRec and FastTagRec, which are well-known tag recommendation systems. On average, the proposed model outperforms TagCombine, TagMulRec and FastTagRec by 26.2, 15.9 and 13.8% in terms of Recall@5 and by 16.9, 12.4 and 9.4% in terms of Recall@10, respectively.
Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content with tags as a short description. Tags, then, are used to describe, categorize and search the posted content. However, tags might be noisy, and postings may become poorly categorized since people tag a posting based on their knowledge of its content and other existing tags. To keep the content well organized, tag recommendation systems can help users by suggesting appropriate tags for their posted content. In this paper, we propose a tag recommendation scheme that uses the textual content of already tagged postings to recommend suitable tags for newly posted content. Our approach combines multi-label classification and textual similarity techniques to improve the performance of tag recommendation. We evaluate the performance of the proposed scheme on 11 software information sites from the Stack Exchange network. The results show a significant improvement over TagCombine, TagMulRec and FastTagRec, which are well-known tag recommendation systems. On average, the proposed model outperforms TagCombine, TagMulRec and FastTagRec by 26.2, 15.9 and 13.8% in terms of Recall@5 and by 16.9, 12.4 and 9.4% in terms of Recall@10, respectively.
Author Sadreddini, Mohammad Hadi
Safdel, Atefeh
Fakhrahmad, Seyed Mostafa
Gharibi, Reza
Author_xml – sequence: 1
  givenname: Reza
  surname: Gharibi
  fullname: Gharibi, Reza
  email: gharibi@cse.shirazu.ac.ir
  organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
– sequence: 2
  givenname: Atefeh
  surname: Safdel
  fullname: Safdel, Atefeh
  organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
– sequence: 3
  givenname: Seyed Mostafa
  surname: Fakhrahmad
  fullname: Fakhrahmad, Seyed Mostafa
  email: fakhrahmad@shirazu.ac.ir
  organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
– sequence: 4
  givenname: Mohammad Hadi
  surname: Sadreddini
  fullname: Sadreddini, Mohammad Hadi
  organization: Department of Computer Science and Engineering and IT, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran
BookMark eNqFkEtLAzEUhYNUsK0u3WfpJvZmMpPpLGvxUVEEW9dDkrmRKTNJSSI-fr3VcSWIqwuH7xwu34SMnHdIyCmHcw6VmBnfb103028fPM8PyJjnElgGshyRMQAHlssMjsgkxi0AZFDJMbld0KV3CV1iFypiQ-99gx21PtCNeqaPuB_t0TUqtd7R1tG1t-lVBaQrt4f6IV-3CeMxObSqi3jyc6fk6epys7xhdw_Xq-XijpmshMSMLnSFQheFkFmZZZXAyhYiB7SGa1UYi9KAVhXKObeF1VpzyHUzL0olDAgxJWLYNcHHGNDWpk3ff6Sg2q7mUH_pqAcd9aBj32K_WrvQ9iq8_8mfDbx_2f2DfgJE3XYi
CitedBy_id crossref_primary_10_1145_3708532
crossref_primary_10_32604_cmc_2024_050389
Cites_doi 10.1109/ICPC.2015.18
10.1007/978-3-319-25255-1_22
10.1007/s11390-015-1578-2
10.1017/CBO9780511809071
10.1145/1882362.1882435
10.1007/s10515-018-0239-4
10.1109/MSR.2013.6624040
10.1007/BFb0026683
10.1093/comjnl/bxs040
10.1145/1864708.1864741
10.2307/3001968
10.1109/ICSME.2014.51
10.1109/TSE.2010.91
10.1002/smr.1805
10.1002/smr.1706
ContentType Journal Article
Copyright The British Computer Society 2019. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019
Copyright_xml – notice: The British Computer Society 2019. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2019
DBID AAYXX
CITATION
DOI 10.1093/comjnl/bxz144
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1460-2067
EndPage 1691
ExternalDocumentID 10_1093_comjnl_bxz144
10.1093/comjnl/bxz144
GroupedDBID -E4
-~X
.2P
.DC
.I3
0B8
0R~
123
18M
1OL
1TH
29F
3R3
4.4
41~
48X
5VS
5WA
6J9
6TJ
70D
85S
9M8
AAIJN
AAJKP
AAJQQ
AAMVS
AAOGV
AAPQZ
AAPXW
AARHZ
AASNB
AAUAY
AAUQX
AAVAP
AAYOK
ABDTM
ABEFU
ABEUO
ABIXL
ABNKS
ABPTD
ABQLI
ABQTQ
ABSAR
ABSMQ
ABTAH
ABXVV
ABZBJ
ACBEA
ACFRR
ACGFS
ACGOD
ACIWK
ACNCT
ACUFI
ACUTJ
ACYTK
ADEYI
ADEZT
ADGZP
ADHKW
ADHZD
ADIPN
ADOCK
ADQBN
ADRDM
ADRIX
ADRTK
ADVEK
ADYVW
ADZXQ
AECKG
AEGPL
AEGXH
AEJOX
AEKKA
AEKSI
AEMDU
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFIYH
AFOFC
AFXEN
AGINJ
AGKEF
AGMDO
AGSYK
AHXPO
AI.
AIDUJ
AIJHB
AJEEA
AJEUX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
APIBT
APWMN
ASAOO
ATDFG
ATGXG
AXUDD
AZVOD
BAYMD
BCRHZ
BEFXN
BEYMZ
BFFAM
BGNUA
BHONS
BKEBE
BPEOZ
BQUQU
BTQHN
CAG
CDBKE
COF
CS3
CXTWN
CZ4
DAKXR
DFGAJ
DILTD
DU5
D~K
EBS
EE~
EJD
F20
F9B
FA8
FLIZI
FLUFQ
FOEOM
GAUVT
GJXCC
H13
H5~
HAR
HW0
HZ~
H~9
IOX
J21
JAVBF
KBUDW
KOP
KSI
KSN
M-Z
M49
MBTAY
ML0
MVM
N9A
NGC
NMDNZ
NOMLY
NU-
O0~
O9-
OCL
ODMLO
OJQWA
OJZSN
OWPYF
O~Y
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
R44
RD5
RIG
RNI
ROL
ROX
ROZ
RUSNO
RW1
RXO
RZO
SC5
TAE
TJP
TN5
UCJ
VH1
VOH
WH7
WHG
X7H
XJT
XOL
XSW
YAYTL
YKOAZ
YXANX
ZHY
ZKX
ZY4
~91
AAYXX
ABAZT
ABDFA
ABEJV
ABGNP
ABVGC
ABVLG
ACUXJ
ADMLS
ADYJX
AHGBF
AJBYB
AJNCP
ALXQX
ANAKG
CITATION
JXSIZ
ID FETCH-LOGICAL-c270t-cb5b9e3b5536272293e9f5340efc1ba5cfe6c0ba9e681f5fbbb104bd857a3c033
ISSN 0010-4620
IngestDate Wed Oct 01 03:33:45 EDT 2025
Thu Apr 24 22:53:37 EDT 2025
Wed Aug 28 03:17:35 EDT 2024
IsPeerReviewed true
IsScholarly true
Issue 11
Keywords textual analysis
recommender system
tag recommendation
software information site
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c270t-cb5b9e3b5536272293e9f5340efc1ba5cfe6c0ba9e681f5fbbb104bd857a3c033
PageCount 12
ParticipantIDs crossref_citationtrail_10_1093_comjnl_bxz144
crossref_primary_10_1093_comjnl_bxz144
oup_primary_10_1093_comjnl_bxz144
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-11-01
PublicationDateYYYYMMDD 2021-11-01
PublicationDate_xml – month: 11
  year: 2021
  text: 2021-11-01
  day: 01
PublicationDecade 2020
PublicationTitle Computer journal
PublicationYear 2021
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Jenks (2021120104052719900_ref19) 1977
Wu (2021120104052719900_ref14) 2016
Pedregosa (2021120104052719900_ref24) 2011; 12
Zhou (2021120104052719900_ref9) 2017
Al-Kofahi (2021120104052719900_ref10) 2010
Storey (2021120104052719900_ref1) 2010
Zangerle (2021120104052719900_ref11) 2011
Field (2021120104052719900_ref26) 2013
Memmel (2021120104052719900_ref3) 2009; 15
Hsu (2021120104052719900_ref5) 2012; 55
Joachims (2021120104052719900_ref17) 1998
Li (2021120104052719900_ref13) 2015
Treude (2021120104052719900_ref4) 2012; 38
Manning (2021120104052719900_ref22) 2008
Sisman (2021120104052719900_ref21) 2017; 29
Stanley (2021120104052719900_ref15) 2013
Wilcoxon (2021120104052719900_ref25) 1945; 1
Cohen (2021120104052719900_ref27) 1988
Lipczak (2021120104052719900_ref18) 2010
Wang (2021120104052719900_ref12) 2015; 30
Avazpour (2021120104052719900_ref23) 2014
Wang (2021120104052719900_ref7) 2014
Xia (2021120104052719900_ref2) 2013
Liu (2021120104052719900_ref8) 2018; 25
González (2021120104052719900_ref16) 2015
Beyer (2021120104052719900_ref6) 2015
Xia (2021120104052719900_ref20) 2015; 27
References_xml – start-page: 94
  volume-title: Proc. 2015 IEEE 23rd Int. Conf. Program Comprehension, Florence, Italy, 16–24 May
  year: 2015
  ident: 2021120104052719900_ref6
  doi: 10.1109/ICPC.2015.18
– start-page: 268
  volume-title: Web Technologies and Applications: 17th Asia-Pacific Web Conf., Guangzhou, China, 18–20 September
  year: 2015
  ident: 2021120104052719900_ref13
  doi: 10.1007/978-3-319-25255-1_22
– start-page: 113
  volume-title: Proceedings of the 3rd International Conference on Social Informatics, Singapore, 6–8 October
  year: 2011
  ident: 2021120104052719900_ref11
– volume: 30
  start-page: 1017
  year: 2015
  ident: 2021120104052719900_ref12
  article-title: TagCombine: Recommending tags to contents in software information sites
  publication-title: J. Comput. Sci. Technol.
  doi: 10.1007/s11390-015-1578-2
– volume-title: Introduction to Information Retrieval
  year: 2008
  ident: 2021120104052719900_ref22
  doi: 10.1017/CBO9780511809071
– start-page: 359
  volume-title: Proc. FSE/SDP Workshop on Future of Software Engineering Research, Santa Fe, New Mexico, USA, 7–8 November
  year: 2010
  ident: 2021120104052719900_ref1
  doi: 10.1145/1882362.1882435
– volume: 25
  start-page: 675
  year: 2018
  ident: 2021120104052719900_ref8
  article-title: FastTagRec: fast tag recommendation for software information sites
  publication-title: Autom. Softw. Eng.
  doi: 10.1007/s10515-018-0239-4
– start-page: 287
  volume-title: 2013 10th Working Conf. Mining Software Repositories (MSR), San Francisco, CA, USA, 18–19 May
  year: 2013
  ident: 2021120104052719900_ref2
  doi: 10.1109/MSR.2013.6624040
– volume: 15
  start-page: 678
  year: 2009
  ident: 2021120104052719900_ref3
  article-title: Providing multi source tag recommendations in a social resource sharing platform
  publication-title: J. Univers. Comput. Sci.
– start-page: 137
  volume-title: Machine Learning: ECML-98, Germany, 21–23 April
  year: 1998
  ident: 2021120104052719900_ref17
  doi: 10.1007/BFb0026683
– volume-title: Statistical Power Analysis for the Behavioral Sciences
  year: 1988
  ident: 2021120104052719900_ref27
– volume: 55
  start-page: 1118
  year: 2012
  ident: 2021120104052719900_ref5
  article-title: Semantic tag-based profile framework for social tagging systems
  publication-title: Comput. J.
  doi: 10.1093/comjnl/bxs040
– start-page: 167
  volume-title: Proceedings of the Fourth ACM Conference on Recommender Systems, Barcelona, Spain, 26–30 September
  year: 2010
  ident: 2021120104052719900_ref18
  doi: 10.1145/1864708.1864741
– volume: 1
  start-page: 80
  year: 1945
  ident: 2021120104052719900_ref25
  article-title: Individual comparisons by ranking methods
  publication-title: Biom. Bull.
  doi: 10.2307/3001968
– volume-title: Optimal Data Classification for Choropleth Maps
  year: 1977
  ident: 2021120104052719900_ref19
– start-page: 1
  volume-title: 2015 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), Ixtapa, Mexico, 4–6 November
  year: 2015
  ident: 2021120104052719900_ref16
– start-page: 1
  volume-title: 2010 IEEE Int. Conf. Software Maintenance, Timisoara, Romania, 12–18 September
  year: 2010
  ident: 2021120104052719900_ref10
– start-page: 291
  volume-title: 2014 IEEE Int. Conf. Software Maintenance and Evolution, Victoria, BC, Canada, 29 September–3 October
  year: 2014
  ident: 2021120104052719900_ref7
  doi: 10.1109/ICSME.2014.51
– start-page: 2287
  volume-title: Proc. the 25th ACM Int. Conf. Information and Knowledge Management, Indianapolis, Indiana, USA, 24–28 October
  year: 2016
  ident: 2021120104052719900_ref14
– volume-title: Recommendation Systems in Software Engineering
  year: 2014
  ident: 2021120104052719900_ref23
– volume: 38
  start-page: 19
  year: 2012
  ident: 2021120104052719900_ref4
  article-title: Work item tagging: communicating concerns in collaborative software development
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2010.91
– start-page: 414
  volume-title: Proc. 12th Int. Conf. Cognitive Modelling (ICCM), Ottawa, Canada, 11-14 July
  year: 2013
  ident: 2021120104052719900_ref15
– volume: 29
  start-page: e1805
  year: 2017
  ident: 2021120104052719900_ref21
  article-title: Exploiting spatial code proximity and order for improved source code retrieval for bug localization
  publication-title: J. Softw. Evol. Process
  doi: 10.1002/smr.1805
– volume: 12
  start-page: 2825
  year: 2011
  ident: 2021120104052719900_ref24
  article-title: Scikit-learn: machine learning in Python
  publication-title: J. Mach. Learn. Res.
– volume: 27
  start-page: 195
  year: 2015
  ident: 2021120104052719900_ref20
  article-title: Dual analysis for recommending developers to resolve bugs
  publication-title: J. Softw. Evol. Process
  doi: 10.1002/smr.1706
– volume-title: Discovering statistics using IBM SPSS Statistics
  year: 2013
  ident: 2021120104052719900_ref26
– start-page: 272
  volume-title: 2017 IEEE 24th Int. Conf. Software Analysis, Evolution and Reengineering (SANER), Klagenfurt, Austria, 20–24 February
  year: 2017
  ident: 2021120104052719900_ref9
SSID ssj0002096
Score 2.3118107
Snippet Abstract Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to...
Developers use software information sites such as Stack Overflow to get and give information on various subjects. These sites allow developers to label content...
SourceID crossref
oup
SourceType Enrichment Source
Index Database
Publisher
StartPage 1680
Title A Content-Based Model for Tag Recommendation in Software Information Sites
Volume 64
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVEBS
  databaseName: Inspec with Full Text
  customDbUrl:
  eissn: 1460-2067
  dateEnd: 20241003
  omitProxy: false
  ssIdentifier: ssj0002096
  issn: 0010-4620
  databaseCode: ADMLS
  dateStart: 19960101
  isFulltext: true
  titleUrlDefault: https://www.ebsco.com/products/research-databases/inspec-full-text
  providerName: EBSCOhost
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELZK98KF5SmWl4yEuJSwiR3ncSyParWiXNqV9lbZzljtQrNoNyugf4C_zTh2gosKWrhEVeRME8-nb8b2PAh5wasqEVWZRFmcV7aFmYoKgDIqTR5rDlDIwiYnTz9mRyfp8ak4HQx-BFFLV416rTc780r-R6t4D_Vqs2T_QbO9ULyBv1G_eEUN4_VaOrYp5TYEvYneoDGq2sZmbT7iaC5tUB3-5xp81yS7sTFDzv1qQ718EpJjC3Q6L0MftWv0MArfwYboLHFdrVZOKZuezmfSVP6ovwED_fbyRH5aIpOtHYZm8L19QXRGTfBodQEVWs9W6PR8Kdc4fIRsuAp3I1ji0_IChkVeTzPmzlrAkWqaxZEtEx-yrqtd3qErCTg0yYo4sMe2ms9Ornd1sHAmz2r8yIn6tklcKcntqtq_Wbs-BtGdvvOFE7Bwj98geyzPMjYke-N30w-z3qizuG311n-eL9eKAg6dgEMnYMu9sSmTgbcyv01u-WUGHTvM3CEDqO-S_U6z1DP6PXI8plsQoi2EKKKDIoToNoToqqYdhGgAIdpC6D45mbyfvz2KfHuNSLM8biKthCqBKyHQickZ-n1QGsHTGIxOlBTaQKZjJUvIisQIo5TCtbuqCpFLrmPOH5BhfV7DQ0JlwdBwomee6jJlRpVacCbLFGy1wzyRB-RVNysL7WvP2xYonxc7tXBAXvbDv7iiK38a-Byn-O9jHl1X2GNy8xein5Bhc3EFT9HjbNQzj4afe7OHxA
linkProvider EBSCOhost
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Content-Based+Model+for+Tag+Recommendation+in+Software+Information+Sites&rft.jtitle=Computer+journal&rft.au=Gharibi%2C+Reza&rft.au=Safdel%2C+Atefeh&rft.au=Fakhrahmad%2C+Seyed+Mostafa&rft.au=Sadreddini%2C+Mohammad+Hadi&rft.date=2021-11-01&rft.issn=0010-4620&rft.eissn=1460-2067&rft.volume=64&rft.issue=11&rft.spage=1680&rft.epage=1691&rft_id=info:doi/10.1093%2Fcomjnl%2Fbxz144&rft.externalDBID=n%2Fa&rft.externalDocID=10_1093_comjnl_bxz144
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4620&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4620&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4620&client=summon