Python code smells detection using conventional machine learning models

Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies...

Full description

Saved in:
Bibliographic Details
Published inPeerJ. Computer science Vol. 9; p. e1370
Main Authors Sandouka, Rana, Aljamaan, Hamoud
Format Journal Article
LanguageEnglish
Published United States PeerJ. Ltd 29.05.2023
PeerJ Inc
Subjects
Online AccessGet full text
ISSN2376-5992
2376-5992
DOI10.7717/peerj-cs.1370

Cover

Abstract Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89.
AbstractList Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89.
Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89.Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89.
ArticleNumber e1370
Audience Academic
Author Aljamaan, Hamoud
Sandouka, Rana
Author_xml – sequence: 1
  givenname: Rana
  surname: Sandouka
  fullname: Sandouka, Rana
  organization: Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
– sequence: 2
  givenname: Hamoud
  surname: Aljamaan
  fullname: Aljamaan, Hamoud
  organization: Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37346528$$D View this record in MEDLINE/PubMed
BookMark eNp9ks1v1DAQxSNUREvpkStaiQscsthxHDsnVFW0rFQJxMfZmtiTXa8ce4mTwv73dbqlNEgQH2LNvPeT59nPsyMfPGbZS0qWQlDxbofYb3Mdl5QJ8iQ7KZiocl7XxdGj_XF2FuOWEEI5TV_9LDtmgpUVL-RJdvV5P2yCX-hgcBE7dC4uDA6oB5uqY7R-nXr-Bv1UALfoQG-sx4VD6P3U7ZLTxRfZ0xZcxLP7_2n2_fLDt4uP-fWnq9XF-XWuOedDbgQtSsNF03BmDEpsidQ1FBVvWEvbCoBipQEFCmMK1MIgM4WpkGvS6oay02x14JoAW7XrbQf9XgWw6q4Q-rWCfrDaoaqp1EIKrKE0JRENMNlCCxpKQqUxTWItD6zR72D_E5x7AFKipoDVXcBKRzUFnAzvD4bd2HRodAqlBzc7xbzj7Uatw03CFZKUciK8uSf04ceIcVCdjTqlDh7DGFUhCyk4rzhP0tcH6RrSMNa3ISH1JFfngk_EWrA_M8xUaRnsbLo4bG2qzwxvZ4akGfDXsIYxRrX6-mWuffV43odBfz-fJGAHge5DjD22StsBppeSTmHdP2PM_3L9P_Zb0zfsjA
CitedBy_id crossref_primary_10_1016_j_knosys_2024_111390
crossref_primary_10_1038_s41598_023_43380_8
crossref_primary_10_1007_s10664_024_10445_9
crossref_primary_10_1007_s10515_024_00429_w
crossref_primary_10_1007_s42979_025_03680_4
crossref_primary_10_7717_peerj_cs_2254
crossref_primary_10_3846_ntcs_2024_21305
Cites_doi 10.1016/j.jss.2018.05.057
10.1016/j.asoc.2019.105524
10.1016/j.entcs.2005.02.059
10.1016/j.jss.2021.110936
10.1142/S0218194021500431
10.1016/j.jss.2010.11.921
10.1007/978-981-19-0901-6_25
10.1109/ACCESS.2021.3084050
10.1007/3-540-45672-4_31
10.1186/s12864-019-6419-1
10.1007/s10664-015-9378-4
10.1016/j.infsof.2018.12.009
10.1016/j.jss.2020.110610
10.1007/s11390-020-0323-7
10.22152/programming-journal.org/2017/1/11
10.1007/s11219-020-09498-y
10.1109/32.6156
10.1016/j.scico.2021.102713
10.1109/TSE.2009.50
10.1016/j.infsof.2021.106648
10.1109/TSE.2016.2584050
10.1007/s10664-017-9535-z
10.1007/s13369-019-04311-w
10.1145/3383219.3383264
10.1109/TSE.2014.2327044
ContentType Journal Article
Copyright 2023 Sandouka and Aljamaan.
COPYRIGHT 2023 PeerJ. Ltd.
2023 Sandouka and Aljamaan 2023 Sandouka and Aljamaan
Copyright_xml – notice: 2023 Sandouka and Aljamaan.
– notice: COPYRIGHT 2023 PeerJ. Ltd.
– notice: 2023 Sandouka and Aljamaan 2023 Sandouka and Aljamaan
DBID AAYXX
CITATION
NPM
ISR
7X8
5PM
ADTOC
UNPAY
DOA
DOI 10.7717/peerj-cs.1370
DatabaseName CrossRef
PubMed
Gale In Context: Science
MEDLINE - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
Openly Available Collection - DOAJ
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList
CrossRef
PubMed
MEDLINE - Academic


Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2376-5992
ExternalDocumentID oai_doaj_org_article_918c787e9a4d407ba38fafaca4018ddb
10.7717/peerj-cs.1370
PMC10280480
A751028973
37346528
10_7717_peerj_cs_1370
Genre Journal Article
GrantInformation_xml – fundername: King Fahd University of Petroleum and Minerals (KFUPM)
GroupedDBID 53G
5VS
8FE
8FG
AAFWJ
AAYXX
ABUWG
ADBBV
AFKRA
AFPKN
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BCNDV
BENPR
BGLVJ
BPHCQ
CCPQU
CITATION
DWQXO
FRP
GNUQQ
GROUPED_DOAJ
HCIFZ
IAO
ICD
IEA
ISR
ITC
K6V
K7-
M~E
OK1
P62
PHGZM
PHGZT
PIMPY
PQGLB
PQQKQ
PROAC
PUEGO
RPM
3V.
ARCSS
H13
M0N
NPM
7X8
5PM
ADTOC
UNPAY
ID FETCH-LOGICAL-c555t-d7124d57bb53dde8ef08c9a265b3f1f6aa1e6cae7e7dd2ec7de3d2d6e5c0fcb13
IEDL.DBID DOA
ISSN 2376-5992
IngestDate Fri Oct 03 12:46:27 EDT 2025
Sun Oct 26 04:15:49 EDT 2025
Tue Sep 30 17:13:40 EDT 2025
Sun Aug 24 03:47:57 EDT 2025
Mon Oct 20 22:19:42 EDT 2025
Mon Oct 20 16:30:14 EDT 2025
Thu Oct 16 16:11:15 EDT 2025
Thu Jan 02 22:52:23 EST 2025
Wed Oct 01 04:07:35 EDT 2025
Thu Apr 24 23:11:13 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Large class
Code smell
Detection
Long method
Machine learning
Python
Language English
License https://creativecommons.org/licenses/by/4.0
2023 Sandouka and Aljamaan.
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
cc-by
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c555t-d7124d57bb53dde8ef08c9a265b3f1f6aa1e6cae7e7dd2ec7de3d2d6e5c0fcb13
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://doaj.org/article/918c787e9a4d407ba38fafaca4018ddb
PMID 37346528
PQID 2828755655
PQPubID 23479
PageCount e1370
ParticipantIDs doaj_primary_oai_doaj_org_article_918c787e9a4d407ba38fafaca4018ddb
unpaywall_primary_10_7717_peerj_cs_1370
pubmedcentral_primary_oai_pubmedcentral_nih_gov_10280480
proquest_miscellaneous_2828755655
gale_infotracmisc_A751028973
gale_infotracacademiconefile_A751028973
gale_incontextgauss_ISR_A751028973
pubmed_primary_37346528
crossref_citationtrail_10_7717_peerj_cs_1370
crossref_primary_10_7717_peerj_cs_1370
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-05-29
PublicationDateYYYYMMDD 2023-05-29
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-05-29
  day: 29
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Diego, USA
PublicationTitle PeerJ. Computer science
PublicationTitleAlternate PeerJ Comput Sci
PublicationYear 2023
Publisher PeerJ. Ltd
PeerJ Inc
Publisher_xml – name: PeerJ. Ltd
– name: PeerJ Inc
References Alazba (10.7717/peerj-cs.1370/ref-2) 2021; 138
Leopold (10.7717/peerj-cs.1370/ref-27) 2014; 40
Al-Shaaby (10.7717/peerj-cs.1370/ref-1) 2020; 45
Chen (10.7717/peerj-cs.1370/ref-10) 2016
Basili (10.7717/peerj-cs.1370/ref-7) 1988; 14
Sharma (10.7717/peerj-cs.1370/ref-35) 2021
Lutz (10.7717/peerj-cs.1370/ref-28) 2009
Moha (10.7717/peerj-cs.1370/ref-32) 2009; 36
Wang (10.7717/peerj-cs.1370/ref-47) 2012
Chicco (10.7717/peerj-cs.1370/ref-12) 2021; 9
Beazley (10.7717/peerj-cs.1370/ref-8) 2009
Madeyski (10.7717/peerj-cs.1370/ref-29) 2020
Kim (10.7717/peerj-cs.1370/ref-23) 2017; 7
Menshawy (10.7717/peerj-cs.1370/ref-30) 2021
Palomba (10.7717/peerj-cs.1370/ref-33) 2018; 23
Azeem (10.7717/peerj-cs.1370/ref-6) 2019; 108
Walter (10.7717/peerj-cs.1370/ref-44) 2018; 144
Güzel (10.7717/peerj-cs.1370/ref-19) 2016; 5
Wang (10.7717/peerj-cs.1370/ref-45) 2021; 31
Srinath (10.7717/peerj-cs.1370/ref-37) 2017; 4
Tempero (10.7717/peerj-cs.1370/ref-39) 2010
Vavrová (10.7717/peerj-cs.1370/ref-43) 2017
Tian (10.7717/peerj-cs.1370/ref-40) 2012
Demšar (10.7717/peerj-cs.1370/ref-13) 2006; 7
Mhawish (10.7717/peerj-cs.1370/ref-31) 2020; 35
Fontana (10.7717/peerj-cs.1370/ref-16) 2016; 21
Aljamaan (10.7717/peerj-cs.1370/ref-3) 2021
Bergstra (10.7717/peerj-cs.1370/ref-9) 2012; 13
Fowler (10.7717/peerj-cs.1370/ref-17) 2002
Woolson (10.7717/peerj-cs.1370/ref-48) 2008
Yu (10.7717/peerj-cs.1370/ref-51) 2023
Di Nucci (10.7717/peerj-cs.1370/ref-15) 2018
Wang (10.7717/peerj-cs.1370/ref-46) 2021
Lacerda (10.7717/peerj-cs.1370/ref-25) 2020; 167
Yadav (10.7717/peerj-cs.1370/ref-49) 2021
Guggulothu (10.7717/peerj-cs.1370/ref-18) 2020; 28
Singh (10.7717/peerj-cs.1370/ref-36) 2020; 97
Lenarduzzi (10.7717/peerj-cs.1370/ref-26) 2019
Tomczak (10.7717/peerj-cs.1370/ref-41) 2014; 1
Vaucher (10.7717/peerj-cs.1370/ref-42) 2009
Khomh (10.7717/peerj-cs.1370/ref-22) 2011; 84
Dewangan (10.7717/peerj-cs.1370/ref-14) 2022
Chicco (10.7717/peerj-cs.1370/ref-11) 2020; 21
Arcelli Fontana (10.7717/peerj-cs.1370/ref-5) 2016; 21
Sharma (10.7717/peerj-cs.1370/ref-34) 2021; 176
Karegowda (10.7717/peerj-cs.1370/ref-21) 2010; 2
Kreimer (10.7717/peerj-cs.1370/ref-24) 2005; 141
Yu (10.7717/peerj-cs.1370/ref-50) 2010
Jain (10.7717/peerj-cs.1370/ref-20) 2021; 212
Zazworka (10.7717/peerj-cs.1370/ref-52) 2011
Amorim (10.7717/peerj-cs.1370/ref-4) 2015
Tantithamthavorn (10.7717/peerj-cs.1370/ref-38) 2016; 43
References_xml – volume: 144
  start-page: 1
  year: 2018
  ident: 10.7717/peerj-cs.1370/ref-44
  article-title: Code smells and their collocations: a large-scale experiment on open-source systems
  publication-title: Journal of Systems and Software
  doi: 10.1016/j.jss.2018.05.057
– start-page: 145
  year: 2009
  ident: 10.7717/peerj-cs.1370/ref-42
  article-title: Tracking design smells: lessons from a study of god classes
– volume: 97
  start-page: 105524
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-36
  article-title: Investigating the impact of data normalization on classification performance
  publication-title: Applied Soft Computing
  doi: 10.1016/j.asoc.2019.105524
– start-page: 336
  year: 2010
  ident: 10.7717/peerj-cs.1370/ref-39
  article-title: The qualitas corpus: a curated collection of Java code for empirical studies
– volume: 141
  start-page: 117
  issue: 4
  year: 2005
  ident: 10.7717/peerj-cs.1370/ref-24
  article-title: Adaptive detection of design flaws
  publication-title: Electronic Notes in Theoretical Computer Science
  doi: 10.1016/j.entcs.2005.02.059
– volume: 5
  start-page: 114
  issue: 6
  year: 2016
  ident: 10.7717/peerj-cs.1370/ref-19
  article-title: A survey on bad smells in codes and usage of algorithm analysis
  publication-title: International Journal of Computer Science and Software Engineering
– volume: 176
  start-page: 110936
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-34
  article-title: Code smell detection by deep direct-learning and transfer-learning
  publication-title: Journal of Systems and Software
  doi: 10.1016/j.jss.2021.110936
– start-page: 1
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-49
  article-title: Extraction of prediction rules of code smell using decision tree algorithm
– start-page: 612
  year: 2018
  ident: 10.7717/peerj-cs.1370/ref-15
  article-title: Detecting code smells using machine learning techniques: are we there yet?
– volume: 7
  start-page: 3613
  issue: 6
  year: 2017
  ident: 10.7717/peerj-cs.1370/ref-23
  article-title: Finding bad code smells with neural network models
  publication-title: International Journal of Electrical and Computer Engineering
– volume: 31
  start-page: 1329
  issue: 09
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-45
  article-title: Python code smell refactoring route generation based on association rule and correlation
  publication-title: International Journal of Software Engineering and Knowledge Engineering
  doi: 10.1142/S0218194021500431
– start-page: 261
  year: 2015
  ident: 10.7717/peerj-cs.1370/ref-4
  article-title: Experience report: evaluating the effectiveness of decision trees for detecting code smells
– start-page: 215
  year: 2012
  ident: 10.7717/peerj-cs.1370/ref-40
  article-title: Information retrieval based nearest neighbor classification for fine-grained bug severity prediction
– volume-title: Programming Python: powerful object-oriented programming
  year: 2009
  ident: 10.7717/peerj-cs.1370/ref-28
– volume: 84
  start-page: 559
  issue: 4
  year: 2011
  ident: 10.7717/peerj-cs.1370/ref-22
  article-title: BDTEX: a GQM-based Bayesian approach for the detection of antipatterns
  publication-title: Journal of Systems and Software
  doi: 10.1016/j.jss.2010.11.921
– start-page: 257
  volume-title: Intelligent systems
  year: 2022
  ident: 10.7717/peerj-cs.1370/ref-14
  article-title: Code smell detection using classification approaches
  doi: 10.1007/978-981-19-0901-6_25
– volume: 9
  start-page: 78368
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-12
  article-title: The Matthews correlation coefficient (MCC) is more informative than Cohen’s Kappa and Brier score in binary classification assessment
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2021.3084050
– year: 2002
  ident: 10.7717/peerj-cs.1370/ref-17
  article-title: Refactoring: improving the design of existing code
  doi: 10.1007/3-540-45672-4_31
– volume: 21
  start-page: 1
  issue: 1
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-11
  article-title: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
  publication-title: BMC Genomics
  doi: 10.1186/s12864-019-6419-1
– start-page: 17
  year: 2011
  ident: 10.7717/peerj-cs.1370/ref-52
  article-title: Investigating the impact of design debt on software quality
– volume: 21
  start-page: 1143
  issue: 3
  year: 2016
  ident: 10.7717/peerj-cs.1370/ref-5
  article-title: Comparing and experimenting machine learning techniques for code smell detection
  publication-title: Empirical Software Engineering
  doi: 10.1007/s10664-015-9378-4
– volume: 13
  start-page: 281
  issue: 2
  year: 2012
  ident: 10.7717/peerj-cs.1370/ref-9
  article-title: Random search for hyper-parameter optimization
  publication-title: Journal of Machine Learning Research
– volume: 108
  start-page: 115
  year: 2019
  ident: 10.7717/peerj-cs.1370/ref-6
  article-title: Machine learning techniques for code smell detection: a systematic literature review and meta-analysis
  publication-title: Information and Software Technology
  doi: 10.1016/j.infsof.2018.12.009
– start-page: 78
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-30
  article-title: Code smells and detection techniques: a survey
– start-page: 18
  year: 2016
  ident: 10.7717/peerj-cs.1370/ref-10
  article-title: Detecting code smells in Python programs
– volume: 167
  start-page: 110610
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-25
  article-title: Code smells and refactoring: a tertiary systematic review of challenges and observations
  publication-title: Journal of Systems and Software
  doi: 10.1016/j.jss.2020.110610
– volume: 35
  start-page: 1428
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-31
  article-title: Predicting code smells and analysis of predictions: using machine learning techniques and software metrics
  publication-title: Journal of Computer Science and Technology
  doi: 10.1007/s11390-020-0323-7
– start-page: 1
  year: 2008
  ident: 10.7717/peerj-cs.1370/ref-48
  article-title: Wilcoxon signed-rank test
  publication-title: Wiley Encyclopedia of Clinical Trials
– year: 2017
  ident: 10.7717/peerj-cs.1370/ref-43
  article-title: Does python smell like java? Tool support for design defect discovery in python
  doi: 10.22152/programming-journal.org/2017/1/11
– start-page: 352
  year: 2010
  ident: 10.7717/peerj-cs.1370/ref-50
  article-title: A survey on metric of software complexity
– volume: 28
  start-page: 1063
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-18
  article-title: Code smell detection using multi-label classification approach
  publication-title: Software Quality Journal
  doi: 10.1007/s11219-020-09498-y
– volume: 14
  start-page: 758
  issue: 6
  year: 1988
  ident: 10.7717/peerj-cs.1370/ref-7
  article-title: The TAME project: towards improvement-oriented software environments
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/32.6156
– start-page: 170
  year: 2012
  ident: 10.7717/peerj-cs.1370/ref-47
  article-title: Can I clone this piece of code here?
– start-page: 897
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-3
  article-title: Voting heterogeneous ensemble for code smell detection
– start-page: 2
  year: 2019
  ident: 10.7717/peerj-cs.1370/ref-26
  article-title: The technical debt dataset
– volume: 1
  start-page: 19
  issue: 21
  year: 2014
  ident: 10.7717/peerj-cs.1370/ref-41
  article-title: The need to report effect size estimates revisited. An overview of some recommended measures of effect size
  publication-title: Trends in Sport Sciences
– start-page: 593
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-46
  article-title: PyNose: a test smell detector for Python
– volume: 212
  start-page: 102713
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-20
  article-title: Improving performance with hybrid feature selection and ensemble machine learning techniques for code smell detection
  publication-title: Science of Computer Programming
  doi: 10.1016/j.scico.2021.102713
– volume: 7
  start-page: 1
  year: 2006
  ident: 10.7717/peerj-cs.1370/ref-13
  article-title: Statistical comparisons of classifiers over multiple data sets
  publication-title: The Journal of Machine Learning Research
– volume: 36
  start-page: 20
  issue: 1
  year: 2009
  ident: 10.7717/peerj-cs.1370/ref-32
  article-title: Decor: a method for the specification and detection of code and design smells
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.2009.50
– volume-title: Python essential reference
  year: 2009
  ident: 10.7717/peerj-cs.1370/ref-8
– volume: 138
  start-page: 106648
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-2
  article-title: Code smell detection using feature selection and stacking ensemble: an empirical investigation
  publication-title: Information and Software Technology
  doi: 10.1016/j.infsof.2021.106648
– start-page: 590
  year: 2021
  ident: 10.7717/peerj-cs.1370/ref-35
  article-title: QScored: a large dataset of code smells and quality metrics
– volume: 43
  start-page: 1
  issue: 1
  year: 2016
  ident: 10.7717/peerj-cs.1370/ref-38
  article-title: An empirical comparison of model validation techniques for defect prediction models
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.2016.2584050
– volume: 4
  start-page: 354
  issue: 12
  year: 2017
  ident: 10.7717/peerj-cs.1370/ref-37
  article-title: Python—the fastest growing programming language
  publication-title: International Research Journal of Engineering and Technology
– volume: 23
  start-page: 1188
  issue: 3
  year: 2018
  ident: 10.7717/peerj-cs.1370/ref-33
  article-title: On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation
  publication-title: Empirical Software Engineering
  doi: 10.1007/s10664-017-9535-z
– volume: 21
  start-page: 1143
  issue: 3
  year: 2016
  ident: 10.7717/peerj-cs.1370/ref-16
  article-title: Comparing and experimenting machine learning techniques for code smell detection
  publication-title: Empirical Software Engineering
  doi: 10.1007/s10664-015-9378-4
– volume: 2
  start-page: 271
  issue: 2
  year: 2010
  ident: 10.7717/peerj-cs.1370/ref-21
  article-title: Comparative study of attribute selection using gain ratio and correlation based feature selection
  publication-title: International Journal of Information Technology and Knowledge Management
– volume: 45
  start-page: 2341
  issue: 4
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-1
  article-title: Bad smell detection using machine learning techniques: a systematic literature review
  publication-title: Arabian Journal for Science and Engineering
  doi: 10.1007/s13369-019-04311-w
– year: 2023
  ident: 10.7717/peerj-cs.1370/ref-51
  article-title: On the relative value of imbalanced learning for code smell detection
  publication-title: Authorea Preprints
– start-page: 342
  year: 2020
  ident: 10.7717/peerj-cs.1370/ref-29
  article-title: MLCQ: industry-relevant code smell data set
  doi: 10.1145/3383219.3383264
– volume: 40
  start-page: 818
  issue: 8
  year: 2014
  ident: 10.7717/peerj-cs.1370/ref-27
  article-title: Supporting process model validation through natural language generation
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/TSE.2014.2327044
SSID ssj0001511119
Score 2.3589284
Snippet Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is...
SourceID doaj
unpaywall
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e1370
SubjectTerms Algorithms
Analysis
Artificial Intelligence
Code smell
Data mining
Data Mining and Machine Learning
Detection
Java (Computer program language)
Large class
Long method
Machine learning
Python
Software Engineering
SummonAdditionalLinks – databaseName: Unpaywall
  dbid: UNPAY
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lj9MwELZQ9wAXljeBBRmE4EJKXn7kWBBl4bBaAZWWkzW2x-XRTatNI7T8euwkrZpd8bjGE8UznrFnlM_fEPKMY6FLMDoGmbi40ImJNSQ6Ftqlwgm_5qZF-R7xw1nx4YSd9CCacBdm5_-98JXGqxXi2ffY1OM0F74y3-PMp9wjsjc7Op58aRvHCR6zssw6_szL7wzOm5aW__Lmu3P6XERGXm2qFZz_hMVi59iZ7pPpZsId2uTHuFnrsfl1gcvxnxrdINf7xJNOOk-5Sa5gdYvsb5o60D7Gb5N3x-eBToCGu-60PsXFoqYW1y1iq6IBJj-nu1B1etrCMZH2_SfmtG2uU98hs-nbz28O477bQmwYY-vYCn_UWya0Zrnf8yS6RJoSMs507lLHAVLkBlCgsDZDIyzmNrMcmUmc0Wl-l4yqZYX3CfU5nHMZk7aQWFgAcMiFwFKUTnIJEJGXmzVRpqciDx0xFsqXJMFIqjWSMrUKRorI8634quPg-JPg67DAW6FAnd0-8PZXfSSqMpXG71JYQmF9Nashlw4cGPCVprRWR-RpcA8VyDGqgL6ZQ1PX6v2nj2oiWMjHSpFH5EUv5JZ-5gb6ywxe_8CnNZA8GEj66DWD4ScbL1RhKEDeKlw2tQq1sGA-32YRudd55VaxXOQFZ5mMiBz460Dz4Uj17WtLHh6-G3gEvAZb1_67VR_8t-RDci3zOWAAV2TlARmtzxp85HO2tX7cR-xvJgZGSg
  priority: 102
  providerName: Unpaywall
Title Python code smells detection using conventional machine learning models
URI https://www.ncbi.nlm.nih.gov/pubmed/37346528
https://www.proquest.com/docview/2828755655
https://pubmed.ncbi.nlm.nih.gov/PMC10280480
https://doi.org/10.7717/peerj-cs.1370
https://doaj.org/article/918c787e9a4d407ba38fafaca4018ddb
UnpaywallVersion publishedVersion
Volume 9
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: DOA
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: M~E
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVAQN
  databaseName: PubMed Central
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: RPM
  dateStart: 20170101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl: http://www.proquest.com/pqcentral?accountid=15518
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: BENPR
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Technology Collection
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: 8FG
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/technologycollection1
  providerName: ProQuest
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Lb9NAEF5BOcCFN9RQogUhuGDq13rXxxQ1LRyiqBCpnFazr0DlOlGdCPXfs2M7kQ0CLhztHcnemdl5SN9-Q8jr3GaqAK1CEJELMxXpUEGkQq5czB33NtcNynean86zT-fsvDfqCzFhLT1wq7jDIhbaO5UtIDO--VCQCgcONPjGQBijMPpGoug1U-39YAwFRUuqyX3Lcriy9uoi1PX7OMXBxL0k1HD1_x6ReynpV7jk7U21gusfUJa9XDS5T-52RSQdtz__gNyw1UNybzuggXbn9RE5mV0jNQDFe-u0vrRlWVNj1w36qqIIeV_QPuycXjbQSku7WRIL2gzKqR-T-eT4y4fTsJucEGrG2Do03Kdtw7hSLPXxS1gXCV1AkjOVutjlALHNNVhuuTGJ1dzY1CQmt0xHTqs4fUL2qmVl9wn19ZhzCRMmEzYzAOBszr05eOFELgAC8m6rSqk7WnGcblFK316g5mWjealriZoPyJud-Krl0_iT4BHaZSeENNjNC-8csnMO-S_nCMgrtKpEoosKkTQL2NS1_Pj5TI45w9qq4GlA3nZCbun_XEN3McHvH7mxBpIHA0l_EvVg-eXWeSQuIXytsstNLbGv5czXziwgT1tn2m0s5WmWs0QERAzcbLDz4Ur1_VtDBI7fRU4Av4OdR_5dq8_-h1afkzuJL_UQQ5EUB2RvfbWxL3xptlYjclNMTkbk1tHxdHY2as6kf5pPZ-OvPwHA7ELN
linkProvider Directory of Open Access Journals
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lj9MwELZQ9wAXljeBBRmE4EJKXn7kWBBl4bBaAZWWkzW2x-XRTatNI7T8euwkrZpd8bjGE8UznrFnlM_fEPKMY6FLMDoGmbi40ImJNSQ6Ftqlwgm_5qZF-R7xw1nx4YSd9CCacBdm5_-98JXGqxXi2ffY1OM0F74y3-PMp9wjsjc7Op58aRvHCR6zssw6_szL7wzOm5aW__Lmu3P6XERGXm2qFZz_hMVi59iZ7pPpZsId2uTHuFnrsfl1gcvxnxrdINf7xJNOOk-5Sa5gdYvsb5o60D7Gb5N3x-eBToCGu-60PsXFoqYW1y1iq6IBJj-nu1B1etrCMZH2_SfmtG2uU98hs-nbz28O477bQmwYY-vYCn_UWya0Zrnf8yS6RJoSMs507lLHAVLkBlCgsDZDIyzmNrMcmUmc0Wl-l4yqZYX3CfU5nHMZk7aQWFgAcMiFwFKUTnIJEJGXmzVRpqciDx0xFsqXJMFIqjWSMrUKRorI8634quPg-JPg67DAW6FAnd0-8PZXfSSqMpXG71JYQmF9Nashlw4cGPCVprRWR-RpcA8VyDGqgL6ZQ1PX6v2nj2oiWMjHSpFH5EUv5JZ-5gb6ywxe_8CnNZA8GEj66DWD4ScbL1RhKEDeKlw2tQq1sGA-32YRudd55VaxXOQFZ5mMiBz460Dz4Uj17WtLHh6-G3gEvAZb1_67VR_8t-RDci3zOWAAV2TlARmtzxp85HO2tX7cR-xvJgZGSg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Python+code+smells+detection+using+conventional+machine+learning+models&rft.jtitle=PeerJ.+Computer+science&rft.au=Sandouka%2C+Rana&rft.au=Aljamaan%2C+Hamoud&rft.date=2023-05-29&rft.pub=PeerJ.+Ltd&rft.issn=2376-5992&rft.eissn=2376-5992&rft.volume=9&rft.spage=e1370&rft_id=info:doi/10.7717%2Fpeerj-cs.1370&rft.externalDocID=A751028973
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2376-5992&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2376-5992&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2376-5992&client=summon