Dynamic stacking ensemble for cross-language code smell detection

Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple pr...

Full description

Saved in:

Bibliographic Details
Published in	PeerJ. Computer science Vol. 10; p. e2254
Main Author	Aljamaan, Hamoud
Format	Journal Article
Language	English
Published	United States PeerJ. Ltd 15.08.2024 PeerJ Inc
Subjects	Analysis Artificial Intelligence Code smell Data Mining and Machine Learning Detection Dynamic ensemble Ensemble learning Machine learning Neural Networks Programming Languages Software Engineering Stacking ensemble Stacking ensemble Code smell Detection Ensemble learning Dynamic ensemble Machine learning
Online Access	Get full text
ISSN	2376-5992 2376-5992
DOI	10.7717/peerj-cs.2254

Cover

More Information
Summary:	Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages ( i.e ., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	2376-5992 2376-5992
DOI:	10.7717/peerj-cs.2254