Optimizing Software Defect Classification with Locally Linear Embedding Techniques

Software defect classification is crucial for enhancing the quality and reliability of software. This research explores the integration of Locally Linear Embedding (LLE) into the preprocessing stages of classification models to improve accuracy. LLE is particularly effective as it transforms complex...

Full description

Saved in:
Bibliographic Details
Published inProceedings. Annual Reliability and Maintainability Symposium pp. 1 - 7
Main Authors Salboukh, Fatemeh, Saki, Hesam, Fiondella, Lance
Format Conference Proceeding
LanguageEnglish
Published IEEE 27.01.2025
Subjects
Online AccessGet full text
ISSN2577-0993
DOI10.1109/RAMS48127.2025.10935043

Cover

More Information
Summary:Software defect classification is crucial for enhancing the quality and reliability of software. This research explores the integration of Locally Linear Embedding (LLE) into the preprocessing stages of classification models to improve accuracy. LLE is particularly effective as it transforms complex, nonlinear data into a simpler, linear format. This transformation is instrumental in reducing the complexity of the data, facilitating a more straightforward analysis and potentially improving the predictive accuracy of classifiers. For this study, we used the KC2 dataset from the PROMISE repository and applied several feature selection techniques, including Correlation-Based Feature Selection, Gain Ratio, Sequential Forward Selection, and K-nearest neighbors. After selecting the most informative features, LLE was employed to linearize this data, aiming to enhance the classifiers' performance by presenting them with data that is easier to interpret and analyze. The impact of LLE was evaluated by comparing the outcomes of five different classifiers, such as Random Forest, Gradient Boosting, Support Vector Machines, Non-Linear Support Vector Machines, and Naive Bayes, on both the original and the LLE-processed datasets. Our results demonstrated an average improvement of 4.5% in accuracy and 4% in F1-Score across the classifiers with the LLE-processed data, confirming the effectiveness of transforming nonlinear data into a linear space for enhancing software defect classification.
ISSN:2577-0993
DOI:10.1109/RAMS48127.2025.10935043