A Comparative Analysis of Metaheuristic Feature Selection Methods in Software Vulnerability Prediction

Background: Early identification of software vulnerabilities is an intrinsic step in achieving software security. In the era of artificial intelligence, software vulnerability prediction models (VPMs) are created using machine learning and deep learning approaches. The effectiveness of these models...

Full description

Saved in:
Bibliographic Details
Published inE-informatica : software engineering journal Vol. 19; no. 1
Main Authors Deepali Bassi, Hardeep Singh
Format Journal Article
LanguageEnglish
Published Wroclaw University of Science and Technology 01.01.2025
Subjects
Online AccessGet full text
ISSN1897-7979
2084-4840
2084-4840
DOI10.37190/e-inf250103

Cover

More Information
Summary:Background: Early identification of software vulnerabilities is an intrinsic step in achieving software security. In the era of artificial intelligence, software vulnerability prediction models (VPMs) are created using machine learning and deep learning approaches. The effectiveness of these models aids in increasing the quality of the software. The handling of imbalanced datasets and dimensionality reduction are important aspects that affect the performance of VPMs. Aim: The current study applies novel metaheuristic approaches for feature subset selection. Method: This paper performs a comparative analysis of forty-eight combinations of eight machine learning techniques and six metaheuristic feature selection methods on four public datasets. Results: The experimental results reveal that VPMs productivity is upgraded after the application of the feature selection methods for both metrics-based and text-mining-based datasets. Additionally, the study has applied Wilcoxon signed-rank test to the results of metrics-based and text-features-based VPMs to evaluate which outperformed the other. Furthermore, it discovers the best-performing feature selection algorithm based on AUC for each dataset. Finally, this paper has performed better than the benchmark studies in terms of F1-Score. Conclusion: The results conclude that GWO has performed satisfactorily for all the datasets.
ISSN:1897-7979
2084-4840
2084-4840
DOI:10.37190/e-inf250103