Assessing the impact of tuning parameter in instance selection based bug resolution classification

Software maintenance is time-consuming and requires significant effort for bug resolution and various types of software enhancement. Estimating software maintenance effort is challenging for open source software (OSS) without historical data about direct effort expressed in terms of man-days, compar...

Full description

Saved in:

Bibliographic Details
Published in	Information and software technology Vol. 188; p. 107874
Main Authors	Miloudi, Chaymae, Cheikhi, Laila, Idri, Ali, Abran, Alain
Format	Journal Article
Language	English
Published	Elsevier B.V 01.12.2025
Subjects	Bug resolution Classification Instance selection Maintenance effort estimation Open source software Tuning parameter Bug resolution Maintenance effort estimation Instance selection Tuning parameter Open source software Classification
Online Access	Get full text
ISSN	0950-5849
DOI	10.1016/j.infsof.2025.107874

Cover

More Information
Summary:	Software maintenance is time-consuming and requires significant effort for bug resolution and various types of software enhancement. Estimating software maintenance effort is challenging for open source software (OSS) without historical data about direct effort expressed in terms of man-days, compared to proprietary software for which this data about effort is available. Therefore, maintenance efforts in the OSS context can only be estimated indirectly through other features, such as OSS bug reports, and other approaches, such as bug resolution prediction models using a number of machine learning (ML) techniques. Although these bug reports are at times large in size, they need to be preprocessed before they can be used. In this context, instance selection (IS) has been presented in the literature as a way of reducing the size of datasets by selecting a subset of instances. Additionally, ML techniques often require fine-tuning of numerous parameters to achieve optimal predictions. This is typically done using tuning parameter (TP) methods. The empirical study reported here investigated the impact of TP methods together with instance selection algorithms (ISAs) on the performance of bug resolution prediction ML classifiers on five datasets: Eclipse JDT, Eclipse Platform, KDE, LibreOffice, and Apache. To this end, a set of 480 ML classifiers are built using 60 datasets including the five original ones, 15 reduced datasets using Edited Nearest Neighbor (ENN), Repeated Edited Nearest Neighbor (RENN), and all-k Nearest Neighbor (AllkNN) single ISAs, and 40 reduced datasets using Bagging, Random Feature Subsets, and Voting ensemble ISAs, together with four ML techniques (k Nearest Neighbor (kNN), Support Vector Machine (SVM), Voted Perceptron (VP), and Random Tree (RT) using Grid Search (GS) and Default Parameter (DP) configurations. Furthermore, the classifiers were evaluated using Accuracy, Precision, and Recall performance criteria, in addition to the ten-fold cross-validation method. Next, these classifiers are compared to determine how parameter tuning and IS can enhance bug resolution prediction performance. The findings revealed that (1) using GS with single ISAs enhanced the performance of the built ML classifiers, (2) using GS with homogeneous and heterogeneous ensemble ISAs enhanced the performance of the built ML classifiers, and (3) associating GS and SVM with RENN (either used as a single ISA or implemented as a base algorithm for ensemble ISAs) gave the best performance. [Display omitted]
ISSN:	0950-5849
DOI:	10.1016/j.infsof.2025.107874