Feature selection using binary horse herd optimization algorithm with lightGBA ensemble classification in microarray data

Data analysis presents significant challenges due to its high dimensionality, imbalanced distribution, and complexity. Traditional feature selection methods often fall short of addressing these challenges effectively. In response, this research proposes a novel hybrid methodology that integrates mul...

Full description

Saved in:

Bibliographic Details
Published in	Knowledge-based systems Vol. 312; p. 113168
Main Authors	Preyanka Lakshme, R.S., Ganesh Kumar, S.
Format	Journal Article
Language	English
Published	Elsevier B.V 15.03.2025
Subjects	Binary horse herd optimization algorithm light gradient boosting Multi-objective Pearson correlation Weighted entropy variance Weighted entropy variance light gradient boosting Binary horse herd optimization algorithm Multi-objective Pearson correlation
Online Access	Get full text
ISSN	0950-7051
DOI	10.1016/j.knosys.2025.113168

Cover

More Information
Summary:	Data analysis presents significant challenges due to its high dimensionality, imbalanced distribution, and complexity. Traditional feature selection methods often fall short of addressing these challenges effectively. In response, this research proposes a novel hybrid methodology that integrates multi-filtering techniques with the Multi-Objective Binary Horse Herd Optimization (MOBHHO) algorithm to tackle gene selection and ensemble classification in microarray data. The study begins by identifying the limitations of existing methods, emphasizing the need for a comprehensive approach that combines the strengths of multi-filtering and metaheuristic optimization. Leveraging various filtering methods, including Information Gain, entropy, Pearson correlation, mutual information, mean absolute deviation, and weighted entropy variance, the proposed methodology aims to mitigate biases and enhance the robustness of feature selection. Subsequently, the MOBHHO wrapper method facilitates multi-objective optimization, optimizing objectives by minimizing selected features and maximizing prediction criteria. Finally, the ensemble prediction model LightGBA capitalizes on the diverse solutions obtained from MOBHHO, striking an optimal balance between feature count and prediction accuracy. The proposed method was evaluated on multiple high-dimensional microarray datasets such as Small Round Blue Cell Tumors (SRBCT), Prostate tumors, Lung cancer, Leukemia, Colon tumor and diffuse large B-cell lymphoma (DLBCL), Lymphoma, ALL-AML-4C, ALL-AML-3C, and MLL datasets are used to assess its effectiveness in feature selection and classification accuracy. The experimental outcomes demonstrate the efficacy of the proposed methodology, showcasing improved prediction accuracy and feature subset selection across diverse datasets.
ISSN:	0950-7051
DOI:	10.1016/j.knosys.2025.113168