Centroid mutation-based Search and Rescue optimization algorithm for feature selection and classification

Massive data is generated as a result of technological innovations in various fields. Medical data sets often have extremely complex dimensions with limited sample sizes. The researchers face a difficult problem in classifying this high-dimensional data. We present a novel optimization approach for...

Full description

Saved in:
Bibliographic Details
Published inExpert systems with applications Vol. 191; p. 116235
Main Authors Houssein, Essam H., Saber, Eman, Ali, Abdelmgeid A., Wazery, Yaser M.
Format Journal Article
LanguageEnglish
Published New York Elsevier Ltd 01.04.2022
Elsevier BV
Subjects
Online AccessGet full text
ISSN0957-4174
1873-6793
DOI10.1016/j.eswa.2021.116235

Cover

More Information
Summary:Massive data is generated as a result of technological innovations in various fields. Medical data sets often have extremely complex dimensions with limited sample sizes. The researchers face a difficult problem in classifying this high-dimensional data. We present a novel optimization approach for better feature selection in medical data classification in this research. We call this approach a centroid mutation-based Search and Rescue optimization algorithm (cmSAR) based on a k-Nearest Neighbor (kNN) classifier for disease classification. The use of cmSAR in feature selection is to find the optimal group of features that show strong separability between two classes, solving premature convergence and improves the local search ability of the SAR algorithm. We use a fuzzy logic as a logical system, which is an extension of multi-valued logic to generate a fuzzy set and apply a centroid mutation operator on it. The statistical results of cmSAR were either identical or superior to those of well-known metaheuristic algorithms, including the Slime Mould Algorithm (SMA), Particle Swarm Optimization (PSO) algorithm, Sine Cosine Algorithm (SCA), Moth–Flame Optimization (MFO) algorithm, Whale Optimization Algorithm (WOA), Genetic Algorithm (GA), and the original SAR algorithm on 15 disease data sets with different feature sizes extracted from UCI. In addition, cmSAR outperformed the other algorithms in CEC-C06 2019 single-objective benchmark functions as well as in performance evaluation metrics for classification according to Friedman test and Bonferroni–Dunn test for statistical verification. The proposed cmSAR achieved superior performance on all the medical data sets. •An efficient FSAR algorithm is proposed based on Fuzzy logic mutation.•CEC-C06 2019 test suite is utilized for verification of FSAR performance.•FSAR is proposed for biomedical classification tasks.•FSAR is analyzed using various analysis metrics.•The performance of the FSAR is better than other competitor algorithms.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2021.116235