A novel grey wolf optimization algorithm based on geometric transformations for gene selection and cancer classification
Cancer classification based on microarray data plays a very important role in cancer diagnosis and detection. Indeed, since microarray data contains a huge number of genes and a small number of samples, it is also nonlinear and noisy, which has led to the need to find a way to reduce the data dimens...
Saved in:
| Published in | The Journal of supercomputing Vol. 80; no. 4; pp. 4808 - 4840 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Springer US
01.03.2024
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0920-8542 1573-0484 |
| DOI | 10.1007/s11227-023-05643-z |
Cover
| Summary: | Cancer classification based on microarray data plays a very important role in cancer diagnosis and detection. Indeed, since microarray data contains a huge number of genes and a small number of samples, it is also nonlinear and noisy, which has led to the need to find a way to reduce the data dimensionality. In order to solve this problem, we need to find an effective way to help biologists and medical research scientists. This paper proposes a new bio-inspired algorithm for cancer classification in gene selection called Binary Grey Wolf Optimization Algorithm (BGWOA), which is based on hybridization between Minimum Redundancy-Maximum Relevance (MRMR) and a novel Binary Grey Wolf algorithm. The BGWOA is composed of two stages: The first stage consists of the MRMR pre-filter to obtain the set of relevant genes that reduces the dimensionality of the data sets. The second stage consists of a new Binary Grey Wolf algorithm based on direct similarity and centroid known in the geometric field to update the positions of grey wolves in order to exploit and explore the search spaces. As well, we used a fitness function that depends on the SVM with LOOCV classifier and the rate of unselected genes to evaluate the presented solutions. The primary goal of the last stage is to identify the best relevant subset of genes among those obtained in the first stage. This research used eight microarray datasets to evaluate and compare the proposed method with other existing algorithms. The experimental results produced in this research are able to provide a higher classification accuracy with fewer genes compared to many recently published algorithms. Specifically, the proposed method achieves 100% classification accuracy in five reference datasets with a number of genes ranging from 12 to 25. Therefore, this indicates that our research is promising and significant. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0920-8542 1573-0484 |
| DOI: | 10.1007/s11227-023-05643-z |