Cell separation algorithm with enhanced search behaviour in miRNA feature selection for cancer diagnosis
Feature (biomarker) selection of microRNA data is an important step for the cancer classification task. In this paper, the cell separation meta-heuristic algorithm (CSA) which imitates the cell separation action using differential centrifugation process is improved. Enhanced search behaviour embedde...
Saved in:
| Published in | Information systems (Oxford) Vol. 104; p. 101906 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Oxford
Elsevier Ltd
01.02.2022
Elsevier Science Ltd |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0306-4379 1873-6076 |
| DOI | 10.1016/j.is.2021.101906 |
Cover
| Summary: | Feature (biomarker) selection of microRNA data is an important step for the cancer classification task. In this paper, the cell separation meta-heuristic algorithm (CSA) which imitates the cell separation action using differential centrifugation process is improved. Enhanced search behaviour embedded in movement of virtual cells in this algorithm is provided. This enhancement contributes to an effective trade-off between global exploration and local exploitation of the search space which is the key factor to improve the performance of any meta-heuristic. In order to perform feature selection, different levels and periods to handle the number of features in the selected subset is automatically used during the search process. This process is set to avoid the effect of the dimension of data on computational time and effort. To examine the improved CSA (I-CSA), 25 test functions were initially used. Next, six benchmark classification problems, including four biological and two other datasets, were employed to test the I-CSA. Furthermore, an experiment for feature selection from microRNA data is conducted. The classification accuracy of the selected features obtained by applying I-CSA on microRNA data is compared with the accuracy of 22 well-known classifiers on a real-world dataset. This dataset contains genomic information of 8,129 patients for 29 different types of cancer with 1,046 gene expressions. The classification accuracy of each cancer type is also ranked with the results of 77 classifiers reported in previous works. As a result, the proposed approach achieved 100% accuracy in 25 out of 29 classes. In seven cases out of 29, the method achieved 100% accuracy, which no classifier in other studies has reached.
•Feature selection from miRNA data using an enhanced CSA.•A new search behaviour for CSA is proposed.•Setting the different levels and periods tohandle the number of features.•Comparison of classification accuracy with 22 well-known classifiers.•Achieving accuracy of 100% in 25 out of 29 classes of genomic data. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0306-4379 1873-6076 |
| DOI: | 10.1016/j.is.2021.101906 |