Cell separation algorithm with enhanced search behaviour in miRNA feature selection for cancer diagnosis

Feature (biomarker) selection of microRNA data is an important step for the cancer classification task. In this paper, the cell separation meta-heuristic algorithm (CSA) which imitates the cell separation action using differential centrifugation process is improved. Enhanced search behaviour embedde...

Full description

Saved in:
Bibliographic Details
Published inInformation systems (Oxford) Vol. 104; p. 101906
Main Authors Jaddi, Najmeh Sadat, Saniee Abadeh, Mohammad
Format Journal Article
LanguageEnglish
Published Oxford Elsevier Ltd 01.02.2022
Elsevier Science Ltd
Subjects
Online AccessGet full text
ISSN0306-4379
1873-6076
DOI10.1016/j.is.2021.101906

Cover

More Information
Summary:Feature (biomarker) selection of microRNA data is an important step for the cancer classification task. In this paper, the cell separation meta-heuristic algorithm (CSA) which imitates the cell separation action using differential centrifugation process is improved. Enhanced search behaviour embedded in movement of virtual cells in this algorithm is provided. This enhancement contributes to an effective trade-off between global exploration and local exploitation of the search space which is the key factor to improve the performance of any meta-heuristic. In order to perform feature selection, different levels and periods to handle the number of features in the selected subset is automatically used during the search process. This process is set to avoid the effect of the dimension of data on computational time and effort. To examine the improved CSA (I-CSA), 25 test functions were initially used. Next, six benchmark classification problems, including four biological and two other datasets, were employed to test the I-CSA. Furthermore, an experiment for feature selection from microRNA data is conducted. The classification accuracy of the selected features obtained by applying I-CSA on microRNA data is compared with the accuracy of 22 well-known classifiers on a real-world dataset. This dataset contains genomic information of 8,129 patients for 29 different types of cancer with 1,046 gene expressions. The classification accuracy of each cancer type is also ranked with the results of 77 classifiers reported in previous works. As a result, the proposed approach achieved 100% accuracy in 25 out of 29 classes. In seven cases out of 29, the method achieved 100% accuracy, which no classifier in other studies has reached. •Feature selection from miRNA data using an enhanced CSA.•A new search behaviour for CSA is proposed.•Setting the different levels and periods tohandle the number of features.•Comparison of classification accuracy with 22 well-known classifiers.•Achieving accuracy of 100% in 25 out of 29 classes of genomic data.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2021.101906