A hybrid feature selection algorithm for microarray data

For each microarray data set, only a small number of genes are beneficial. Due to the high-dimensional problem, gene selection research work remains a challenge. In order to solve the high-dimensional problem, we propose a dimensionality reduction algorithm named K value maximum relevance minimum re...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of supercomputing Vol. 76; no. 5; pp. 3494 - 3526
Main Authors Zheng, Yuefeng, Li, Ying, Wang, Gang, Chen, Yupeng, Xu, Qian, Fan, Jiahao, Cui, Xueting
Format Journal Article
LanguageEnglish
Published New York Springer US 01.05.2020
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN0920-8542
1573-0484
DOI10.1007/s11227-018-2640-y

Cover

More Information
Summary:For each microarray data set, only a small number of genes are beneficial. Due to the high-dimensional problem, gene selection research work remains a challenge. In order to solve the high-dimensional problem, we propose a dimensionality reduction algorithm named K value maximum relevance minimum redundancy improved grey wolf optimizer (KMR 2 IGWO). First, in the processing of KMR 2 , the K genes are selected. Second, the K genes are initialized by two ways according to random selection feature and different proportions of selection feature. Finally, the IGWO algorithm selects the optimal classification accuracy and the optimal combination of gene by adjusting the parameters of fitness function. The algorithm has a significant dimensionality reduction effect and is suitable for high-dimensional data sets. Experimental results show that the proposing KMR 2 IGWO strategy significantly reduces the dimension of microarray data and removes the redundant features. On the 14 microarray data sets, compared with the four algorithms mRMR + PSO, mRMR + GA, mRMR + BA, mRMR + CS, the proposed algorithm has higher performance in classification accuracy and feature subset length. In five data sets, the proposed algorithm average classification accuracy is 100%. On the 14 data sets, the proposed algorithm has a very significant dimensionality reduction effect, and the dimensionality reduction range is between 0.4% and 0.04%.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-018-2640-y