A novel random forests-based feature selection method for microarray expression data analysis

High-dimensional data and a large number of redundancy features in bioinformatics research have created an urgent need for feature selection. In this paper, a novel random forests-based feature selection method is proposed that adopts the idea of stratifying feature space and combines generalised se...

Full description

Saved in:

Bibliographic Details
Published in	International journal of data mining and bioinformatics Vol. 13; no. 1; p. 84
Main Authors	Yao, Dengju, Yang, Jing, Zhan, Xiaojuan, Zhan, Xiaorong, Xie, Zhiqiang
Format	Journal Article
Language	English
Published	Switzerland 2015
Subjects	Animals Biomarkers, Tumor - biosynthesis Biomarkers, Tumor - genetics Databases, Genetic Gene Expression Profiling - methods Gene Expression Regulation, Leukemic Humans Leukemia - genetics Leukemia - metabolism Oligonucleotide Array Sequence Analysis Support Vector Machine
Online Access	Get more information
ISSN	1748-5673
DOI	10.1504/IJDMB.2015.070852

Cover

More Information
Summary:	High-dimensional data and a large number of redundancy features in bioinformatics research have created an urgent need for feature selection. In this paper, a novel random forests-based feature selection method is proposed that adopts the idea of stratifying feature space and combines generalised sequence backward searching and generalised sequence forward searching strategies. A random forest variable importance score is used to rank features, and different classifiers are used as a feature subset evaluating function. The proposed method is examined on five microarray expression datasets, including leukaemia, prostate, breast, nervous and DLBCL, and the average accuracies of the SVM classifier in these datasets are 100%, 95.24%, 85%, 91.67%, and 91.67%, respectively. The results show that the proposed method could not only improve the classification accuracy but also greatly reduce the computation time of the feature selection process.
ISSN:	1748-5673
DOI:	10.1504/IJDMB.2015.070852