Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments

[Display omitted] •We propose two new, simple, and efficient Hybrid Feature Selection techniques.•We use a feature-based ranking to initialize the Binary Differential Evolution.•We also propose a new fitness function influenced by the features in the population.•Several statistical tests show the ro...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 38; pp. 922 - 932
Main Authors Apolloni, Javier, Leguizamón, Guillermo, Alba, Enrique
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.01.2016
Subjects
Online AccessGet full text
ISSN1568-4946
1872-9681
DOI10.1016/j.asoc.2015.10.037

Cover

More Information
Summary:[Display omitted] •We propose two new, simple, and efficient Hybrid Feature Selection techniques.•We use a feature-based ranking to initialize the Binary Differential Evolution.•We also propose a new fitness function influenced by the features in the population.•Several statistical tests show the robustness and effectiveness of the proposals.•The reducing of the size of the original set of features is larger than 99%. Microarray experiments generally deal with complex and high-dimensional samples, and in addition, the number of samples is much smaller than their dimensions. Both issues can be alleviated by using a feature selection (FS) method. In this paper two new, simple, and efficient hybrid FS algorithms, called respectively BDE-XRank and BDE-XRankf, are presented. Both algorithms combine a wrapper FS method based on a Binary Differential Evolution (BDE) algorithm with a rank-based filter FS method. Besides, they generate the initial population with solutions involving only a small number of features. Some initial solutions are built considering only the most relevant features regarding the filter method, and the remaining ones include only random features (to promote diversity). In the BDE-XRankf, a new fitness function, in which the score value of a solution is influenced by the frequency of the features in the current population, is incorporated in the algorithm. The robustness of BDE-XRank and BDE-XRankf is shown by using four Machine Learning (ML) algorithms (NB, SVM, C4.5, and kNN). Six high-dimensional well-known data sets of microarray experiments are used to carry out an extensive experimental study based on statistical tests. This experimental analysis shows the robustness as well as the ability of both proposals to obtain highly accurate solutions at the earlier stages of BDE evolutionary process. Finally, BDE-XRank and BDE-XRankf are also compared against the results of nine state-of-the-art algorithms to highlight its competitiveness and the ability to successfully reduce the original feature set size by more than 99%.
ISSN:1568-4946
1872-9681
DOI:10.1016/j.asoc.2015.10.037