A problem-specific non-dominated sorting genetic algorithm for supervised feature selection
•Supervised feature selection of high-dimensional data is formulated as an MOP.•We developed a problem-specific non-dominated sorting genetic algorithm to solve the MOP.•We made a systematical comparison between our method and some state-of-the-art FS approaches. Feature selection (FS), which plays...
Saved in:
| Published in | Information sciences Vol. 547; pp. 841 - 859 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier Inc
08.02.2021
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0020-0255 1872-6291 |
| DOI | 10.1016/j.ins.2020.08.083 |
Cover
| Summary: | •Supervised feature selection of high-dimensional data is formulated as an MOP.•We developed a problem-specific non-dominated sorting genetic algorithm to solve the MOP.•We made a systematical comparison between our method and some state-of-the-art FS approaches.
Feature selection (FS), which plays an important role in classification tasks, has been recently studied as a multi-objective optimization problem (MOP). In this paper, we consider minimizing three objectives of FS and propose a problem-specific non-dominated sorting genetic algorithm (PS-NSGA). In PS-NSGA, an accuracy-preferred domination operator is applied, which makes the individual with higher classification accuracy in the population more likely to survive. And a quick bit mutation is used, which breaks through the limitation of traditional bit string mutation and increases the efficiency. In addition, a mutation-retry operator and a combination operator are designed to make our algorithm converge faster and better. At last, a solution selection strategy is developed to determine the most proper feature subset from the obtained Pareto solutions. Experimental results on 15 real-world high-dimensional datasets demonstrate that our proposed algorithm can achieve competitive classification accuracy while obtaining a smaller size of feature subset compared with some state-of-the-art evolutionary and traditional FS algorithms. |
|---|---|
| ISSN: | 0020-0255 1872-6291 |
| DOI: | 10.1016/j.ins.2020.08.083 |