A Multiobjective Genetic Programming-Based Ensemble for Simultaneous Feature Selection and Classification
We present an integrated algorithm for simultaneous feature selection (FS) and designing of diverse classifiers using a steady state multiobjective genetic programming (GP), which minimizes three objectives: 1) false positives (FPs); 2) false negatives (FNs); and 3) the number of leaf nodes in the t...
        Saved in:
      
    
          | Published in | IEEE transactions on cybernetics Vol. 46; no. 2; pp. 499 - 510 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        United States
          IEEE
    
        01.02.2016
     The Institute of Electrical and Electronics Engineers, Inc. (IEEE)  | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 2168-2267 2168-2275  | 
| DOI | 10.1109/TCYB.2015.2404806 | 
Cover
| Summary: | We present an integrated algorithm for simultaneous feature selection (FS) and designing of diverse classifiers using a steady state multiobjective genetic programming (GP), which minimizes three objectives: 1) false positives (FPs); 2) false negatives (FNs); and 3) the number of leaf nodes in the tree. Our method divides a c-class problem into c binary classification problems. It evolves c sets of genetic programs to create c ensembles. During mutation operation, our method exploits the fitness as well as unfitness of features, which dynamically change with generations with a view to using a set of highly relevant features with low redundancy. The classifiers of ith class determine the net belongingness of an unknown data point to the ith class using a weighted voting scheme, which makes use of the FP and FN mistakes made on the training data. We test our method on eight microarray and 11 text data sets with diverse number of classes (from 2 to 44), large number of features (from 2000 to 49151), and high feature-to-sample ratio (from 1.03 to 273.1). We compare our method with a bi-objective GP scheme that does not use any FS and rule size reduction strategy. It depicts the effectiveness of the proposed FS and rule size reduction schemes. Furthermore, we compare our method with four classification methods in conjunction with six features selection algorithms and full feature set. Our scheme performs the best for 380 out of 474 combinations of data sets, algorithm and FS method. | 
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23  | 
| ISSN: | 2168-2267 2168-2275  | 
| DOI: | 10.1109/TCYB.2015.2404806 |