Evolutionary Design of Decision-Tree Algorithms Tailored to Microarray Gene Expression Data Sets

Decision-tree induction algorithms are widely used in machine learning applications in which the goal is to extract knowledge from data and present it in a graphically intuitive way. The most successful strategy for inducing decision trees is the greedy top-down recursive approach, which has been co...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on evolutionary computation Vol. 18; no. 6; pp. 873 - 892
Main Authors Barros, Rodrigo C., Basgalupp, Marcio P., Freitas, Alex A., de Carvalho, Andre C. P. L. F.
Format Journal Article
LanguageEnglish
Published New York IEEE 01.12.2014
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN1089-778X
1941-0026
1941-0026
DOI10.1109/TEVC.2013.2291813

Cover

More Information
Summary:Decision-tree induction algorithms are widely used in machine learning applications in which the goal is to extract knowledge from data and present it in a graphically intuitive way. The most successful strategy for inducing decision trees is the greedy top-down recursive approach, which has been continuously improved by researchers over the past 40 years. In this paper, we propose a paradigm shift in the research of decision trees: instead of proposing a new manually designed method for inducing decision trees, we propose automatically designing decision-tree induction algorithms tailored to a specific type of classification data set (or application domain). Following recent breakthroughs in the automatic design of machine learning algorithms, we propose a hyper-heuristic evolutionary algorithm called hyper-heuristic evolutionary algorithm for designing decision-tree algorithms (HEAD-DT) that evolves design components of top-down decision-tree induction algorithms. By the end of the evolution, we expect HEAD-DT to generate a new and possibly better decision-tree algorithm for a given application domain. We perform extensive experiments in 35 real-world microarray gene expression data sets to assess the performance of HEAD-DT, and compare it with very well known decision-tree algorithms such as C4.5, CART, and REPTree. Results show that HEAD-DT is capable of generating algorithms that significantly outperform the baseline manually designed decision-tree algorithms regarding predictive accuracy and F-measure.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1089-778X
1941-0026
1941-0026
DOI:10.1109/TEVC.2013.2291813