F-PENN— Forest path encoding for neural networks

Deep neural nets (DNNs) mostly tend to outperform other machine learning (ML) approaches when the training data is abundant, high-dimensional, sparse, or consisting of raw data (e.g., pixels). For datasets with other characteristics – for example, dense tabular numerical data – algorithms such as Gr...

Full description

Saved in:

Bibliographic Details
Published in	Information fusion Vol. 75; pp. 186 - 196
Main Authors	Cohen, Yoni, Katz, Gilad, Rokach, Lior
Format	Journal Article
Language	English
Published	Elsevier B.V 01.11.2021
Subjects	Gradient Boosted Decision trees Neural Networks Random Forest Word2Vec Neural Networks Gradient Boosted Decision trees Word2Vec Random Forest
Online Access	Get full text
ISSN	1566-2535 1872-6305
DOI	10.1016/j.inffus.2021.06.005

Cover

More Information
Summary:	Deep neural nets (DNNs) mostly tend to outperform other machine learning (ML) approaches when the training data is abundant, high-dimensional, sparse, or consisting of raw data (e.g., pixels). For datasets with other characteristics – for example, dense tabular numerical data – algorithms such as Gradient Boosting Machines and Random Forest often achieve comparable or better performance at a fraction of the time and resources. These differences suggest that combining these approaches has potential to yield superior performance. Existing attempts to combine DNNs with other ML approaches, which usually consist of feeding the output of the latter into the former, often do not produce positive results. We argue that this lack of improvement stems from the fact that the final classifications fail to provide the DNN with an understanding of the other algorithms’ decision-making process (i.e., its “logic”). In this study we present F-PENN, a novel approach for combining decision forests and DNNs. Instead of providing the final output of the forest (or its trees) to the DNN, we provide the paths traveled by each sample. This information, when fed to the neural net, yields significant improvement in performance. We demonstrate the effectiveness of our approach by conducting extensive evaluation on 56 datasets and comparing F-PENN to four leading baselines: DNNs, Gradient Boosted Decision Trees (GBDT), Random Forest and DeepFM. We show that F-PENN outperforms the baselines in 69%–89% of dataset and achieves an overall average error reduction of 16%–26%. •We present a novel method to combine ensemble models and neural networks.•We encode the sample trajectory in the ensemble to enrich samples representation.•The encoding method used is Word2Vec — what creates de-facto data augmentation.•We evaluate our model on 56 open-sourced datasets.•We present superior performances against several baselines.
ISSN:	1566-2535 1872-6305
DOI:	10.1016/j.inffus.2021.06.005