F-PENN— Forest path encoding for neural networks
Deep neural nets (DNNs) mostly tend to outperform other machine learning (ML) approaches when the training data is abundant, high-dimensional, sparse, or consisting of raw data (e.g., pixels). For datasets with other characteristics – for example, dense tabular numerical data – algorithms such as Gr...
Saved in:
| Published in | Information fusion Vol. 75; pp. 186 - 196 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
Elsevier B.V
01.11.2021
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1566-2535 1872-6305 |
| DOI | 10.1016/j.inffus.2021.06.005 |
Cover
| Summary: | Deep neural nets (DNNs) mostly tend to outperform other machine learning (ML) approaches when the training data is abundant, high-dimensional, sparse, or consisting of raw data (e.g., pixels). For datasets with other characteristics – for example, dense tabular numerical data – algorithms such as Gradient Boosting Machines and Random Forest often achieve comparable or better performance at a fraction of the time and resources. These differences suggest that combining these approaches has potential to yield superior performance. Existing attempts to combine DNNs with other ML approaches, which usually consist of feeding the output of the latter into the former, often do not produce positive results. We argue that this lack of improvement stems from the fact that the final classifications fail to provide the DNN with an understanding of the other algorithms’ decision-making process (i.e., its “logic”). In this study we present F-PENN, a novel approach for combining decision forests and DNNs. Instead of providing the final output of the forest (or its trees) to the DNN, we provide the paths traveled by each sample. This information, when fed to the neural net, yields significant improvement in performance. We demonstrate the effectiveness of our approach by conducting extensive evaluation on 56 datasets and comparing F-PENN to four leading baselines: DNNs, Gradient Boosted Decision Trees (GBDT), Random Forest and DeepFM. We show that F-PENN outperforms the baselines in 69%–89% of dataset and achieves an overall average error reduction of 16%–26%.
•We present a novel method to combine ensemble models and neural networks.•We encode the sample trajectory in the ensemble to enrich samples representation.•The encoding method used is Word2Vec — what creates de-facto data augmentation.•We evaluate our model on 56 open-sourced datasets.•We present superior performances against several baselines. |
|---|---|
| ISSN: | 1566-2535 1872-6305 |
| DOI: | 10.1016/j.inffus.2021.06.005 |