Improving Naive Bayes for Regression with Optimized Artificial Surrogate Data
Can we evolve better training data for machine learning algorithms? To investigate this question we use population-based optimization algorithms to generate artificial surrogate training data for naive Bayes for regression. We demonstrate that the generalization performance of naive Bayes for regres...
Saved in:
| Published in | Applied artificial intelligence Vol. 34; no. 6; pp. 484 - 514 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
Philadelphia
Taylor & Francis
11.05.2020
Taylor & Francis Ltd Taylor & Francis Group |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0883-9514 1087-6545 1087-6545 |
| DOI | 10.1080/08839514.2020.1726615 |
Cover
| Summary: | Can we evolve better training data for machine learning algorithms? To investigate this question we use population-based optimization algorithms to generate artificial surrogate training data for naive Bayes for regression. We demonstrate that the generalization performance of naive Bayes for regression models is enhanced by training them on the artificial data as opposed to the real data. These results are important for two reasons. Firstly, naive Bayes models are simple and interpretable but frequently underperform compared to more complex "black box" models, and therefore new methods of enhancing accuracy are called for. Secondly, the idea of using the real training data indirectly in the construction of the artificial training data, as opposed to directly for model training, is a novel twist on the usual machine learning paradigm. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0883-9514 1087-6545 1087-6545 |
| DOI: | 10.1080/08839514.2020.1726615 |