SBNNR: Small-Size Bat-Optimized KNN Regression

Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in som...

Full description

Saved in:
Bibliographic Details
Published inFuture internet Vol. 16; no. 11; p. 422
Main Authors Seyghaly, Rasool, Garcia, Jordi, Masip-Bruin, Xavi, Kuljanin, Jovana
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.11.2024
Subjects
Online AccessGet full text
ISSN1999-5903
1999-5903
DOI10.3390/fi16110422

Cover

More Information
Summary:Small datasets are frequent in some scientific fields. Such datasets are usually created due to the difficulty or cost of producing laboratory and experimental data. On the other hand, researchers are interested in using machine learning methods to analyze this scale of data. For this reason, in some cases, low-performance, overfitting models are developed for small-scale data. As a result, it appears necessary to develop methods for dealing with this type of data. In this research, we provide a new and innovative framework for regression problems with a small sample size. The base of our proposed method is the K-nearest neighbors (KNN) algorithm. For feature selection, instance selection, and hyperparameter tuning, we use the bat optimization algorithm (BA). Generative Adversarial Networks (GANs) are employed to generate synthetic data, effectively addressing the challenges associated with data sparsity. Concurrently, Deep Neural Networks (DNNs), as a deep learning approach, are utilized for feature extraction from both synthetic and real datasets. This hybrid framework integrates KNN, DNN, and GAN as foundational components and is optimized in multiple aspects (features, instances, and hyperparameters) using BA. The outcomes exhibit an enhancement of up to 5% in the coefficient of determination (R2 score) using the proposed method compared to the standard KNN method optimized through grid search.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1999-5903
1999-5903
DOI:10.3390/fi16110422