Development and evaluation of a novel framework to enhance k-NN algorithm’s accuracy in data sparsity contexts

This paper presents a novel framework for implementing the k-NN algorithm, designed to enhance its accuracy in contexts with sparse data. The framework addresses limitations in the algorithm’s training process by optimizing data structures. It employs composite datasets generated from the initial da...

Full description

Saved in:
Bibliographic Details
Published inScientific reports Vol. 14; no. 1; pp. 25036 - 13
Main Authors Giannopoulos, Panagiotis G., Dasaklis, Thomas K., Rachaniotis, Nikolaos
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 23.10.2024
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text
ISSN2045-2322
2045-2322
DOI10.1038/s41598-024-76909-6

Cover

More Information
Summary:This paper presents a novel framework for implementing the k-NN algorithm, designed to enhance its accuracy in contexts with sparse data. The framework addresses limitations in the algorithm’s training process by optimizing data structures. It employs composite datasets generated from the initial data using a data-driven fuzzy Analytic Hierarchy Process weighting scheme. This approach is designed to enhance the informational content in the initial datasets, thus reducing the entropy and implementation uncertainty. The framework was evaluated using 75 publicly available datasets and 3 generated datasets, demonstrating significant accuracy improvements across various k -parameter values. The findings were rigorously generalized using non-parametric hypothesis tests; while the resulting sensitivity was assessed by applying different distance metrics. By enhancing informational content, the composite data structures contribute to both accuracy improvements and scalability, particularly in data-sparse contexts. This relationship underscores the critical role of entropy in enhancing the performance of explainable machine learning algorithms, providing a valuable and interpretable tool for transforming data structures in sparse data environments.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2045-2322
2045-2322
DOI:10.1038/s41598-024-76909-6