SMOTE-MRS: A Novel SMOTE-Multiresolution Sampling Technique for Imbalanced Distribution to Improve Prediction of Anemia

Anemia is a widespread worldwide health problem that has a substantial effect on groups who are particularly susceptible. The objective of this work is to improve the diagnosis of anemia by creating a hybrid machine learning model called SMOTE-MRS. This model combines SMOTE, K-Means Clustering, and...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 12; pp. 154675 - 154699
Main Authors Chaerul Ekty Saputra, Dimas, Sunat, Khamron, Ratnaningsih, Tri
Format Journal Article
LanguageEnglish
Published Piscataway IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text
ISSN2169-3536
2169-3536
DOI10.1109/ACCESS.2024.3482968

Cover

More Information
Summary:Anemia is a widespread worldwide health problem that has a substantial effect on groups who are particularly susceptible. The objective of this work is to improve the diagnosis of anemia by creating a hybrid machine learning model called SMOTE-MRS. This model combines SMOTE, K-Means Clustering, and Random Over Sampling techniques. The model aims to enhance diagnosis accuracy and ultimately improve healthcare outcomes, specifically in Indonesia, by resolving the imbalance in the dataset. The SMOTE-MRS model mitigates the issue of imbalanced datasets by combining the techniques of SMOTE, K-Means clustering, and random oversampling. K-Means clustering first divides the dataset into K groups. SMOTE then produces artificial instances of the underrepresented class inside every cluster. Random oversampling is a technique that replicates instances of the minority class to make the dataset balanced and enhance the training of machine learning models. The research evaluated the performance of Random Forest (RF), Naïve Bayes (NB), and Support Vector Machine (SVM) models in predicting anemia, using the SMOTE-MRS technique. The SMOTE-MRS algorithm exhibited outstanding performance, achieving scores of 0.973 for Accuracy, 0.990 for Recall, 0.968 for Precision, 0.979 for F1-Score, and 0.994 for AUC. These findings demonstrate its exceptional capacity to effectively handle class imbalance and enhance prediction accuracy. The SMOTE-MRS algorithm has outstanding performance on a wide range of datasets, such as those related to anemia, diabetes, breast cancer, and kidney failure datasets. This showcases its robustness and versatility in many predictive modeling scenarios. The findings demonstrate that SMOTE-MRS outperforms standard approaches such as SMOTE, SMOTE-ENC, and ROS in effectively addressing class imbalance. The research affirms the better performance of SMOTE-MRS in addressing class imbalance for the prediction of anemia. SMOTE-MRS surpasses traditional approaches, attaining exceptional results in all measures. Future studies should prioritize the optimization of SMOTE-MRS to minimize small performance declines, conduct a comparative analysis with other approaches, and validate its adaptability across various domains. Furthermore, it is advisable to include SMOTE-MRS in deep learning models and real-time applications.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3482968