Maximal Information Coefficient-Based Undersampling Method for Highly-Imbalanced Learning

Learning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced lear...

Full description

Saved in:
Bibliographic Details
Published inIEEE access Vol. 13; pp. 4126 - 4135
Main Author Qin, Haiou
Format Journal Article
LanguageEnglish
Published IEEE 2025
Subjects
Online AccessGet full text
ISSN2169-3536
2169-3536
DOI10.1109/ACCESS.2025.3525475

Cover

More Information
Summary:Learning from highly-imbalanced datasets is still a big challenge in the field of machine learning because models created by general learning algorithms are weak in recognizing the samples from the minority class correctly. Undersampling is an alternative kind of methods to deal with imbalanced learning. In this paper, we propose a new undersampling method based on maximal information coefficient (including two algorithms MICU-1 and MICU-2) to rebalance the datasets. In order to evaluate the effectiveness of the method, 20 highly- imbalanced datasets are used for the benchmarks. Results show that compared with other undersampling methods, maximal information coefficient-based undersampling method are competitive in terms of G-mean and F-measure.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2025.3525475