Advanced User Credit Risk Prediction Model Using LightGBM, XGBoost and Tabnet with SMOTEENN

Bank credit risk is a significant challenge in modern financial transactions, and the ability to identify qualified credit card holders among a large number of applicants is crucial for the profitability of a bank's credit card business. In the past, screening applicants' conditions often...

Full description

Saved in:
Bibliographic Details
Published inIEEE International Conference on Power, Intelligent Computing and Systems (Online) pp. 876 - 883
Main Authors Yu, Chang, Jin, Yixin, Xing, Qianwen, Zhang, Ye, Guo, Shaobo, Meng, Shuchen
Format Conference Proceeding
LanguageEnglish
Published IEEE 26.07.2024
Subjects
Online AccessGet full text
ISSN2834-8567
DOI10.1109/ICPICS62053.2024.10796247

Cover

More Information
Summary:Bank credit risk is a significant challenge in modern financial transactions, and the ability to identify qualified credit card holders among a large number of applicants is crucial for the profitability of a bank's credit card business. In the past, screening applicants' conditions often required a significant amount of manual labor, which was time-consuming and labor-intensive. Although the accuracy and reliability of previously used ML models have been continuously improving, the pursuit of more reliable and powerful AI intelligent models is undoubtedly the unremitting pursuit by major banks in the financial industry. In this study, we used a dataset of over 40,000 records provided by a commercial bank as the research object. We compared various dimension reduction techniques such as PCA and T-SNE for processing high-dimensional datasets and performed in-depth adaptation and tuning of distributed models such as LightGBM and XGBoost, as well as deep models like Tabnet. After a series of research and processing, we obtained excellent research results by combining SMOTEENN with these techniques. The experiments demonstrated that LightGBM combined with PCA and SMOTEENN techniques can assist banks in accurately predicting potential high-quality customers, showing relatively outstanding performance compared to other models.
ISSN:2834-8567
DOI:10.1109/ICPICS62053.2024.10796247