An ensemble learning approach for anomaly detection in credit card data with imbalanced and overlapped classes

Electronic payment methods have become increasingly popular for business transactions, both online and in-person, across the globe. Anomalies like online fraud and default payments, which can result in substantial financial losses, have become more common as the usage of credit cards in online purch...

Full description

Saved in:
Bibliographic Details
Published inJournal of information security and applications Vol. 78; p. 103618
Main Authors Islam, Md Amirul, Uddin, Md Ashraf, Aryal, Sunil, Stea, Giovanni
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.11.2023
Subjects
Online AccessGet full text
ISSN2214-2126
2214-2134
DOI10.1016/j.jisa.2023.103618

Cover

More Information
Summary:Electronic payment methods have become increasingly popular for business transactions, both online and in-person, across the globe. Anomalies like online fraud and default payments, which can result in substantial financial losses, have become more common as the usage of credit cards in online purchases has increased. To address this issue, researchers have explored various machine learning models and their ensemble techniques for detecting anomalies in credit card transaction data. However, detecting anomalies in this data can be challenging due to overlapping class samples and an imbalanced class distribution. Therefore, the detection rate of anomalies from minority class samples is relatively low, and general learning algorithms can be biased towards the majority class samples. In this paper, we propose a model called Credit Card Anomaly Detection (CCAD) that leverages the base learners paradigm and meta-learning ensemble techniques to improve the detection rate of credit card anomalies. We utilize four outlier detection algorithms as base learners and XGBoost algorithm as meta learner in the proposed stacked ensemble approach to detect anomaly in credit card transactions. We apply stratified sampling technique and k-fold cross-validation process to address the issues of data imbalance and overfitting. In addition, the discordance rate is calculated to enhance the accuracy of ensemble learning performances. The proposed model is trained and tested using two datasets: CCF (Credit Card Fraud) and CCDP (Credit Card Default Payment). Experimental results demonstrate that our approach outperforms existing approaches, particularly in detecting anomalies from the minority class instances of these datasets.
ISSN:2214-2126
2214-2134
DOI:10.1016/j.jisa.2023.103618