TOMEK-LINK AND SMOTE COMBINATION TO OVERCOME CLASS Imbalance in CREDIT CARD FRAUD
DOI:
https://doi.org/10.31294/larik.v2i2.1789Keywords:
Imbalanced Class, Oversampling, Tomek-Link, SMOTE, C5.0Abstract
Increasing online trading activities or e-commerce has become a trend today. As a result the most common crime is credit card fraud or carding. There are approximately 1,000 cases of fraud in one million transactions so that data is collected in the form of datasets of credit card fraud risk. In some cases, minority classes are more important to identify than the majority class as in the case of credit card transactions. In this study to deal with the problem of class imbalances on credit card fraud risk datasets, the proposed resampling method is the Tomek-Link and SMOT data level with the C5.0 classification model. This research was conducted to improve the accuracy of AUC in the C5.0 classification algorithm model. The results showed that the proposed method was able to increase the AUC value of 0.134 compared to without the resampling method.
References
S. N. Prasetyo, “Rumusan Pengaturan Credit Card Fraud Dalam Hukum Pidana Indonesia Ditinjau Dari Asas Legalitas,” J. Ilm. Huk. Leg., vol. 24, no. 1, p. 101, 2017, doi: 10.22219/jihl.v24i1.4260.
P. K. Robertson, “Cone penetration test (CPT)-based soil behaviour type (SBT) classification system — An update,” Can. Geotech. J., vol. 53, no. 12, pp. 1910–1927, 2016, doi: 10.1139/cgj-2016-0044.
S. Vluymans, D. S. Tarrago, Y. Saeys, C. Cornelis, and F. Herrera, “Fuzzy multi-instance classifiers,” IEEE Trans. Fuzzy Syst., vol. 24, no. 6, pp. 1395–1409, 2016, doi: 10.1109/TFUZZ.2016.2516582.
R. Siringoringo, “KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR,” J. ISD, vol. 3, no. 1, pp. 44–49, 2018.
A. R. Ismail, N. Z. Abidin, and M. K. Maen, “Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare,” J. Robot. Control, vol. 3, no. 2, pp. 143–152, 2022, doi: 10.18196/jrc.v3i2.13133.
W. C. Lin, C. F. Tsai, Y. H. Hu, and J. S. Jhang, “Clustering-based undersampling in class-imbalanced data,” Inf. Sci. (Ny)., vol. 409–410, pp. 17–26, 2017, doi: 10.1016/j.ins.2017.05.008.
Z. Sun, Q. Song, X. Zhu, H. Sun, B. Xu, and Y. Zhou, “A novel ensemble method for classifying imbalanced data,” Pattern Recognit., vol. 48, no. 5, pp. 1623–1637, 2015, doi: 10.1016/j.patcog.2014.11.014.
E. Irawan and R. S. Wahono, “Penggunaan Random Under Sampling untuk Penanganan Ketidakseimbangan Kelas pada Prediksi Cacat Software Berbasis Neural Network,” J. Softw. Eng., vol. 1, no. 2, pp. 92–100, 2015.
R. Azmatul Barro, I. D. Sulvianti, and M. Afendi, “Penerapan Synthetic Minority Oversampling Technique (Smote) Terhadap Data Tidak Seimbang Pada Pembuatan Model Komposisi Jamu,” Xplore J. Stat., vol. 1, no. 1, pp. 1–6, 2013.
A. Asrin, “Metode Penelitian Eksperimen,” J. Maqasiduna Ilmu Humaniora, Pendidik. Ilmu Sos., vol. 2, no. 1, pp. 1–9, 2022, [Online]. Available: https://journal.mukhlisina.id/index.php/maqasiduna/article/view/24/15
A. Nurwanda and E. Badriah, “Analisis Program Inovasi Desa Dalam Mendorong Pengembangan Ekonomi Lokal Oleh Tim Pelaksana Inovasi Desa (PID) Di Desa Bangunharja Kabupaten Ciamis,” J. Ilm. Ilmu Adm. Negara, vol. 7, no. 1, pp. 68–75, 2020, [Online]. Available: https://jurnal.unigal.ac.id/index.php/dinamika/article/download/3313/pdf
U. Enri, “PENERAPAN ALGORITMA C4.5 DALAM PEMILIHAN PROGRAM STUDI FAKULTAS ILMU KOMPUTER (Studi Kasus Sekolah Menengah Atas Negeri 1 Tambun Utara),” J. Rekayasa Inf., vol. 7, no. 1, pp. 1–7, 2018.
M. Kuhn and K. Johnson, Applied Predictive Modeling [Hardcover]. 2013. doi: 10.1007/978-1-4614-6849-3.
M. Kuhn, “caret Package,” J. Stat. Softw., vol. 28, no. 5, pp. 1–26, 2008.