Komparasi Kinerja Algoritma Logistic Regression, Random Forest, dan Naïve Bayes dalam Klasifikasi Risiko Kredit
Keywords:
klasifikasi, komparasi, risiko kredit, supervised learning, classification, comparison, credit riskAbstract
Klasifikasi risiko kredit merupakan proses penting dalam industri keuangan untuk mengidentifikasi nasabah yang berpotensi mengalami gagal bayar. Penelitian ini bertujuan membandingkan kinerja algoritma supervised learning antara lain Logistic Regression, Random Forest, dan Naïve Bayes. Pada klasifikasi risiko kredit ini menggunakan dataset German Credit yang terdiri dari 1.000 data nasabah. Tahapan penelitian meliputi data preprocessing, normalisasi data, pembagian data latih dan data uji dengan rasio 80:20, pemodelan menggunakan KNIME Analytics Platform 5.9.0, serta evaluasi menggunakan Accuracy, Precision, Recall, F1-Score, Confusion Matrix, dan Cohen's Kappa. Fokus evaluasi penelitian diarahkan pada kemampuan model dalam mendeteksi kelas bad sebagai representasi nasabah berisiko. Hasil pengujian menunjukkan bahwa Random Forest memberikan performa terbaik dengan akurasi 81,50%, precision 0,94, recall 0,82, F1-Score 0,87, dan Cohen's Kappa 0,51. Logistic Regression memperoleh akurasi 76,10% dan Naïve Bayes sebesar 75,50%. Temuan ini menunjukkan bahwa Random Forest lebih efektif dalam mengidentifikasi nasabah berisiko dibandingkan dua algoritma lainnya. Kontribusi penelitian terletak pada pendekatan evaluasi berbasis risiko yang menekankan kemampuan deteksi kelas bad sebagai dasar rekomendasi model klasifikasi risiko kredit.
Credit risk classification is a critical process in the financial industry for identifying customers at risk of default. This study aims to compare the performance of supervised learning algorithms, including Logistic Regression, Random Forest, and Naïve Bayes. For this credit risk classification, the German Credit dataset, consisting of 1,000 customer records, was used. The research stages include data preprocessing, data normalization, splitting the data into training and test sets at an 80:20 ratio, modeling using the KNIME Analytics Platform 5.9.0, and evaluation using Accuracy, Precision, Recall, F1-Score, Confusion Matrix, and Cohen’s Kappa. The evaluation focused on the models’ ability to detect the “bad” class, which represents high-risk customers. The test results showed that Random Forest delivered the best performance with an accuracy of 81.50%, precision of 0.94, recall of 0.82, F1-Score of 0.87, and Cohen’s Kappa of 0.51. Logistic Regression achieved an accuracy of 76.10%, and Naïve Bayes achieved 75.50%. These findings indicate that Random Forest is more effective at identifying at-risk customers than the other two algorithms. The contribution of this research lies in its risk-based evaluation approach, which emphasizes the ability to detect the bad class as the basis for credit risk classification model recommendations.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Dewi Ayu Nur Wulandari, Omar Pahlevi, Yopi Handrianto, Ichsan Ramdhani

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish manuscripts through IJCIT agree to the following provisions:
1. The copyright holder of the article is the author.
2. The author grants the right to publish the scientific article to IJCIT as the first publisher. At the same time, the author grants permission/license regarding the Creative Commons Attribution License to other parties to distribute the article.
3. Non-exclusivity matters in the distribution of the Journal, namely the publication of the author's scientific article can be agreed separately (for example, a request to be included in the institutional library or published as a book) then adjust the author as one of the parties and IJCIT as the first publisher.
4. Authors can publish articles online before and during the manuscript submission process (for example, in the Repository or on the website of the organization/institution), as this can encourage the creation of estimates and exchange of citations.
5. Manuscripts and related materials published through this Journal are distributed under the Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA).



