Improved C45 performance with gain ratio for credit approval dataset

Ivandari Ivandari, M Adib Al Karomi, Much. Rifqi Maulana

Abstract


Abstract— People's shopping behavior has undergone many changes after the COVID-19 pandemic. Many people have switched to using the marketplace to make buying and selling transactions. The payment process in the marketplace is relatively easy, especially when using a credit card. The increase in demand for credit must be addressed better by financial providers to minimize bad loans. The best thing in minimizing bad credit is to be more selective in choosing credit customers. Data mining is a field that can study old data to become new knowledge in the future. In data mining, the classification of bad credit customers is mostly done. One of the algorithms that excels in handling credit approval datasets is C45. The C45 model is widely used because it has an output decision tree that is easier to understand in human language. The number of data attributes can affect the performance of the algorithm. Feature selection is a form of attribute reduction to improve data quality and improve classification algorithm performance. Gain ratio is the development of information gain and is the best feature selection model and is widely used by researchers. This study performs a classification using C45 and uses a gain ratio for the selection of credit approval data features. By using the gain ratio, the accuracy of the C45 classification algorithm increased from the previous 94.12% to 95.29%.


Keywords


Decision tree; information gain ratio; accuracy

Full Text:

PDF

References


B. J. G. Rozo, J. Crook, and G. Andreeva, “The role of web browsing in credit risk prediction,” Decis. Support Syst., p. 113879, 2022, doi: 10.1016/j.dss.2022.113879.

R. A. Mancisidor, M. Kampffmeyer, K. Aas, and R. Jenssen, “Generating customer’s credit behavior with deep generative models,” Knowledge-Based Syst., vol. 245, p. 108568, 2022, doi: 10.1016/j.knosys.2022.108568.

M. R. Maulana and M. A. Al Karomi, “Sistem Pendukung Keputusan Persetujuan Kredit Menggunakan Algoritma C4.5,” J. IC-Tech, vol. Vol. XI No, no. 1, pp. 29–38, 2016, [Online]. Available: http://jurnal.stmik-wp.ac.id/gdl.php?mod=browse&op=read&id=ictech--muchrifqim-80.

Ivandari and M. A. Al Karomi, “Algoritma K-NN untuk klasifikasi dataset Covid-19 survillance,” IC Tech, vol. 16, no. 1, pp. 12–15, 2021, [Online]. Available: https://ejournal.stmik-wp.ac.id/index.php/ictech/article/view/137.

M. A. Al Karomi, M. R. Maulana, S. J. Prasetiyono, Ivandari, and Arochman, “Strengthening campus finance by analyzing attribute attributes for student registration classifications.” p. 1, 2019, [Online]. Available: https://jurnal.polines.ac.id/index.php/jaict/article/view/1431.

V. K. Xindong Wu, The Top Ten Algorithm in Data Mining. 2009.

Ivandari and M. A. Al Karomi, “Classification of Covid-19 Survillance Datasets using the Decision Tree Algorithm,” Jaict, vol. 6, no. 1, pp. 44–49, 2021, [Online]. Available: https://jurnal.polines.ac.id/index.php/jaict/article/view/2896.

Ivandari, T. T. Chasanah, S. W. Binabar, and M. A. Al Karomi, “Data Attribute Selection with Information Gain to Improve Credit Approval Classification Performance using K-Nearest Neighbor Algorithm,” IJIBEC, vol. I, pp. 15–24, 2017.

A. G. Karegowda, A. S. Manjunath, and M. A. Jayaram, “Comparative Study of Attribute Selection using Gain Ratio and Correlation Based Feature Selection,” Int. J. Inf. Technol. Knowl. Manag., vol. 2, no. 2, pp. 271–277, 2010.

I. Indrayanti, S. Devi, and M. A. Al Karomi, “Peningkatan Akurasi Algoritma KNN dengan Seleksi Fitur Gain Ratio untuk Klasifikasi Penyakit Diabetes Mellitus,” IC-TECH, vol. XIII, no. 2, pp. 1–6, 2017, [Online]. Available: ejournal.stmik-wp.ac.id.

X. Wu et al., Top 10 algorithms in data mining, vol. 14, no. 1. 2007.

O. Maimoon and L. Rokach, Data Mining and Knowledge Discovery Handbook, vol. 40, no. 6. Springer, 2010.




DOI: http://dx.doi.org/10.32497/jaict.v7i2.3978

Refbacks

  • There are currently no refbacks.


ISSN: 2541-6340
Online ISSN: 2541-6359

Visitor: 

View My Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.