Credit Score Prediction Using CART Algorithm in Python
Loading...
Downloads
11
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská – Technická univerzita Ostrava
Location
Signature
Abstract
In this thesis, the objective is to create a credit scoring model that can be used by a bank to predict the creditworthiness of new customers. We consider the Home Equity dataset, which contains baseline and loan performance data for over 5,900 home equity loans. To improve the accuracy of scoring from large datasets, several machine learning algorithms can be applied. In this study, we adopted the CART algorithm to build a model that predicts whether the applicant can pay back the loan or not. By using machine learning techniques like CART, we can develop more precise models to help us evaluate the creditworthiness of loan applicants. For this we have used Python for developing the models. Ultimately, accurate credit scoring is critical to the lending process and helps to minimize the risks associated with granting credit. We have developed 4 models in this thesis to predict the credit score the 1st model is data without balancing, 2nd is balanced data model, 3rd random under sampling model and 4th random over sampling the application and the process we have explained in the application part of the thesis also the methods like tuning hyperparameters and important feature selection. We have evaluate the performance of the model using various metrics, including accuracy, precision, recall, and F1-score. After developing and evaluating the results of all the model’s accuracy we found that the balanced data provides us with the highest accuracy prediction of credit scores. The model works best predicting in the form of binary and in the form of credit scores, we have the accuracy as 83% for the test set and 87% for the train set while predicting the binary form and 92 % in case of credit score prediction. There are still ways to improve the accuracy, as the accuracy of the credit score is important. If we can improve the accuracy score of the binary form prediction, we can ultimately improve the accuracy score of the credit scores too.
Description
Subject(s)
Credit scoring, predict, Creditworthiness, Random over sampling, Random under sampling, Balanced data, Unbalanced data, CART, Decision tree, Customers, Bank, Loan, Accuracy, evaluation