Publication: Variable Importance Analysis in Default Prediction using Machine Learning Techniques
No Thumbnail Available
Date
2018
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
SciTePress
Abstract
In this study, different data mining techniques were applied to a finance credit data set from a financial institution to provide an automated and objective profitability measurement. Two-step methodology was used Determining the variables to be included in the model and deciding on the model to classify the potential credit application as bad credit (default) or good credit (not default). The phrases bad credit and good credit are used as class labels since they are used like this in financial sector jargon in Turkey. For this twostep procedure, different variable selection algorithms like Random Forest, Boruta and machine learning algorithms like Logistic Regression, Random Forest, Artificial Neural Network were tried. At the end of the feature selection phase, CRA and III variables were determined as most important variables. Moreover, occupation and product number were also predictor variables. For the classification phase, Neural Network model was the best model with higher accuracy and low average square error also Random Forest model better resulted than Logistic Regression model. © 2020 Elsevier B.V., All rights reserved.
