five

Credit risk scoring based on machine learning models: a comparative assessment

收藏
DataCite Commons2023-01-20 更新2025-04-16 收录
下载链接:
http://doi.nrct.go.th/?page=resolve_doi&resolve_doi=10.14457/TU.the.2022.59
下载链接
链接失效反馈
官方服务:
资源简介:
Commercial banks primarily generate revenue from secured and unsecured loans. However, despite the use of risk assessment tools, many borrowers are unable to repay their financial obligations for various reasons, which can impact the profitability and reputation of financial institutions. Therefore, credit approval processes should be carefully considered. Credit risk scoring was developed to assess customers before approving loans in order to improve the performance of credit assessment. This research aims to develop a credit risk scoring model using machine learning model with feature selection techniques. The dataset used in this research is the home credit default risk dataset from Kaggle. LightGBM was compared to Decision Tree and Random Forest, which were enhanced with feature engineering and statistical methods. Feature selection was performed to identify the optimal number of features. The model performance was evaluated using the AUC score. The number of features was reduced from 624 to 75 after performing feature selection process, and the results showed that reducing the number of features did not impact the model performance while it improves training time. LightGBM had the best AUC score at 78.00%, and Decision Tree had the best computational time reduction at 68.22%
提供机构:
Thammasat University
创建时间:
2023-01-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作