Replication Data for: Alternative Datasets for Credit Scoring of Thin File Consumers

DataONE2026-01-29 更新2026-02-07 收录

下载链接：

https://search.dataone.org/view/sha256:ace825f228759a23198955dea634cda726c7495a98d8ed9bb85330a3fadc72c1

下载链接

链接失效反馈

官方服务：

资源简介：

Credit scoring is essential in financial services, allowing institutions to assess consumers' creditworthiness. Traditional credit scoring models heavily rely on extensive transaction history, which often poses a significant challenge for thin-file consumers—individuals with limited credit history. This comprehensive review aims to explore and evaluate various alternative datasets that can be utilised to improve credit scoring for thin-file consumers. By moving beyond traditional transaction profiles, alternative datasets such as social media data, web browsing behaviour, digital footprints, and telecom data offer new dimensions to assess consumer credit risk. Additionally, the review compares the effectiveness of various machine learning algorithms, including support vector machines, neural networks, decision trees, random forests, and hybrid models, in leveraging these datasets for credit scoring. The findings indicate that integrating multiple alternative data sources with advanced machine learning algorithms can significantly improve the accuracy and reliability of credit risk assessments. The comparative analysis of machine learning algorithms used in credit scoring highlights the strengths and limitations of different approaches. Support vector machines (SVM), neural networks, decision trees, random forests, and hybrid models have all shown varying degrees of success in utilising alternative datasets for credit scoring. Hybrid models combine multiple machine learning techniques and are particularly effective in leveraging diverse data sources to provide a robust credit risk assessment. This review underscores the potential of alternative datasets in revolutionizing credit scoring for thin-file consumers. By incorporating new data dimensions and advanced machine learning algorithms, researchers can improve their ability to assess credit risk accurately. Future researchers may continue to refine these models and explore new alternative datasets to enhance credit scoring models using machine learning algorithms.

信用评分在金融服务领域至关重要，可助力金融机构评估消费者的信用资质。传统信用评分模型高度依赖海量交易历史，这往往给薄档案消费者（thin-file consumers）——即信用记录有限的个人——带来显著挑战。本综述旨在探索并评估可用于改善薄档案消费者信用评分的各类替代数据集。跳出传统交易画像范畴，社交媒体数据、网页浏览行为、数字足迹以及电信数据等替代数据集，为消费者信用风险评估提供了全新维度。此外，本综述还对比了各类机器学习算法在利用此类数据集开展信用评分方面的有效性，涵盖支持向量机、神经网络、决策树、随机森林以及混合模型。研究结果表明，将多源替代数据与先进机器学习算法相结合，可显著提升信用风险评估的准确性与可靠性。针对信用评分领域所用机器学习算法的对比分析，阐明了各类方法的优势与局限。支持向量机（SVM）、神经网络、决策树、随机森林及混合模型在利用替代数据集开展信用评分方面，均展现出不同程度的应用成效。混合模型整合了多种机器学习技术，在利用多源数据开展稳健信用风险评估方面尤为有效。本综述凸显了替代数据集在革新薄档案消费者信用评分领域的巨大潜力。通过融入新型数据维度与先进机器学习算法，研究人员可提升信用风险的精准评估能力。未来研究者可进一步优化此类模型，并探索新型替代数据集，以基于机器学习算法进一步完善信用评分模型。

创建时间：

2026-02-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集