The summary of the literature review.

Figshare2023-12-08 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/The_summary_of_the_literature_review_/24775915

下载链接

链接失效反馈

官方服务：

资源简介：

In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

近年来，随着金融体系持续完善与银行业快速发展，银行业内部竞争日趋激烈。与此同时，随着信息技术与互联网技术的迅猛发展，客户对金融产品的选择日趋多元，其对银行机构的依赖度与忠诚度不断降低，商业银行的客户流失问题也愈发凸显。如何精准预测客户行为、留存现有客户，已成为商业银行亟待破解的重大挑战。为此，本研究以Kaggle平台上某商业银行的业务数据为研究对象，采用多种采样方法对数据进行对比平衡处理，基于遗传算法-极限梯度提升树（GA-XGBoost）构建了用于客户流失识别的商业银行客户流失预测模型，并对该模型开展可解释性分析，旨在为银行业防范客户流失提供决策支持与优化建议。研究结果表明：(1) 所采用的合成少数类过采样技术编辑最近邻算法（SMOTEENN）在处理银行业数据不平衡问题上，相较于合成少数类过采样技术（SMOTE）与自适应合成采样算法（ADASYN）效果更优。(2) 经遗传算法优化的极限梯度提升树（XGBoost）模型，其F1值与受试者工作特征曲线下面积（AUC）分别可达90%与99%，相较于其余六种机器学习模型表现最优，GA-XGBoost分类器被认定为解决客户流失问题的最佳方案。(3) 借助夏普利值（Shapley Values），本研究阐释了各特征对模型输出的影响机制，并分析了对模型预测结果影响较大的特征，包括客户过去一年的交易总次数、交易总金额、所持产品数量以及总销售余额。本研究的主要贡献体现在两个方面：(1) 本研究可通过精准识别流失客户，从黑箱模型中提取有效信息，为商业银行提升服务质量、留存客户提供参考依据；(2) 可为其他相关行业的客户流失预警模型提供借鉴，助力银行业维持客户稳定性、巩固市场地位并降低企业损失。

创建时间：

2023-12-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集