ZZHHJ/bank_churners
收藏Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ZZHHJ/bank_churners
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
---
# Bank Churners Analysis
# Part 1 - Dataset Overview
## Dataset Description
The dataset is titled “Credit Card Customers” (Bank Churners), obtained from Kaggle. It contains detailed demographic, financial, and behavioral information about 10,127 credit card users of a retail banking institution, recorded across 23 features (columns), along with an indicator of whether each customer has churned.
Each record represents a single customer account, describing:
### Demographic Attributes
Age, Gender, Marital Status, Education Level, Income Category, Number of Dependents
### Account & Credit Characteristics
Card Category, Credit Limit, Revolving Balance, Average Open To Buy (available credit)
### Behavioral Indicators
Months on Book (tenure), Total Relationship Count (products held), Months Inactive, Contacts Count (last 12 months), Total Transaction Amount & Count (yearly), Change in Spending (Q4 → Q1), Change in Transaction Frequency, Credit Utilization Ratio
## Objective of the Analysis
The main goal of this analysis is to investigate customer behavior within a retail banking environment and identify the factors that drive customer retention versus attrition. We aim to uncover the demographic, financial, and behavioral patterns that distinguish customers who remain active from those at risk of churning. By examining account usage, spending intensity, product engagement, and credit behavior, the analysis seeks to surface actionable insights that can help financial institutions improve customer loyalty, design targeted retention strategies, and optimize overall customer lifecycle management.
Customer churn is a major concern for banks because losing clients directly affects revenue, stability, and long term growth. Understanding who is likely to leave and why is essential for preventing financial loss and strengthening customer relationships. This dataset provides realistic behavioral and financial signals that allow us to explore the underlying causes of attrition, making the analysis both meaningful and highly relevant to real world banking operations.
## Target Variable
The target variable in this analysis is Attrition_Flag, which indicates whether a customer is an “Existing Customer” or an “Attrited Customer”. This variable represents customer churn, and the goal of the analysis is to explore which demographic, financial, and behavioral factors are associated with a higher likelihood of attrition.
# Part 2 - Exploratory Data Analysis
## Data Cleaning
The dataset was cleaned to ensure consistency and analytical readiness.
`CLIENTNUM` was converted to an object identifier, and implicit missing values (“Unknown”, “N/A”) were standardized to `NaN` while keeping the affected rows to preserve potentially meaningful behavioral patterns.
Zero values were reviewed and confirmed to represent valid customer behavior.
Duplicate checks verified all records were unique, and categorical fields showed no formatting inconsistencies. Numerical sanity checks found no unrealistic values.
The `Attrition_Flag` column was encoded into a binary variable for easier analysis, and two model-generated columns were removed to avoid leaking predictive information. Finally, income ranges were converted into approximate numeric values to support statistical exploration.
The resulting dataset is clean, consistent, and ready for analysis.
## Outlier Detection & Handling
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/4MsZNWhqnFrpJd61hSCAd.png" width="650">
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/V1DEpDvv_7gEtzVZ7NwOY.png" width="650">
Outlier analysis was conducted on key numerical features (transaction count and transaction amount). While several high-value observations appeared, they represent genuine high-spending customers rather than data errors. Because these values naturally occur in real banking environments, where a small segment of customers often shows significantly higher activity, we chose to retain them.
Although these customers were not analyzed as a dedicated subgroup later in the project, keeping them in the dataset preserves the full behavioral spectrum of the customer base and prevents introducing bias by artificially removing legitimate activity levels.
## Statistics - Attrition Flag (Target Variable)
The customer base shows an average age of 46, with most customers having 2-3 dependents and holding 3-4 banking products. Activity levels indicate moderate engagement: a median of 67 yearly transactions and around $3,900 in annual spending, with men spending slightly more than women. Churners represent 16% of the population and typically leave after about 36 months, mirroring the average tenure. These statistics provide a clear baseline overview of the customer population before deeper behavioral analysis.
## Vizualizations
### Average Transaction Amount by Gender and Age Group
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/_jczENczf1bCa3UHjIpXj.png" width="700">
The chart shows that average transaction amount declines steadily with age. Spending peaks in the early 30s and gradually decreases across older age groups, with a sharper drop after age 60. Both genders follow the same trend, with men consistently spending slightly more than women across all age segments.
### Product Holding Distribution Across Customer Tenure
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/yjxFAhWIQjFqt8kyxwuIj.png" width="700">
Most customers hold exactly 3 products regardless of tenure, and while longer tenured customers tend to have slightly more products, the change is gradual rather than dramatic. This indicates stable customer behavior over time with limited upsell expansion.
## Research
### How a change in spending between Q4 and Q1 predict customer attrition?
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/JxVzwwJmKhNAotfzsn7Ok.png" width="700">
Although the spending change chart shows only a small difference in attrition rates (about 2%), the consistent direction suggests that some customer subgroups may react more strongly to financial changes than others. This leads to the next question: whether demographic factors such as the number of dependents are associated with different churn patterns. In other words, do customers with more dependents tend to stay longer or churn more often?
### How does the number of dependents affect customer attrition?
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/XS4cXk1Y4Mz51AhOpgwbx.png" width="700">
Although attrition rates vary slightly by number of dependents, with a mild peak among customers with 3-4 dependents. The pattern is not stable enough to claim a meaningful relationship between family size and churn. This suggests that dependents may influence financial pressure for some customers, but they do not consistently explain attrition behavior.
Given the weak signal, it prompts a deeper question: maybe churn is less about family structure and more about customer frustration or reduced engagement.
### Do customers who contact the bank more frequently show higher churn?
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/6S_dT2LdBzZ1ORqwInaIH.png" width="700">
The bubble chart shows a clear pattern: the more often customers contact the bank, the higher their likelihood of churn. Attrition rises from almost zero among customers with no contacts to full churn among those with six contacts, suggesting persistent issues or dissatisfaction.
Given this strong link, we expanded the analysis to examine whether broader engagement factors such as product ownership also help explain churn across the full customer base.
## Does the depth of the customer's relationship with the bank reduce the likelihood of attrition?
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/p0Wrua7WfmAMuji5x1IaV.png" width="700">
The chart shows a strong inverse link between product ownership and churn: customers with only 1-2 products have the highest attrition rates, while those with 5-6 products churn far less. This indicates that deeper relationships help protect against churn. This insight naturally leads to the next question: does a decline in day to day activity, rather than product count, also signal an increased risk of leaving?
## How does customer inactivity influence attrition?
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/jqqoI1GUZ0ln2tSHeavHj.png" width="700">
The analysis identifies a clear risk pattern: churn likelihood rises sharply from 1-4 months of inactivity, peaking at month 4. This means month 1 is the critical point for early intervention, before risk accelerates. After month 4, attrition decreases, suggesting that long term inactive customers are less likely to leave. Therefore, the bank should focus its retention efforts during months 1-4, where timely outreach and targeted support can significantly reduce churn.
#
# Conclusions
<img src="https://cdn-uploads.huggingface.co/production/uploads/6904aebe18a1ba17d9435d1e/4m3LVtMzaBfn4HbKnGqCt.png" width="700">
The summary profile clearly highlights the behavioral divide between customers who stay and those who churn. Existing customers consistently exhibit higher spending, more frequent transactions, and broader product ownership, indicating strong engagement and stable relationships with the bank. In contrast, attrited customers show more inactive months and significantly higher contact frequency, signaling dissatisfaction or unresolved issues that accumulate over time. When combined with earlier findings such as the effects of declining spending, rising frustration driven contacts, limited product engagement, and prolonged inactivity, it becomes evident that churn is not a sudden event but the end result of a gradual breakdown in the customer-bank relationship.
Overall, the patterns from our analysis show that churn is strongly connected to early signs of customer frustration, a weakening relationship with the bank, and a gradual loss of trust, long before the customer decides to leave.
# Strategic Recommendations for Reducing Customer Churn
### Early Intervention After 1 Month of Inactivity
Automate outreach when a customer becomes inactive for one month and offer small incentives to re-engage before churn risk peaks in months 3-4.
### Fast Track Support for High-Contact Customers
Flag customers with 4+ yearly contacts and route them to priority support to resolve recurring issues quickly and prevent frustration driven churn.
### Strengthen Engagement for Customers With 1-2 Products
Target this high risk segment with simple cross-sell offers (card, savings, digital tools) to increase product ownership and stabilize retention.
### Build a Proactive Churn Risk Monitoring System
Create a churn risk score that tracks key signals (inactivity, spending drops, high contact frequency) and triggers early retention actions automatically.
# Presentation
The video is longer than the recommended length because I wanted to present the analysis clearly and avoid skipping important steps.
I felt this was the best way to show the full process in a coherent and understandable way.
**https://drive.google.com/file/d/1yGLYvIfas9NsG_5ufhmTFEw-JB8PWUdK/view?usp=sharing**
提供机构:
ZZHHJ



