yoavschaffer127/yoav.credit.cards
收藏Hugging Face2025-11-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yoavschaffer127/yoav.credit.cards
下载链接
链接失效反馈官方服务:
资源简介:
# Credit Card Fraud Detection – EDA Project
## 1. Dataset Description
The dataset used in this project is the **Credit Card Fraud Detection Dataset** from Kaggle,
published by the ULB Machine Learning Group.
It contains 284,807 transactions and 30 PCA-transformed numerical features,
along with the target variable `Class` (1 = fraud, 0 = non-fraud).
## 2. Goal of the Analysis
The goal of the EDA is to explore patterns, anomalies, correlations, outliers,
and statistical properties to better understand fraudulent transactions.
## 3. Research Questions
- Do transaction amounts differ between fraudulent and non-fraudulent transactions?
- Which features are most correlated with the target variable (`Class`)?
- What is the class distribution in the dataset?
- Are there significant outliers in key features?
## 4. Key Visualizations
Included in the notebook:
- Histograms for transaction amounts and PCA features
- Boxplots to detect outliers
- Correlation heatmap
- Bar chart for class imbalance
## 5. Insights & Findings
- The dataset is extremely imbalanced (fraud cases < 0.2%).
- Fraudulent transactions tend to show different amount distributions.
- Several PCA features correlate with fraud labels.
- Outliers in `Amount` are important indicators of unusual behavior.
## 6. Notebook and Files
Both the notebook (`.ipynb`) and the original dataset (`creditcard.csv`)
are included in this repository.
## 7. Video Presentation
A short video (2–3 minutes) explaining the results and insights will be added here.
(Insert link here once uploaded)
# 信用卡欺诈检测——探索性数据分析(EDA)项目
## 1. 数据集描述
本项目使用的数据集为Kaggle上的**信用卡欺诈检测数据集**,由ULB机器学习组发布。该数据集包含284,807笔交易记录,30个经主成分分析(PCA)变换的数值特征,以及目标变量`Class`(1 = 欺诈,0 = 非欺诈)。
## 2. 分析目标
本探索性数据分析(EDA)的目标是探索模式、异常、相关性、离群值及统计特性,以更深入地理解欺诈交易行为。
## 3. 研究问题
- 欺诈交易与非欺诈交易的金额是否存在差异?
- 哪些特征与目标变量`Class`的相关性最强?
- 数据集中的类别分布情况如何?
- 关键特征中是否存在显著的离群值?
## 4. 核心可视化内容
笔记本中包含以下可视化结果:
- 交易金额与PCA特征的直方图
- 用于检测离群值的箱线图
- 相关性热图
- 类别不平衡情况的条形图
## 5. 洞察与发现
- 数据集存在极度不平衡现象(欺诈案例占比 < 0.2%)。
- 欺诈交易的金额分布往往具有显著差异。
- 多个PCA特征与欺诈标签存在相关性。
- `Amount`特征中的离群值是识别异常行为的重要指标。
## 6. 笔记本与文件
本仓库包含笔记本文件(.ipynb)和原始数据集文件(creditcard.csv)。
## 7. 视频展示
将添加一段简短视频(2–3分钟)解释分析结果与洞察,链接将在上传后插入此处。
(上传后插入链接)
提供机构:
yoavschaffer127



