five

yoavschaffer127/yoav.credit.cards

收藏
Hugging Face2025-11-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yoavschaffer127/yoav.credit.cards
下载链接
链接失效反馈
官方服务:
资源简介:
# Credit Card Fraud Detection – EDA Project ## 1. Dataset Description The dataset used in this project is the **Credit Card Fraud Detection Dataset** from Kaggle, published by the ULB Machine Learning Group. It contains 284,807 transactions and 30 PCA-transformed numerical features, along with the target variable `Class` (1 = fraud, 0 = non-fraud). ## 2. Goal of the Analysis The goal of the EDA is to explore patterns, anomalies, correlations, outliers, and statistical properties to better understand fraudulent transactions. ## 3. Research Questions - Do transaction amounts differ between fraudulent and non-fraudulent transactions? - Which features are most correlated with the target variable (`Class`)? - What is the class distribution in the dataset? - Are there significant outliers in key features? ## 4. Key Visualizations Included in the notebook: - Histograms for transaction amounts and PCA features - Boxplots to detect outliers - Correlation heatmap - Bar chart for class imbalance ## 5. Insights & Findings - The dataset is extremely imbalanced (fraud cases < 0.2%). - Fraudulent transactions tend to show different amount distributions. - Several PCA features correlate with fraud labels. - Outliers in `Amount` are important indicators of unusual behavior. ## 6. Notebook and Files Both the notebook (`.ipynb`) and the original dataset (`creditcard.csv`) are included in this repository. ## 7. Video Presentation A short video (2–3 minutes) explaining the results and insights will be added here. (Insert link here once uploaded)

# 信用卡欺诈检测——探索性数据分析(EDA)项目 ## 1. 数据集描述 本项目使用的数据集为Kaggle上的**信用卡欺诈检测数据集**,由ULB机器学习组发布。该数据集包含284,807笔交易记录,30个经主成分分析(PCA)变换的数值特征,以及目标变量`Class`(1 = 欺诈,0 = 非欺诈)。 ## 2. 分析目标 本探索性数据分析(EDA)的目标是探索模式、异常、相关性、离群值及统计特性,以更深入地理解欺诈交易行为。 ## 3. 研究问题 - 欺诈交易与非欺诈交易的金额是否存在差异? - 哪些特征与目标变量`Class`的相关性最强? - 数据集中的类别分布情况如何? - 关键特征中是否存在显著的离群值? ## 4. 核心可视化内容 笔记本中包含以下可视化结果: - 交易金额与PCA特征的直方图 - 用于检测离群值的箱线图 - 相关性热图 - 类别不平衡情况的条形图 ## 5. 洞察与发现 - 数据集存在极度不平衡现象(欺诈案例占比 < 0.2%)。 - 欺诈交易的金额分布往往具有显著差异。 - 多个PCA特征与欺诈标签存在相关性。 - `Amount`特征中的离群值是识别异常行为的重要指标。 ## 6. 笔记本与文件 本仓库包含笔记本文件(.ipynb)和原始数据集文件(creditcard.csv)。 ## 7. 视频展示 将添加一段简短视频(2–3分钟)解释分析结果与洞察,链接将在上传后插入此处。 (上传后插入链接)
提供机构:
yoavschaffer127
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作