creditcard Dataset

Name: creditcard Dataset
Creator: figshare
Published: 2025-06-09 17:33:27
License: 暂无描述

DataCite Commons2025-06-09 更新2025-09-08 收录

下载链接：

https://figshare.com/articles/dataset/creditcard_Dataset/29270873/1

下载链接

链接失效反馈

官方服务：

资源简介：

Title: Credit Card Transactions Dataset for Fraud Detection (Used in: A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning)Description:This dataset, commonly known as creditcard.csv, contains anonymized credit card transactions made by European cardholders in September 2013. It includes 284,807 transactions, with 492 labeled as fraudulent. Due to confidentiality constraints, features have been transformed using PCA, except for 'Time' and 'Amount'.This dataset was used in the research article titled "A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning for Credit Card Fraud Detection". The study proposes an ensemble model integrating techniques such as Autoencoders, Isolation Forest, Local Outlier Factor, and supervised classifiers including XGBoost and Random Forest, aiming to improve the detection of rare fraudulent patterns while maintaining efficiency and scalability.Key Features:30 numerical input features (V1–V28, Time, Amount)Class label indicating fraud (1) or normal (0)Imbalanced class distribution typical in real-world fraud detectionUse Case: Ideal for benchmarking and evaluating anomaly detection and classification algorithms in highly imbalanced data scenarios.Source: Originally published by the Machine Learning Group at Université Libre de Bruxelles. https://www.kaggle.com/mlg-ulb/creditcardfraudLicense: This dataset is distributed for academic and research purposes only. Please cite the original source when using the dataset.

数据集名称：用于欺诈检测的信用卡交易数据集（应用于：融合监督与无监督学习的混合异常检测框架）数据集描述：本数据集通常被称为creditcard.csv，收录了2013年9月欧洲信用卡持卡人产生的匿名化信用卡交易记录。数据集共包含284807笔交易，其中492笔被标记为欺诈交易。出于保密约束，除'Time'和'Amount'外，其余特征均通过主成分分析（Principal Component Analysis，PCA）完成了变换处理。本数据集曾被用于题为《融合监督与无监督学习的混合异常检测框架用于信用卡欺诈检测》的研究论文中。该研究提出了一款集成模型，整合了自编码器（Autoencoder）、孤立森林（Isolation Forest）、局部离群因子（Local Outlier Factor）等异常检测技术，以及极端梯度提升树（XGBoost）、随机森林（Random Forest）等监督分类器，旨在在保证效率与可扩展性的同时，提升对稀有欺诈交易模式的检测能力。核心特征：30个数值型输入特征（V1~V28、Time、Amount）；用于标记交易类别的标签，其中1代表欺诈交易，0代表正常交易；符合现实世界欺诈检测场景的非平衡类别分布特性。应用场景： 非常适合用于在高度非平衡数据场景下，对异常检测与分类算法进行基准测试与性能评估。数据集来源： 最初由布鲁塞尔自由大学（Université Libre de Bruxelles）机器学习组发布。 https://www.kaggle.com/mlg-ulb/creditcardfraud使用许可： 本数据集仅可用于学术与研究用途，使用该数据集时请引用原始来源。

提供机构：

figshare

创建时间：

2025-06-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集