five

面向小样本、零样本、不均衡样本的异常欺诈侦测模型数据集

收藏
国家基础学科公共科学数据中心2026-01-30 收录
下载链接:
https://nbsdc.cn/general/dataDetail?id=67fb6371195d26544804477b&type=1
下载链接
链接失效反馈
官方服务:
资源简介:
包含三个数据文件。Amazon-Fraud反诈数据集主要面向金融风控研究、反金融欺诈需求建设,基于亚马逊评论数据集构建的多关系图数据集,主要用于评估基于图的节点分类、欺诈检测和异常检测模型。 针对小样本异常欺诈侦测任务,在Amazon数据集上采用小样本节点筛选策略,包括随机采样和启发式规则筛选高风险账户,从而缩减数据规模,生成适用于小样本的模型数据集。YelpCHI 数据集基于 Yelp 评论,记录酒店和餐馆的垃圾评论与正常评论。针对零样本异常欺诈侦测任务,YelpChi数据集通过去除标签,实现零标签样本状态,确保数据集能够适应无标签环境的欺诈检测。Elliptic 数据集为比特币交易记录,标注了非法与合法交易。针对不均衡样本问题,在EllipticBitcoin数据集中采用了随机采样技术,包括欠采样和过采样,使数据集的标签比例调整为1:10,使得适用于不均衡样本下异常欺诈侦测。三者均以图结构形式呈现,支持节点分类、欺诈检测和异常分析。

This dataset includes three data files. The Amazon-Fraud Anti-Fraud Dataset is a multi-relational graph dataset built upon the Amazon Reviews dataset, developed for financial risk management research and anti-financial fraud needs. It is mainly used to evaluate graph-based node classification, fraud detection, and anomaly detection models. For few-shot anomaly fraud detection tasks, a small-sample node screening strategy is applied to the Amazon dataset, including random sampling and heuristic rule-based screening of high-risk accounts, to reduce the dataset scale and generate a model dataset suitable for few-shot learning scenarios. The YelpCHI Dataset is constructed based on Yelp reviews, recording spam and genuine reviews of hotels and restaurants. For zero-shot anomaly fraud detection tasks, the YelpCHI Dataset removes all labels to create a fully unlabeled sample setting, allowing the dataset to adapt to fraud detection in unlabeled environments. The Elliptic Bitcoin Dataset consists of Bitcoin transaction records, labeled as illegal or legitimate. To address the class imbalance issue, random sampling techniques including undersampling and oversampling are adopted in the Elliptic Bitcoin Dataset, adjusting the label ratio to 1:10 to make it applicable for anomaly fraud detection under imbalanced sample conditions. All three datasets are presented in graph structure, supporting node classification, fraud detection, and anomaly analysis.
提供机构:
上海大学
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集包含三个图结构数据文件,分别针对小样本、零样本和不均衡样本的异常欺诈侦测任务:Amazon-Fraud反诈数据集用于金融风控研究,YelpCHI数据集适应无标签环境,Elliptic数据集调整标签比例以处理不均衡问题。这些数据支持节点分类、欺诈检测和异常分析,适用于评估相关检测模型。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务