Elliptic
收藏阿里云天池2026-05-16 更新2024-03-07 收录
下载链接:
https://tianchi.aliyun.com/dataset/110892
下载链接
链接失效反馈官方服务:
资源简介:
Elliptic 数据集将比特币交易映射到属于合法类别(交易所、钱包提供商、矿工、合法服务等)与非法实体(诈骗、恶意软件、恐怖组织、勒索软件、庞氏骗局等)的真实实体。数据集上的任务是对图中的非法和合法节点进行分类。
这个匿名数据集是从比特币区块链收集的交易图。图中的一个节点代表一笔交易,一条边可以看作是一笔交易与另一笔交易之间的比特币流。每个节点有 166 个特征,并被标记为由“合法”、“非法”或“未知”实体创建。
节点和边
该图由 203,769 个节点和 234,355 条边组成。百分之二 (4,545) 的节点被标记为 class1(非法)。百分之二十一 (42,019) 被标记为 class2 (licit)。其余的交易没有标明合法与非法。
特征
每个节点有 166 个特征。由于知识产权问题,我们无法提供数据集中所有特征的准确描述。每个节点都有一个时间步长,代表交易广播到比特币网络的时间度量。时间步长从 1 到 49,以大约两周的间隔均匀分布。每个时间步包含一个单独的连接交易组件,这些交易在彼此之间不到三小时的时间内出现在区块链上;没有连接不同时间步长的边。
前 94 个特征代表有关交易的本地信息——包括上述时间步长、输入/输出数量、交易费用、输出量和汇总数字,例如输入/输出接收(花费)的平均 BTC 和平均传入数量与输入/输出相关的(输出)交易。其余 72 个特征是聚合特征,使用交易信息从中心节点向后/向前一跳获得 - 给出相同信息数据(输入/输出数量,交易费等)。
The Elliptic Dataset maps Bitcoin transactions to real entities belonging to legitimate categories (exchanges, wallet providers, miners, legitimate services, etc.) and illicit entities (scams, malware, terrorist organizations, ransomware, Ponzi schemes, etc.). The task of this dataset is to classify illicit and legitimate nodes in the graph.
This anonymous dataset is a transaction graph collected from the Bitcoin blockchain. A node in the graph represents a single Bitcoin transaction, while an edge represents the flow of Bitcoin between two transactions. Each node has 166 features and is labeled according to the category of the entity that created it: legitimate, illicit, or unknown.
### Nodes and Edges
The graph comprises 203,769 nodes and 234,355 edges. Two percent (4,545) of the nodes are labeled as class 1 (illicit), while 21% (42,019) are labeled as class 2 (licit). The remaining nodes (transactions) are not labeled as either legitimate or illicit.
### Features
Each node has 166 features. Due to intellectual property restrictions, we are unable to provide an accurate description of all features in the dataset. Each node is associated with a time step, which represents the temporal metric indicating when the transaction was broadcast to the Bitcoin network. Time steps range from 1 to 49, evenly spaced at approximately two-week intervals. Each time step contains a single connected transaction component, where all transactions within the component appear on the blockchain within three hours of one another; no edges exist between nodes across different time steps.
The first 94 features encode local transactional information, including the aforementioned time step, the number of inputs and outputs, transaction fees, output volumes, and aggregate metrics such as the average BTC amount received (spent) by inputs or outputs, and the average number of incoming (outgoing) transactions linked to each input or output. The remaining 72 features are aggregate features, which are computed by aggregating transactional information over one hop backward or forward from a central node, providing the same set of metrics (number of inputs/outputs, transaction fees, etc.) as the local features.
提供机构:
阿里云天池
创建时间:
2021-09-24
搜集汇总
数据集介绍

背景与挑战
背景概述
Elliptic数据集是一个基于比特币区块链的交易图数据集,用于分类合法与非法实体。它包含203,769个节点和234,355条边,每个节点有166个特征,包括时间步长和交易信息,节点标签分为合法、非法和未知三类。该数据集适用于图神经网络和欺诈检测任务,旨在帮助识别加密货币中的非法活动。
以上内容由遇见数据集搜集并总结生成



