alimamaTech/TRACE
收藏Hugging Face2026-01-30 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/alimamaTech/TRACE
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- tabular-classification
size_categories:
- 100M<n<1B
---
We introduce TRACE, the first benchmark dataset for post-click GMV prediction with delayed feedback.
📁 Data structure includes:
1. **Features**:
* **User/Item/Contextual Attributes**: 22 such attributes are provided (e.g., `feature_0` to `feature_21`).
* **Click Timestamp**: `click_time`.
2. **Labels and Sequence Information**:
* **Gross Merchandise Volume (GMV) Sequence**: The sequence of transaction amounts for all purchases associated with a click. This is represented by `dirpay_amt` for the current purchase and `prev_dirpay_amt` for previous purchases.
* **Purchase Count Sequence**: The sequence indicating the order of each purchase within the attribution window for a given click. `count` represents the current purchase's order, and `total_counts` indicates the final number of purchases.
* **Purchase Timestamp Sequence**: The sequence of timestamps for each purchase. `pay_time` typically represents the timestamp of the current purchase, and `prev_pay_time` contains timestamps of previous purchases.
* **Repurchase Indicator (`multi_tag`)**: A binary label (0 or 1) indicating whether the GMV generated by a click is from a single purchase (0) or a repurchase (1). This is derived from the number of purchases.
* **Final Ground-Truth GMV Label**: The total GMV accumulated by the end of the attribution window. This is calculated as the sum of `dirpay_amt`.
许可证:Apache-2.0
任务类别:表格分类(tabular-classification)
样本规模:100M<n<1B
我们推出了TRACE,这是首个针对带延迟反馈的点击后商品交易总额(Gross Merchandise Volume,GMV)预测的基准数据集。
📁 数据结构包含:
1. **特征**:
* **用户/商品/上下文属性**:共包含22类此类属性(例如`feature_0`至`feature_21`)。
* **点击时间戳**:字段名为`click_time`。
2. **标签与序列信息**:
* **GMV序列**:指与某次点击相关的所有购买交易金额序列,其中当前交易金额由`dirpay_amt`表示,过往交易金额由`prev_dirpay_amt`表示。
* **购买次数序列**:用于指示单次点击归因窗口内各笔购买的下单顺序。`count`代表当前购买的顺位,`total_counts`则表示最终总购买笔数。
* **购买时间戳序列**:各笔购买的时间戳序列,`pay_time`通常指代当前购买的时间戳,`prev_pay_time`存储过往购买的时间戳。
* **复购标识(`multi_tag`)**:二分类标签(0或1),用于标识某次点击产生的GMV来自单笔购买(0)还是复购(1),该标签由总购买笔数推导而来。
* **最终真实GMV标签**:归因窗口结束时累计的总GMV,计算方式为所有`dirpay_amt`的求和结果。
提供机构:
alimamaTech



