five

alimamaTech/TRACE

收藏
Hugging Face2026-01-30 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/alimamaTech/TRACE
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - tabular-classification size_categories: - 100M<n<1B --- We introduce TRACE, the first benchmark dataset for post-click GMV prediction with delayed feedback. 📁 Data structure includes: 1. **Features**: * **User/Item/Contextual Attributes**: 22 such attributes are provided (e.g., `feature_0` to `feature_21`). * **Click Timestamp**: `click_time`. 2. **Labels and Sequence Information**: * **Gross Merchandise Volume (GMV) Sequence**: The sequence of transaction amounts for all purchases associated with a click. This is represented by `dirpay_amt` for the current purchase and `prev_dirpay_amt` for previous purchases. * **Purchase Count Sequence**: The sequence indicating the order of each purchase within the attribution window for a given click. `count` represents the current purchase's order, and `total_counts` indicates the final number of purchases. * **Purchase Timestamp Sequence**: The sequence of timestamps for each purchase. `pay_time` typically represents the timestamp of the current purchase, and `prev_pay_time` contains timestamps of previous purchases. * **Repurchase Indicator (`multi_tag`)**: A binary label (0 or 1) indicating whether the GMV generated by a click is from a single purchase (0) or a repurchase (1). This is derived from the number of purchases. * **Final Ground-Truth GMV Label**: The total GMV accumulated by the end of the attribution window. This is calculated as the sum of `dirpay_amt`.

许可证:Apache-2.0 任务类别:表格分类(tabular-classification) 样本规模:100M<n<1B 我们推出了TRACE,这是首个针对带延迟反馈的点击后商品交易总额(Gross Merchandise Volume,GMV)预测的基准数据集。 📁 数据结构包含: 1. **特征**: * **用户/商品/上下文属性**:共包含22类此类属性(例如`feature_0`至`feature_21`)。 * **点击时间戳**:字段名为`click_time`。 2. **标签与序列信息**: * **GMV序列**:指与某次点击相关的所有购买交易金额序列,其中当前交易金额由`dirpay_amt`表示,过往交易金额由`prev_dirpay_amt`表示。 * **购买次数序列**:用于指示单次点击归因窗口内各笔购买的下单顺序。`count`代表当前购买的顺位,`total_counts`则表示最终总购买笔数。 * **购买时间戳序列**:各笔购买的时间戳序列,`pay_time`通常指代当前购买的时间戳,`prev_pay_time`存储过往购买的时间戳。 * **复购标识(`multi_tag`)**:二分类标签(0或1),用于标识某次点击产生的GMV来自单笔购买(0)还是复购(1),该标签由总购买笔数推导而来。 * **最终真实GMV标签**:归因窗口结束时累计的总GMV,计算方式为所有`dirpay_amt`的求和结果。
提供机构:
alimamaTech
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作