AE-110k
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/cubenlp/acl19_scaling_up_open_tagging/blob/master/publish_data.txt
下载链接
链接失效反馈官方服务:
资源简介:
该数据集来自阿里巴巴速卖通平台的体育与娱乐类别,包含了产品标题、属性及其对应值的元组。在处理过程中,已移除含有空值的数据实例,因此,数据集最终包含了39,505个产品,拥有2,045个独特的属性和10,977个独特的属性值。此外,为了确保数据质量,已删除含有空值的数据实例。该数据集的规模为39,505个产品,其任务是对产品属性及其值进行识别(Pavi)。
This dataset is sourced from the Sports & Entertainment category of Alibaba's AliExpress platform, and contains tuples of product titles, attributes and their corresponding values. During preprocessing, data instances with null values were removed, resulting in a final dataset consisting of 39,505 products, with 2,045 unique attributes and 10,977 unique attribute values. Additionally, null-containing data instances were eliminated to guarantee data quality. The targeted task of this dataset is product attribute and value identification (PAVI).
提供机构:
AliExpress



