five

AE-110k

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/cubenlp/acl19_scaling_up_open_tagging/blob/master/publish_data.txt
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集来自阿里巴巴速卖通平台的体育与娱乐类别,包含了产品标题、属性及其对应值的元组。在处理过程中,已移除含有空值的数据实例,因此,数据集最终包含了39,505个产品,拥有2,045个独特的属性和10,977个独特的属性值。此外,为了确保数据质量,已删除含有空值的数据实例。该数据集的规模为39,505个产品,其任务是对产品属性及其值进行识别(Pavi)。

This dataset is sourced from the Sports & Entertainment category of Alibaba's AliExpress platform, and contains tuples of product titles, attributes and their corresponding values. During preprocessing, data instances with null values were removed, resulting in a final dataset consisting of 39,505 products, with 2,045 unique attributes and 10,977 unique attribute values. Additionally, null-containing data instances were eliminated to guarantee data quality. The targeted task of this dataset is product attribute and value identification (PAVI).
提供机构:
AliExpress
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作