five

DMASTE

收藏
arXiv2023-05-27 更新2024-06-21 收录
下载链接:
https://github.com/NJUNLP/DMASTE
下载链接
链接失效反馈
官方服务:
资源简介:
DMASTE数据集是由南京大学国家软件新技术重点实验室创建的,专门用于方面情感三元组提取(ASTE)任务。该数据集包含7524条手动标注的评论,覆盖了电子、时尚、美容、家居等八个领域,旨在模拟真实世界场景。DMASTE数据集的特点包括:评论长度多样,从1到250字不等;表达方式丰富,包含多种词汇和句法结构;方面类型多样,包括显式和隐式方面;以及广泛的应用领域,支持单源和多源域适应研究。该数据集的创建过程涉及从亚马逊数据集中选择最受欢迎的四个领域,并随机抽样进行标注,以确保数据的多样性和真实性。DMASTE数据集的应用旨在解决ASTE任务中的复杂性和挑战,为研究者提供一个更接近真实世界的数据基准。

DMASTE Dataset was developed by the State Key Laboratory for Novel Software Technology at Nanjing University, and is specifically designed for the Aspect Sentiment Triplet Extraction (ASTE) task. It contains 7,524 manually annotated reviews spanning eight domains including electronics, fashion, beauty, home furnishing and other categories, which is intended to simulate real-world application scenarios. The key characteristics of the DMASTE Dataset are as follows: reviews exhibit diverse lengths ranging from 1 to 250 words; diverse expressive forms incorporating various vocabularies and syntactic structures; a wide range of aspect types covering both explicit and implicit aspects; and a broad set of application domains to support research on single-source and multi-source domain adaptation. The construction process of the dataset involves selecting the four most popular domains from the Amazon Dataset, conducting random sampling and performing manual annotation to ensure data diversity and authenticity. The DMASTE Dataset aims to tackle the complexities and challenges inherent in the ASTE task, providing researchers with a real-world-aligned data benchmark.
提供机构:
南京大学国家软件新技术重点实验室
创建时间:
2023-05-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作