自动标注数据集
收藏arXiv2017-09-15 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/1709.05094v1
下载链接
链接失效反馈官方服务:
资源简介:
自动标注数据集是由瑞士电信公司的人工智能与机器学习组创建,用于无监督方面术语提取的研究。该数据集包含约3000条从英语评论中提取的句子,主要来源于亚马逊和Yelp的公开评论数据。创建过程中,通过自动化和无监督的方法对原始意见文本进行标注,使用IOB格式确保高精度标注。该数据集主要应用于方面基于情感分析(ABSA)领域,旨在解决传统监督学习数据集规模小和人工标注成本高的问题。
This automatically annotated dataset was developed by the AI and Machine Learning Group of Swisscom for research on unsupervised aspect term extraction. It contains approximately 3,000 sentences extracted from English reviews, which are primarily sourced from public review data of Amazon and Yelp. During the dataset construction, the original opinion texts were annotated via automated and unsupervised approaches, with the IOB format adopted to ensure high-precision annotations. This dataset is mainly applied in the field of Aspect-Based Sentiment Analysis (ABSA), aiming to address the issues of small scale and high manual annotation costs in traditional supervised learning datasets.
提供机构:
人工智能与机器学习组 — 瑞士电信公司
创建时间:
2017-09-15



