five

CroSentiNews 2.0

收藏
arXiv2023-05-14 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2305.08187v1
下载链接
链接失效反馈
官方服务:
资源简介:
CroSentiNews 2.0是一个针对克罗地亚新闻领域的句子级情感数据集,由人文与社会科学学院创建。该数据集包含14.5K个带有5个类别标签的注释句子,用于情感分析。数据来源于克罗地亚主要媒体组织24sata的网站,涵盖汽车新闻、健康、烹饪内容和生活方式建议等多个主题。创建过程包括使用现有的情感分类器进行预注释,并由克罗地亚语母语的本科生进行标注。该数据集主要用于开发情感分类器,解决低资源语言如克罗地亚语的情感分析问题。

CroSentiNews 2.0 is a sentence-level sentiment dataset tailored for the Croatian news domain, created by the School of Humanities and Social Sciences. It contains 14.5k annotated sentences labeled with 5 categories, designed for sentiment analysis tasks. The dataset is sourced from the website of 24sata, a major Croatian media outlet, covering multiple topics including automotive news, health, cooking content, and lifestyle advice. The dataset construction process includes pre-annotation using existing sentiment classifiers, followed by manual annotation conducted by undergraduate native speakers of Croatian. This dataset is primarily used to develop sentiment classifiers, addressing the sentiment analysis challenges faced by low-resource languages such as Croatian.
提供机构:
人文与社会科学学院
创建时间:
2023-05-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作