tartuNLP/Estonian_Subjectivity
收藏Hugging Face2026-01-14 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/tartuNLP/Estonian_Subjectivity
下载链接
链接失效反馈官方服务:
资源简介:
爱沙尼亚主观性数据集是基于理论方法构建的,包含来自爱沙尼亚国家语料库的1000个文本,其中300个是新闻文章和观点文章,700个是网络文本。这些文本由4名标注者进行主观性评分,评分范围从0(客观)到100(主观),并包括标注者的置信度评分。此外,还有250个文本被重新标注,以验证标注的一致性。数据集的列包括文本ID、文本内容、类别、平均人工评分、各标注者评分及置信度、GPT评分及解释、文本长度信息等。
The Estonian Subjectivity Dataset is a dataset based on a theoretical approach, consisting of 1000 randomly selected texts from the Estonian National Corpus (2023), with 300 journalistic texts (150 news articles and 150 opinion pieces) from the Feeds subcorpus and 700 web texts from the full corpus. These texts have been annotated by 4 annotators, who scored the subjectivity of a text using a sliding scale from 0 (objective) to 100 (subjective), along with their confidence in the annotation. Additionally, 250 texts were selected for reannotation to verify consistency. The dataset columns include text ID, text content, category, mean human score, individual annotator scores and certainties, GPT scores and explanations, and text length information.
提供机构:
tartuNLP



