dsfsi/setswana-sentiment
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/dsfsi/setswana-sentiment
下载链接
链接失效反馈官方服务:
资源简介:
DSFSI Setswana Sentiment是一个包含3,555条Setswana(ISO 639-3: `tsn`)推文的情感分析数据集,由三位母语标注者进行标注。数据集提供了完整的标注时间戳、语言识别元数据和每位标注者的标签,以支持下游建模和标注质量研究。数据集分为训练集(2,762条)、验证集(346条)和测试集(346条),以及一个包含所有数据的完整配置(3,555条)。标注标签包括Positive(正面)、Negative(负面)、Neutral(中性)、Mixed(混合)和Indeterminate(不确定)。数据集主要用于Setswana及相关班图语的情感分类器训练和评估,以及标注分歧和标注质量研究。
DSFSI Setswana Sentiment is a sentiment analysis dataset of 3,555 Setswana (ISO 639-3: `tsn`) tweets annotated by three native-speaker annotators. The dataset is released alongside full per-annotation timestamps, language-identification metadata, and per-annotator labels to enable both downstream modelling and annotation-quality research. The dataset includes training (2,762 examples), validation (346 examples), and test (346 examples) splits, as well as a full configuration (3,555 examples). The labels include Positive, Negative, Neutral, Mixed, and Indeterminate. The dataset is intended for training and evaluating sentiment classifiers for Setswana and related Bantu languages, as well as research on annotator disagreement and annotation-quality monitoring.
提供机构:
dsfsi



