Korean HateSpeech Dataset
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/Korean_HateSpeech_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
总共有 9,381 条人工标记的评论。它们分为 7,896 个训练集、471 个验证集和 974 个测试集。 (为了公平比较预测模型,我们未公开测试集标签。该模型可以通过 Kaggle 提交来评估,这将在本文档后面描述。)每条评论都在两个方面进行注释,社会偏见和仇恨言论的存在,鉴于仇恨言论与偏见密切相关。
A total of 9,381 manually annotated comments are included in this dataset. They are split into 7,896 training samples, 471 validation samples, and 974 test samples. (To enable fair comparison of predictive models, the test set labels are not publicly disclosed. Model evaluation can be conducted via Kaggle submissions, which will be described later in this document.) Each comment is annotated along two dimensions: the presence of societal bias and hate speech, given the close correlation between hate speech and bias.
提供机构:
OpenDataLab
创建时间:
2022-05-23
搜集汇总
数据集介绍

背景与挑战
背景概述
Korean HateSpeech Dataset是一个由首尔国立大学于2020年发布的韩语仇恨言论数据集,包含9,381条人工标记的评论,分为训练集、验证集和测试集。每条评论均在社会偏见和仇恨言论两个方面进行注释,用于相关预测模型的研究。
以上内容由遇见数据集搜集并总结生成



