BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection

Name: BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection
Creator: 首尔国立大学
Published: 2020-05-26 11:34:01
License: 暂无描述

arXiv2020-05-26 更新2024-06-21 收录

下载链接：

https://github.com/kocohub/korean-hate-speech

下载链接

链接失效反馈

官方服务：

资源简介：

BEEP!韩国在线新闻评论有害言论检测语料库是由首尔国立大学数据科学研究生院创建的首个韩语有害言论检测数据集。该数据集包含9.4K条来自韩国在线娱乐新闻平台的评论，每条评论均标注了社会偏见和仇恨言论两个方面。数据集的创建旨在解决网络匿名性带来的社会问题，特别是针对公众人物的偏见和仇恨言论。通过提供基准模型和在Kaggle上举办竞赛，该数据集旨在推动有害言论检测的研究发展，并帮助改善网络欺凌问题。

BEEP! Korean Online News Comment Toxic Speech Detection Corpus is the first Korean toxic speech detection dataset developed by the Graduate School of Data Science at Seoul National University. This corpus contains 9.4 thousand comments sourced from South Korean online entertainment news platforms, with each comment annotated along two dimensions: social bias and hate speech. The dataset was created to address social issues arising from online anonymity, particularly bias and hate speech targeting public figures. By providing benchmark models and hosting a competition on Kaggle, this corpus aims to advance research in toxic speech detection and help mitigate the problem of cyberbullying.

提供机构：

首尔国立大学

创建时间：

2020-05-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集