Social Media Implicit Hate Speech Dataset

Name: Social Media Implicit Hate Speech Dataset
Creator: Science Data Bank
Published: 2026-01-14 00:48:20
License: 暂无描述

DataCite Commons2026-01-14 更新2026-05-05 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=28c5672acb0f4db7abc34a4fa7d9d956

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is specifically designed to evaluate models' ability to detect disguised hate speech. It is derived from the TOXICLOAKCN dataset, which itself is constructed by applying perturbations—such as homophone substitution, emoji replacement, and Pinyin augmentation—to original sentences from datasets like TOXICN. These perturbations aim to simulate common strategies used on social media to evade content moderation. In this work, we randomly sampled 1,000 instances from this dataset to assess the generalization capability of large language models in identifying implicit hate speech.

提供机构：

Science Data Bank

创建时间：

2026-01-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集