Hate Speech and Offensive Language

Name: Hate Speech and Offensive Language
Creator: Davidson et al.
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/t-davidson/hate-speech-and-offensive-language

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了一系列推文，其中包含了仇恨言论，包括种族歧视、性别歧视、恐同以及攻击性表达。这些数据的特点是分布极为不平衡，仇恨言论与非仇恨言论的比例大约为1:15。总体而言，该数据集共包含24,783个样本，其中仇恨言论样本有1,430个，非仇恨言论样本有23,353个。该数据集的任务是进行仇恨言论检测。

This dataset consists of a collection of tweets containing hate speech, including racial discrimination, sexism, homophobia, and aggressive expressions. The dataset features an extremely imbalanced distribution, with the ratio of hate speech samples to non-hate speech samples standing at approximately 1:15. In total, this dataset encompasses 24,783 samples, of which 1,430 are hate speech samples and 23,353 are non-hate speech samples. The task associated with this dataset is hate speech detection.

提供机构：

Davidson et al.

5,000+

优质数据集

54 个

任务类型

进入经典数据集