five

Hate Speech and Offensive Language

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/t-davidson/hate-speech-and-offensive-language
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了一系列推文,其中包含了仇恨言论,包括种族歧视、性别歧视、恐同以及攻击性表达。这些数据的特点是分布极为不平衡,仇恨言论与非仇恨言论的比例大约为1:15。总体而言,该数据集共包含24,783个样本,其中仇恨言论样本有1,430个,非仇恨言论样本有23,353个。该数据集的任务是进行仇恨言论检测。

This dataset consists of a collection of tweets containing hate speech, including racial discrimination, sexism, homophobia, and aggressive expressions. The dataset features an extremely imbalanced distribution, with the ratio of hate speech samples to non-hate speech samples standing at approximately 1:15. In total, this dataset encompasses 24,783 samples, of which 1,430 are hate speech samples and 23,353 are non-hate speech samples. The task associated with this dataset is hate speech detection.
提供机构:
Davidson et al.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作