five

Zeerak Waseem (ZW)

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/ZeerakW/hatespeech
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为ZW,包含了16,907个标注实例,分为三个类别:种族歧视、性别歧视和两者都不是。该数据集是基于对宗教、性取向、性别和少数民族常用的侮辱性标准术语的手动搜索而收集的。数据由作者进行标注,并由外部评估员进行审核,以减少偏见。该数据集的规模为16,907个实例,其任务是进行仇恨言论检测。

This dataset, named ZW, includes 16,907 annotated instances divided into three categories: racial discrimination, gender discrimination, and neither. It was collected via manual searches for commonly used derogatory terms targeting religions, sexual orientations, gender identities, and ethnic minorities. The data was annotated by the authors and reviewed by external evaluators to reduce annotation bias. The task of this dataset is hate speech detection.
提供机构:
Zeerak Waseem & Dirk Hovy
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作