Zeerak Waseem (ZW)
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/ZeerakW/hatespeech
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为ZW,包含了16,907个标注实例,分为三个类别:种族歧视、性别歧视和两者都不是。该数据集是基于对宗教、性取向、性别和少数民族常用的侮辱性标准术语的手动搜索而收集的。数据由作者进行标注,并由外部评估员进行审核,以减少偏见。该数据集的规模为16,907个实例,其任务是进行仇恨言论检测。
This dataset, named ZW, includes 16,907 annotated instances divided into three categories: racial discrimination, gender discrimination, and neither. It was collected via manual searches for commonly used derogatory terms targeting religions, sexual orientations, gender identities, and ethnic minorities. The data was annotated by the authors and reviewed by external evaluators to reduce annotation bias. The task of this dataset is hate speech detection.
提供机构:
Zeerak Waseem & Dirk Hovy



