Hate Speech Dataset from a White Supremacy Forum

Name: Hate Speech Dataset from a White Supremacy Forum
Creator: Vicomtech, Donostia/San Sebastián, Spain
Published: 2018-09-12 21:51:02
License: 暂无描述

arXiv2018-09-12 更新2024-06-21 收录

下载链接：

https://github.com/aitor-garcia-p/hate-speech-dataset

下载链接

链接失效反馈

官方服务：

资源简介：

本研究介绍了首个从白人至上主义论坛Stormfront提取的仇恨言论数据集，包含10,568条句子，均经过手动标注以区分是否包含仇恨言论。数据集内容涉及种族、性别、宗教等多个敏感领域，旨在通过自动化手段识别和分析网络上的仇恨言论。创建过程中，研究团队开发了专门的标注工具，确保标注的准确性和一致性。该数据集的应用领域广泛，包括但不限于网络安全、社会学研究及政策制定，旨在解决网络仇恨言论的识别与管理问题。

This study introduces the first hate speech dataset extracted from the white supremacist forum Stormfront, which contains 10,568 sentences. All sentences were manually annotated to distinguish whether they contain hate speech. The dataset covers multiple sensitive domains including race, gender, religion and others, and aims to identify and analyze online hate speech via automated methods. During the dataset creation process, the research team developed a dedicated annotation tool to ensure the accuracy and consistency of annotations. This dataset has a wide range of application scenarios, including but not limited to cybersecurity, sociological research and policy-making, with the goal of addressing the identification and management of online hate speech.

提供机构：

Vicomtech, Donostia/San Sebastián, Spain

创建时间：

2018-09-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集