piuba-bigdata/contextualized_hate_speech_raw

Name: piuba-bigdata/contextualized_hate_speech_raw
Creator: piuba-bigdata
Published: 2024-03-26 20:12:23
License: 暂无描述

Hugging Face2024-03-26 更新2024-06-11 收录

下载链接：

https://hf-mirror.com/datasets/piuba-bigdata/contextualized_hate_speech_raw

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为contextualized_hate_speech，主要包含在COVID-19疫情期间，阿根廷五家新闻媒体（Clarín, Infobae, La Nación, Perfil和Crónica）的Twitter评论。这些评论被标注了是否存在仇恨言论，并细分为八种不同的特征：针对女性、种族主义、阶级仇恨、针对LGBTQ+个体、针对外貌、针对残疾人、针对罪犯以及出于政治原因。所有数据均为西班牙语。此外，数据集还包含一个额外的标签CALLS，用于表示评论是否呼吁采取（可能暴力的）行动。每个评论都有一组注释者首先标记为HATEFUL，然后选择相关类别（一个或多个）。

提供机构：

piuba-bigdata

原始信息汇总

Contextualized Hate Speech Dataset Summary

Basic Information

Language: Spanish
Pretty Name: contextualized_hate_speech
Task Categories: text-classification
Tags: hate_speech
Size Categories: 10K<n<100K

Dataset Description

Repository: https://github.com/finiteautomata/contextualized-hatespeech-classification
Paper: "Assessing the impact of contextual information in hate speech detection", by Juan Manuel Pérez et al.
Point of Contact: jmperez (at) dc uba ar

Dataset Content

Source: Tweets posted in response to news articles from five Argentinean news outlets (Clarín, Infobae, La Nación, Perfil, Crónica) during the COVID-19 pandemic.
Annotations: Comments are annotated for the presence of hate speech across eight characteristics: against women, racist content, class hatred, against LGBTQ+ individuals, against physical appearance, against people with disabilities, against criminals, and for political reasons.

Labels

Each comment is labeled with the following variables:

Label	Description
HATEFUL	Contains hate speech (HS)?
CALLS	If it is hateful, is this message calling to (possibly violent) action?
WOMEN	Is this against women?
LGBTI	Is this against LGBTI people?
RACISM	Is this a racist message?
CLASS	Is this a classist message?
POLITICS	Is this HS due to political ideology?
DISABLED	Is this HS against disabled people?
APPEARANCE	Is this HS against people due to their appearance? (e.g. fatshaming)
CRIMINAL	Is this HS against criminals or people in conflict with law?

Additional Information

Extra Label: CALLS represents whether a comment is a call to violent action or not.
Annotators: For each comment, a list of annotators who marked the comment first as HATEFUL, and then the selected categories (one or more).
Aggregated Version: Available at https://huggingface.co/datasets/piuba-bigdata/contextualized_hate_speech/

Citation Information

bibtex @article{perez2022contextual, author = {Pérez, Juan Manuel and Luque, Franco M. and Zayat, Demian and Kondratzky, Martín and Moro, Agustín and Serrati, Pablo Santiago and Zajac, Joaquín and Miguel, Paula and Debandi, Natalia and Gravano, Agustín and Cotik, Viviana}, journal = {IEEE Access}, title = {Assessing the Impact of Contextual Information in Hate Speech Detection}, year = {2023}, volume = {11}, number = {}, pages = {30575-30590}, doi = {10.1109/ACCESS.2023.3258973} }

5,000+

优质数据集

54 个

任务类型

进入经典数据集