five

piuba-bigdata/contextualized_hate_speech_raw

收藏
Hugging Face2024-03-26 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/piuba-bigdata/contextualized_hate_speech_raw
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为contextualized_hate_speech,主要包含在COVID-19疫情期间,阿根廷五家新闻媒体(Clarín, Infobae, La Nación, Perfil和Crónica)的Twitter评论。这些评论被标注了是否存在仇恨言论,并细分为八种不同的特征:针对女性、种族主义、阶级仇恨、针对LGBTQ+个体、针对外貌、针对残疾人、针对罪犯以及出于政治原因。所有数据均为西班牙语。此外,数据集还包含一个额外的标签CALLS,用于表示评论是否呼吁采取(可能暴力的)行动。每个评论都有一组注释者首先标记为HATEFUL,然后选择相关类别(一个或多个)。

该数据集名为contextualized_hate_speech,主要包含在COVID-19疫情期间,阿根廷五家新闻媒体(Clarín, Infobae, La Nación, Perfil和Crónica)的Twitter评论。这些评论被标注了是否存在仇恨言论,并细分为八种不同的特征:针对女性、种族主义、阶级仇恨、针对LGBTQ+个体、针对外貌、针对残疾人、针对罪犯以及出于政治原因。所有数据均为西班牙语。此外,数据集还包含一个额外的标签CALLS,用于表示评论是否呼吁采取(可能暴力的)行动。每个评论都有一组注释者首先标记为HATEFUL,然后选择相关类别(一个或多个)。
提供机构:
piuba-bigdata
原始信息汇总

Contextualized Hate Speech Dataset Summary

Basic Information

  • Language: Spanish
  • Pretty Name: contextualized_hate_speech
  • Task Categories: text-classification
  • Tags: hate_speech
  • Size Categories: 10K<n<100K

Dataset Description

Dataset Content

  • Source: Tweets posted in response to news articles from five Argentinean news outlets (Clarín, Infobae, La Nación, Perfil, Crónica) during the COVID-19 pandemic.
  • Annotations: Comments are annotated for the presence of hate speech across eight characteristics: against women, racist content, class hatred, against LGBTQ+ individuals, against physical appearance, against people with disabilities, against criminals, and for political reasons.

Labels

Each comment is labeled with the following variables:

Label Description
HATEFUL Contains hate speech (HS)?
CALLS If it is hateful, is this message calling to (possibly violent) action?
WOMEN Is this against women?
LGBTI Is this against LGBTI people?
RACISM Is this a racist message?
CLASS Is this a classist message?
POLITICS Is this HS due to political ideology?
DISABLED Is this HS against disabled people?
APPEARANCE Is this HS against people due to their appearance? (e.g. fatshaming)
CRIMINAL Is this HS against criminals or people in conflict with law?

Additional Information

Citation Information

bibtex @article{perez2022contextual, author = {Pérez, Juan Manuel and Luque, Franco M. and Zayat, Demian and Kondratzky, Martín and Moro, Agustín and Serrati, Pablo Santiago and Zajac, Joaquín and Miguel, Paula and Debandi, Natalia and Gravano, Agustín and Cotik, Viviana}, journal = {IEEE Access}, title = {Assessing the Impact of Contextual Information in Hate Speech Detection}, year = {2023}, volume = {11}, number = {}, pages = {30575-30590}, doi = {10.1109/ACCESS.2023.3258973} }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作