piuba-bigdata/contextualized_hate_speech_raw
收藏Hugging Face2024-03-26 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/piuba-bigdata/contextualized_hate_speech_raw
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为contextualized_hate_speech,主要包含在COVID-19疫情期间,阿根廷五家新闻媒体(Clarín, Infobae, La Nación, Perfil和Crónica)的Twitter评论。这些评论被标注了是否存在仇恨言论,并细分为八种不同的特征:针对女性、种族主义、阶级仇恨、针对LGBTQ+个体、针对外貌、针对残疾人、针对罪犯以及出于政治原因。所有数据均为西班牙语。此外,数据集还包含一个额外的标签CALLS,用于表示评论是否呼吁采取(可能暴力的)行动。每个评论都有一组注释者首先标记为HATEFUL,然后选择相关类别(一个或多个)。
该数据集名为contextualized_hate_speech,主要包含在COVID-19疫情期间,阿根廷五家新闻媒体(Clarín, Infobae, La Nación, Perfil和Crónica)的Twitter评论。这些评论被标注了是否存在仇恨言论,并细分为八种不同的特征:针对女性、种族主义、阶级仇恨、针对LGBTQ+个体、针对外貌、针对残疾人、针对罪犯以及出于政治原因。所有数据均为西班牙语。此外,数据集还包含一个额外的标签CALLS,用于表示评论是否呼吁采取(可能暴力的)行动。每个评论都有一组注释者首先标记为HATEFUL,然后选择相关类别(一个或多个)。
提供机构:
piuba-bigdata
原始信息汇总
Contextualized Hate Speech Dataset Summary
Basic Information
- Language: Spanish
- Pretty Name: contextualized_hate_speech
- Task Categories: text-classification
- Tags: hate_speech
- Size Categories: 10K<n<100K
Dataset Description
- Repository: https://github.com/finiteautomata/contextualized-hatespeech-classification
- Paper: "Assessing the impact of contextual information in hate speech detection", by Juan Manuel Pérez et al.
- Point of Contact: jmperez (at) dc uba ar
Dataset Content
- Source: Tweets posted in response to news articles from five Argentinean news outlets (Clarín, Infobae, La Nación, Perfil, Crónica) during the COVID-19 pandemic.
- Annotations: Comments are annotated for the presence of hate speech across eight characteristics: against women, racist content, class hatred, against LGBTQ+ individuals, against physical appearance, against people with disabilities, against criminals, and for political reasons.
Labels
Each comment is labeled with the following variables:
| Label | Description |
|---|---|
| HATEFUL | Contains hate speech (HS)? |
| CALLS | If it is hateful, is this message calling to (possibly violent) action? |
| WOMEN | Is this against women? |
| LGBTI | Is this against LGBTI people? |
| RACISM | Is this a racist message? |
| CLASS | Is this a classist message? |
| POLITICS | Is this HS due to political ideology? |
| DISABLED | Is this HS against disabled people? |
| APPEARANCE | Is this HS against people due to their appearance? (e.g. fatshaming) |
| CRIMINAL | Is this HS against criminals or people in conflict with law? |
Additional Information
- Extra Label:
CALLSrepresents whether a comment is a call to violent action or not. - Annotators: For each comment, a list of annotators who marked the comment first as HATEFUL, and then the selected categories (one or more).
- Aggregated Version: Available at https://huggingface.co/datasets/piuba-bigdata/contextualized_hate_speech/
Citation Information
bibtex @article{perez2022contextual, author = {Pérez, Juan Manuel and Luque, Franco M. and Zayat, Demian and Kondratzky, Martín and Moro, Agustín and Serrati, Pablo Santiago and Zajac, Joaquín and Miguel, Paula and Debandi, Natalia and Gravano, Agustín and Cotik, Viviana}, journal = {IEEE Access}, title = {Assessing the Impact of Contextual Information in Hate Speech Detection}, year = {2023}, volume = {11}, number = {}, pages = {30575-30590}, doi = {10.1109/ACCESS.2023.3258973} }



