five

Prop2Hate-Meme

收藏
魔搭社区2025-12-05 更新2025-06-21 收录
下载链接:
https://modelscope.cn/datasets/QCRI/Prop2Hate-Meme
下载链接
链接失效反馈
官方服务:
资源简介:
# Prop2Hate-Meme This repository presents the first *Arabic* **Prop2Hate-Meme** dataset which explore the intersection of propaganda and hate in memes using a multi-agent LLM-based framework. We extend an existing propagandistic meme dataset by annotating it with fine- and coarse-grained hate speech labels, and provide baseline experiments to support future research. ![License](https://img.shields.io/badge/license-CC--BY--NC--SA-blue) [![Paper](https://img.shields.io/badge/Paper-Download%20PDF-green)](https://arxiv.org/pdf/2409.07246) **Table of contents:** * [Dataset](#dataset) * [Licensing](#licensing) * [Citation](#citation) --- ## Dataset We adopted the ArMeme dataset for both fine- and coarse-grained hatefulness categorization. We preserved the original train, development, and test splits. While ArMeme was initially annotated with four labels, for this study we retained only the memes labeled as propaganda and not_propaganda. These were subsequently re-annotated with hatefulness categories. The data distribution is provided below. --- ### 📊 Dataset Statistics #### 🏋️‍♂️ **Train Split** **`prop_label`** * `propaganda`: **603** * `not_propaganda`: **1540** **`hate_label`** * `not-hateful`: **1930** * `hateful`: **213** **`hate_fine_grained_label`** * `sarcasm`: **105** * `humor`: **1815** * `inciting violence`: **13** * `mocking`: **133** * `other`: **10** * `exclusion`: **6** * `dehumanizing`: **12** * `contempt`: **38** * `inferiority`: **4** * `slurs`: **7** --- #### 🧪 **Dev Split** **`prop_label`** * `not_propaganda`: **224** * `propaganda`: **88** **`hate_label`** * `not-hateful`: **281** * `hateful`: **31** **`hate_fine_grained_label`** * `humor`: **260** * `sarcasm`: **19** * `mocking`: **19** * `contempt`: **7** * `other`: **1** * `dehumanizing`: **2** * `inferiority`: **1** * `slurs`: **1** * `inciting violence`: **2** --- #### 🧾 **Dev-Test Split (`dev_test`)** **`prop_label`** * `not_propaganda`: **436** * `propaganda`: **170** **`hate_label`** * `not-hateful`: **452** * `hateful`: **154** **`hate_fine_grained_label`** * `humor`: **334** * `sarcasm`: **118** * `inciting violence`: **12** * `slurs`: **29** * `other`: **20** * `mocking`: **49** * `contempt`: **25** * `inferiority`: **14** * `dehumanizing`: **2** * `exclusion`: **3** --- ## Experimental Scripts Please find the experimental scripts here: [https://github.com/firojalam/propaganda-and-hateful-memes.git](https://github.com/firojalam/propaganda-and-hateful-memes.git) ## Licensing This dataset is licensed under CC BY-NC-SA 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/4.0/ ## Citation If you use our dataset in a scientific publication, we would appreciate using the following citations: [![Paper](https://img.shields.io/badge/Paper-Download%20PDF-green)](https://arxiv.org/pdf/2409.07246) ``` @inproceedings{alam2024propaganda, title={Propaganda to Hate: A Multimodal Analysis of Arabic Memes with Multi-agent LLMs}, author={Alam, Firoj and Biswas, Md Rafiul and Shah, Uzair and Zaghouani, Wajdi and Mikros, Georgios}, booktitle={International Conference on Web Information Systems Engineering}, pages={380--390}, year={2024}, organization={Springer} } @inproceedings{alam2024armeme, title={{ArMeme}: Propagandistic Content in Arabic Memes}, author={Alam, Firoj and Hasnat, Abul and Ahmed, Fatema and Hasan, Md Arid and Hasanain, Maram}, booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year={2024}, address={Miami, Florida}, month={November 12--16}, publisher={Association for Computational Linguistics}, } ```

# Prop2Hate-Meme 本仓库发布了首个阿拉伯语**Prop2Hate-Meme**数据集,该数据集采用基于多智能体大语言模型(Large Language Model,LLM)的框架,探究模因中宣传内容与仇恨言论的交叉领域。本研究对现有宣传类模因数据集进行扩展,为其添加细粒度与粗粒度仇恨言论标注,并提供基准实验以支撑后续相关研究。 ![License](https://img.shields.io/badge/license-CC--BY--NC--SA-blue) [![Paper](https://img.shields.io/badge/Paper-Download%20PDF-green)](https://arxiv.org/pdf/2409.07246) **目录:** * [数据集](#数据集) * [授权协议](#授权协议) * [引用](#引用) --- ## 数据集 我们采用ArMeme数据集开展细粒度与粗粒度仇恨性分类任务。我们保留了原始的训练集、开发集与测试集划分。尽管ArMeme最初标注了四类标签,但本研究仅保留其中被标注为宣传类(propaganda)与非宣传类(not_propaganda)的模因,并随后为其补充仇恨性类别标注。数据分布如下所示。 --- ### 📊 数据集统计 #### 🏋️‍♂️ **训练集(Train Split)** **`prop_label`(宣传标签)** * `propaganda`(宣传类):**603** * `not_propaganda`(非宣传类):**1540** **`hate_label`(仇恨标签)** * `not-hateful`(非仇恨类):**1930** * `hateful`(仇恨类):**213** **`hate_fine_grained_label`(细粒度仇恨标签)** * `sarcasm`(讽刺类):**105** * `humor`(幽默类):**1815** * `inciting violence`(煽动暴力类):**13** * `mocking`(嘲弄类):**133** * `other`(其他类):**10** * `exclusion`(排斥类):**6** * `dehumanizing`(去人性化类):**12** * `contempt`(轻蔑类):**38** * `inferiority`(身份贬低类):**4** * `slurs`(辱骂性言语类):**7** --- #### 🧪 **开发集(Dev Split)** **`prop_label`(宣传标签)** * `not_propaganda`(非宣传类):**224** * `propaganda`(宣传类):**88** **`hate_label`(仇恨标签)** * `not-hateful`(非仇恨类):**281** * `hateful`(仇恨类):**31** **`hate_fine_grained_label`(细粒度仇恨标签)** * `humor`(幽默类):**260** * `sarcasm`(讽刺类):**19** * `mocking`(嘲弄类):**19** * `contempt`(轻蔑类):**7** * `other`(其他类):**1** * `dehumanizing`(去人性化类):**2** * `inferiority`(身份贬低类):**1** * `slurs`(辱骂性言语类):**1** * `inciting violence`(煽动暴力类):**2** --- #### 🧾 **开发测试集(Dev-Test Split)** **`prop_label`(宣传标签)** * `not_propaganda`(非宣传类):**436** * `propaganda`(宣传类):**170** **`hate_label`(仇恨标签)** * `not-hateful`(非仇恨类):**452** * `hateful`(仇恨类):**154** **`hate_fine_grained_label`(细粒度仇恨标签)** * `humor`(幽默类):**334** * `sarcasm`(讽刺类):**118** * `inciting violence`(煽动暴力类):**12** * `slurs`(辱骂性言语类):**29** * `other`(其他类):**20** * `mocking`(嘲弄类):**49** * `contempt`(轻蔑类):**25** * `inferiority`(身份贬低类):**14** * `dehumanizing`(去人性化类):**2** * `exclusion`(排斥类):**3** --- ## 实验代码 请访问以下链接获取实验代码:[https://github.com/firojalam/propaganda-and-hateful-memes.git](https://github.com/firojalam/propaganda-and-hateful-memes.git) ## 授权协议 本数据集采用CC BY-NC-SA 4.0协议进行授权。若需查看协议副本,请访问https://creativecommons.org/licenses/by-nc-sa/4.0/ ## 引用 若您在学术出版物中使用本数据集,请引用如下文献: [![Paper](https://img.shields.io/badge/Paper-Download%20PDF-green)](https://arxiv.org/pdf/2409.07246) @inproceedings{alam2024propaganda, title={Propaganda to Hate: A Multimodal Analysis of Arabic Memes with Multi-agent LLMs}, author={Alam, Firoj and Biswas, Md Rafiul and Shah, Uzair and Zaghouani, Wajdi and Mikros, Georgios}, booktitle={International Conference on Web Information Systems Engineering}, pages={380--390}, year={2024}, organization={Springer} } @inproceedings{alam2024armeme, title={{ArMeme}: Propagandistic Content in Arabic Memes}, author={Alam, Firoj and Hasnat, Abul and Ahmed, Fatema and Hasan, Md Arid and Hasanain, Maram}, booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year={2024}, address={Miami, Florida}, month={November 12--16}, publisher={Association for Computational Linguistics}, }
提供机构:
maas
创建时间:
2025-06-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作