five

PrideMM

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/siddhantbikram/memeclip
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为PrideMM,是为了评估社交媒体表情包背景下多模态模型而精心整理的一个基准。它包含了从Reddit和Twitter等平台收集的超过7000个表情包样本,每个样本都包括一张图片和一段文字说明。该数据集对每个样本进行了有害性和冒犯性的标注,形成了一个具有挑战性的二分类任务。此外,PrideMM包含大量带有讽刺、讽刺或文化内涵的表情包,这要求模型在视觉和文本内容上进行联合推理,以做出正确的预测。该数据集规模超过7000个表情包,所面临的任务是表情包中的仇恨检测。

This dataset, named PrideMM, is a meticulously curated benchmark for evaluating multimodal models in the context of social media memes. It contains over 7,000 meme samples collected from platforms such as Reddit and Twitter, with each sample consisting of an image and a textual caption. Each sample in the dataset is annotated with harmfulness and offensiveness labels, creating a challenging binary classification task. Furthermore, PrideMM includes a substantial number of memes featuring sarcasm, satire or cultural connotations, which necessitates models to conduct joint reasoning over both visual and textual content to make accurate predictions. Comprising over 7,000 meme samples, the core task of this dataset is hate detection in social media memes.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作