nils-herrmann/hateful_memes_fine_grained

Name: nils-herrmann/hateful_memes_fine_grained
Creator: nils-herrmann
Published: 2026-04-08 12:29:26
License: 暂无描述

Hugging Face2026-04-08 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/nils-herrmann/hateful_memes_fine_grained

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit configs: - config_name: aggregated data_files: - split: train path: aggregated/train-* - split: validation path: aggregated/validation-* - split: test path: aggregated/test-* - config_name: annotations data_files: - split: train path: annotations/train-* dataset_info: - config_name: aggregated features: - name: id dtype: int64 - name: img dtype: string - name: original_split dtype: string - name: label_hateful dtype: int64 - name: label_incivility dtype: int64 - name: label_intolerance dtype: int64 splits: - name: train num_bytes: 84506 num_examples: 1420 - name: validation num_bytes: 12080 num_examples: 203 - name: test num_bytes: 24220 num_examples: 407 download_size: 38974 dataset_size: 120806 - config_name: annotations features: - name: id dtype: int64 - name: annotator dtype: string - name: label_hateful dtype: int64 - name: label_incivility dtype: string - name: label_intolerance dtype: string splits: - name: train num_bytes: 234632 num_examples: 6090 download_size: 27211 dataset_size: 234632 language: - en size_categories: - 1K<n<10K --- # Hateful Memes Fine-Grained Dataset This dataset is a fine-grained extension of the widely used Hateful Memes dataset, designed to enable more nuanced analysis of harmful multimodal content. While the original dataset focuses on binary hatefulness classification, this extension introduces additional annotation dimensions capturing incivility and intolerance at a more granular level. The dataset consists of a subset of 2,030 memes, each annotated independently by three annotators. Annotations are provided both at the individual annotator level and as aggregated majority labels after binarization. The goal of this dataset is to disentangle different aspects of harmful content—particularly separating tone (incivility) from content (intolerance)—and to support research in content moderation, multimodal understanding, and responsible AI. - **Curated by:** Nils A. Herrmann - **Language(s) (NLP):** English - **License:** MIT ## Dataset Sources - **Repository:** [Github Repo](https://github.com/nils-herrmann/beyond_hate) - **Pre-Print:** [Arxiv](https://arxiv.org/abs/2603.22985) - **Paper:** TBA ## Uses This dataset is intended for: - Training and evaluating multimodal classification models - Studying fine-grained harmful content detection, beyond binary hatefulness - Analyzing the distinction between: Incivility (tone) vs. intolerance (content) - Evaluating bias and fairness in content moderation systems - Research on annotation disagreement and uncertainty modeling **Out-of-Scope Use:** This dataset should not be used for fully automated moderation systems without human oversight. ## Dataset Structure The dataset consists of two main components: ### 1. Annotation-Level Data (`annotations`) - Contains **individual annotations** from each annotator - Total entries: **6,090 rows (2,030 samples × 3 annotators)** - Each row corresponds to one annotator’s labels for a meme **Fields:** - `id`: meme identifier - `annotator`: annotator identifier - `label_hateful`: original binary label - `label_incivility`: multi-class label (comma sepparated) - `label_intolerance`: multi-class label (comma sepparated) ### 2. Aggregated Data (`aggregated`) - Contains **majority-vote labels** after binarization - Total entries: **2,030 rows (one per meme)** **Fields:** - `id`: meme identifier - `img`: image file name - `original_split`: corresponding split in the original dataset - `label_hateful`: original binary label - `label_incivility`: binary label after majority vote - `label_intolerance`: binary label after majority vote ## Dataset Creation Existing multimodal hate detection datasets primarily focus on **binary labels**, which obscure important distinctions in harmful content. This dataset was created to: - Capture **different dimensions of harmfulness** - Enable **more interpretable model behavior** - Support research on **annotation ambiguity and disagreement** - Provide a testbed for **fine-grained moderation strategies** ### Source Data The dataset builds on the Hateful Memes dataset, which consists of image-text pairs designed to require multimodal understanding. ### Data Collection and Processing - A subset of **2,030 memes** was selected - Each meme was annotated independently by three annotators - Annotation included: - Binary hatefulness - Incivility categories (tone) - Intolerance categories (content) - Aggregated labels were computed via: - Label binarization - Majority voting ### Who are the source data producers? The original memes were created by researchers at Meta AI as part of the Hateful Memes benchmark. The dataset consists of synthetic and semi-synthetic meme-style image-text combinations. ### Annotations #### Annotation process - **3 annotators per meme** - Annotation conducted in two stages: 1. Initial annotation 2. Review/disagreement resolution (when applicable) #### Who are the annotators? - **2 expert annotators** - Background in social science - Experience in communication science research - **1 trained non-expert annotator** - Background in computer science - Received task-specific training ## Bias, Risks, and Limitations - **Subjectivity in annotations** - **Limited dataset size (2,030 samples)** - **Annotator bias due to background differences** - **Cultural bias in interpretation of harmful content** - **Synthetic nature of memes may limit real-world generalization** ## Citation Herrmann, N. A., Eder, T., He, J., & Groh, G. (2026). Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation (arXiv:2603.22985). arXiv. https://doi.org/10.48550/arXiv.2603.22985

许可证：MIT 配置项： - 配置名称：aggregated 数据文件： - 拆分集：训练集（train），路径：aggregated/train-* - 拆分集：验证集（validation），路径：aggregated/validation-* - 拆分集：测试集（test），路径：aggregated/test-* - 配置名称：annotations 数据文件： - 拆分集：训练集（train），路径：annotations/train-* 数据集信息： - 配置名称：aggregated 特征字段： - 字段名：id，数据类型：64位整型（int64） - 字段名：img，数据类型：字符串类型（string） - 字段名：original_split，数据类型：字符串类型（string） - 字段名：label_hateful，数据类型：64位整型（int64） - 字段名：label_incivility，数据类型：64位整型（int64） - 字段名：label_intolerance，数据类型：64位整型（int64）拆分集详情： - 拆分集：训练集（train），字节数：84506，样本数：1420 - 拆分集：验证集（validation），字节数：12080，样本数：203 - 拆分集：测试集（test），字节数：24220，样本数：407 下载大小：38974，数据集总大小：120806 - 配置名称：annotations 特征字段： - 字段名：id，数据类型：64位整型（int64） - 字段名：annotator，数据类型：字符串类型（string） - 字段名：label_hateful，数据类型：64位整型（int64） - 字段名：label_incivility，数据类型：字符串类型（string） - 字段名：label_intolerance，数据类型：字符串类型（string）拆分集详情： - 拆分集：训练集（train），字节数：234632，样本数：6090 下载大小：27211，数据集总大小：234632 语言：英语（en）样本规模分类：1K<n<10K # 仇恨Memes细粒度数据集（Hateful Memes Fine-Grained Dataset）本数据集为广泛使用的**仇恨Memes数据集（Hateful Memes Dataset）**的细粒度扩展版本，旨在实现对有害多模态内容的更精细化分析。原始数据集仅关注二元仇恨性分类任务，而本扩展版本新增了多个标注维度，以更精细的粒度捕捉不文明行为与偏执倾向。本数据集包含2030个Memes的子集，每个样本均由3名标注者独立完成标注。数据集同时提供单标注者级别的原始标注，以及经二值化处理后的聚合多数投票标签。本数据集的目标在于拆解有害内容的不同维度——尤其区分语气层面的不文明行为与内容层面的偏执倾向，同时为内容审核、多模态理解以及负责任AI领域的研究提供支撑。 - **整理者：** 尼尔斯·A·赫尔曼（Nils A. Herrmann） - **自然语言处理语言：** 英语 - **许可证：** MIT ## 数据集来源 - **代码仓库：** [GitHub仓库](https://github.com/nils-herrmann/beyond_hate) - **预印本：** [Arxiv预印本](https://arxiv.org/abs/2603.22985) - **正式论文：** 待公布（TBA） ## 适用场景本数据集适用于以下研究方向： - 训练与评估多模态分类模型 - 研究超越二元仇恨性分类的细粒度有害内容检测方法 - 分析不文明行为（语气层面）与偏执倾向（内容层面）之间的差异 - 评估内容审核系统中的偏见与公平性 - 开展标注分歧与不确定性建模相关研究 **不适用场景：** 本数据集不得用于无人工监督的全自动内容审核系统。 ## 数据集结构本数据集包含两个核心组成部分： ### 1. 标注者级数据（`annotations`） - 包含每名标注者的独立标注结果 - 总条目数：**6090条（2030个样本 × 3名标注者）** - 每一行对应一名标注者对单个Memes的标注结果 **字段说明：** - `id`：Memes样本标识符 - `annotator`：标注者标识符 - `label_hateful`：原始二元仇恨性标签 - `label_incivility`：多分类标签（以逗号分隔） - `label_intolerance`：多分类标签（以逗号分隔） ### 2. 聚合级数据（`aggregated`） - 包含经二值化处理后的多数投票标签 - 总条目数：**2030条（每个Memes对应一条）** **字段说明：** - `id`：Memes样本标识符 - `img`：图像文件名 - `original_split`：原始数据集对应的拆分集 - `label_hateful`：原始二元仇恨性标签 - `label_incivility`：经多数投票得到的二值化标签 - `label_intolerance`：经多数投票得到的二值化标签 ## 数据集构建现有的多模态仇恨检测数据集大多仅关注**二元标签**，这会掩盖有害内容中诸多重要的差异。本数据集的构建目标包括： - 捕捉有害内容的不同维度 - 实现更具可解释性的模型行为 - 支撑标注歧义与标注分歧相关研究 - 为细粒度审核策略提供测试基准 ### 源数据本数据集基于仇恨Memes数据集构建，该数据集包含需依赖多模态理解能力的图像-文本配对样本。 ### 数据收集与处理流程 - 选取了**2030个Memes**作为子集样本 - 每个Memes由3名标注者独立完成标注 - 标注内容包括： 1. 二元仇恨性标签 2. 不文明行为分类（语气层面） 3. 偏执倾向分类（内容层面） - 聚合标签通过以下方式计算得到： 1. 标签二值化 2. 多数投票规则 ### 源数据生产者原始Memes样本由Meta AI的研究人员作为仇恨Memes基准数据集的一部分创建，本数据集包含合成与半合成的Memes风格图像-文本组合。 ### 标注说明 #### 标注流程 - 每个Memes样本由3名标注者完成标注 - 标注分为两个阶段进行： 1. 初始标注 2. 复核与分歧解决（如有需要） #### 标注者信息 - **2名专家标注者** - 具备社会科学背景 - 拥有传播科学研究经验 - **1名经过培训的非专家标注者** - 具备计算机科学背景 - 接受过任务专属培训 ## 偏见、风险与局限性 - **标注主观性** - **数据集规模有限（仅2030个样本）** - **标注者背景差异导致的标注偏见** - **有害内容解读中的文化偏见** - **Memes的合成属性可能限制模型在真实场景中的泛化能力** ## 引用格式赫尔曼, N. A., 埃德尔, T., 何, J., & 格罗, G. (2026). 超越仇恨：多模态内容审核中区分不文明言论与偏执言论 (arXiv:2603.22985). arXiv. https://doi.org/10.48550/arXiv.2603.22985

提供机构：

nils-herrmann

5,000+

优质数据集

54 个

任务类型

进入经典数据集