herbwood27/Remem

Name: herbwood27/Remem
Creator: herbwood27
Published: 2026-04-07 17:34:26
License: 暂无描述

Hugging Face2026-04-07 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/herbwood27/Remem

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: image dtype: image - name: question dtype: string - name: answer dtype: string - name: keywords dtype: string - name: question_type dtype: string - name: qa_category dtype: string - name: attribute dtype: string - name: cloze_prompt dtype: string - name: image_path dtype: string splits: - name: finetune num_bytes: 4644281416 num_examples: 2000 - name: forget1 num_bytes: 225638598 num_examples: 100 - name: forget2 num_bytes: 455653748 num_examples: 200 - name: forget3 num_bytes: 694681426 num_examples: 300 - name: forget4 num_bytes: 927627782 num_examples: 400 - name: retain num_bytes: 1306241740 num_examples: 560 - name: test num_bytes: 1284353584 num_examples: 560 download_size: 9527522858 dataset_size: 9538478294 configs: - config_name: default data_files: - split: finetune path: data/finetune-* - split: forget1 path: data/forget1-* - split: forget2 path: data/forget2-* - split: forget3 path: data/forget3-* - split: forget4 path: data/forget4-* - split: retain path: data/retain-* - split: test path: data/test-* --- # Before Forgetting, Learn to Remember: Revisiting Foundational Learning Failures in LVLM Unlearning Benchmarks ## Abstract While Large Vision-Language Models (LVLMs) offer powerful capabilities, they pose privacy risks by unintentionally memorizing sensitive personal information. Current unlearning benchmarks attempt to mitigate this using fictitious identities but overlook a critical stage 1 failure: models fail to effectively memorize target information initially, rendering subsequent unlearning evaluations unreliable. Diagnosing under-memorization and the multi-hop curse as root causes, we introduce **ReMem, a Reliable Multi-hop and Multi-image Memorization Benchmark**. ReMem ensures robust foundational learning through principled data scaling, reasoning-aware QA pairs, and diverse visual contexts. Additionally, we propose a novel Exposure metric to quantify the depth of information erasure from the model's internal probability distribution. Extensive experiments demonstrate that ReMem provides a rigorous and trustworthy framework for diagnosing both learning and unlearning behaviors in LVLMs. ## Quick Access - **Huggingface Dataset**: [herbwood27/Remem](https://huggingface.co/datasets/herbwood27/Remem) - **Paper**: Accepted to **Findings of ACL 2026** (Link coming soon) ## Dataset Structure ### Available Splits This dataset consists of 7 splits designed to evaluate both the memorization (Stage 1) and the unlearning (Stage 2) phases: | Split | Examples | Description | | :--- | :--- | :--- | | **finetune** | 2,000 | The full set for foundational memorization of target identities. | | **forget1** | 100 | Target subset for unlearning (5% of the train set). | | **forget2** | 200 | Target subset for unlearning (10% of the train set). | | **forget3** | 300 | Target subset for unlearning (15% of the train set). | | **forget4** | 400 | Target subset for unlearning (20% of the train set). | | **retain** | 560 | Evaluation set for utility preservation of non-target information. | | **test** | 560 | Held-out set for assessing unlearning robustness across diverse contexts. | ### Data Fields - `image`: The personal identification image (PIL.Image). - `question`: Reasoning-aware VQA question regarding sensitive info. - `answer`: Ground-truth answer containing personal information. - `keywords`: Key entity or value for Exact Match (EM) evaluation. - `question_type`: Reasoning depth (e.g., `1-hop`). - `qa_category`: Category of the information (e.g., `personal_information`). - `attribute`: Specific PII type (e.g., `email`, `date_of_birth`). - `cloze_prompt`: Prompt for measuring internal probability/exposure. - `image_path`: Original image file mapping. ## Quick Start ```python from datasets import load_dataset # Load the full memorization set ds_full = load_dataset("herbwood27/Remem", split="finetune") # Load a specific forget split for unlearning ds_forget = load_dataset("herbwood27/Remem", split="forget1") ```

提供机构：

herbwood27

5,000+

优质数据集

54 个

任务类型

进入经典数据集