herbwood27/Remem
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/herbwood27/Remem
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: question
dtype: string
- name: answer
dtype: string
- name: keywords
dtype: string
- name: question_type
dtype: string
- name: qa_category
dtype: string
- name: attribute
dtype: string
- name: cloze_prompt
dtype: string
- name: image_path
dtype: string
splits:
- name: finetune
num_bytes: 4644281416
num_examples: 2000
- name: forget1
num_bytes: 225638598
num_examples: 100
- name: forget2
num_bytes: 455653748
num_examples: 200
- name: forget3
num_bytes: 694681426
num_examples: 300
- name: forget4
num_bytes: 927627782
num_examples: 400
- name: retain
num_bytes: 1306241740
num_examples: 560
- name: test
num_bytes: 1284353584
num_examples: 560
download_size: 9527522858
dataset_size: 9538478294
configs:
- config_name: default
data_files:
- split: finetune
path: data/finetune-*
- split: forget1
path: data/forget1-*
- split: forget2
path: data/forget2-*
- split: forget3
path: data/forget3-*
- split: forget4
path: data/forget4-*
- split: retain
path: data/retain-*
- split: test
path: data/test-*
---
# Before Forgetting, Learn to Remember: Revisiting Foundational Learning Failures in LVLM Unlearning Benchmarks
## Abstract
While Large Vision-Language Models (LVLMs) offer powerful capabilities, they pose privacy risks by unintentionally memorizing sensitive personal information. Current unlearning benchmarks attempt to mitigate this using fictitious identities but overlook a critical stage 1 failure: models fail to effectively memorize target information initially, rendering subsequent unlearning evaluations unreliable. Diagnosing under-memorization and the multi-hop curse as root causes, we introduce **ReMem, a Reliable Multi-hop and Multi-image Memorization Benchmark**. ReMem ensures robust foundational learning through principled data scaling, reasoning-aware QA pairs, and diverse visual contexts. Additionally, we propose a novel Exposure metric to quantify the depth of information erasure from the model's internal probability distribution. Extensive experiments demonstrate that ReMem provides a rigorous and trustworthy framework for diagnosing both learning and unlearning behaviors in LVLMs.
## Quick Access
- **Huggingface Dataset**: [herbwood27/Remem](https://huggingface.co/datasets/herbwood27/Remem)
- **Paper**: Accepted to **Findings of ACL 2026** (Link coming soon)
## Dataset Structure
### Available Splits
This dataset consists of 7 splits designed to evaluate both the memorization (Stage 1) and the unlearning (Stage 2) phases:
| Split | Examples | Description |
| :--- | :--- | :--- |
| **finetune** | 2,000 | The full set for foundational memorization of target identities. |
| **forget1** | 100 | Target subset for unlearning (5% of the train set). |
| **forget2** | 200 | Target subset for unlearning (10% of the train set). |
| **forget3** | 300 | Target subset for unlearning (15% of the train set). |
| **forget4** | 400 | Target subset for unlearning (20% of the train set). |
| **retain** | 560 | Evaluation set for utility preservation of non-target information. |
| **test** | 560 | Held-out set for assessing unlearning robustness across diverse contexts. |
### Data Fields
- `image`: The personal identification image (PIL.Image).
- `question`: Reasoning-aware VQA question regarding sensitive info.
- `answer`: Ground-truth answer containing personal information.
- `keywords`: Key entity or value for Exact Match (EM) evaluation.
- `question_type`: Reasoning depth (e.g., `1-hop`).
- `qa_category`: Category of the information (e.g., `personal_information`).
- `attribute`: Specific PII type (e.g., `email`, `date_of_birth`).
- `cloze_prompt`: Prompt for measuring internal probability/exposure.
- `image_path`: Original image file mapping.
## Quick Start
```python
from datasets import load_dataset
# Load the full memorization set
ds_full = load_dataset("herbwood27/Remem", split="finetune")
# Load a specific forget split for unlearning
ds_forget = load_dataset("herbwood27/Remem", split="forget1")
```
提供机构:
herbwood27



