FudanCVL/NEST_250323_250411
收藏Hugging Face2026-04-14 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/FudanCVL/NEST_250323_250411
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: images
list: image
- name: annotations
list: image
splits:
- name: train
num_bytes: 881762328
num_examples: 1507
download_size: 1756619925
dataset_size: 881762328
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: cc-by-nc-4.0
task_categories:
- visual-question-answering
- image-segmentation
language:
- en
size_categories:
- 1K<n<10K
---
# NEST_250323_250411
**NEST (Novel Emerging Segmentation Task)** is a benchmark dataset for segmenting (i) novel entities that MLLMs fail to recognize due to their absence from training data, and (ii) emerging entities that exist within the model’s knowledge but demand up-to-date external information for accurate recognition, introduced in the CVPR 2026 Findings paper [ROSE: Retrieval-Oriented Segmentation Enhancement](https://henghuiding.com/ROSE/).
## Dataset Description
NEST targets two categories of challenging entities that MLLM-based segmentation models struggle with:
- **Novel entities**: objects entirely absent from MLLMs' training data (e.g., newly released products)
- **Emerging entities**: objects within the model's prior knowledge but requiring up-to-date context for accurate segmentation (e.g., current officeholders, recent event participants)
This dataset contains **1,548 image-question-answer-mask samples** built from news articles published between **March 23, 2025 and April 11, 2025**, covering diverse domains including economics, technology, politics, entertainment, sports, and society.
Each sample includes:
- A **natural language question** about a named entity depicted in the image
- The **ground-truth answer** (entity name)
- **One or more images** containing the target entity alongside other entities
- **Segmentation mask annotations** for the target entity
## Dataset Statistics
| Statistic | Value |
|---|---|
| Total QA pairs | 1,548 |
| Average entities per image | ~2.7 |
| Average questions per image | ~1.6 |
| Date range | Mar 23 – Apr 11, 2025 |
| Domains | Economics, Technology, Politics, Entertainment, Sports, Society |
| Entity types | People, Products |
| Image format | JPEG (images), PNG (annotations/masks) |
## Usage
```python
from datasets import load_dataset
ds = load_dataset("SongTang/NEST_250323_250411")
# Access a sample
sample = ds["train"][0]
print(sample["question"]) # Question about the target entity
print(sample["answer"]) # Ground-truth entity name
print(sample["images"]) # List of PIL images (multi-entity scenes)
print(sample["annotations"]) # List of segmentation mask PIL images
```
## Data Fields
| Field | Type | Description |
|---|---|---|
| `id` | string | Unique identifier for each sample |
| `question` | string | Natural language question about a novel or emerging named entity |
| `answer` | string | Ground-truth entity name |
| `images` | list of images | Scene images containing the target and other entities |
| `annotations` | list of images | Binary segmentation masks for the target entity |
## Automated Data Collection & Annotation
To reflect the dynamic nature of the NEST task — where evaluation data must be continuously refreshed to prevent leakage into future model training — this dataset is collected and annotated via a fully automated pipeline requiring no human intervention. For the complete implementation and usage instructions, please refer to:
👉 **[https://github.com/FudanCVL/ROSE](https://github.com/FudanCVL/ROSE)**
The pipeline continuously retrieves up-to-date image–news pairs from the web, constructs VQA samples, and generates segmentation mask annotations, enabling scalable and timely evaluation of models' novel emerging segmentation capabilities.
## Citation
If you use this dataset, please cite:
```bibtex
@inproceedings{tang2026rose,
title={{ROSE}: Retrieval-Oriented Segmentation Enhancement},
author={Tang, Song and Jie, Guangquan and Ding, Henghui and Jiang, Yu-Gang},
booktitle={CVPR 2026 Findings},
year={2026}
}
```
提供机构:
FudanCVL



