five

FudanCVL/NEST_250323_250411

收藏
Hugging Face2026-04-14 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/FudanCVL/NEST_250323_250411
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: string - name: question dtype: string - name: answer dtype: string - name: images list: image - name: annotations list: image splits: - name: train num_bytes: 881762328 num_examples: 1507 download_size: 1756619925 dataset_size: 881762328 configs: - config_name: default data_files: - split: train path: data/train-* license: cc-by-nc-4.0 task_categories: - visual-question-answering - image-segmentation language: - en size_categories: - 1K<n<10K --- # NEST_250323_250411 **NEST (Novel Emerging Segmentation Task)** is a benchmark dataset for segmenting (i) novel entities that MLLMs fail to recognize due to their absence from training data, and (ii) emerging entities that exist within the model’s knowledge but demand up-to-date external information for accurate recognition, introduced in the CVPR 2026 Findings paper [ROSE: Retrieval-Oriented Segmentation Enhancement](https://henghuiding.com/ROSE/). ## Dataset Description NEST targets two categories of challenging entities that MLLM-based segmentation models struggle with: - **Novel entities**: objects entirely absent from MLLMs' training data (e.g., newly released products) - **Emerging entities**: objects within the model's prior knowledge but requiring up-to-date context for accurate segmentation (e.g., current officeholders, recent event participants) This dataset contains **1,548 image-question-answer-mask samples** built from news articles published between **March 23, 2025 and April 11, 2025**, covering diverse domains including economics, technology, politics, entertainment, sports, and society. Each sample includes: - A **natural language question** about a named entity depicted in the image - The **ground-truth answer** (entity name) - **One or more images** containing the target entity alongside other entities - **Segmentation mask annotations** for the target entity ## Dataset Statistics | Statistic | Value | |---|---| | Total QA pairs | 1,548 | | Average entities per image | ~2.7 | | Average questions per image | ~1.6 | | Date range | Mar 23 – Apr 11, 2025 | | Domains | Economics, Technology, Politics, Entertainment, Sports, Society | | Entity types | People, Products | | Image format | JPEG (images), PNG (annotations/masks) | ## Usage ```python from datasets import load_dataset ds = load_dataset("SongTang/NEST_250323_250411") # Access a sample sample = ds["train"][0] print(sample["question"]) # Question about the target entity print(sample["answer"]) # Ground-truth entity name print(sample["images"]) # List of PIL images (multi-entity scenes) print(sample["annotations"]) # List of segmentation mask PIL images ``` ## Data Fields | Field | Type | Description | |---|---|---| | `id` | string | Unique identifier for each sample | | `question` | string | Natural language question about a novel or emerging named entity | | `answer` | string | Ground-truth entity name | | `images` | list of images | Scene images containing the target and other entities | | `annotations` | list of images | Binary segmentation masks for the target entity | ## Automated Data Collection & Annotation To reflect the dynamic nature of the NEST task — where evaluation data must be continuously refreshed to prevent leakage into future model training — this dataset is collected and annotated via a fully automated pipeline requiring no human intervention. For the complete implementation and usage instructions, please refer to: 👉 **[https://github.com/FudanCVL/ROSE](https://github.com/FudanCVL/ROSE)** The pipeline continuously retrieves up-to-date image–news pairs from the web, constructs VQA samples, and generates segmentation mask annotations, enabling scalable and timely evaluation of models' novel emerging segmentation capabilities. ## Citation If you use this dataset, please cite: ```bibtex @inproceedings{tang2026rose, title={{ROSE}: Retrieval-Oriented Segmentation Enhancement}, author={Tang, Song and Jie, Guangquan and Ding, Henghui and Jiang, Yu-Gang}, booktitle={CVPR 2026 Findings}, year={2026} } ```
提供机构:
FudanCVL
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作