changdae/vittle-llavabench-coco-visual-perturbed
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/changdae/vittle-llavabench-coco-visual-perturbed
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- visual-question-answering
tags:
- robustness
- LLaVA-Bench
- COCO
- perturbation
- vittle
pretty_name: "Vittle - Visually Perturbed LLaVA-Bench-COCO"
size_categories:
- n<1K
---
# Vittle - Visually Perturbed LLaVA-Bench-COCO
This dataset provides **visually perturbed** variants of the [LLaVA-Bench (COCO)](https://arxiv.org/abs/2304.08485) open-ended VQA benchmark.
It is released as part of the [Vittle (Visual Instruction Bottleneck Tuning)](https://arxiv.org/abs/2505.13946) project (NeurIPS 2025).
## Overview
- **Questions**: 90 open-ended questions (conversation, detail, complex) — clean text, perturbed images
- **Images**: 30 unique COCO val2014 images, each with 9 visual perturbation variants (severity level 3)
- **Total image files**: 270 (30 images x 9 perturbations)
## Visual Perturbations
All perturbations are at severity level 3, generated following [MM-Robustness](https://github.com/Jielin-Qiu/MM_Robustness):
| Perturbation | Folder |
|---|---|
| Gaussian Noise | `images/COCO_IP_gaussian_noise_3/` |
| Shot Noise | `images/COCO_IP_shot_noise_3/` |
| Speckle Noise | `images/COCO_IP_speckle_noise_3/` |
| Fog | `images/COCO_IP_fog_3/` |
| Contrast | `images/COCO_IP_contrast_3/` |
| Brightness | `images/COCO_IP_brightness_3/` |
| Defocus Blur | `images/COCO_IP_defocus_blur_3/` |
| Zoom Blur | `images/COCO_IP_zoom_blur_3/` |
| Frost | `images/COCO_IP_frost_3/` |
## File Structure
```
.
├── README.md
├── qa90_questions.jsonl # 90 questions (clean text)
└── images/
├── COCO_IP_gaussian_noise_3/ # 30 images
├── COCO_IP_shot_noise_3/
├── COCO_IP_speckle_noise_3/
├── COCO_IP_fog_3/
├── COCO_IP_contrast_3/
├── COCO_IP_brightness_3/
├── COCO_IP_defocus_blur_3/
├── COCO_IP_zoom_blur_3/
└── COCO_IP_frost_3/
```
## Question Format (JSONL)
```json
{"question_id": 1, "image": "COCO_val2014_000000367571.jpg", "text": "What are the colors of the bus in the image?", "category": "conv"}
```
## Citation
```bibtex
@inproceedings{oh2025vittle,
title={Visual Instruction Bottleneck Tuning},
author={Oh, Changdae and Li, Jiatong and Im, Shawn and Li, Yixuan},
booktitle={Advances in Neural Information Processing Systems},
year={2025}
}
```
## License
MIT
提供机构:
changdae



