five

changdae/vittle-llavabench-coco-visual-perturbed

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/changdae/vittle-llavabench-coco-visual-perturbed
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - visual-question-answering tags: - robustness - LLaVA-Bench - COCO - perturbation - vittle pretty_name: "Vittle - Visually Perturbed LLaVA-Bench-COCO" size_categories: - n<1K --- # Vittle - Visually Perturbed LLaVA-Bench-COCO This dataset provides **visually perturbed** variants of the [LLaVA-Bench (COCO)](https://arxiv.org/abs/2304.08485) open-ended VQA benchmark. It is released as part of the [Vittle (Visual Instruction Bottleneck Tuning)](https://arxiv.org/abs/2505.13946) project (NeurIPS 2025). ## Overview - **Questions**: 90 open-ended questions (conversation, detail, complex) — clean text, perturbed images - **Images**: 30 unique COCO val2014 images, each with 9 visual perturbation variants (severity level 3) - **Total image files**: 270 (30 images x 9 perturbations) ## Visual Perturbations All perturbations are at severity level 3, generated following [MM-Robustness](https://github.com/Jielin-Qiu/MM_Robustness): | Perturbation | Folder | |---|---| | Gaussian Noise | `images/COCO_IP_gaussian_noise_3/` | | Shot Noise | `images/COCO_IP_shot_noise_3/` | | Speckle Noise | `images/COCO_IP_speckle_noise_3/` | | Fog | `images/COCO_IP_fog_3/` | | Contrast | `images/COCO_IP_contrast_3/` | | Brightness | `images/COCO_IP_brightness_3/` | | Defocus Blur | `images/COCO_IP_defocus_blur_3/` | | Zoom Blur | `images/COCO_IP_zoom_blur_3/` | | Frost | `images/COCO_IP_frost_3/` | ## File Structure ``` . ├── README.md ├── qa90_questions.jsonl # 90 questions (clean text) └── images/ ├── COCO_IP_gaussian_noise_3/ # 30 images ├── COCO_IP_shot_noise_3/ ├── COCO_IP_speckle_noise_3/ ├── COCO_IP_fog_3/ ├── COCO_IP_contrast_3/ ├── COCO_IP_brightness_3/ ├── COCO_IP_defocus_blur_3/ ├── COCO_IP_zoom_blur_3/ └── COCO_IP_frost_3/ ``` ## Question Format (JSONL) ```json {"question_id": 1, "image": "COCO_val2014_000000367571.jpg", "text": "What are the colors of the bus in the image?", "category": "conv"} ``` ## Citation ```bibtex @inproceedings{oh2025vittle, title={Visual Instruction Bottleneck Tuning}, author={Oh, Changdae and Li, Jiatong and Im, Shawn and Li, Yixuan}, booktitle={Advances in Neural Information Processing Systems}, year={2025} } ``` ## License MIT
提供机构:
changdae
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作