five

changdae/vittle-llavabench-coco-textual-perturbed

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/changdae/vittle-llavabench-coco-textual-perturbed
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - visual-question-answering tags: - robustness - LLaVA-Bench - COCO - perturbation - vittle - text-perturbation pretty_name: "Vittle - Textually Perturbed LLaVA-Bench-COCO" size_categories: - n<1K --- # Vittle - Textually Perturbed LLaVA-Bench-COCO This dataset provides **textually perturbed** variants of the [LLaVA-Bench (COCO)](https://arxiv.org/abs/2304.08485) open-ended VQA benchmark. It is released as part of the [Vittle (Visual Instruction Bottleneck Tuning)](https://arxiv.org/abs/2505.13946) project (NeurIPS 2025). ## Overview - **Questions**: 90 base questions x 9 textual perturbation variants = 810 perturbed questions. Clean images are used. - **Images**: 30 unique COCO val2014 images (clean, unperturbed) ## Textual Perturbations Generated following [MM-Robustness](https://github.com/Jielin-Qiu/MM_Robustness) for char/word-level, and GPT-4o for sentence-level (translation): ### Char/Word-level Perturbations | Perturbation | File | Description | |---|---|---| | Random Delete | `qa90_questions_rd_7.jsonl` | Random character deletion (severity 7) | | Random Swap | `qa90_questions_rs_4.jsonl` | Random character swap (severity 4) | | Random Insert | `qa90_questions_ri_4.jsonl` | Random character insertion (severity 4) | | Keyboard Aug | `qa90_questions_KeyboardAug_3.jsonl` | Keyboard-based typo augmentation (severity 3) | | Char Delete | `qa90_questions_RandomCharAug_delete_3.jsonl` | Random character deletion augmentation (severity 3) | | Char Insert | `qa90_questions_RandomCharAug_insert_3.jsonl` | Random character insertion augmentation (severity 3) | ### Sentence-level Perturbations (Translation) | Perturbation | File | Description | |---|---|---| | Hindi | `qa90_questions_Hindi.jsonl` | GPT-4o translation to Hindi | | Greek | `qa90_questions_Greek.jsonl` | GPT-4o translation to Greek | | Arabic | `qa90_questions_Arabic.jsonl` | GPT-4o translation to Arabic | ## File Structure ``` . ├── README.md ├── qa90_questions.jsonl # 90 original (clean) questions ├── questions_perturbed/ │ ├── qa90_questions_rd_7.jsonl │ ├── qa90_questions_rs_4.jsonl │ ├── qa90_questions_ri_4.jsonl │ ├── qa90_questions_KeyboardAug_3.jsonl │ ├── qa90_questions_RandomCharAug_delete_3.jsonl │ ├── qa90_questions_RandomCharAug_insert_3.jsonl │ ├── qa90_questions_Hindi.jsonl │ ├── qa90_questions_Greek.jsonl │ └── qa90_questions_Arabic.jsonl └── images/ └── val2014/ # 30 clean COCO images ``` ## Citation ```bibtex @inproceedings{ oh2025visual, title={Visual Instruction Bottleneck Tuning}, author={Changdae Oh and Jiatong Li and Shawn Im and Sharon Li}, booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}, year={2025}, url={https://openreview.net/forum?id=yzHiEmLSk8} } ``` ## License MIT
提供机构:
changdae
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作