II-Bench
收藏魔搭社区2025-12-02 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/m-a-p/II-Bench
下载链接
链接失效反馈官方服务:
资源简介:
# II-Bench
[**🌐 Homepage**](https://ii-bench.github.io/) | [**🤗 Paper**](https://huggingface.co/papers/2406.05862) | [**📖 arXiv**](https://arxiv.org/abs/2406.05862) | [**🤗 Dataset**](https://huggingface.co/datasets/m-a-p/II-Bench) | [**GitHub**](https://github.com/II-Bench/II-Bench)
<div style="text-align: center;">
<img src="intr.png" width="40%">
</div>
## Introduction
**II-Bench** comprises 1,222 images, each accompanied by 1 to 3 multiple-choice questions, totaling 1,434 questions. II-Bench encompasses images from six distinct domains: Life, Art, Society, Psychology, Environment and Others. It also features a diverse array of image types, including Illustrations, Memes, Posters, Multi-panel Comics, Single-panel Comics, Logos and Paintings. The detailed statistical information can be found in the image below.
<div style="text-align: center;">
<img src="II-bench-type.jpg" width="80%">
</div>
## Example
Here are some examples of II-Bench:
<div style="text-align: center;">
<img src="II-bench-sample.jpg" width="80%">
</div>
## 🏆 Mini-Leaderboard
| Open-source Models | Score |
|---------------------------|-------|
| InstructBLIP-T5-XL | 47.3 |
| BLIP-2 FLAN-T5-XL | 52.8 |
| mPLUGw-OWL2 | 53.2 |
| Qwen-VL-Chat | 53.4 |
| InstructBLIP-T5-XXL | 56.7 |
| Mantis-8B-siglip-Llama3 | 57.5 |
| BLIP-2 FLAN-T5-XXL | 57.8 |
| DeepSeek-VL-Chat-7B | 60.3 |
| Yi-VL-6B-Chat | 61.3 |
| InternLM-XComposer2-VL | 62.1 |
| InternVL-Chat-1.5 | 66.3 |
| Idefics2-8B | 67.7 |
| Yi-VL-34B-Chat | 67.9 |
| MiniCPM-Llama3-2.5 | 69.4 |
| CogVLM2-Llama3-Chat | 70.3 |
| LLaVA-1.6-34B |**73.8**|
| **Closed-source Models** |**Score**|
| GPT-4V | 65.9 |
| GPT-4o | 72.6 |
| Gemini-1.5 Pro | 73.9 |
| Qwen-VL-MAX | 74.8 |
| Claude 3.5 Sonnet |**80.9**|
## Disclaimers
The guidelines for the annotators emphasized strict compliance with copyright and licensing rules from the initial data source, specifically avoiding materials from websites that forbid copying and redistribution.
Should you encounter any data samples potentially breaching the copyright or licensing regulations of any site, we encourage you to [contact](#contact) us. Upon verification, such samples will be promptly removed.
## Contact
- Ziqiang Liu: zq.liu4@siat.ac.cn
- Feiteng Fang: feitengfang@mail.ustc.edu.cn
- Xi Feng: fengxi@ustc.edu
- Xinrun Du: duxinrun2000@gmail.com
- Chenhao Zhang: ch_zhang@hust.edu.cn
- Ge Zhang: gezhang@umich.edu
- Shiwen Ni: sw.ni@siat.ac.cn
## Citation
**BibTeX:**
```bibtex
@misc{liu2024iibench,
title={II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models},
author={Ziqiang Liu and Feiteng Fang and Xi Feng and Xinrun Du and Chenhao Zhang and Zekun Wang and Yuelin Bai and Qixuan Zhao and Liyang Fan and Chengguang Gan and Hongquan Lin and Jiaming Li and Yuansheng Ni and Haihong Wu and Yaswanth Narsupalli and Zhigang Zheng and Chengming Li and Xiping Hu and Ruifeng Xu and Xiaojun Chen and Min Yang and Jiaheng Liu and Ruibo Liu and Wenhao Huang and Ge Zhang and Shiwen Ni},
year={2024},
eprint={2406.05862},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
# II-Bench
[🌐 主页](https://ii-bench.github.io/) | [🤗 论文](https://huggingface.co/papers/2406.05862) | [📖 arXiv](https://arxiv.org/abs/2406.05862) | [🤗 数据集](https://huggingface.co/datasets/m-a-p/II-Bench) | [GitHub](https://github.com/II-Bench/II-Bench)
<div style="text-align: center;">
<img src="intr.png" width="40%">
</div>
## 简介
**II-Bench** 包含1222张图像,每张图像配套1至3道多项选择题,总计1434道题目。II-Bench涵盖六大领域的图像:生命、艺术、社会、心理学、环境及其他领域。同时,该数据集包含丰富多样的图像类型,涵盖插画、表情包、海报、多格漫画、单格漫画、标识及绘画作品。详细统计信息可参见下图。
<div style="text-align: center;">
<img src="II-bench-type.jpg" width="80%">
</div>
## 示例
以下为II-Bench的部分示例:
<div style="text-align: center;">
<img src="II-bench-sample.jpg" width="80%">
</div>
## 🏆 迷你评测榜单
| 开源模型 | 得分 |
|---------------------------|-------|
| InstructBLIP-T5-XL | 47.3 |
| BLIP-2 FLAN-T5-XL | 52.8 |
| mPLUGw-OWL2 | 53.2 |
| Qwen-VL-Chat | 53.4 |
| InstructBLIP-T5-XXL | 56.7 |
| Mantis-8B-siglip-Llama3 | 57.5 |
| BLIP-2 FLAN-T5-XXL | 57.8 |
| DeepSeek-VL-Chat-7B | 60.3 |
| Yi-VL-6B-Chat | 61.3 |
| InternLM-XComposer2-VL | 62.1 |
| InternVL-Chat-1.5 | 66.3 |
| Idefics2-8B | 67.7 |
| Yi-VL-34B-Chat | 67.9 |
| MiniCPM-Llama3-2.5 | 69.4 |
| CogVLM2-Llama3-Chat | 70.3 |
| LLaVA-1.6-34B |**73.8**|
| **闭源模型** |**得分**|
| GPT-4V | 65.9 |
| GPT-4o | 72.6 |
| Gemini-1.5 Pro | 73.9 |
| Qwen-VL-MAX | 74.8 |
| Claude 3.5 Sonnet |**80.9**|
## 免责声明
标注人员指南明确要求严格遵守原始数据源的版权与许可规范,尤其需规避来自禁止复制与再分发网站的素材。若您发现任何可能违反任一网站版权或许可规范的数据样本,欢迎[联系](#contact)我们。经核实后,此类样本将被立即移除。
## 联系方式
- 刘子强:zq.liu4@siat.ac.cn
- 方腾飞:feitengfang@mail.ustc.edu.cn
- 冯曦:fengxi@ustc.edu
- 杜鑫润:duxinrun2000@gmail.com
- 张晨昊:ch_zhang@hust.edu.cn
- 张戈:gezhang@umich.edu
- 倪诗雯:sw.ni@siat.ac.cn
## 引用
**BibTeX格式:**
bibtex
@misc{liu2024iibench,
title={II-Bench: 面向多模态大语言模型(Multimodal Large Language Model)的图像内涵理解评测基准},
author={Ziqiang Liu and Feiteng Fang and Xi Feng and Xinrun Du and Chenhao Zhang and Zekun Wang and Yuelin Bai and Qixuan Zhao and Liyang Fan and Chengguang Gan and Hongquan Lin and Jiaming Li and Yuansheng Ni and Haihong Wu and Yaswanth Narsupalli and Zhigang Zheng and Chengming Li and Xiping Hu and Ruifeng Xu and Xiaojun Chen and Min Yang and Jiaheng Liu and Ruibo Liu and Wenhao Huang and Ge Zhang and Shiwen Ni},
year={2024},
eprint={2406.05862},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
maas
创建时间:
2024-07-02



