MMBench

arXiv2025-09-30 收录

下载链接：

https://opencompass.org.cn/mmbench

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集全面评估了视觉语言模型（VLMs）的多样化技能，同时，MMBench作为另一个衡量多模态多任务学习（MMICL）性能的基准，也被用于参考和比较。该任务的目的是对各种技能进行综合评估。

This dataset comprehensively evaluates the diverse capabilities of Vision-Language Models (VLMs). In addition, MMBench, another benchmark designed to measure the performance of Multimodal Multitask Learning (MMICL), is utilized as a reference for comparative evaluation. The objective of this task is to carry out a comprehensive assessment across various skills.

5,000+

优质数据集

54 个

任务类型

进入经典数据集