MMVP-VLM

arXiv2025-09-30 收录

下载链接：

https://tsb0601.github.io/mmvp_blog/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在评估基于CLIP的模型在使用具有相似特征嵌入的视觉上独特的图像对时的性能表现。它详细描述了九种典型的场景，在这些场景中，基于CLIP的模型通常会出现失败。该数据集的规模属于中等，其任务是对视觉语言模型中的幻觉现象进行量化评估。

This dataset is designed to evaluate the performance of CLIP-based models when utilizing visually distinct image pairs with similar feature embeddings. It details nine typical scenarios where CLIP-based models commonly experience failures. With a moderate scale, this dataset targets the quantitative assessment of hallucination phenomena in vision-language models.

5,000+

优质数据集

54 个

任务类型

进入经典数据集