Sterzhang/vlm2-bench
收藏Hugging Face2025-10-26 更新2025-08-30 收录
下载链接:
https://hf-mirror.com/datasets/Sterzhang/vlm2-bench
下载链接
链接失效反馈官方服务:
资源简介:
VLM2-Bench是一个全面的基准测试,用于评估视觉语言模型(VLMs)在多图像序列和视频中对匹配视觉线索的链接能力。该基准测试包含9个子任务,超过3000个测试案例,旨在评估人类日常生活中使用的视觉链接基本能力,如在不同照片中识别同一个人而无需事先知道其身份。数据集分为三个主要类别:通用线索、以对象为中心的线索和以人物为中心的线索。
VLM2-Bench is a comprehensive benchmark that evaluates vision-language models (VLMs) ability to visually link matching cues across multi-image sequences and videos. The benchmark consists of 9 subtasks with over 3,000 test cases, designed to assess fundamental visual linking capabilities that humans use daily, such as identifying the same person across different photos without prior knowledge of their identity. The dataset is organized into three main categories: General Cue, Object-centric Cue, and Person-centric Cue.
提供机构:
Sterzhang



