Visual Spatial Reasoning
收藏OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/Visual_Spatial_Reasoning
下载链接
链接失效反馈官方服务:
资源简介:
视觉空间推理 (VSR) 语料库是具有真/假标签的字幕图像对的集合。每个标题描述了图像中两个单独对象的空间关系,视觉语言模型 (VLM) 需要判断标题是否正确地描述了图像 (True) 或不正确 (False)。下面是几个例子。
The Visual Spatial Reasoning (VSR) corpus is a collection of caption-image pairs with true/false labels. Each caption describes the spatial relationship between two distinct objects in the corresponding image, and Visual Language Models (VLMs) are required to determine whether the caption correctly describes the image (True) or not (False). Several examples are provided below.
提供机构:
OpenDataLab
创建时间:
2023-10-20



