FlagEval/EmbSpatial-Bench
收藏Hugging Face2025-04-21 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/FlagEval/EmbSpatial-Bench
下载链接
链接失效反馈官方服务:
资源简介:
EmbSpatial-Bench是一个评估大型视觉语言模型(LVLMs)身体空间理解能力的基准数据集。该数据集从身体场景自动衍生而来,包含了从自我中心视角出发的6种空间关系。数据集总共包含3,640个问答对,涵盖了294个对象类别和6种空间关系。数据集以更易于访问和使用的方式进行了格式转换,每个样本包含场景标识、问题标识、问题文本、描述的关系、场景图像、答案选项、正确答案索引、图像中的对象列表、对象的边界框坐标和名称以及图像的Base64编码。
EmbSpatial-Bench is a benchmark for evaluating the embodied spatial understanding of Large Vision-Language Models (LVLMs). The dataset is automatically derived from embodied scenes and covers 6 spatial relationships from an egocentric perspective. It consists of a total of 3,640 QA pairs, covering 294 object categories and 6 spatial relationships. The dataset has been transformed into a more accessible and easy-to-use format, with each sample including scene ID, question ID, question text, described relationship, scene image, answer options, index of the correct answer, list of objects in the image, bounding box coordinates and names for the objects, and Base64 encoded image data.
提供机构:
FlagEval



