EmbSpatial-Bench
收藏魔搭社区2025-12-11 更新2025-11-22 收录
下载链接:
https://modelscope.cn/datasets/comefly/EmbSpatial-Bench
下载链接
链接失效反馈官方服务:
资源简介:
### EmbSpatial-Bench
EmbSpatial-Bench is a benchmark for evaluating embodied spatial understanding of LVLMs. The benchmark is automatically derived from embodied scenes and covers 6 spatial relationships from an egocentric perspective.
The constructed benchmark comprises a total of 3,640 QA pairs, covering 294 object categories and 6 spatial relationships.
### EmbSpatial-SFT
EmbSpatial-SFT is an instruction-tuning dataset, which provides QA data for two tasks: spatial relationship identification and object localization.
The first task setting is consistent with EmbSpatial-Bench, while the other serves as an auxiliary task to enhance the model’s ability to ground target objects. The auxiliary task can be considered as the foundational skill for relationship identification.
EmbSpatial-SFT is solely built on the training scenes of MP3D.
More details can be seen in our paper: https://arxiv.org/abs/2406.05756.
### EmbSpatial-Bench
EmbSpatial-Bench 是用于评估视觉语言大模型(Large Vision-Language Models, LVLMs)具身空间理解能力的基准测试集。该基准集自动从具身场景中衍生而来,涵盖自我中心视角下的6种空间关系。所构建的基准集总计包含3640组问答对,覆盖294个物体类别与6种空间关系。
### EmbSpatial-SFT
EmbSpatial-SFT 是一款指令微调数据集,为两类任务提供问答数据:空间关系识别与物体定位。第一项任务的设置与EmbSpatial-Bench保持一致,另一项则作为辅助任务,用于增强模型对目标物体的接地能力,该辅助任务可被视作关系识别的基础技能。EmbSpatial-SFT 仅基于MP3D的训练场景构建。
更多细节可参阅我们的论文:https://arxiv.org/abs/2406.05756.
提供机构:
maas
创建时间:
2025-11-03



