MP3D-EQA
收藏embodiedqa.org2025-03-21 收录
下载链接:
https://embodiedqa.org/
下载链接
链接失效反馈官方服务:
资源简介:
MP3D-EQA v1 数据集由佐治亚理工学院和 Facebook AI Research 联合创建,旨在推动具身问答(EmbodiedQA)任务在光影逼真环境中的研究。该数据集基于 Matterport3D 数据集构建,包含 1136 个问答对,分布在 83 个不同的室内环境中。数据集中的问题主要涉及物体的颜色、位置等信息,通过程序化生成的方式从 Matterport3D 的注释中产生,并通过亚马逊众包平台收集物体颜色的标注以生成 “What color ...?” 类型的问题。数据集的创建过程充分考虑了环境的视觉逼真度和语义丰富性,通过在 Matterport3D 的 3D 重建环境中生成问答任务,为具身智能体提供了更具挑战性的导航和问答场景。其应用领域主要集中在开发能够在复杂环境中导航并回答问题的具身智能体,推动视觉导航、语义理解以及多模态交互等领域的研究。MP3D-EQA v1 数据集为研究者提供了一个测试智能体在真实世界场景中导航和问答能力的基准平台,有助于推动具身 AI 的发展。
The MP3D-EQA v1 dataset was co-developed by the Georgia Institute of Technology and Facebook AI Research, with the objective of advancing research on Embodied Question Answering (EmbodiedQA) tasks in photorealistic environments. Built upon the Matterport3D dataset, this resource includes 1,136 question-answer pairs spanning 83 unique indoor environments. Most questions in the dataset pertain to information such as object colors and spatial positions. They are primarily generated programmatically from the annotations of the Matterport3D dataset, while questions of the form "What color ...?" are formulated by collecting object color annotations through the Amazon Mechanical Turk crowdsourcing platform. The dataset development process fully takes into account the visual realism and semantic richness of the environments. By generating question-answering tasks within the 3D reconstructed Matterport3D environments, it offers more challenging navigation and question-answering scenarios for embodied AI agents. Its application domains primarily focus on developing embodied agents capable of navigating complex environments and answering relevant questions, thereby advancing research in fields including visual navigation, semantic understanding, and multimodal interaction. The MP3D-EQA v1 dataset provides researchers with a benchmark platform to evaluate agents' navigation and question-answering capabilities in real-world scenarios, thus facilitating the advancement of embodied AI research.
提供机构:
佐治亚理工学院
搜集汇总
数据集介绍

背景与挑战
背景概述
MP3D-EQA是一个用于3D环境中视觉导航和问答任务的数据集,支持智能体通过第一人称视角探索环境并回答问题。该数据集与多篇CVPR论文相关,并提供了PyTorch代码支持。
以上内容由遇见数据集搜集并总结生成



