GRiD-A-3D
收藏arXiv2022-07-06 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2207.02624v1
下载链接
链接失效反馈官方服务:
资源简介:
GRiD-A-3D是由汉堡大学创建的一个新型诊断视觉问答(VQA)数据集,专注于通过多任务学习实现相对方向的定位。该数据集包含8000张基于抽象对象的渲染图像,分为训练、验证和测试集,共有432,948个相关问题。数据集通过Blender生成,使用六种不同颜色的箭头标记对象,以清晰表示相对方向。GRiD-A-3D旨在通过消除真实世界对象可能引入的偏见,更有效地分析和训练VQA模型,特别是在处理相对方向的推理任务时。该数据集的应用领域包括机器视觉和自然语言处理的交叉研究,旨在解决复杂空间任务中的方向关系推理问题。
GRiD-A-3D is a novel diagnostic visual question answering (VQA) dataset developed by the University of Hamburg, which focuses on relative orientation localization through multi-task learning. This dataset comprises 8,000 rendered images based on abstract objects, and is split into training, validation, and test sets, with a total of 432,948 associated questions. Generated using Blender, the dataset marks objects with arrows in six distinct colors to clearly represent relative orientations. GRiD-A-3D aims to more effectively analyze and train VQA models, especially for reasoning tasks involving relative orientations, by eliminating biases that may arise from real-world objects. Its application areas cover cross-disciplinary research between machine vision and natural language processing, with the goal of solving directional relation reasoning problems in complex spatial tasks.
提供机构:
汉堡大学
创建时间:
2022-07-06



