RoVI Book

arXiv2025-09-30 收录

下载链接：

https://robotic-visual-instruction.github.io/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为RoVI Book，包含了15,000个图像-文本问答对，这些对包括用RoVI标注的初始任务观察图像、简单的默认提示查询，以及由GPT-4o生成的回答，这些回答涵盖了RoVI分析、任务名称、细粒度的规划步骤和Python函数。该数据集覆盖了64%的单步任务和36%的多步任务，涉及五种基本操作技能：移动对象、旋转对象、拿起、打开抽屉/橱柜以及关闭抽屉/橱柜。此外，数据集还包括了增强的视觉变体，并保留了来自Open-X Embodiment数据集的原始语义任务名称。规模上，该数据集包含了15,000个实例，其任务是具有时空约束的机器人操作任务。

The dataset, named RoVI Book, consists of 15,000 image-text question-answer pairs. These pairs include initial task observation images annotated with RoVI, simple default prompt queries, and answers generated by GPT-4o, which cover RoVI analysis, task names, fine-grained planning steps, and Python functions. This dataset encompasses 64% of single-step tasks and 36% of multi-step tasks, involving five basic operational skills: moving objects, rotating objects, picking up, opening drawers/cabinets, and closing drawers/cabinets. Additionally, the dataset incorporates enhanced visual variants and retains the original semantic task names from the Open-X Embodiment dataset. With a total of 15,000 instances, this dataset focuses on robotic manipulation tasks with spatiotemporal constraints.

5,000+

优质数据集

54 个

任务类型

进入经典数据集