five

Arena-GR1-Manipulation-Task

收藏
魔搭社区2025-12-04 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/nv-community/Arena-GR1-Manipulation-Task
下载链接
链接失效反馈
官方服务:
资源简介:
## Dataset Description: The Arena-GR1-Manipulation-Task dataset is multimodal collections of trajectories generated in Isaac Lab. It supports humanoid (GR1) manipulation task in IsaacLab-Arena environment. Each entry provides the full context (state, vision, language, action) needed to train and evaluate generalist robot policies for opening microwave task. | Dataset Name | # Trajectories | |-----------------------|----------------| | GR1 Manipulation Task | 50 | This dataset is ideal for behavior cloning, policy learning, and generalist robotic manipulation research. It has been for post-training GR00T N1.5 model. This dataset is ready for commercial use. ## Dataset Owner NVIDIA Corporation ## Dataset Creation Date: 10/10/2025 ## License/Terms of Use: This dataset is governed by the Creative Commons Attribution 4.0 International License (CC-BY-4.0). ## Intended Usage: This dataset is intended for: - Training robot manipulation policies using behavior cloning. - Research in generalist robotics and task-conditioned agents. - Sim-to-real / Sim-to-Sim transfer studies. ## Dataset Characterization: ### Data Collection Method - Automated - Automatic/Sensors - Synthetic 10 human teleoperated demonstrations are collected through a depth camera and keyboard in Isaac Lab. All 50 demos are generated automatically using a synthetic motion trajectory generation framework, Mimicgen [1]. Each demo is generated at 50 Hz. ### Labeling Method Not Applicable ## Dataset Format: We provide a few dataset files, including - a human-annoated 10 demonstrations in HDF5 dataset file (`arena_gr1_manipulation_dataset_annotated.hdf5`) - a Mimic-generated 50 demonstrations in HDF5 dataset file (`arena_gr1_manipulation_dataset_generated.hdf5`) - a GR00T-Lerobot formatted dataset converted from the Mimic-generated HDF5 dataset file (`lerobot`) Each demo in GR00T-Lerobot datasets consists of a time-indexed sequence of the following modalities: ### Actions - action (FP64): joint desired positions for all body joints (36 DoF) ### Observations - observation.state (FP64): joint positions for all body joints (54 DoF) ### Task-specific - timestamp (FP64): simulation time in seconds of each recorded data entry. - annotation.human.action.task_description (INT64): index referring to the language instruction recorded in the metadata - annotation.human.action.valid (INT64): index indicating validity of annotaion recorded in the metadata - episode_index (INT64): index indicating the order of each demo - task_index (INT64): index used in multi-task data loader. Not applicable to Gr00t-N1 post training, always set to 0. ### Videos - 512 x 512 RGB videos in mp4 format from first-person-view camera In additional, a set of metadata describing the followings is provided, - `episodes.jsonl` contains a list of all the episodes in the entire dataset. Each episode contains a list of tasks and the length of the episode. - `tasks.jsonl` contains a list of all the tasks in the entire dataset. - `modality.json` contains the modality configuration. - `info.json` contains the dataset information. ## Dataset Quantification: ### Record Count #### GR1 Manipulation Task - Number of demonstrations/trajectories: 50 - Number of RGB videos: 50 ### Total Storage 5.16 GB ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/). ## Reference(s): [1] @inproceedings{mandlekar2023mimicgen, title={MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations}, author={Mandlekar, Ajay and Nasiriany, Soroush and Wen, Bowen and Akinola, Iretiayo and Narang, Yashraj and Fan, Linxi and Zhu, Yuke and Fox, Dieter}, booktitle={7th Annual Conference on Robot Learning}, year={2023} }

### 数据集描述 Arena-GR1-操作任务数据集是在艾萨克实验室(Isaac Lab)中生成的多模态轨迹集合,支持艾萨克实验室竞技场(IsaacLab-Arena)环境下的人形机器人GR1操作任务。每条数据均包含训练与评估通用机器人策略所需的完整上下文信息(状态、视觉、语言、动作),用于完成打开微波炉任务。 | 数据集名称 | 轨迹数量 | |---------------------------|----------| | GR1 操作任务 | 50 | 本数据集适用于行为克隆、策略学习以及通用机器人操作研究,已被用于GR00T N1.5模型的后训练。该数据集可用于商业用途。 ### 数据集所有者 英伟达(NVIDIA)公司 ### 数据集创建日期 2025年10月10日 ### 使用许可/条款 本数据集受知识共享署名4.0国际许可协议(Creative Commons Attribution 4.0 International License,CC-BY-4.0)约束。 ### 预期用途 本数据集适用于: - 基于行为克隆的机器人操作策略训练 - 通用机器人学与任务条件智能体相关研究 - 仿真到实物(Sim-to-Real)/仿真到仿真(Sim-to-Sim)迁移研究 ### 数据集特征 #### 数据采集方法 - 自动化采集 - 自动/传感器采集 - 合成数据 首先通过深度相机与键盘在艾萨克实验室(Isaac Lab)中采集了10条人类远程操作演示数据。全部50条演示数据均通过合成运动轨迹生成框架MimicGen [1] 自动生成。每条演示数据以50赫兹的频率采集。 #### 标注方法 不适用 ### 数据集格式 我们提供了以下数种数据集文件: - 10条人类标注的演示数据,存储于HDF5数据集文件`arena_gr1_manipulation_dataset_annotated.hdf5`中 - 50条由MimicGen生成的演示数据,存储于HDF5数据集文件`arena_gr1_manipulation_dataset_generated.hdf5`中 - 由MimicGen生成的HDF5数据集文件转换而来的GR00T-Lerobot格式数据集(`lerobot`目录) GR00T-Lerobot数据集中的每条演示由以下多模态时序序列组成: ##### 动作 - `action`(FP64):所有身体关节的期望关节位置(共36个自由度) ##### 观测 - `observation.state`(FP64):所有身体关节的当前关节位置(共54个自由度) ##### 任务特定字段 - `timestamp`(FP64):每条记录数据对应的仿真时间(单位:秒) - `annotation.human.action.task_description`(INT64):指向元数据中记录的语言指令的索引 - `annotation.human.action.valid`(INT64):指向元数据中记录的标注有效性的索引 - `episode_index`(INT64):指示每条演示的顺序索引 - `task_index`(INT64):多任务数据加载器中使用的索引。本数据集不适用于Gr00t-N1后训练,始终设置为0。 ##### 视频 - 512×512分辨率的第一视角RGB视频,格式为MP4 此外,还提供了用于描述以下内容的元数据集: - `episodes.jsonl`:包含整个数据集中所有演示的列表,每条演示包含任务列表与演示时长 - `tasks.jsonl`:包含整个数据集中所有任务的列表 - `modality.json`:包含模态配置信息 - `info.json`:包含数据集相关信息 ### 数据集量化统计 #### 记录数量 ##### GR1 操作任务 - 演示/轨迹总数:50 - RGB视频总数:50 #### 总存储空间 5.16 GB ### 伦理考量 英伟达(NVIDIA)认为,可信人工智能是一项共同责任,我们已建立相关政策与实践规范,以支持各类人工智能应用的开发。开发者在按照本服务条款下载或使用本数据集时,应与其内部模型团队协作,确保该模型符合相关行业与应用场景的要求,并应对可能出现的产品误用问题。 请通过[此链接](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)提交安全漏洞或英伟达人工智能相关问题。 ### 参考文献 [1] @inproceedings{mandlekar2023mimicgen, title={MimicGen: 基于人类演示的可扩展机器人学习数据生成系统}, author={Mandlekar, Ajay and Nasiriany, Soroush and Wen, Bowen and Akinola, Iretiayo and Narang, Yashraj and Fan, Linxi and Zhu, Yuke and Fox, Dieter}, booktitle={第7届机器人学习年度会议}, year={2023} }
提供机构:
maas
创建时间:
2025-12-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作