PhysicalAI-Robotics-Manipulation-Augmented

Name: PhysicalAI-Robotics-Manipulation-Augmented
Creator: maas
Published: 2025-12-04 09:19:27
License: 暂无描述

魔搭社区2025-12-04 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/nv-community/PhysicalAI-Robotics-Manipulation-Augmented

下载链接

链接失效反馈

官方服务：

资源简介：

## Dataset Description: This is a fully annotated, synthetically generated dataset consisting of 1,000 demonstrations of a single Franka Panda robot arm performing a fixed-order three-cube stacking task in Isaac Lab. The robot consistently stacks cubes in the order: blue (bottom) → red (middle) → green (top). The dataset was produced using the following pipeline: - Collected 10 human teleoperation demonstrations of the stacking task. - Used Isaac Lab’s **Mimic** tool [1] to simulate 1,000 high-quality trajectories in Isaac Sim. - Applied **Cosmos Transfer1** model [2] to augment the RGB visuals from the table camera with photorealistic domain adaptation. Each demonstration includes synchronized multimodal data: - RGB videos from both a table-mounted and wrist-mounted camera. - Depth, segmentation, and surface normal maps from the table camera. - Full low-level robot and object states (joints, end-effector, gripper, cube poses). - Action sequences executed by the robot. This dataset is ideal for behavior cloning, policy learning, and generalist robotic manipulation research. This dataset is ready for commercial use. ## Dataset Owner(s): NVIDIA Corporation ## Dataset Creation Date: 05/14/2025 ## License/Terms of Use: This dataset is governed by the Creative Commons Attribution 4.0 International License (CC-BY-4.0). ## Intended Usage: This dataset is intended for: - Training robot manipulation policies using behavior cloning. - Research in generalist robotics and task-conditioned agents. - Sim-to-real transfer studies and visual domain adaptation. ## Dataset Characterization: **Data Collection Method** * Automated * Automatic/Sensors * Synthetic 10 human teleoperated demonstrations were used to bootstrap a Mimic-based simulation [1] in Isaac Sim. All 1,000 demos are generated automatically followed by domain-randomized visual augmentation using Cosmos Transfer1 [2]. **Labeling Method** * Not Applicable ## Dataset Format: We provide the Mimic generated 1000 demonstrations and the 1000 Cosmos augmented demonstrations in separate HDF5 dataset files (`mimic_dataset_1k.hdf5` and `cosmos_dataset_1k.hdf5` respectively). Each demo in each file consists of a time-indexed sequence of the following modalities: **Actions** - 7D vector: 6D relative end-effector motion + 1D gripper action **Observations** - Robot states: Joint positions, velocities, and gripper open/close state - EEF states: End-effector 6-DOF pose - Cube states: Poses (positions + orientations) for blue, red, and green cubes - Table camera visuals: - 200×200 RGB - 200×200 Depth - 200×200 Segmentation mask - 200×200 Surface normal map - Wrist camera visuals: - 200×200 RGB The datasets' trajectories can be replayed in simulation using Isaac Lab (refer to [this script](https://github.com/isaac-sim/IsaacLab/blob/main/scripts/tools/replay_demos.py)). The videos can be extracted in MP4 format using [this script](https://huggingface.co/datasets/nvidia/PhysicalAI-Robotics-Manipulation-Augmented/blob/main/hdf5_to_mp4.py). ## Dataset Quantification: **Record Count** * `mimic_dataset_1k` * Number of demonstrations/trajectories: 1000 * Number of RGB videos: 2000 (1000 table camera + 1000 wrist camera) * Number of depth videos: 1000 (table camera) * Number of segmentation videos: 1000 (table camera) * Number of normal map videos: 1000 (table camera) * `cosmos_dataset_1k` * Number of demonstrations/trajectories: 1000 * Number of RGB videos: 2000 (1000 table camera + 1000 wrist camera) * Number of depth videos: 1000 (table camera) * Number of segmentation videos: 1000 (table camera) * Number of normal map videos: 1000 (table camera) **Total Storage** * 69.4 GB ## Reference(s): ``` [1] @inproceedings{mandlekar2023mimicgen, title={MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations}, author={Mandlekar, Ajay and Nasiriany, Soroush and Wen, Bowen and Akinola, Iretiayo and Narang, Yashraj and Fan, Linxi and Zhu, Yuke and Fox, Dieter}, booktitle={7th Annual Conference on Robot Learning}, year={2023} } [2] @misc{nvidia2025cosmostransfer1conditionalworldgeneration, title = {Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control}, author = {NVIDIA and Abu Alhaija, Hassan and Alvarez, Jose and Bala, Maciej and Cai, Tiffany and Cao, Tianshi and Cha, Liz and Chen, Joshua and Chen, Mike and Ferroni, Francesco and Fidler, Sanja and Fox, Dieter and Ge, Yunhao and Gu, Jinwei and Hassani, Ali and Isaev, Michael and Jannaty, Pooya and Lan, Shiyi and Lasser, Tobias and Ling, Huan and Liu, Ming-Yu and Liu, Xian and Lu, Yifan and Luo, Alice and Ma, Qianli and Mao, Hanzi and Ramos, Fabio and Ren, Xuanchi and Shen, Tianchang and Tang, Shitao and Wang, Ting-Chun and Wu, Jay and Xu, Jiashu and Xu, Stella and Xie, Kevin and Ye, Yuchong and Yang, Xiaodong and Zeng, Xiaohui and Zeng, Yu}, journal = {arXiv preprint arXiv:2503.14492}, year = {2025}, url = {https://arxiv.org/abs/2503.14492} } ``` ## Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

## 数据集描述：本数据集为全标注合成生成数据集，包含1000段演示数据，由单台Franka Panda机械臂（Franka Panda robot arm）在Isaac Lab（Isaac Lab）仿真环境中完成固定顺序的三立方体堆叠任务。机器人始终按照固定顺序堆叠立方体：蓝色（底层）→红色（中层）→绿色（顶层）。本数据集通过以下流程生成： - 收集10段人类遥操作堆叠演示数据。 - 使用Isaac Lab的**Mimic**工具（Mimic）[1]在Isaac Sim（Isaac Sim）中模拟生成1000条高质量轨迹。 - 应用**Cosmos Transfer1**模型（Cosmos Transfer1）[2]对台置相机拍摄的RGB视觉数据进行照片级真实感域自适应增强。每段演示均包含同步多模态数据： - 台置相机与腕装相机拍摄的RGB视频。 - 台置相机采集的深度图、分割掩码与表面法向图。 - 完整的底层机器人与物体状态数据（关节、末端执行器（End-effector，EEF）、夹爪、立方体位姿）。 - 机器人执行的动作序列。本数据集适用于行为克隆、策略学习以及通用机器人操作研究，且可商用。 ## 数据集所有者：英伟达公司（NVIDIA Corporation） ## 数据集创建日期： 2025年5月14日 ## 使用许可条款：本数据集遵循知识共享署名4.0国际许可协议（Creative Commons Attribution 4.0 International License，CC-BY-4.0）。 ## 预期用途：本数据集适用于： - 使用行为克隆训练机器人操作策略。 - 通用机器人学与任务条件AI智能体（AI Agent）相关研究。 - 仿真到真实（Sim-to-real）迁移研究与视觉域自适应任务。 ## 数据集特征： **数据采集方式** * 自动化采集 * 自动化/传感器采集 * 合成数据使用10段人类遥操作演示数据作为引导，基于Mimic工具[1]在Isaac Sim中构建仿真系统。1000段演示数据均通过自动生成，并使用Cosmos Transfer1[2]进行域随机化视觉增强。 **标注方式** * 不适用 ## 数据集格式：我们将Mimic生成的1000段演示数据与经Cosmos增强后的1000段演示数据分别存储于HDF5格式数据集文件中（分别为`"mimic_dataset_1k.hdf5"`与`"cosmos_dataset_1k.hdf5"`）。每个文件中的每段演示均包含按时间索引的以下多模态数据： **动作数据** - 7维向量：6维相对末端执行器运动 + 1维夹爪动作 **观测数据** - 机器人状态：关节位置、速度及夹爪开合状态 - 末端执行器状态：末端执行器6自由度位姿 - 立方体状态：蓝色、红色与绿色立方体的位姿（位置+姿态） - 台置相机视觉数据： - 200×200分辨率RGB图像 - 200×200分辨率深度图 - 200×200分辨率分割掩码 - 200×200分辨率表面法向图 - 腕装相机视觉数据： - 200×200分辨率RGB图像可使用Isaac Lab回放数据集轨迹（参考脚本：https://github.com/isaac-sim/IsaacLab/blob/main/scripts/tools/replay_demos.py）。可使用脚本https://huggingface.co/datasets/nvidia/PhysicalAI-Robotics-Manipulation-Augmented/blob/main/hdf5_to_mp4.py将视频导出为MP4格式。 ## 数据集量化统计： **记录统计** * `"mimic_dataset_1k"` * 演示/轨迹数量：1000 * RGB视频数量：2000（1000段台置相机视频 + 1000段腕装相机视频） * 深度视频数量：1000（台置相机） * 分割掩码视频数量：1000（台置相机） * 法向图视频数量：1000（台置相机） * `"cosmos_dataset_1k"` * 演示/轨迹数量：1000 * RGB视频数量：2000（1000段台置相机视频 + 1000段腕装相机视频） * 深度视频数量：1000（台置相机） * 分割掩码视频数量：1000（台置相机） * 法向图视频数量：1000（台置相机） **总存储空间** * 69.4 GB ## 参考文献： [1] @inproceedings{mandlekar2023mimicgen, title={MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations}, author={Mandlekar, Ajay and Nasiriany, Soroush and Wen, Bowen and Akinola, Iretiayo and Narang, Yashraj and Fan, Linxi and Zhu, Yuke and Fox, Dieter}, booktitle={7th Annual Conference on Robot Learning}, year={2023} } [2] @misc{nvidia2025cosmostransfer1conditionalworldgeneration, title = {Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control}, author = {NVIDIA and Abu Alhaija, Hassan and Alvarez, Jose and Bala, Maciej and Cai, Tiffany and Cao, Tianshi and Cha, Liz and Chen, Joshua and Chen, Mike and Ferroni, Francesco and Fidler, Sanja and Fox, Dieter and Ge, Yunhao and Gu, Jinwei and Hassani, Ali and Isaev, Michael and Jannaty, Pooya and Lan, Shiyi and Lasser, Tobias and Ling, Huan and Liu, Ming-Yu and Liu, Xian and Lu, Yifan and Luo, Alice and Ma, Qianli and Mao, Hanzi and Ramos, Fabio and Ren, Xuanchi and Shen, Tianchang and Tang, Shitao and Wang, Ting-Chun and Wu, Jay and Xu, Jiashu and Xu, Stella and Xie, Kevin and Ye, Yuchong and Yang, Xiaodong and Zeng, Xiaohui and Zeng, Yu}, journal = {arXiv preprint arXiv:2503.14492}, year = {2025}, url = {https://arxiv.org/abs/2503.14492} } ## 伦理考量：英伟达（NVIDIA）认为可信人工智能是一项共同责任，我们已建立相关政策与实践规范，以支持各类AI应用的开发。开发者在遵循本服务条款下载或使用本数据集时，应与其内部模型团队协作，确保该模型符合相关行业与应用场景的要求，并防范不可预见的产品误用。请通过以下链接报告安全漏洞或英伟达AI相关问题：https://www.nvidia.com/en-us/support/submit-security-vulnerability/。

提供机构：

maas

创建时间：

2025-08-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集