jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k

Name: jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k
Creator: jnsungp
Published: 2025-11-19 15:03:20
License: 暂无描述

Hugging Face2025-11-19 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 task_categories: - robotics tags: - unitree-g1 - pick-and-place - simulation - curobo - depth-perception - rgbd size_categories: - 100K<n<1M language: - en pretty_name: Unitree G1 Apple Pick and Place with Depth Dataset --- # Unitree G1 Apple Pick and Place with Depth Dataset <div align="center"> <table> <tr> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/frontview.png" width="500px" alt="Front View"/> <br><b>Front View (Global)</b> </td> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/sideview.png" width="500px" alt="Side View"/> <br><b>Side View (Profile)</b> </td> </tr> <tr> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/birdview.png" width="500px" alt="Top-Down View"/> <br><b>Top-Down View (Bird's Eye)</b> </td> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/rs_view.png" width="500px" alt="Ego-Centric View"/> <br><b>Ego-Centric View (Robot POV)</b> </td> </tr> <tr> <td colspan="2" align="center" style="padding-top: 20px;"> <i>Multi-view perspectives of the Unitree G1 performing the pick-and-place task.</i> </td> </tr> </table> </div> ## Depth Data Visualization To aid in understanding the raw depth values, here we provide a side-by-side comparison of a **normalized depth image** (for visual clarity) and its corresponding RGB frame. <div align="center"> <table> <tr> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k/resolve/main/depth_vis.png" width="500px" alt="Normalized Depth Image"/> <br><b>Normalized Depth Image</b><br>(for visualization) </td> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k/resolve/main/ego_vis.png" width="500px" alt="Corresponding RGB Image"/> <br><b>Corresponding RGB Image</b><br>(from `rs_view` camera) </td> </tr> <tr> <td colspan="2" align="center" style="padding-top: 15px;"> <i>A sample depth frame (normalized to 0-255 for grayscale visualization) and its synchronized RGB counterpart.</i> </td> </tr> </table> </div> <br> ## Dataset Description The **Unitree G1 Apple Pick and Place with Depth Dataset** contains **963 high-quality trajectories** with **per-frame depth images** for RGB-D manipulation research. The robot picks up a red apple and places it into a bowl using bilateral arms and tri-finger hands. All trajectories include synchronized depth measurements from a head-mounted camera. **Key Features:** - 963 successful trajectories with depth data - **256×256 depth images per timestep** (277,592 total frames) - 28-DOF control: bilateral arms (7+7) + dexterous hands (7+7) - 256×256 RGB video at 20 FPS (ego view) - CuRobo motion planning (collision-free trajectories) - MuJoCo + RoboCasa simulation with realistic depth rendering **This dataset extends the base dataset with depth perception for RGB-D manipulation and 3D scene understanding research.** **Note on Data Availability:** To maintain accessibility within storage limits, the depth data hosted here is a subset containing **10 sample episodes**. This allows users to verify the data structure and quality. The full dataset containing depth maps for all 963 trajectories is archived separately. If you need the complete dataset for training large-scale models, please refer to the **Contact** section below. ## Dataset Owner **Junsung Park** ([@jnsungp](https://huggingface.co/jnsungp)) ## License Creative Commons Attribution 4.0 International (CC BY 4.0) ## Dataset Format | Modality | Type | Shape | Description | |----------|------|-------|-------------| | **Observation State** | `float32` | `(28,)` | Joint positions (radians) for arms + hands | | **Observation Depth** | `float32` | `(256, 256)` | **Depth image (meters) from rs_view camera** | | **Action** | `float32` | `(28,)` | Target joint positions | | **Video** | RGB | `(256, 256, 3)` | Ego view, 20 FPS, H.264 | | **Language** | `string` | - | _"Pick up the red apple and place it on the bowl"_ | ### Depth Image Specification Depth measurements captured from the robot's head-mounted camera: | Property | Value | Description | |----------|-------|-------------| | **Resolution** | 256 × 256 | Width × Height in pixels | | **Data Type** | `float32` | 32-bit floating point | | **Units** | Meters (m) | Distance from camera to surface | | **Camera** | `rs_view` | Head-mounted RGB-D camera | | **Format** | `.npy` | NumPy binary format | | **Range** | ~0.3m to 5.0m | Typical depth range in scene | **Loading Depth Data:** ```python import numpy as np # Load single depth frame depth = np.load("depth/chunk-000/episode_000000/frame_000050.npy") print(depth.shape) # (256, 256) print(f"Min depth: {depth.min():.2f}m, Max depth: {depth.max():.2f}m") ``` **Path Template:** ``` depth/chunk-{episode_chunk:03d}/episode_{episode_index:06d}/frame_{frame_index:06d}.npy ``` ### Joint Configuration (28-DOF) | Body Part | DOF | Description | |-----------|-----|-------------| | **Left Arm** | 7 | Shoulder (3) + Elbow (1) + Wrist (3) | | **Right Arm** | 7 | Shoulder (3) + Elbow (1) + Wrist (3) | | **Left Hand** | 7 | Index (2) + Middle (2) + Thumb (3) | | **Right Hand** | 7 | Index (2) + Middle (2) + Thumb (3) | ## Dataset Statistics - **Trajectories:** 963 - **Total Frames:** 277,592 - **Avg Episode Length:** ~288 frames (~14.4 seconds) - **Episode Length Range:** 180-400 frames - **Storage Size:** ~2.5 GB (data + videos + depth) - **Success Rate:** 100% ## Download ```bash huggingface-cli download \ --repo-type dataset jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k \ --local-dir ./datasets/g1-depth ``` ### Using Python ```python from datasets import load_dataset dataset = load_dataset("jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k") ``` ## Dataset Structure ``` dataset_depth_1k/ ├── data/ │ └── chunk-000/ │ ├── episode_000000.parquet │ ├── episode_000001.parquet │ └── ... ├── videos/ │ └── chunk-000/ │ └── observation.images.ego_view/ │ ├── episode_000000.mp4 │ ├── episode_000001.mp4 │ └── ... ├── depth/ │ └── chunk-000/ │ ├── episode_000000/ │ │ ├── frame_000000.npy │ │ ├── frame_000001.npy │ │ └── ... │ ├── episode_000001/ │ └── ... ├── meta/ │ ├── info.json # Dataset metadata │ ├── stats.json # Statistics (mean, std, min, max) │ ├── tasks.jsonl # Task descriptions │ └── episodes.jsonl # Episode information └── README.md ``` ## Loading Data Example ```python import pandas as pd import numpy as np import cv2 import matplotlib.pyplot as plt # Load trajectory data df = pd.read_parquet("data/chunk-000/episode_000000.parquet") # Access data observations = df['observation.state'].values # (N, 28) - joint positions actions = df['action'].values # (N, 28) - target positions # Load RGB video cap = cv2.VideoCapture("videos/chunk-000/observation.images.ego_view/episode_000000.mp4") # Load depth images episode_idx = 0 frame_idx = 100 depth = np.load(f"depth/chunk-000/episode_{episode_idx:06d}/frame_{frame_idx:06d}.npy") # Visualize depth plt.figure(figsize=(10, 5)) plt.subplot(1, 2, 1) plt.imshow(depth, cmap='turbo') plt.colorbar(label='Depth (m)') plt.title('Depth Image') plt.subplot(1, 2, 2) # Read corresponding RGB frame cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx) ret, rgb = cap.read() plt.imshow(cv2.cvtColor(rgb, cv2.COLOR_BGR2RGB)) plt.title('RGB Image') plt.show() cap.release() ``` ## Use Cases ### 1. RGB-D Manipulation Train policies that leverage depth information for: - Precise 3D localization of objects - Distance-aware grasping - Occlusion-robust perception ### 2. 3D Scene Understanding - Point cloud generation from RGB-D pairs - 3D object detection and segmentation - Spatial reasoning for manipulation ### 3. Depth-Aware Policy Learning - Multi-modal learning (RGB + Depth) - Improved generalization with geometric cues - Robustness to lighting variations ### 4. Sim-to-Real Transfer - Fine-tune models with realistic depth sensing - Domain adaptation with geometric constraints - Depth-based safety checks ## Technical Details **Simulation:** - Platform: MuJoCo + RoboCasa - Robot: Unitree G1 (upper body) - Hands: Dex31 tri-finger hands - Depth Rendering: MuJoCo native depth rendering **Motion Planning:** - CuRobo (GPU-accelerated) - Collision-free trajectories - Smooth cubic interpolation **Depth Sensing:** - Camera: Head-mounted RGB-D sensor (`rs_view`) - Resolution: 256×256 pixels - Format: 32-bit float, meters - Per-frame depth synchronized with RGB ## Comparison with Base Dataset | Feature | Base Dataset | **Depth Dataset** | |---------|--------------|-------------------| | Trajectories | 957 | **963** | | Joint State | ✓ (28D) | ✓ (28D) | | RGB Video | ✓ | ✓ | | **Depth Images** | ✗ | **✓ (256×256)** | | Use Case | Vision-based manipulation | **RGB-D 3D manipulation** | ## Citation ```bibtex @dataset{park2025unitree_g1_depth, title={Unitree G1 Apple Pick and Place with Depth Dataset}, author={Park, Junsung}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k} } ``` ## Acknowledgments Built with [CuRobo](https://curobo.org/), [RoboCasa](https://robocasa.ai/), [MuJoCo](https://mujoco.org/), and Unitree G1. --- **Version:** 1.0 | **Last Updated:** November 19, 2025 ## Contact & Full Dataset Access For questions, issues, or **to request the full depth dataset (963 episodes)**: - **Email:** night1115@snu.ac.kr - **Hugging Face:** [@jnsungp](https://huggingface.co/jnsungp) - **Institution:** Seoul National University Please include your affiliation when requesting the full dataset.

license: CC BY 4.0（知识共享署名4.0国际许可协议） task_categories: - 机器人学 tags: - Unitree G1 - 拾取与放置 - 仿真 - CuRobo - 深度感知 - RGB-D size_categories: - 100000 < 样本数 < 1000000 language: - 英语 pretty_name: 宇树G1苹果拾取与放置深度数据集 # 宇树G1苹果拾取与放置深度数据集 <div align="center"> <table> <tr> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/frontview.png" width="500px" alt="Front View"/> <br><b>全局正视图</b> </td> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/sideview.png" width="500px" alt="Side View"/> <br><b>侧视图（剖面视角）</b> </td> </tr> <tr> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/birdview.png" width="500px" alt="Top-Down View"/> <br><b>俯视图（鸟瞰视角）</b> </td> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-contact-1k/resolve/main/rs_view.png" width="500px" alt="Ego-Centric View"/> <br><b>第一人称视角（机器人视角）</b> </td> </tr> <tr> <td colspan="2" align="center" style="padding-top: 20px;"> <i>宇树G1执行拾取与放置任务的多视角展示。</i> </td> </tr> </table> </div> ## 深度数据可视化为便于理解原始深度值，本文提供归一化深度图像（用于提升可视化清晰度）与对应RGB帧的并排对比。 <div align="center"> <table> <tr> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k/resolve/main/depth_vis.png" width="500px" alt="Normalized Depth Image"/> <br><b>归一化深度图像</b><br>(用于可视化) </td> <td align="center" style="vertical-align: top; padding: 5px;"> <img src="https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k/resolve/main/ego_vis.png" width="500px" alt="Corresponding RGB Image"/> <br><b>对应RGB图像</b><br>(来自`rs_view`相机) </td> </tr> <tr> <td colspan="2" align="center" style="padding-top: 15px;"> <i>示例深度帧（归一化至0-255以灰度可视化）及其同步RGB对应帧。</i> </td> </tr> </table> </div> <br> ## 数据集描述本**宇树G1苹果拾取与放置深度数据集**包含963条高质量轨迹，以及用于RGB-D操作研究的逐帧深度图像。该数据集基于宇树G1（Unitree G1）人形机器人，通过双臂与三指末端执行器将红苹果拾取并放置于碗中。所有轨迹均包含来自头部安装相机的同步深度测量数据。 **关键特性：** - 963条包含深度数据的成功轨迹 - 每时间步的**256×256分辨率深度图像**（总帧数277592） - 28自由度控制：双臂各7自由度（共14）+ 灵巧手各7自由度（共14） - 20FPS的256×256分辨率RGB视频（第一人称视角） - CuRobo运动规划（生成无碰撞轨迹） - 基于MuJoCo与RoboCasa的仿真环境，支持真实感深度渲染本数据集在基础数据集的基础上新增了深度感知模块，可用于RGB-D操作与三维场景理解研究。 **数据可用性说明：** 为在存储限制内保证可访问性，本页面托管的深度数据为包含10个演示轨迹的子集，便于用户验证数据结构与质量。包含全部963条轨迹深度图的完整数据集已单独归档。若需完整数据集用于大规模模型训练，请参阅下文的**联系方式**部分。 ## 数据集所有者 **朴俊成（Junsung Park）** ([@jnsungp](https://huggingface.co/jnsungp)) ## 许可证知识共享署名4.0国际许可协议（CC BY 4.0） ## 数据集格式 | 模态类型 | 数据类型 | 形状 | 描述 | |----------|------|-------|-------------| | **观测状态** | `float32` | `(28,)` | 双臂与手部的关节位置（单位：弧度） | | **观测深度** | `float32` | `(256, 256)` | 来自`rs_view`相机的深度图像（单位：米） | | **动作** | `float32` | `(28,)` | 目标关节位置 | | **视频** | RGB | `(256, 256, 3)` | 第一人称视角，20FPS，H.264编码 | | **语言指令** | `string` | - | "将红苹果拾取并放置于碗中" | ### 深度图像规格从机器人头部安装相机采集的深度测量数据： | 属性 | 参数值 | 描述 | |----------|-------|-------------| | **分辨率** | 256 × 256 | 像素宽×高 | | **数据类型** | `float32` | 32位浮点型 | | **单位** | 米（m） | 相机到物体表面的距离 | | **相机** | `rs_view` | 头部安装的RGB-D相机 | | **格式** | `.npy` | NumPy二进制格式 | | **范围** | ~0.3m至5.0m | 场景中典型深度范围 | **加载深度数据示例：** python import numpy as np # 加载单帧深度图像 depth = np.load("depth/chunk-000/episode_000000/frame_000050.npy") print(depth.shape) # (256, 256) print(f"Min depth: {depth.min():.2f}m, Max depth: {depth.max():.2f}m") **路径模板：** depth/chunk-{episode_chunk:03d}/episode_{episode_index:06d}/frame_{frame_index:06d}.npy ### 关节配置（28自由度） | 身体部位 | 自由度数量 | 描述 | |-----------|-----|-------------| | **左臂** | 7 | 肩部（3）+ 肘部（1）+ 腕部（3） | | **右臂** | 7 | 肩部（3）+ 肘部（1）+ 腕部（3） | | **左手** | 7 | 食指（2）+ 中指（2）+ 拇指（3） | | **右手** | 7 | 食指（2）+ 中指（2）+ 拇指（3） | ## 数据集统计 - **轨迹数：** 963 - **总帧数：** 277,592 - **平均轨迹长度：** 约288帧（约14.4秒） - **轨迹长度范围：** 180~400帧 - **存储大小：** 约2.5 GB（含数据、视频与深度数据） - **成功率：** 100% ## 下载 bash huggingface-cli download --repo-type dataset jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k --local-dir ./datasets/g1-depth ### Python加载方式 python from datasets import load_dataset dataset = load_dataset("jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k") ## 数据集结构 dataset_depth_1k/ ├── data/ │ └── chunk-000/ │ ├── episode_000000.parquet │ ├── episode_000001.parquet │ └── ... ├── videos/ │ └── chunk-000/ │ └── observation.images.ego_view/ │ ├── episode_000000.mp4 │ ├── episode_000001.mp4 │ └── ... ├── depth/ │ └── chunk-000/ │ ├── episode_000000/ │ │ ├── frame_000000.npy │ │ ├── frame_000001.npy │ │ └── ... │ ├── episode_000001/ │ └── ... ├── meta/ │ ├── info.json # 数据集元数据 │ ├── stats.json # 统计信息（均值、标准差、最小值、最大值） │ ├── tasks.jsonl # 任务描述 │ └── episodes.jsonl # 轨迹信息 └── README.md ## 数据加载示例 python import pandas as pd import numpy as np import cv2 import matplotlib.pyplot as plt # 加载轨迹数据 df = pd.read_parquet("data/chunk-000/episode_000000.parquet") # 访问数据 observations = df['observation.state'].values # (N, 28) - 关节位置 actions = df['action'].values # (N, 28) - 目标关节位置 # 加载RGB视频 cap = cv2.VideoCapture("videos/chunk-000/observation.images.ego_view/episode_000000.mp4") # 加载深度图像 episode_idx = 0 frame_idx = 100 depth = np.load(f"depth/chunk-000/episode_{episode_idx:06d}/frame_{frame_idx:06d}.npy") # 可视化深度图像 plt.figure(figsize=(10, 5)) plt.subplot(1, 2, 1) plt.imshow(depth, cmap='turbo') plt.colorbar(label='Depth (m)') plt.title('深度图像') plt.subplot(1, 2, 2) # 读取对应RGB帧 cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx) ret, rgb = cap.read() plt.imshow(cv2.cvtColor(rgb, cv2.COLOR_BGR2RGB)) plt.title('RGB图像') plt.show() cap.release() ## 应用场景 ### 1. RGB-D操作任务训练利用深度信息的智能策略，用于： - 物体的精准三维定位 - 距离感知抓取 - 抗遮挡感知 ### 2. 三维场景理解 - 基于RGB-D对生成点云 - 三维目标检测与分割 - 面向操作任务的空间推理 ### 3. 深度感知策略学习 - 多模态学习（RGB+深度） - 借助几何线索提升泛化能力 - 提升对光照变化的鲁棒性 ### 4. 仿真到现实迁移 - 基于真实感深度感知微调模型 - 基于几何约束的域自适应 - 基于深度的安全检查 ## 技术细节 **仿真环境：** - 平台：MuJoCo + RoboCasa - 机器人：宇树G1（Unitree G1）上半身 - 末端执行器：Dex31三指灵巧手 - 深度渲染：MuJoCo原生深度渲染 **运动规划：** - CuRobo（GPU加速） - 生成无碰撞轨迹 - 平滑三次插值 **深度感知：** - 相机：头部安装的RGB-D传感器（`rs_view`） - 分辨率：256×256像素 - 格式：32位浮点型，单位为米 - 逐帧深度数据与RGB同步 ## 与基础数据集的对比 | 特性 | 基础数据集 | **深度数据集** | |---------|--------------|-------------------| | 轨迹数 | 957 | **963** | | 关节状态 | ✓（28维） | ✓（28维） | | RGB视频 | ✓ | ✓ | | **深度图像** | ✗ | **✓（256×256）** | | 应用场景 | 基于视觉的操作任务 | **RGB-D三维操作任务** | ## 引用格式 bibtex @dataset{park2025unitree_g1_depth, title={Unitree G1 Apple Pick and Place with Depth Dataset}, author={Park, Junsung}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/datasets/jnsungp/unitree-g1-robocasa-pick-apple-bowl-depth-1k} } ## 致谢本数据集基于[CuRobo](https://curobo.org/)、[RoboCasa](https://robocasa.ai/)、[MuJoCo](https://mujoco.org/)与宇树G1机器人开发。 --- **版本：1.0 | 最后更新：2025年11月19日** ## 联系方式与完整数据集获取如有疑问、问题或**申请获取完整963条轨迹的深度数据集**，请通过以下方式联系： - **邮箱：** night1115@snu.ac.kr - **Hugging Face账号：** [@jnsungp](https://huggingface.co/jnsungp) - **所属机构：** 首尔国立大学申请完整数据集时请注明您的所属机构。

提供机构：

jnsungp

5,000+

优质数据集

54 个

任务类型

进入经典数据集