di-techinnova/so-arm-101-pouring-0.1

Name: di-techinnova/so-arm-101-pouring-0.1
Creator: di-techinnova
Published: 2026-04-17 07:03:06
License: 暂无描述

Hugging Face2026-04-17 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/di-techinnova/so-arm-101-pouring-0.1

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - robotics tags: - LeRobot configs: - config_name: default data_files: data/*/*.parquet --- This dataset contains **135 episodes** of multi-task robotic manipulation focused on pouring and grasping activities. It was collected using a **Leader-Follower (Master-Slave)** setup with the **SO-ARM-101** robotic arm (Waveshare/Koch Arm derivative). The dataset is designed to train and evaluate **Vision-Language-Action (VLA)** models (like SmolVLA or X-VLA) on tasks requiring high-precision visual grounding and long-horizon action sequences. <a class="flex" href="https://huggingface.co/spaces/lerobot/visualize_dataset?path=di-techinnova/so-arm-101-pouring-0.1"> <img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/badges/resolve/main/visualize-this-dataset-xl.svg"/> <img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/badges/resolve/main/visualize-this-dataset-xl-dark.svg"/> </a> ## Dataset Description - **Robot Type:** SO-ARM-101 (6-DOF: 5 joints + 1 gripper) - **Control Frequency:** 15 Hz - **Total Episodes:** 135 - **Total Frames:** 65,250 - **Visual Modalities:** - `camera1`: Wrist-mounted camera (1280x720) for high-precision manipulation and object-centric views. - `camera2`: Global/Portal view (640x360) using an Android phone camera for scene context. - **Environment:** Office meeting table (wood texture) with a high-contrast yellow background. ### Task Instructions The dataset covers distinct tasks across two main domains (Seeds and Coffee): 1. **Pouring Seeds:** *"Pour sunflower seeds from the orange cup into the clear cup."* 2. **Pouring Coffee (Standard):** *"Pour coffee from the orange cup into the cup with the D sticker."* 3. **Visual Grounding (High Contrast):** *"Pour coffee into the cup with the black-bordered letter D."* 4. **Long-horizon Composition:** *"Pour coffee and then hold the D-marked cup."* (A sequential task requiring a 0.5s pause between actions). ... ## Dataset Structure The data follows the **LeRobot v3.0** format, using Parquet files for telemetry and MP4 files for video streams. ### Features | Feature | Type | Description | | :--- | :--- | :--- | | `action` | `float32[6]` | Goal positions for the 6 servos (Shoulder Pan, Lift, Elbow, Wrist Flex, Roll, Gripper). | | `observation.state` | `float32[6]` | Current proprioceptive state (joint positions in degrees). | | `observation.images.camera1` | `video` | Wrist camera feed (1280x720 @ 15fps). | | `observation.images.camera2` | `video` | Global phone camera feed (640x360 @ 15fps). | | `task_index` | `int64` | Index mapping to the language instruction in `meta/tasks.parquet`. | ## Technical Details ### Visual Grounding & Challenges - **Transparency Mitigation:** To handle the challenges of transparent plastic cups, we utilized "Visual Anchors" including a white sticker with a black-bordered letter "D". - **Spatial Diversity:** Episodes include variations in cup placement and camera angles to prevent overfitting to fixed coordinates. - **Temporal Consistency:** Data was collected with careful attention to the 15Hz rhythm, ensuring actions and images are synchronized within a ~66ms window. ### Action Space The action space is continuous, representing the absolute angular positions of the servos. The gripper values typically range between **20-40 degrees** for a firm hold and **60+ degrees** for release. ## How to use ```python from lerobot.common.datasets.lerobot_dataset import LeRobotDataset # Load the dataset dataset = LeRobotDataset("di-techinnova/so-arm-101-pouring-0.1") # Access the first episode frame = dataset[0] image = frame["observation.images.camera1"] state = frame["observation.state"] action = frame["action"] print(f"Instruction: {dataset.get_task(frame['task_index'])}") ``` ## Citation If you use this dataset in your research, please cite it as: ```bibtex @misc{di-techinnova/so-arm-101-pouring-0.1, author = {Data Impact VN - Technology Innovation Department}, title = {SO-ARM-101 Pouring Seeds and Coffee Dataset for VLA Training}, year = {2026}, publisher = {Hugging Face}, journal = {Hugging Face Hub}, howpublished = {\url{https://huggingface.co/datasets/di-techinnova/so-arm-101-pouring-0.1}} } ```

提供机构：

di-techinnova

5,000+

优质数据集

54 个

任务类型

进入经典数据集