di-techinnova/so-arm-101-pouring-0.1
收藏Hugging Face2026-04-17 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/di-techinnova/so-arm-101-pouring-0.1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- robotics
tags:
- LeRobot
configs:
- config_name: default
data_files: data/*/*.parquet
---
This dataset contains **135 episodes** of multi-task robotic manipulation focused on pouring and grasping activities. It was collected using a **Leader-Follower (Master-Slave)** setup with the **SO-ARM-101** robotic arm (Waveshare/Koch Arm derivative).
The dataset is designed to train and evaluate **Vision-Language-Action (VLA)** models (like SmolVLA or X-VLA) on tasks requiring high-precision visual grounding and long-horizon action sequences.
<a class="flex" href="https://huggingface.co/spaces/lerobot/visualize_dataset?path=di-techinnova/so-arm-101-pouring-0.1">
<img class="block dark:hidden" src="https://huggingface.co/datasets/huggingface/badges/resolve/main/visualize-this-dataset-xl.svg"/>
<img class="hidden dark:block" src="https://huggingface.co/datasets/huggingface/badges/resolve/main/visualize-this-dataset-xl-dark.svg"/>
</a>
## Dataset Description
- **Robot Type:** SO-ARM-101 (6-DOF: 5 joints + 1 gripper)
- **Control Frequency:** 15 Hz
- **Total Episodes:** 135
- **Total Frames:** 65,250
- **Visual Modalities:**
- `camera1`: Wrist-mounted camera (1280x720) for high-precision manipulation and object-centric views.
- `camera2`: Global/Portal view (640x360) using an Android phone camera for scene context.
- **Environment:** Office meeting table (wood texture) with a high-contrast yellow background.
### Task Instructions
The dataset covers distinct tasks across two main domains (Seeds and Coffee):
1. **Pouring Seeds:** *"Pour sunflower seeds from the orange cup into the clear cup."*
2. **Pouring Coffee (Standard):** *"Pour coffee from the orange cup into the cup with the D sticker."*
3. **Visual Grounding (High Contrast):** *"Pour coffee into the cup with the black-bordered letter D."*
4. **Long-horizon Composition:** *"Pour coffee and then hold the D-marked cup."* (A sequential task requiring a 0.5s pause between actions).
...
## Dataset Structure
The data follows the **LeRobot v3.0** format, using Parquet files for telemetry and MP4 files for video streams.
### Features
| Feature | Type | Description |
| :--- | :--- | :--- |
| `action` | `float32[6]` | Goal positions for the 6 servos (Shoulder Pan, Lift, Elbow, Wrist Flex, Roll, Gripper). |
| `observation.state` | `float32[6]` | Current proprioceptive state (joint positions in degrees). |
| `observation.images.camera1` | `video` | Wrist camera feed (1280x720 @ 15fps). |
| `observation.images.camera2` | `video` | Global phone camera feed (640x360 @ 15fps). |
| `task_index` | `int64` | Index mapping to the language instruction in `meta/tasks.parquet`. |
## Technical Details
### Visual Grounding & Challenges
- **Transparency Mitigation:** To handle the challenges of transparent plastic cups, we utilized "Visual Anchors" including a white sticker with a black-bordered letter "D".
- **Spatial Diversity:** Episodes include variations in cup placement and camera angles to prevent overfitting to fixed coordinates.
- **Temporal Consistency:** Data was collected with careful attention to the 15Hz rhythm, ensuring actions and images are synchronized within a ~66ms window.
### Action Space
The action space is continuous, representing the absolute angular positions of the servos. The gripper values typically range between **20-40 degrees** for a firm hold and **60+ degrees** for release.
## How to use
```python
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
# Load the dataset
dataset = LeRobotDataset("di-techinnova/so-arm-101-pouring-0.1")
# Access the first episode
frame = dataset[0]
image = frame["observation.images.camera1"]
state = frame["observation.state"]
action = frame["action"]
print(f"Instruction: {dataset.get_task(frame['task_index'])}")
```
## Citation
If you use this dataset in your research, please cite it as:
```bibtex
@misc{di-techinnova/so-arm-101-pouring-0.1,
author = {Data Impact VN - Technology Innovation Department},
title = {SO-ARM-101 Pouring Seeds and Coffee Dataset for VLA Training},
year = {2026},
publisher = {Hugging Face},
journal = {Hugging Face Hub},
howpublished = {\url{https://huggingface.co/datasets/di-techinnova/so-arm-101-pouring-0.1}}
}
```
提供机构:
di-techinnova



