RoboCOIN/Agilex_Cobot_Magic_fold_towel_blue
收藏Hugging Face2026-04-02 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/RoboCOIN/Agilex_Cobot_Magic_fold_towel_blue
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- robotics
language:
- en
extra_gated_prompt: 'By accessing this dataset, you agree to cite the associated paper in your research/publications—see the "Citation" section for details. You agree to not use the dataset to conduct experiments that cause harm to human subjects.'
extra_gated_fields:
Company/Organization:
type: 'text'
description: 'e.g., "ETH Zurich", "Boston Dynamics", "Independent Researcher"'
Country:
type: 'country'
description: 'e.g., "Germany", "China", "United States"'
tags:
- RoboCOIN
- LeRobot
license: apache-2.0
configs:
- config_name: default
data_files: data/chunk-{id}/episode_{id}.parquet
---
# Agilex_Cobot_Magic_fold_towel_blue
## Dataset Description
This dataset uses an extended format based on LeRobot and is fully compatible with LeRobot.
## Task Preview
<video src="videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4" controls width="640"></video>
[View Video Directly](videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4)
### Overview
- **Total Episodes:** 185
- **Total Frames:** 146917
- **FPS:** 30
- **Dataset Size:** 10.04 GB
- **Robot Name:** `Agilex_Cobot_Magic`
- **End-Effector Type:** `two_finger_gripper`
- **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.`
- **Sensors:** `cam_head_rgb`,
`cam_right_wrist_rgb`,
`cam_left_wrist_rgb`
- **Camera Information:** cam_head_rgb;
cam_right_wrist_rgb;
cam_left_wrist_rgb
- **Scene:** `office_workspace->office`
- **Objects:** `table(unknown)`,
`basket(unknown)`,
`blue_towel(unknown)`
- **Task Description:** fold the towels on the table.
### Primary Task Instruction
> fold the towels on the table.
### Robot Configuration
- **Robot Name:** `Agilex_Cobot_Magic`
- **Codebase Version:** `v2.1`
- **End-Effector Type:** `two_finger_gripper`
- **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.`
## Scene and Objects
### Scene Type
`office_workspace->office`
### Objects
- `table(unknown)`
- `basket(unknown)`
- `blue_towel(unknown)`
## Task Descriptions
- **Standardized Task Description:** `fold the towels on the table.`
- **Operation Type:** `Due to some reasons, this dataset temporarily cannot provide the operation type information.`
- **Environment Type:** `Due to some reasons, this dataset temporarily cannot provide the environment type information.`
### Sub-Tasks
This dataset includes 11 distinct subtasks:
1. **Left hand: adjust the brown towel** (Index: 0)
2. **Left hand: adjust the blue towel** (Index: 1)
3. **Right hand: adjust the blue towel** (Index: 2)
4. **Right hand: grab the bottom right corner of blue towel** (Index: 3)
5. **Left hand: fold the blue towel from left to right** (Index: 4)
6. **Left hand: fold the blue towel up** (Index: 5)
7. **End** (Index: 6)
8. **Right hand: fold the blue towel up** (Index: 7)
9. **Right hand: spread the blue towel flat on the table** (Index: 8)
10. **Left hand: grab the bottom left corner of blue towel** (Index: 9)
11. **null** (Index: 10)
### Atomic Actions
- `grasp`
- `fold`
- `lift`
- `lower`
## Hardware and Sensors
### Sensors
- `cam_head_rgb`
- `cam_right_wrist_rgb`
- `cam_left_wrist_rgb`
### Camera Information
- `cam_head_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=h264, pix_fmt=yuv420p
- `cam_right_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=h264, pix_fmt=yuv420p
- `cam_left_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=h264, pix_fmt=yuv420p
### Coordinate System
- **Definition:** `right-hand-frame`
### Dimensions & Units
- **Joint Rotation:** `radian`
- **End-Effector Rotation:** `radian`
- **End-Effector Translation:** `meter`
## Dataset Statistics
| Metric | Value |
|--------|-------|
| **Total Episodes** | 185 |
| **Total Frames** | 146917 |
| **Total Tasks** | 11 |
| **Total Videos** | 555 |
| **Total Chunks** | 1 |
| **Chunk Size** | 10000 |
| **FPS** | 30 |
| **State Dimensions** | 26 |
| **Action Dimensions** | 26 |
| **Camera Views** | 3 |
| **Dataset Size** | 10.04 GB |
## Data Splits
The dataset is organized into the following splits:
- **Training**: Episodes 0:184
## Dataset Structure
This dataset follows the LeRobot format and contains the following components:
### Data Files
- **Videos**: Compressed video files containing RGB camera observations
- **State Data**: Robot joint positions, velocities, and other state information
- **Action Data**: Robot action commands and trajectories
- **Metadata**: Episode metadata, timestamps, and annotations
### File Organization
- **Data Path Pattern**: `data/chunk-{id}/episode_{id}.parquet`
- **Video Path Pattern**: `videos/chunk-{id}/observation.images.cam_right_wrist_rgb/episode_{id}.mp{id}`
- **Chunking**: Data is organized into 1 chunk(s)
of size 10000
### Data Structure (Tree)
```
Agilex_Cobot_Magic_fold_towel_blue_qced_hardlink/
|-- annotations
| |-- eef_acc_mag_annotation.jsonl
| |-- eef_direction_annotation.jsonl
| |-- eef_velocity_annotation.jsonl
| |-- gripper_activity_annotation.jsonl
| |-- gripper_mode_annotation.jsonl
| |-- scene_annotations.jsonl
| `-- subtask_annotations.jsonl
|-- data
| `-- chunk-000
| |-- episode_000000.parquet
| |-- episode_000001.parquet
| |-- episode_000002.parquet
| |-- episode_000003.parquet
| |-- episode_000004.parquet
| |-- episode_000005.parquet
| |-- episode_000006.parquet
| |-- episode_000007.parquet
| |-- episode_000008.parquet
| |-- episode_000009.parquet
| |-- episode_000010.parquet
| `-- episode_000011.parquet
| `-- ... (173 more entries)
|-- meta
| |-- episodes.jsonl
| |-- episodes_stats.jsonl
| |-- info.json
| `-- tasks.jsonl
`-- videos
`-- chunk-000
|-- observation.images.cam_head_rgb
|-- observation.images.cam_left_wrist_rgb
`-- observation.images.cam_right_wrist_rgb
```
## Camera Views
This dataset includes 3 camera views: `cam_head_rgb`, `cam_right_wrist_rgb`, `cam_left_wrist_rgb`.
## Features (Full YAML)
```yaml
action:
dtype: float32
shape:
- 26
names:
- left_arm_joint_1_rad
- left_arm_joint_2_rad
- left_arm_joint_3_rad
- left_arm_joint_4_rad
- left_arm_joint_5_rad
- left_arm_joint_6_rad
- left_eef_pos_x_m
- left_eef_pos_y_m
- left_eef_pos_z_m
- left_eef_rot_euler_x_rad
- left_eef_rot_euler_y_rad
- left_eef_rot_euler_z_rad
- left_gripper_open
- right_arm_joint_1_rad
- right_arm_joint_2_rad
- right_arm_joint_3_rad
- right_arm_joint_4_rad
- right_arm_joint_5_rad
- right_arm_joint_6_rad
- right_eef_pos_x_m
- right_eef_pos_y_m
- right_eef_pos_z_m
- right_eef_rot_euler_x_rad
- right_eef_rot_euler_y_rad
- right_eef_rot_euler_z_rad
- right_gripper_open
observation.state:
dtype: float32
shape:
- 26
names:
- left_arm_joint_1_rad
- left_arm_joint_2_rad
- left_arm_joint_3_rad
- left_arm_joint_4_rad
- left_arm_joint_5_rad
- left_arm_joint_6_rad
- left_eef_pos_x_m
- left_eef_pos_y_m
- left_eef_pos_z_m
- left_eef_rot_euler_x_rad
- left_eef_rot_euler_y_rad
- left_eef_rot_euler_z_rad
- left_gripper_open
- right_arm_joint_1_rad
- right_arm_joint_2_rad
- right_arm_joint_3_rad
- right_arm_joint_4_rad
- right_arm_joint_5_rad
- right_arm_joint_6_rad
- right_eef_pos_x_m
- right_eef_pos_y_m
- right_eef_pos_z_m
- right_eef_rot_euler_x_rad
- right_eef_rot_euler_y_rad
- right_eef_rot_euler_z_rad
- right_gripper_open
observation.images.cam_head_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.fps: 30.0
video.height: 480
video.width: 640
video.channels: 3
video.codec: h264
video.pix_fmt: yuv420p
video.is_depth_map: false
has_audio: false
observation.images.cam_right_wrist_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.fps: 30.0
video.height: 480
video.width: 640
video.channels: 3
video.codec: h264
video.pix_fmt: yuv420p
video.is_depth_map: false
has_audio: false
observation.images.cam_left_wrist_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.fps: 30.0
video.height: 480
video.width: 640
video.channels: 3
video.codec: h264
video.pix_fmt: yuv420p
video.is_depth_map: false
has_audio: false
timestamp:
dtype: float32
shape:
- 1
names: null
frame_index:
dtype: int64
shape:
- 1
names: null
episode_index:
dtype: int64
shape:
- 1
names: null
index:
dtype: int64
shape:
- 1
names: null
task_index:
dtype: int64
shape:
- 1
names: null
subtask_annotation:
names: null
dtype: int32
shape:
- 5
scene_annotation:
names: null
dtype: int32
shape:
- 1
eef_sim_pose_state:
names:
- left_eef_pos_x
- left_eef_pos_y
- left_eef_pos_z
- left_eef_rot_x
- left_eef_rot_y
- left_eef_rot_z
- right_eef_pos_x
- right_eef_pos_y
- right_eef_pos_z
- right_eef_rot_x
- right_eef_rot_y
- right_eef_rot_z
dtype: float32
shape:
- 12
eef_sim_pose_action:
names:
- left_eef_pos_x
- left_eef_pos_y
- left_eef_pos_z
- left_eef_rot_x
- left_eef_rot_y
- left_eef_rot_z
- right_eef_pos_x
- right_eef_pos_y
- right_eef_pos_z
- right_eef_rot_x
- right_eef_rot_y
- right_eef_rot_z
dtype: float32
shape:
- 12
eef_direction_state:
names:
- left_eef_direction
- right_eef_direction
dtype: int32
shape:
- 2
eef_direction_action:
names:
- left_eef_direction
- right_eef_direction
dtype: int32
shape:
- 2
eef_velocity_state:
names:
- left_eef_velocity
- right_eef_velocity
dtype: int32
shape:
- 2
eef_velocity_action:
names:
- left_eef_velocity
- right_eef_velocity
dtype: int32
shape:
- 2
eef_acc_mag_state:
names:
- left_eef_acc_mag
- right_eef_acc_mag
dtype: int32
shape:
- 2
eef_acc_mag_action:
names:
- left_eef_acc_mag
- right_eef_acc_mag
dtype: int32
shape:
- 2
gripper_mode_state:
names:
- left_gripper_mode
- right_gripper_mode
dtype: int32
shape:
- 2
gripper_mode_action:
names:
- left_gripper_mode
- right_gripper_mode
dtype: int32
shape:
- 2
gripper_activity_state:
names:
- left_gripper_activity
- right_gripper_activity
dtype: int32
shape:
- 2
gripper_activity_action:
names:
- left_gripper_activity
- right_gripper_activity
dtype: int32
shape:
- 2
gripper_open_scale_state:
names:
- left_gripper_open_scale
- right_gripper_open_scale
dtype: float32
shape:
- 2
gripper_open_scale_action:
names:
- left_gripper_open_scale
- right_gripper_open_scale
dtype: float32
shape:
- 2
```
## Available Annotations
This dataset includes rich annotations to support diverse learning approaches:
- `eef_acc_mag_annotation.jsonl`
- `eef_direction_annotation.jsonl`
- `eef_velocity_annotation.jsonl`
- `gripper_activity_annotation.jsonl`
- `gripper_mode_annotation.jsonl`
- `scene_annotations.jsonl`
- `subtask_annotations.jsonl`
## Dataset Tags
- `RoboCOIN`
- `LeRobot`
## Authors
### Contributors
This dataset is contributed by:-RoboCOIN Team at Beijing Academy of Artificial Intelligence (BAAI)
### Annotators
No annotator information available.
## Links
- **Homepage:** [https://flagopen.github.io/RoboCOIN/](https://flagopen.github.io/RoboCOIN/)
- **Paper:** [https://arxiv.org/abs/2511.17441](https://arxiv.org/abs/2511.17441)
- **Repository:** [https://github.com/FlagOpen/RoboCOIN](https://github.com/FlagOpen/RoboCOIN)
## Contact and Support
For questions, issues, or feedback regarding this dataset, please contact us.
### Support
For technical support, please open an issue on our GitHub repository.
## License
apache-2.0
## Citation
If you use this dataset in your research, please cite:
```bibtex
@article{robocoin,
title={RoboCOIN: An Open-Sourced Bimanual Robotic Data Collection for Integrated Manipulation},
author={Shihan Wu, Xuecheng Liu, Shaoxuan Xie, Pengwei Wang, Xinghang Li, Bowen Yang, Zhe Li, Kai Zhu, Hongyu Wu, Yiheng Liu, Zhaoye Long, Yue Wang, Chong Liu, Dihan Wang, Ziqiang Ni, Xiang Yang, You Liu, Ruoxuan Feng, Runtian Xu, Lei Zhang, Denghang Huang, Chenghao Jin, Anlan Yin, Xinlong Wang, Zhenguo Sun, Junkai Zhao, Mengfei Du, Mingyu Cao, Xiansheng Chen, Hongyang Cheng, Xiaojie Zhang, Yankai Fu, Ning Chen, Cheng Chi, Sixiang Chen, Huaihai Lyu, Xiaoshuai Hao, Yequan Wang, Bo Lei, Dong Liu, Xi Yang, Yance Jiao, Tengfei Pan, Yunyan Zhang, Songjing Wang, Ziqian Zhang, Xu Liu, Ji Zhang, Caowei Meng, Zhizheng Zhang, Jiyang Gao, Song Wang, Xiaokun Leng, Zhiqiang Xie, Zhenzhen Zhou, Peng Huang, Wu Yang, Yandong Guo, Yichao Zhu, Suibing Zheng, Hao Cheng, Xinmin Ding, Yang Yue, Huanqian Wang, Chi Chen, Jingrui Pang, YuXi Qian, Haoran Geng, Lianli Gao, Haiyuan Li, Bin Fang, Gao Huang, Yaodong Yang, Hao Dong, He Wang, Hang Zhao, Yadong Mu, Di Hu, Hao Zhao, Tiejun Huang, Shanghang Zhang, Yonghua Lin, Zhongyuan Wang and Guocai Yao},
journal={arXiv preprint arXiv:2511.17441},
url = {https://arxiv.org/abs/2511.17441},
year={2025},
}
```
### Additional References
If you use this dataset, please also consider citing:
LeRobot Framework: https://github.com/huggingface/lerobot
## Version Information
Initial Release
提供机构:
RoboCOIN
搜集汇总
数据集介绍

构建方式
在机器人操作学习领域,高质量的双臂协同操作数据集对于推动灵巧操作算法的发展至关重要。Agilex_Cobot_Magic_fold_towel_blue数据集基于LeRobot框架的扩展格式构建,确保了与主流机器人学习生态系统的完全兼容性。该数据集通过Agilex Cobot Magic双臂机器人平台,在模拟办公室工作场景中系统性地采集了折叠蓝色毛巾这一日常操作任务。数据采集过程涵盖了185个完整操作片段,总计超过14万帧的时序数据,并以30帧每秒的速率记录了来自头部、左右腕部三个视角的RGB视频流,同时精确同步了26维的机器人关节状态与动作指令,形成了总容量达10.04GB的标准化数据集合。
特点
该数据集在机器人操作数据领域展现出多模态与细粒度标注的显著特征。其核心价值在于提供了从三个不同视角同步采集的视觉观测数据,配合高维度的连续状态-动作对,为模仿学习与强化学习算法提供了丰富的训练素材。数据集内部结构严谨,不仅包含基础的机器人运动轨迹,还集成了七类精细化标注文件,涵盖末端执行器运动方向、速度、加速度以及夹爪活动模式等多个维度。特别值得注意的是,数据将折叠毛巾这一复合任务分解为11个明确的子任务序列,并标注了抓取、折叠、抬起、放下等原子操作,这种层次化的任务表征为研究任务分解与规划算法提供了理想实验平台。
使用方法
研究人员可借助LeRobot框架或兼容的数据加载工具,通过标准化的Parquet文件接口高效访问该数据集。典型的使用流程包括加载指定片段的数据文件,同步解析视频流与机器人状态序列,进而构建用于行为克隆、离线强化学习或视觉运动策略训练的数据管道。数据集提供的丰富标注支持多种学习范式,例如利用子任务标注进行分层强化学习,或结合末端执行器运动标注研究接触丰富的操作策略。由于数据完全遵循LeRobot格式规范,用户能够无缝集成到现有的机器人学习代码库中,进行模型训练、验证与仿真测试,加速双臂协同操作算法的开发与评估进程。
背景与挑战
背景概述
在机器人操作领域,灵巧的双臂协同任务代表了当前研究的前沿方向,其核心挑战在于如何使机器人像人类一样完成复杂的日常物品操作。Agilex_Cobot_Magic_fold_towel_blue数据集由北京智源人工智能研究院的RoboCOIN团队于2025年构建并发布,旨在为双手机器人折叠毛巾这一具体任务提供高质量、多模态的演示数据。该数据集基于LeRobot框架进行扩展,包含185个完整操作片段、近15万帧数据,并配备了头部及双腕部三视角的RGB视频流、26维的机器人状态与动作信息,以及精细的子任务与场景标注。它直接服务于机器人模仿学习与强化学习算法的训练与评估,为解决非结构化环境下的柔性物体操作这一长期难题提供了宝贵的现实世界数据基础,对推动服务机器人走向实用化具有显著意义。
当前挑战
该数据集致力于解决机器人操作中一个经典而棘手的领域问题:对柔性、可变形物体(如毛巾)进行精确的折叠操作。这一任务挑战在于物体缺乏刚性结构,其形态在操作过程中持续变化,要求机器人具备高超的感知能力以跟踪物体状态,并生成顺应材料物理特性的灵巧、柔顺控制策略。在数据集构建层面,挑战同样显著:首先,需要通过遥操作或演示录制方式采集真实、流畅的双臂协同操作轨迹,确保数据的高保真度与安全性;其次,面对海量的多视角视频与高维状态数据,如何进行高效存储、同步与标注是一大工程难题;最后,数据集的构建需确保其格式的标准化与兼容性,以便于不同研究团队复用并与主流学习框架(如LeRobot)无缝集成,这对数据组织与元信息设计提出了严格要求。
常用场景
经典使用场景
在机器人操作学习领域,双手机器人执行精细的物体操控任务一直是研究难点。Agilex_Cobot_Magic_fold_towel_blue数据集通过记录Agilex_Cobot_Magic机器人折叠蓝色毛巾的完整过程,为模仿学习和强化学习算法提供了高质量的示范数据。该数据集包含185个完整任务片段,涵盖抓取、折叠、调整等多种原子动作,并配有头戴式与双腕部三视角RGB视频流,能够细致呈现双臂协同作业时的手眼协调与空间运动轨迹。
衍生相关工作
基于该数据集衍生的经典研究工作主要集中在双臂协同操作算法与多模态策略学习方面。例如,研究者利用其子任务标注开发分层强化学习框架,将折叠任务分解为可重用的技能模块。同时,三视角视频与状态动作对的同步记录催生了多种视觉-运动表征融合模型,用于提升策略在未见物体上的泛化能力。该数据集作为RoboCOIN项目的一部分,也与LeRobot生态中的其他数据集共同促进了开源机器人学习社区的算法比较与基准测试,推动了端到端操作策略的标准化评估。
数据集最近研究
最新研究方向
在机器人操作学习领域,灵巧的双臂操作始终是推动服务机器人走向实际应用的核心挑战。Agilex_Cobot_Magic_fold_towel_blue数据集聚焦于折叠毛巾这一日常任务,其多视角视频流与精细的动作标注为模仿学习与强化学习算法提供了高质量的训练素材。当前研究前沿正致力于利用此类结构化演示数据,训练能够理解复杂物体形变与多步骤任务规划的通用操作策略。该数据集作为RoboCOIN大规模收集计划的一部分,与LeRobot框架深度兼容,显著促进了开源机器人社区在真实世界技能学习方面的协作与基准测试,为开发能在非结构化家庭环境中可靠工作的机器人系统奠定了关键数据基础。
以上内容由遇见数据集搜集并总结生成



