RoboCOIN/Agilex_Cobot_Magic_fold_T_shirts
收藏Hugging Face2026-04-02 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/RoboCOIN/Agilex_Cobot_Magic_fold_T_shirts
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- robotics
language:
- en
extra_gated_prompt: 'By accessing this dataset, you agree to cite the associated paper in your research/publications—see the "Citation" section for details. You agree to not use the dataset to conduct experiments that cause harm to human subjects.'
extra_gated_fields:
Company/Organization:
type: 'text'
description: 'e.g., "ETH Zurich", "Boston Dynamics", "Independent Researcher"'
Country:
type: 'country'
description: 'e.g., "Germany", "China", "United States"'
tags:
- RoboCOIN
- LeRobot
license: apache-2.0
configs:
- config_name: default
data_files: data/chunk-{id}/episode_{id}.parquet
---
# Agilex_Cobot_Magic_fold_T-shirts
## Dataset Description
This dataset uses an extended format based on LeRobot and is fully compatible with LeRobot.
## Task Preview
<video src="videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4" controls width="640"></video>
[View Video Directly](videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4)
### Overview
- **Total Episodes:** 100
- **Total Frames:** 78640
- **FPS:** 30
- **Dataset Size:** 913.84 MB
- **Robot Name:** `Agilex_Cobot_Magic`
- **End-Effector Type:** `two_finger_gripper`
- **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.`
- **Sensors:** `cam_head_rgb`,
`cam_left_wrist_rgb`,
`cam_right_wrist_rgb`
- **Camera Information:** cam_head_rgb;
cam_left_wrist_rgb;
cam_right_wrist_rgb
- **Scene:** `office_workspace->office`
- **Objects:** `table(unknown)`,
`black_T-shirt(unknown)`
- **Task Description:** fold the clothes on the table.
### Primary Task Instruction
> fold the clothes on the table.
### Robot Configuration
- **Robot Name:** `Agilex_Cobot_Magic`
- **Codebase Version:** `v2.1`
- **End-Effector Type:** `two_finger_gripper`
- **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.`
## Scene and Objects
### Scene Type
`office_workspace->office`
### Objects
- `table(unknown)`
- `black_T-shirt(unknown)`
## Task Descriptions
- **Standardized Task Description:** `fold the clothes on the table.`
- **Operation Type:** `Due to some reasons, this dataset temporarily cannot provide the operation type information.`
- **Environment Type:** `Due to some reasons, this dataset temporarily cannot provide the environment type information.`
### Sub-Tasks
This dataset includes 11 distinct subtasks:
1. **Lift the black T-shirt with the left gripper** (Index: 0)
2. **Abnormal** (Index: 1)
3. **Lift the black T-shirt with the right gripper** (Index: 2)
4. **Grasp the black T-shirt with the left gripper** (Index: 3)
5. **End** (Index: 4)
6. **Fold the black T-shirt downward with the right gripper** (Index: 5)
7. **Grasp the black T-shirt with the right gripper** (Index: 6)
8. **Fold the black T-shirt downward with the left gripper** (Index: 7)
9. **Fold the black T-shirt from right to left with right gripper** (Index: 8)
10. **Use the left gripper to tidy up the clothes** (Index: 9)
11. **null** (Index: 10)
### Atomic Actions
- `grasp`
- `fold`
- `lift`
- `lower`
## Hardware and Sensors
### Sensors
- `cam_head_rgb`
- `cam_left_wrist_rgb`
- `cam_right_wrist_rgb`
### Camera Information
- `cam_head_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p
- `cam_left_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p
- `cam_right_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p
### Coordinate System
- **Definition:** `right-hand-frame`
### Dimensions & Units
- **Joint Rotation:** `radian`
- **End-Effector Rotation:** `radian`
- **End-Effector Translation:** `meter`
## Dataset Statistics
| Metric | Value |
|--------|-------|
| **Total Episodes** | 100 |
| **Total Frames** | 78640 |
| **Total Tasks** | 11 |
| **Total Videos** | 300 |
| **Total Chunks** | 1 |
| **Chunk Size** | 1000 |
| **FPS** | 30 |
| **State Dimensions** | 26 |
| **Action Dimensions** | 26 |
| **Camera Views** | 3 |
| **Dataset Size** | 913.84 MB |
## Data Splits
The dataset is organized into the following splits:
- **Training**: Episodes 0:99
## Dataset Structure
This dataset follows the LeRobot format and contains the following components:
### Data Files
- **Videos**: Compressed video files containing RGB camera observations
- **State Data**: Robot joint positions, velocities, and other state information
- **Action Data**: Robot action commands and trajectories
- **Metadata**: Episode metadata, timestamps, and annotations
### File Organization
- **Data Path Pattern**: `data/chunk-{id}/episode_{id}.parquet`
- **Video Path Pattern**: `videos/chunk-{id}/observation.images.cam_left_wrist_rgb/episode_{id}.mp{id}`
- **Chunking**: Data is organized into 1 chunk(s)
of size 1000
### Data Structure (Tree)
```
Agilex_Cobot_Magic_fold_T-shirts_qced_hardlink/
|-- annotations
| |-- eef_acc_mag_annotation.jsonl
| |-- eef_direction_annotation.jsonl
| |-- eef_velocity_annotation.jsonl
| |-- gripper_activity_annotation.jsonl
| |-- gripper_mode_annotation.jsonl
| |-- scene_annotations.jsonl
| `-- subtask_annotations.jsonl
|-- data
| `-- chunk-000
| |-- episode_000000.parquet
| |-- episode_000001.parquet
| |-- episode_000002.parquet
| |-- episode_000003.parquet
| |-- episode_000004.parquet
| |-- episode_000005.parquet
| |-- episode_000006.parquet
| |-- episode_000007.parquet
| |-- episode_000008.parquet
| |-- episode_000009.parquet
| |-- episode_000010.parquet
| `-- episode_000011.parquet
| `-- ... (88 more entries)
|-- meta
| |-- episodes.jsonl
| |-- episodes_stats.jsonl
| |-- info.json
| `-- tasks.jsonl
`-- videos
`-- chunk-000
|-- observation.images.cam_head_rgb
|-- observation.images.cam_left_wrist_rgb
`-- observation.images.cam_right_wrist_rgb
```
## Camera Views
This dataset includes 3 camera views: `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb`.
## Features (Full YAML)
```yaml
observation.images.cam_head_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.height: 480
video.width: 640
video.codec: av1
video.pix_fmt: yuv420p
video.is_depth_map: false
video.fps: 30
video.channels: 3
has_audio: false
observation.images.cam_left_wrist_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.height: 480
video.width: 640
video.codec: av1
video.pix_fmt: yuv420p
video.is_depth_map: false
video.fps: 30
video.channels: 3
has_audio: false
observation.images.cam_right_wrist_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.height: 480
video.width: 640
video.codec: av1
video.pix_fmt: yuv420p
video.is_depth_map: false
video.fps: 30
video.channels: 3
has_audio: false
observation.state:
dtype: float32
shape:
- 26
names:
- left_arm_joint_1_rad
- left_arm_joint_2_rad
- left_arm_joint_3_rad
- left_arm_joint_4_rad
- left_arm_joint_5_rad
- left_arm_joint_6_rad
- left_gripper_open
- left_eef_pos_x_m
- left_eef_pos_y_m
- left_eef_pos_z_m
- left_eef_rot_euler_x_rad
- left_eef_rot_euler_y_rad
- left_eef_rot_euler_z_rad
- right_arm_joint_1_rad
- right_arm_joint_2_rad
- right_arm_joint_3_rad
- right_arm_joint_4_rad
- right_arm_joint_5_rad
- right_arm_joint_6_rad
- right_gripper_open
- right_eef_pos_x_m
- right_eef_pos_y_m
- right_eef_pos_z_m
- right_eef_rot_euler_x_rad
- right_eef_rot_euler_y_rad
- right_eef_rot_euler_z_rad
action:
dtype: float32
shape:
- 26
names:
- left_arm_joint_1_rad
- left_arm_joint_2_rad
- left_arm_joint_3_rad
- left_arm_joint_4_rad
- left_arm_joint_5_rad
- left_arm_joint_6_rad
- left_gripper_open
- left_eef_pos_x_m
- left_eef_pos_y_m
- left_eef_pos_z_m
- left_eef_rot_euler_x_rad
- left_eef_rot_euler_y_rad
- left_eef_rot_euler_z_rad
- right_arm_joint_1_rad
- right_arm_joint_2_rad
- right_arm_joint_3_rad
- right_arm_joint_4_rad
- right_arm_joint_5_rad
- right_arm_joint_6_rad
- right_gripper_open
- right_eef_pos_x_m
- right_eef_pos_y_m
- right_eef_pos_z_m
- right_eef_rot_euler_x_rad
- right_eef_rot_euler_y_rad
- right_eef_rot_euler_z_rad
timestamp:
dtype: float32
shape:
- 1
names: null
frame_index:
dtype: int64
shape:
- 1
names: null
episode_index:
dtype: int64
shape:
- 1
names: null
index:
dtype: int64
shape:
- 1
names: null
task_index:
dtype: int64
shape:
- 1
names: null
subtask_annotation:
names: null
dtype: int32
shape:
- 5
scene_annotation:
names: null
dtype: int32
shape:
- 1
eef_sim_pose_state:
names:
- left_eef_pos_x
- left_eef_pos_y
- left_eef_pos_z
- left_eef_rot_x
- left_eef_rot_y
- left_eef_rot_z
- right_eef_pos_x
- right_eef_pos_y
- right_eef_pos_z
- right_eef_rot_x
- right_eef_rot_y
- right_eef_rot_z
dtype: float32
shape:
- 12
eef_sim_pose_action:
names:
- left_eef_pos_x
- left_eef_pos_y
- left_eef_pos_z
- left_eef_rot_x
- left_eef_rot_y
- left_eef_rot_z
- right_eef_pos_x
- right_eef_pos_y
- right_eef_pos_z
- right_eef_rot_x
- right_eef_rot_y
- right_eef_rot_z
dtype: float32
shape:
- 12
eef_direction_state:
names:
- left_eef_direction
- right_eef_direction
dtype: int32
shape:
- 2
eef_direction_action:
names:
- left_eef_direction
- right_eef_direction
dtype: int32
shape:
- 2
eef_velocity_state:
names:
- left_eef_velocity
- right_eef_velocity
dtype: int32
shape:
- 2
eef_velocity_action:
names:
- left_eef_velocity
- right_eef_velocity
dtype: int32
shape:
- 2
eef_acc_mag_state:
names:
- left_eef_acc_mag
- right_eef_acc_mag
dtype: int32
shape:
- 2
eef_acc_mag_action:
names:
- left_eef_acc_mag
- right_eef_acc_mag
dtype: int32
shape:
- 2
gripper_mode_state:
names:
- left_gripper_mode
- right_gripper_mode
dtype: int32
shape:
- 2
gripper_mode_action:
names:
- left_gripper_mode
- right_gripper_mode
dtype: int32
shape:
- 2
gripper_activity_state:
names:
- left_gripper_activity
- right_gripper_activity
dtype: int32
shape:
- 2
gripper_activity_action:
names:
- left_gripper_activity
- right_gripper_activity
dtype: int32
shape:
- 2
gripper_open_scale_state:
names:
- left_gripper_open_scale
- right_gripper_open_scale
dtype: float32
shape:
- 2
gripper_open_scale_action:
names:
- left_gripper_open_scale
- right_gripper_open_scale
dtype: float32
shape:
- 2
```
## Available Annotations
This dataset includes rich annotations to support diverse learning approaches:
- `eef_acc_mag_annotation.jsonl`
- `eef_direction_annotation.jsonl`
- `eef_velocity_annotation.jsonl`
- `gripper_activity_annotation.jsonl`
- `gripper_mode_annotation.jsonl`
- `scene_annotations.jsonl`
- `subtask_annotations.jsonl`
## Dataset Tags
- `RoboCOIN`
- `LeRobot`
## Authors
### Contributors
This dataset is contributed by:-RoboCOIN Team at Beijing Academy of Artificial Intelligence (BAAI)
### Annotators
No annotator information available.
## Links
- **Homepage:** [https://flagopen.github.io/RoboCOIN/](https://flagopen.github.io/RoboCOIN/)
- **Paper:** [https://arxiv.org/abs/2511.17441](https://arxiv.org/abs/2511.17441)
- **Repository:** [https://github.com/FlagOpen/RoboCOIN](https://github.com/FlagOpen/RoboCOIN)
## Contact and Support
For questions, issues, or feedback regarding this dataset, please contact us.
### Support
For technical support, please open an issue on our GitHub repository.
## License
apache-2.0
## Citation
If you use this dataset in your research, please cite:
```bibtex
@article{robocoin,
title={RoboCOIN: An Open-Sourced Bimanual Robotic Data Collection for Integrated Manipulation},
author={Shihan Wu, Xuecheng Liu, Shaoxuan Xie, Pengwei Wang, Xinghang Li, Bowen Yang, Zhe Li, Kai Zhu, Hongyu Wu, Yiheng Liu, Zhaoye Long, Yue Wang, Chong Liu, Dihan Wang, Ziqiang Ni, Xiang Yang, You Liu, Ruoxuan Feng, Runtian Xu, Lei Zhang, Denghang Huang, Chenghao Jin, Anlan Yin, Xinlong Wang, Zhenguo Sun, Junkai Zhao, Mengfei Du, Mingyu Cao, Xiansheng Chen, Hongyang Cheng, Xiaojie Zhang, Yankai Fu, Ning Chen, Cheng Chi, Sixiang Chen, Huaihai Lyu, Xiaoshuai Hao, Yequan Wang, Bo Lei, Dong Liu, Xi Yang, Yance Jiao, Tengfei Pan, Yunyan Zhang, Songjing Wang, Ziqian Zhang, Xu Liu, Ji Zhang, Caowei Meng, Zhizheng Zhang, Jiyang Gao, Song Wang, Xiaokun Leng, Zhiqiang Xie, Zhenzhen Zhou, Peng Huang, Wu Yang, Yandong Guo, Yichao Zhu, Suibing Zheng, Hao Cheng, Xinmin Ding, Yang Yue, Huanqian Wang, Chi Chen, Jingrui Pang, YuXi Qian, Haoran Geng, Lianli Gao, Haiyuan Li, Bin Fang, Gao Huang, Yaodong Yang, Hao Dong, He Wang, Hang Zhao, Yadong Mu, Di Hu, Hao Zhao, Tiejun Huang, Shanghang Zhang, Yonghua Lin, Zhongyuan Wang and Guocai Yao},
journal={arXiv preprint arXiv:2511.17441},
url = {https://arxiv.org/abs/2511.17441},
year={2025},
}
```
### Additional References
If you use this dataset, please also consider citing:
LeRobot Framework: https://github.com/huggingface/lerobot
## Version Information
Initial Release
提供机构:
RoboCOIN
搜集汇总
数据集介绍

构建方式
在机器人操作领域,高质量的数据集对于推动模仿学习与强化学习算法的发展至关重要。Agilex_Cobot_Magic_fold_T_shirts数据集依托于LeRobot框架进行构建,其数据采集过程在精心布置的办公室场景中完成,聚焦于衣物折叠这一具体任务。该数据集通过搭载双指夹爪的Agilex_Cobot_Magic机器人平台,系统性地记录了100个完整操作片段,共计超过七万八千帧数据。数据以Parquet格式高效存储,并严格遵循右手机坐标系,确保了状态与动作数据在弧度与米制单位下的精确性与一致性。
使用方法
研究者可利用该数据集进行机器人操作技能的深度研究。数据集完全兼容LeRobot生态系统,用户可通过标准接口直接加载Parquet数据文件及对应的视频流,便捷地访问观测、状态、动作及各类标注信息。其结构化的数据组织方式支持对完整任务轨迹或特定子任务序列的灵活提取。该资源适用于训练端到端的视觉运动策略模型、进行行为克隆或作为离线强化学习的基准环境,其丰富的注释体系亦可用于研究任务分解、技能抽象与可解释性分析等前沿方向。
背景与挑战
背景概述
在机器人操作领域,灵巧的双臂协同任务,尤其是涉及非刚性物体如衣物的折叠,一直是推动机器人从结构化工业环境迈向非结构化日常生活场景的关键挑战。Agilex_Cobot_Magic_fold_T_shirts数据集由北京智源人工智能研究院(BAAI)的RoboCOIN团队于2025年创建并开源,其核心研究问题聚焦于如何通过大规模、高质量的真实世界演示数据,训练机器人执行复杂的双手机器人衣物折叠任务。该数据集基于LeRobot框架构建,包含100个完整演示片段、超过7.8万帧的多视角视觉观测以及高维度的机器人状态与动作数据,旨在为模仿学习、强化学习等算法提供宝贵的训练资源,推动家庭服务机器人及通用操作能力的发展。
当前挑战
该数据集致力于解决机器人操作中非刚性物体灵巧操作的领域挑战。衣物作为典型的非刚性、易变形物体,其状态估计、抓取点规划和折叠过程中的动态物理交互建模极为复杂,对机器人的感知、规划和控制算法提出了极高要求。在数据集构建过程中,挑战同样显著:如何通过遥操作高效、一致地采集高质量的双臂协同演示数据;如何对复杂的连续操作过程进行精细的子任务与原子动作标注,以支持分层学习;以及如何确保多视角视频数据、机器人状态数据与丰富注释之间的严格时间同步与对齐,从而构建一个可用于端到端策略学习的可靠基准。
常用场景
经典使用场景
在机器人操作学习领域,该数据集为双臂协同折叠衣物任务提供了详尽的示范数据。其经典使用场景在于训练机器人模仿学习模型,通过100个完整演示片段,涵盖抓取、提升、折叠等原子动作序列。数据集以LeRobot格式构建,包含多视角视觉观测与高维状态动作对,能够支持端到端策略学习或行为克隆方法的开发,尤其适用于处理非刚性物体操作的复杂动力学问题。
解决学术问题
该数据集针对机器人操作中非刚性物体处理的长期挑战,提供了结构化解决方案。它通过精细标注的11个子任务,解决了模仿学习在复杂连续动作空间中的样本效率问题,并为双臂协调操作的研究奠定了数据基础。其意义在于推动了机器人操作从刚性物体到可变形物体的范式转移,通过真实世界演示数据,促进了基于学习的操控策略在泛化性与鲁棒性方面的理论探索与算法创新。
实际应用
在现实应用层面,该数据集直接服务于家庭服务机器人或工业分拣场景的自动化流程开发。例如,基于此数据训练的模型可部署于仓储物流中的服装折叠系统,或应用于老年护理辅助机器人完成衣物整理任务。其多摄像头配置模拟了真实工作环境,使得学习到的策略能更好地迁移到实际机器人平台,降低对精确建模与手工编程的依赖,提升自动化系统的适应能力与经济可行性。
数据集最近研究
最新研究方向
在机器人操作领域,柔性物体的灵巧折叠任务长期被视为一项极具挑战性的研究课题。Agilex_Cobot_Magic_fold_T_shirts 数据集的出现,为这一方向注入了新的活力。该数据集聚焦于双机械臂协作折叠T恤这一具体场景,其提供的多视角视频流、丰富的状态与动作数据,以及精细的子任务标注,正成为推动模仿学习与强化学习算法发展的关键资源。近期研究热点紧密围绕如何利用此类高质量的真实世界交互数据,训练出能够理解衣物形变物理特性、并执行复杂双手机械协调策略的通用操作模型。数据集所遵循的LeRobot格式及其与RoboCOIN项目的关联,进一步促进了开源机器人社区在数据标准化与模型泛化能力方面的协作探索,对实现家庭服务机器人的实际应用具有深远意义。
以上内容由遇见数据集搜集并总结生成



