RoboCOIN/Agilex_Cobot_Magic_fold_T_shirts

Name: RoboCOIN/Agilex_Cobot_Magic_fold_T_shirts
Creator: RoboCOIN
Published: 2026-04-02 12:43:35
License: 暂无描述

Hugging Face2026-04-02 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/RoboCOIN/Agilex_Cobot_Magic_fold_T_shirts

下载链接

链接失效反馈

官方服务：

资源简介：

--- task_categories: - robotics language: - en extra_gated_prompt: 'By accessing this dataset, you agree to cite the associated paper in your research/publications—see the "Citation" section for details. You agree to not use the dataset to conduct experiments that cause harm to human subjects.' extra_gated_fields: Company/Organization: type: 'text' description: 'e.g., "ETH Zurich", "Boston Dynamics", "Independent Researcher"' Country: type: 'country' description: 'e.g., "Germany", "China", "United States"' tags: - RoboCOIN - LeRobot license: apache-2.0 configs: - config_name: default data_files: data/chunk-{id}/episode_{id}.parquet --- # Agilex_Cobot_Magic_fold_T-shirts ## Dataset Description This dataset uses an extended format based on LeRobot and is fully compatible with LeRobot. ## Task Preview <video src="videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4" controls width="640"></video> [View Video Directly](videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4) ### Overview - **Total Episodes:** 100 - **Total Frames:** 78640 - **FPS:** 30 - **Dataset Size:** 913.84 MB - **Robot Name:** `Agilex_Cobot_Magic` - **End-Effector Type:** `two_finger_gripper` - **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.` - **Sensors:** `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb` - **Camera Information:** cam_head_rgb; cam_left_wrist_rgb; cam_right_wrist_rgb - **Scene:** `office_workspace->office` - **Objects:** `table(unknown)`, `black_T-shirt(unknown)` - **Task Description:** fold the clothes on the table. ### Primary Task Instruction > fold the clothes on the table. ### Robot Configuration - **Robot Name:** `Agilex_Cobot_Magic` - **Codebase Version:** `v2.1` - **End-Effector Type:** `two_finger_gripper` - **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.` ## Scene and Objects ### Scene Type `office_workspace->office` ### Objects - `table(unknown)` - `black_T-shirt(unknown)` ## Task Descriptions - **Standardized Task Description:** `fold the clothes on the table.` - **Operation Type:** `Due to some reasons, this dataset temporarily cannot provide the operation type information.` - **Environment Type:** `Due to some reasons, this dataset temporarily cannot provide the environment type information.` ### Sub-Tasks This dataset includes 11 distinct subtasks: 1. **Lift the black T-shirt with the left gripper** (Index: 0) 2. **Abnormal** (Index: 1) 3. **Lift the black T-shirt with the right gripper** (Index: 2) 4. **Grasp the black T-shirt with the left gripper** (Index: 3) 5. **End** (Index: 4) 6. **Fold the black T-shirt downward with the right gripper** (Index: 5) 7. **Grasp the black T-shirt with the right gripper** (Index: 6) 8. **Fold the black T-shirt downward with the left gripper** (Index: 7) 9. **Fold the black T-shirt from right to left with right gripper** (Index: 8) 10. **Use the left gripper to tidy up the clothes** (Index: 9) 11. **null** (Index: 10) ### Atomic Actions - `grasp` - `fold` - `lift` - `lower` ## Hardware and Sensors ### Sensors - `cam_head_rgb` - `cam_left_wrist_rgb` - `cam_right_wrist_rgb` ### Camera Information - `cam_head_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p - `cam_left_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p - `cam_right_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p ### Coordinate System - **Definition:** `right-hand-frame` ### Dimensions & Units - **Joint Rotation:** `radian` - **End-Effector Rotation:** `radian` - **End-Effector Translation:** `meter` ## Dataset Statistics | Metric | Value | |--------|-------| | **Total Episodes** | 100 | | **Total Frames** | 78640 | | **Total Tasks** | 11 | | **Total Videos** | 300 | | **Total Chunks** | 1 | | **Chunk Size** | 1000 | | **FPS** | 30 | | **State Dimensions** | 26 | | **Action Dimensions** | 26 | | **Camera Views** | 3 | | **Dataset Size** | 913.84 MB | ## Data Splits The dataset is organized into the following splits: - **Training**: Episodes 0:99 ## Dataset Structure This dataset follows the LeRobot format and contains the following components: ### Data Files - **Videos**: Compressed video files containing RGB camera observations - **State Data**: Robot joint positions, velocities, and other state information - **Action Data**: Robot action commands and trajectories - **Metadata**: Episode metadata, timestamps, and annotations ### File Organization - **Data Path Pattern**: `data/chunk-{id}/episode_{id}.parquet` - **Video Path Pattern**: `videos/chunk-{id}/observation.images.cam_left_wrist_rgb/episode_{id}.mp{id}` - **Chunking**: Data is organized into 1 chunk(s) of size 1000 ### Data Structure (Tree) ``` Agilex_Cobot_Magic_fold_T-shirts_qced_hardlink/ |-- annotations | |-- eef_acc_mag_annotation.jsonl | |-- eef_direction_annotation.jsonl | |-- eef_velocity_annotation.jsonl | |-- gripper_activity_annotation.jsonl | |-- gripper_mode_annotation.jsonl | |-- scene_annotations.jsonl | `-- subtask_annotations.jsonl |-- data | `-- chunk-000 | |-- episode_000000.parquet | |-- episode_000001.parquet | |-- episode_000002.parquet | |-- episode_000003.parquet | |-- episode_000004.parquet | |-- episode_000005.parquet | |-- episode_000006.parquet | |-- episode_000007.parquet | |-- episode_000008.parquet | |-- episode_000009.parquet | |-- episode_000010.parquet | `-- episode_000011.parquet | `-- ... (88 more entries) |-- meta | |-- episodes.jsonl | |-- episodes_stats.jsonl | |-- info.json | `-- tasks.jsonl `-- videos `-- chunk-000 |-- observation.images.cam_head_rgb |-- observation.images.cam_left_wrist_rgb `-- observation.images.cam_right_wrist_rgb ``` ## Camera Views This dataset includes 3 camera views: `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb`. ## Features (Full YAML) ```yaml observation.images.cam_head_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.images.cam_left_wrist_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.images.cam_right_wrist_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.state: dtype: float32 shape: - 26 names: - left_arm_joint_1_rad - left_arm_joint_2_rad - left_arm_joint_3_rad - left_arm_joint_4_rad - left_arm_joint_5_rad - left_arm_joint_6_rad - left_gripper_open - left_eef_pos_x_m - left_eef_pos_y_m - left_eef_pos_z_m - left_eef_rot_euler_x_rad - left_eef_rot_euler_y_rad - left_eef_rot_euler_z_rad - right_arm_joint_1_rad - right_arm_joint_2_rad - right_arm_joint_3_rad - right_arm_joint_4_rad - right_arm_joint_5_rad - right_arm_joint_6_rad - right_gripper_open - right_eef_pos_x_m - right_eef_pos_y_m - right_eef_pos_z_m - right_eef_rot_euler_x_rad - right_eef_rot_euler_y_rad - right_eef_rot_euler_z_rad action: dtype: float32 shape: - 26 names: - left_arm_joint_1_rad - left_arm_joint_2_rad - left_arm_joint_3_rad - left_arm_joint_4_rad - left_arm_joint_5_rad - left_arm_joint_6_rad - left_gripper_open - left_eef_pos_x_m - left_eef_pos_y_m - left_eef_pos_z_m - left_eef_rot_euler_x_rad - left_eef_rot_euler_y_rad - left_eef_rot_euler_z_rad - right_arm_joint_1_rad - right_arm_joint_2_rad - right_arm_joint_3_rad - right_arm_joint_4_rad - right_arm_joint_5_rad - right_arm_joint_6_rad - right_gripper_open - right_eef_pos_x_m - right_eef_pos_y_m - right_eef_pos_z_m - right_eef_rot_euler_x_rad - right_eef_rot_euler_y_rad - right_eef_rot_euler_z_rad timestamp: dtype: float32 shape: - 1 names: null frame_index: dtype: int64 shape: - 1 names: null episode_index: dtype: int64 shape: - 1 names: null index: dtype: int64 shape: - 1 names: null task_index: dtype: int64 shape: - 1 names: null subtask_annotation: names: null dtype: int32 shape: - 5 scene_annotation: names: null dtype: int32 shape: - 1 eef_sim_pose_state: names: - left_eef_pos_x - left_eef_pos_y - left_eef_pos_z - left_eef_rot_x - left_eef_rot_y - left_eef_rot_z - right_eef_pos_x - right_eef_pos_y - right_eef_pos_z - right_eef_rot_x - right_eef_rot_y - right_eef_rot_z dtype: float32 shape: - 12 eef_sim_pose_action: names: - left_eef_pos_x - left_eef_pos_y - left_eef_pos_z - left_eef_rot_x - left_eef_rot_y - left_eef_rot_z - right_eef_pos_x - right_eef_pos_y - right_eef_pos_z - right_eef_rot_x - right_eef_rot_y - right_eef_rot_z dtype: float32 shape: - 12 eef_direction_state: names: - left_eef_direction - right_eef_direction dtype: int32 shape: - 2 eef_direction_action: names: - left_eef_direction - right_eef_direction dtype: int32 shape: - 2 eef_velocity_state: names: - left_eef_velocity - right_eef_velocity dtype: int32 shape: - 2 eef_velocity_action: names: - left_eef_velocity - right_eef_velocity dtype: int32 shape: - 2 eef_acc_mag_state: names: - left_eef_acc_mag - right_eef_acc_mag dtype: int32 shape: - 2 eef_acc_mag_action: names: - left_eef_acc_mag - right_eef_acc_mag dtype: int32 shape: - 2 gripper_mode_state: names: - left_gripper_mode - right_gripper_mode dtype: int32 shape: - 2 gripper_mode_action: names: - left_gripper_mode - right_gripper_mode dtype: int32 shape: - 2 gripper_activity_state: names: - left_gripper_activity - right_gripper_activity dtype: int32 shape: - 2 gripper_activity_action: names: - left_gripper_activity - right_gripper_activity dtype: int32 shape: - 2 gripper_open_scale_state: names: - left_gripper_open_scale - right_gripper_open_scale dtype: float32 shape: - 2 gripper_open_scale_action: names: - left_gripper_open_scale - right_gripper_open_scale dtype: float32 shape: - 2 ``` ## Available Annotations This dataset includes rich annotations to support diverse learning approaches: - `eef_acc_mag_annotation.jsonl` - `eef_direction_annotation.jsonl` - `eef_velocity_annotation.jsonl` - `gripper_activity_annotation.jsonl` - `gripper_mode_annotation.jsonl` - `scene_annotations.jsonl` - `subtask_annotations.jsonl` ## Dataset Tags - `RoboCOIN` - `LeRobot` ## Authors ### Contributors This dataset is contributed by:-RoboCOIN Team at Beijing Academy of Artificial Intelligence (BAAI) ### Annotators No annotator information available. ## Links - **Homepage:** [https://flagopen.github.io/RoboCOIN/](https://flagopen.github.io/RoboCOIN/) - **Paper:** [https://arxiv.org/abs/2511.17441](https://arxiv.org/abs/2511.17441) - **Repository:** [https://github.com/FlagOpen/RoboCOIN](https://github.com/FlagOpen/RoboCOIN) ## Contact and Support For questions, issues, or feedback regarding this dataset, please contact us. ### Support For technical support, please open an issue on our GitHub repository. ## License apache-2.0 ## Citation If you use this dataset in your research, please cite: ```bibtex @article{robocoin, title={RoboCOIN: An Open-Sourced Bimanual Robotic Data Collection for Integrated Manipulation}, author={Shihan Wu, Xuecheng Liu, Shaoxuan Xie, Pengwei Wang, Xinghang Li, Bowen Yang, Zhe Li, Kai Zhu, Hongyu Wu, Yiheng Liu, Zhaoye Long, Yue Wang, Chong Liu, Dihan Wang, Ziqiang Ni, Xiang Yang, You Liu, Ruoxuan Feng, Runtian Xu, Lei Zhang, Denghang Huang, Chenghao Jin, Anlan Yin, Xinlong Wang, Zhenguo Sun, Junkai Zhao, Mengfei Du, Mingyu Cao, Xiansheng Chen, Hongyang Cheng, Xiaojie Zhang, Yankai Fu, Ning Chen, Cheng Chi, Sixiang Chen, Huaihai Lyu, Xiaoshuai Hao, Yequan Wang, Bo Lei, Dong Liu, Xi Yang, Yance Jiao, Tengfei Pan, Yunyan Zhang, Songjing Wang, Ziqian Zhang, Xu Liu, Ji Zhang, Caowei Meng, Zhizheng Zhang, Jiyang Gao, Song Wang, Xiaokun Leng, Zhiqiang Xie, Zhenzhen Zhou, Peng Huang, Wu Yang, Yandong Guo, Yichao Zhu, Suibing Zheng, Hao Cheng, Xinmin Ding, Yang Yue, Huanqian Wang, Chi Chen, Jingrui Pang, YuXi Qian, Haoran Geng, Lianli Gao, Haiyuan Li, Bin Fang, Gao Huang, Yaodong Yang, Hao Dong, He Wang, Hang Zhao, Yadong Mu, Di Hu, Hao Zhao, Tiejun Huang, Shanghang Zhang, Yonghua Lin, Zhongyuan Wang and Guocai Yao}, journal={arXiv preprint arXiv:2511.17441}, url = {https://arxiv.org/abs/2511.17441}, year={2025}, } ``` ### Additional References If you use this dataset, please also consider citing: LeRobot Framework: https://github.com/huggingface/lerobot ## Version Information Initial Release

提供机构：

RoboCOIN

搜集汇总

数据集介绍

构建方式

在机器人操作领域，高质量的数据集对于推动模仿学习与强化学习算法的发展至关重要。Agilex_Cobot_Magic_fold_T_shirts数据集依托于LeRobot框架进行构建，其数据采集过程在精心布置的办公室场景中完成，聚焦于衣物折叠这一具体任务。该数据集通过搭载双指夹爪的Agilex_Cobot_Magic机器人平台，系统性地记录了100个完整操作片段，共计超过七万八千帧数据。数据以Parquet格式高效存储，并严格遵循右手机坐标系，确保了状态与动作数据在弧度与米制单位下的精确性与一致性。

使用方法

研究者可利用该数据集进行机器人操作技能的深度研究。数据集完全兼容LeRobot生态系统，用户可通过标准接口直接加载Parquet数据文件及对应的视频流，便捷地访问观测、状态、动作及各类标注信息。其结构化的数据组织方式支持对完整任务轨迹或特定子任务序列的灵活提取。该资源适用于训练端到端的视觉运动策略模型、进行行为克隆或作为离线强化学习的基准环境，其丰富的注释体系亦可用于研究任务分解、技能抽象与可解释性分析等前沿方向。

背景与挑战

背景概述

在机器人操作领域，灵巧的双臂协同任务，尤其是涉及非刚性物体如衣物的折叠，一直是推动机器人从结构化工业环境迈向非结构化日常生活场景的关键挑战。Agilex_Cobot_Magic_fold_T_shirts数据集由北京智源人工智能研究院（BAAI）的RoboCOIN团队于2025年创建并开源，其核心研究问题聚焦于如何通过大规模、高质量的真实世界演示数据，训练机器人执行复杂的双手机器人衣物折叠任务。该数据集基于LeRobot框架构建，包含100个完整演示片段、超过7.8万帧的多视角视觉观测以及高维度的机器人状态与动作数据，旨在为模仿学习、强化学习等算法提供宝贵的训练资源，推动家庭服务机器人及通用操作能力的发展。

当前挑战

该数据集致力于解决机器人操作中非刚性物体灵巧操作的领域挑战。衣物作为典型的非刚性、易变形物体，其状态估计、抓取点规划和折叠过程中的动态物理交互建模极为复杂，对机器人的感知、规划和控制算法提出了极高要求。在数据集构建过程中，挑战同样显著：如何通过遥操作高效、一致地采集高质量的双臂协同演示数据；如何对复杂的连续操作过程进行精细的子任务与原子动作标注，以支持分层学习；以及如何确保多视角视频数据、机器人状态数据与丰富注释之间的严格时间同步与对齐，从而构建一个可用于端到端策略学习的可靠基准。

常用场景

经典使用场景

在机器人操作学习领域，该数据集为双臂协同折叠衣物任务提供了详尽的示范数据。其经典使用场景在于训练机器人模仿学习模型，通过100个完整演示片段，涵盖抓取、提升、折叠等原子动作序列。数据集以LeRobot格式构建，包含多视角视觉观测与高维状态动作对，能够支持端到端策略学习或行为克隆方法的开发，尤其适用于处理非刚性物体操作的复杂动力学问题。

解决学术问题

该数据集针对机器人操作中非刚性物体处理的长期挑战，提供了结构化解决方案。它通过精细标注的11个子任务，解决了模仿学习在复杂连续动作空间中的样本效率问题，并为双臂协调操作的研究奠定了数据基础。其意义在于推动了机器人操作从刚性物体到可变形物体的范式转移，通过真实世界演示数据，促进了基于学习的操控策略在泛化性与鲁棒性方面的理论探索与算法创新。

实际应用

在现实应用层面，该数据集直接服务于家庭服务机器人或工业分拣场景的自动化流程开发。例如，基于此数据训练的模型可部署于仓储物流中的服装折叠系统，或应用于老年护理辅助机器人完成衣物整理任务。其多摄像头配置模拟了真实工作环境，使得学习到的策略能更好地迁移到实际机器人平台，降低对精确建模与手工编程的依赖，提升自动化系统的适应能力与经济可行性。

数据集最近研究