RoboCOIN/Agilex_Cobot_Magic_pour_water_middle_cup

Name: RoboCOIN/Agilex_Cobot_Magic_pour_water_middle_cup
Creator: RoboCOIN
Published: 2026-04-02 12:41:49
License: 暂无描述

Hugging Face2026-04-02 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/RoboCOIN/Agilex_Cobot_Magic_pour_water_middle_cup

下载链接

链接失效反馈

官方服务：

资源简介：

--- task_categories: - robotics language: - en extra_gated_prompt: 'By accessing this dataset, you agree to cite the associated paper in your research/publications—see the "Citation" section for details. You agree to not use the dataset to conduct experiments that cause harm to human subjects.' extra_gated_fields: Company/Organization: type: 'text' description: 'e.g., "ETH Zurich", "Boston Dynamics", "Independent Researcher"' Country: type: 'country' description: 'e.g., "Germany", "China", "United States"' tags: - RoboCOIN - LeRobot license: apache-2.0 configs: - config_name: default data_files: data/chunk-{id}/episode_{id}.parquet --- # Agilex_Cobot_Magic_pour_water_middle_cup ## Dataset Description This dataset uses an extended format based on LeRobot and is fully compatible with LeRobot. ## Task Preview <video src="videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4" controls width="640"></video> [View Video Directly](videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4) ### Overview - **Total Episodes:** 192 - **Total Frames:** 130785 - **FPS:** 30 - **Dataset Size:** 1.10 GB - **Robot Name:** `Agilex_Cobot_Magic` - **End-Effector Type:** `two_finger_gripper` - **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.` - **Sensors:** `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb` - **Camera Information:** cam_head_rgb; cam_left_wrist_rgb; cam_right_wrist_rgb - **Scene:** `office_workspace->office` - **Objects:** `table(unknown)`, `transparent_bottle(unknown)`, `pink_clear_plastic_cup(unknown)`, `blue_clear_plastic_cup(unknown)`, `black_clear_plastic_cup(unknown)` - **Task Description:** pick up the bottle filled with water and pour water into the middle cup of the three cups. ### Primary Task Instruction > pick up the bottle filled with water and pour water into the middle cup of the three cups. ### Robot Configuration - **Robot Name:** `Agilex_Cobot_Magic` - **Codebase Version:** `v2.1` - **End-Effector Type:** `two_finger_gripper` - **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.` ## Scene and Objects ### Scene Type `office_workspace->office` ### Objects - `table(unknown)` - `transparent_bottle(unknown)` - `pink_clear_plastic_cup(unknown)` - `blue_clear_plastic_cup(unknown)` - `black_clear_plastic_cup(unknown)` ## Task Descriptions - **Standardized Task Description:** `pick up the bottle filled with water and pour water into the middle cup of the three cups.` - **Operation Type:** `Due to some reasons, this dataset temporarily cannot provide the operation type information.` - **Environment Type:** `Due to some reasons, this dataset temporarily cannot provide the environment type information.` ### Sub-Tasks This dataset includes 18 distinct subtasks: 1. **Abnormal** (Index: 0) 2. **Pour water from brown bottle to red cup with the right gripper** (Index: 1) 3. **Grasp the transparent bottle with the left gripper** (Index: 2) 4. **Pour water from transparent bottle to grey cup with the right gripper** (Index: 3) 5. **Pour water from transparent bottle to red cup with the left gripper** (Index: 4) 6. **Pour water from transparent bottle to blue cup with the right gripper** (Index: 5) 7. **Pour water from transparent bottle to red cup with the right gripper** (Index: 6) 8. **Pour water from brown bottle to grey cup with the right gripper** (Index: 7) 9. **Pour water from transparent bottle to brown cup with the left gripper** (Index: 8) 10. **End** (Index: 9) 11. **Grasp the transparent bottle with the right gripper** (Index: 10) 12. **Pour water from brown bottle to blue cup with the right gripper** (Index: 11) 13. **Pour water from transparent bottle tobrown cup with the right gripper** (Index: 12) 14. **Place the transparent bottle with the right gripper** (Index: 13) 15. **Grasp the brown bottle with the right gripper** (Index: 14) 16. **Place the transparent bottle with the left gripper** (Index: 15) 17. **Place the brown bottle with the right gripper** (Index: 16) 18. **null** (Index: 17) ### Atomic Actions - `grasp` - `lift` - `lower` - `pour` ## Hardware and Sensors ### Sensors - `cam_head_rgb` - `cam_left_wrist_rgb` - `cam_right_wrist_rgb` ### Camera Information - `cam_head_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p - `cam_left_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p - `cam_right_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p ### Coordinate System - **Definition:** `right-hand-frame` ### Dimensions & Units - **Joint Rotation:** `radian` - **End-Effector Rotation:** `radian` - **End-Effector Translation:** `meter` ## Dataset Statistics | Metric | Value | |--------|-------| | **Total Episodes** | 192 | | **Total Frames** | 130785 | | **Total Tasks** | 18 | | **Total Videos** | 576 | | **Total Chunks** | 1 | | **Chunk Size** | 1000 | | **FPS** | 30 | | **State Dimensions** | 26 | | **Action Dimensions** | 26 | | **Camera Views** | 3 | | **Dataset Size** | 1.10 GB | ## Data Splits The dataset is organized into the following splits: - **Training**: Episodes 0:191 ## Dataset Structure This dataset follows the LeRobot format and contains the following components: ### Data Files - **Videos**: Compressed video files containing RGB camera observations - **State Data**: Robot joint positions, velocities, and other state information - **Action Data**: Robot action commands and trajectories - **Metadata**: Episode metadata, timestamps, and annotations ### File Organization - **Data Path Pattern**: `data/chunk-{id}/episode_{id}.parquet` - **Video Path Pattern**: `videos/chunk-{id}/observation.images.cam_left_wrist_rgb/episode_{id}.mp{id}` - **Chunking**: Data is organized into 1 chunk(s) of size 1000 ### Data Structure (Tree) ``` Agilex_Cobot_Magic_pour_water_middle_cup_qced_hardlink/ |-- annotations | |-- eef_acc_mag_annotation.jsonl | |-- eef_direction_annotation.jsonl | |-- eef_velocity_annotation.jsonl | |-- gripper_activity_annotation.jsonl | |-- gripper_mode_annotation.jsonl | |-- scene_annotations.jsonl | `-- subtask_annotations.jsonl |-- data | `-- chunk-000 | |-- episode_000000.parquet | |-- episode_000001.parquet | |-- episode_000002.parquet | |-- episode_000003.parquet | |-- episode_000004.parquet | |-- episode_000005.parquet | |-- episode_000006.parquet | |-- episode_000007.parquet | |-- episode_000008.parquet | |-- episode_000009.parquet | |-- episode_000010.parquet | `-- episode_000011.parquet | `-- ... (180 more entries) |-- meta | |-- episodes.jsonl | |-- episodes_stats.jsonl | |-- info.json | `-- tasks.jsonl `-- videos `-- chunk-000 |-- observation.images.cam_head_rgb |-- observation.images.cam_left_wrist_rgb `-- observation.images.cam_right_wrist_rgb ``` ## Camera Views This dataset includes 3 camera views: `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb`. ## Features (Full YAML) ```yaml observation.images.cam_head_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.images.cam_left_wrist_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.images.cam_right_wrist_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.state: dtype: float32 shape: - 26 names: - left_arm_joint_1_rad - left_arm_joint_2_rad - left_arm_joint_3_rad - left_arm_joint_4_rad - left_arm_joint_5_rad - left_arm_joint_6_rad - left_gripper_open - left_eef_pos_x_m - left_eef_pos_y_m - left_eef_pos_z_m - left_eef_rot_euler_x_rad - left_eef_rot_euler_y_rad - left_eef_rot_euler_z_rad - right_arm_joint_1_rad - right_arm_joint_2_rad - right_arm_joint_3_rad - right_arm_joint_4_rad - right_arm_joint_5_rad - right_arm_joint_6_rad - right_gripper_open - right_eef_pos_x_m - right_eef_pos_y_m - right_eef_pos_z_m - right_eef_rot_euler_x_rad - right_eef_rot_euler_y_rad - right_eef_rot_euler_z_rad action: dtype: float32 shape: - 26 names: - left_arm_joint_1_rad - left_arm_joint_2_rad - left_arm_joint_3_rad - left_arm_joint_4_rad - left_arm_joint_5_rad - left_arm_joint_6_rad - left_gripper_open - left_eef_pos_x_m - left_eef_pos_y_m - left_eef_pos_z_m - left_eef_rot_euler_x_rad - left_eef_rot_euler_y_rad - left_eef_rot_euler_z_rad - right_arm_joint_1_rad - right_arm_joint_2_rad - right_arm_joint_3_rad - right_arm_joint_4_rad - right_arm_joint_5_rad - right_arm_joint_6_rad - right_gripper_open - right_eef_pos_x_m - right_eef_pos_y_m - right_eef_pos_z_m - right_eef_rot_euler_x_rad - right_eef_rot_euler_y_rad - right_eef_rot_euler_z_rad timestamp: dtype: float32 shape: - 1 names: null frame_index: dtype: int64 shape: - 1 names: null episode_index: dtype: int64 shape: - 1 names: null index: dtype: int64 shape: - 1 names: null task_index: dtype: int64 shape: - 1 names: null subtask_annotation: names: null dtype: int32 shape: - 5 scene_annotation: names: null dtype: int32 shape: - 1 eef_sim_pose_state: names: - left_eef_pos_x - left_eef_pos_y - left_eef_pos_z - left_eef_rot_x - left_eef_rot_y - left_eef_rot_z - right_eef_pos_x - right_eef_pos_y - right_eef_pos_z - right_eef_rot_x - right_eef_rot_y - right_eef_rot_z dtype: float32 shape: - 12 eef_sim_pose_action: names: - left_eef_pos_x - left_eef_pos_y - left_eef_pos_z - left_eef_rot_x - left_eef_rot_y - left_eef_rot_z - right_eef_pos_x - right_eef_pos_y - right_eef_pos_z - right_eef_rot_x - right_eef_rot_y - right_eef_rot_z dtype: float32 shape: - 12 eef_direction_state: names: - left_eef_direction - right_eef_direction dtype: int32 shape: - 2 eef_direction_action: names: - left_eef_direction - right_eef_direction dtype: int32 shape: - 2 eef_velocity_state: names: - left_eef_velocity - right_eef_velocity dtype: int32 shape: - 2 eef_velocity_action: names: - left_eef_velocity - right_eef_velocity dtype: int32 shape: - 2 eef_acc_mag_state: names: - left_eef_acc_mag - right_eef_acc_mag dtype: int32 shape: - 2 eef_acc_mag_action: names: - left_eef_acc_mag - right_eef_acc_mag dtype: int32 shape: - 2 gripper_mode_state: names: - left_gripper_mode - right_gripper_mode dtype: int32 shape: - 2 gripper_mode_action: names: - left_gripper_mode - right_gripper_mode dtype: int32 shape: - 2 gripper_activity_state: names: - left_gripper_activity - right_gripper_activity dtype: int32 shape: - 2 gripper_activity_action: names: - left_gripper_activity - right_gripper_activity dtype: int32 shape: - 2 gripper_open_scale_state: names: - left_gripper_open_scale - right_gripper_open_scale dtype: float32 shape: - 2 gripper_open_scale_action: names: - left_gripper_open_scale - right_gripper_open_scale dtype: float32 shape: - 2 ``` ## Available Annotations This dataset includes rich annotations to support diverse learning approaches: - `eef_acc_mag_annotation.jsonl` - `eef_direction_annotation.jsonl` - `eef_velocity_annotation.jsonl` - `gripper_activity_annotation.jsonl` - `gripper_mode_annotation.jsonl` - `scene_annotations.jsonl` - `subtask_annotations.jsonl` ## Dataset Tags - `RoboCOIN` - `LeRobot` ## Authors ### Contributors This dataset is contributed by:-RoboCOIN Team at Beijing Academy of Artificial Intelligence (BAAI) ### Annotators No annotator information available. ## Links - **Homepage:** [https://flagopen.github.io/RoboCOIN/](https://flagopen.github.io/RoboCOIN/) - **Paper:** [https://arxiv.org/abs/2511.17441](https://arxiv.org/abs/2511.17441) - **Repository:** [https://github.com/FlagOpen/RoboCOIN](https://github.com/FlagOpen/RoboCOIN) ## Contact and Support For questions, issues, or feedback regarding this dataset, please contact us. ### Support For technical support, please open an issue on our GitHub repository. ## License apache-2.0 ## Citation If you use this dataset in your research, please cite: ```bibtex @article{robocoin, title={RoboCOIN: An Open-Sourced Bimanual Robotic Data Collection for Integrated Manipulation}, author={Shihan Wu, Xuecheng Liu, Shaoxuan Xie, Pengwei Wang, Xinghang Li, Bowen Yang, Zhe Li, Kai Zhu, Hongyu Wu, Yiheng Liu, Zhaoye Long, Yue Wang, Chong Liu, Dihan Wang, Ziqiang Ni, Xiang Yang, You Liu, Ruoxuan Feng, Runtian Xu, Lei Zhang, Denghang Huang, Chenghao Jin, Anlan Yin, Xinlong Wang, Zhenguo Sun, Junkai Zhao, Mengfei Du, Mingyu Cao, Xiansheng Chen, Hongyang Cheng, Xiaojie Zhang, Yankai Fu, Ning Chen, Cheng Chi, Sixiang Chen, Huaihai Lyu, Xiaoshuai Hao, Yequan Wang, Bo Lei, Dong Liu, Xi Yang, Yance Jiao, Tengfei Pan, Yunyan Zhang, Songjing Wang, Ziqian Zhang, Xu Liu, Ji Zhang, Caowei Meng, Zhizheng Zhang, Jiyang Gao, Song Wang, Xiaokun Leng, Zhiqiang Xie, Zhenzhen Zhou, Peng Huang, Wu Yang, Yandong Guo, Yichao Zhu, Suibing Zheng, Hao Cheng, Xinmin Ding, Yang Yue, Huanqian Wang, Chi Chen, Jingrui Pang, YuXi Qian, Haoran Geng, Lianli Gao, Haiyuan Li, Bin Fang, Gao Huang, Yaodong Yang, Hao Dong, He Wang, Hang Zhao, Yadong Mu, Di Hu, Hao Zhao, Tiejun Huang, Shanghang Zhang, Yonghua Lin, Zhongyuan Wang and Guocai Yao}, journal={arXiv preprint arXiv:2511.17441}, url = {https://arxiv.org/abs/2511.17441}, year={2025}, } ``` ### Additional References If you use this dataset, please also consider citing: LeRobot Framework: https://github.com/huggingface/lerobot ## Version Information Initial Release

提供机构：

RoboCOIN

搜集汇总

数据集介绍

构建方式

在机器人操作任务数据集的构建领域，Agilex_Cobot_Magic_pour_water_middle_cup数据集采用了基于LeRobot框架的扩展格式进行系统化采集。该数据集通过记录Agilex_Cobot_Magic双臂机器人在办公场景中执行倒水任务的过程而构建，共包含192个完整任务片段，总计超过13万帧数据。数据采集依托于多视角视觉传感器，包括头部及双腕部RGB相机，以30帧每秒的速率同步录制视频流，同时精确记录机器人26维的关节状态与动作指令。原始数据以Parquet格式分块存储，并辅以丰富的结构化注释，确保了数据的高质量与可追溯性。

特点

该数据集的核心特点在于其针对双臂协同操作任务的深度刻画与多模态数据融合。数据集专注于“拾取水瓶并向中间杯子倒水”这一具身智能任务，场景中包含了桌子、透明水瓶以及三种颜色的塑料杯等多样物体。其显著特征包括从三个不同视角（头部、左腕、右腕）同步采集的RGB视频流，分辨率均为640x480，为模型提供了丰富的视觉上下文。此外，数据集提供了精细的动作分解，涵盖了抓握、提起、放下、倾倒等原子操作，并附有末端执行器位姿、速度、夹爪模式等多维度状态与动作标注，为模仿学习与强化学习算法提供了高价值的训练与评估基准。

使用方法

为有效利用该数据集进行机器人学习研究，使用者可遵循其与LeRobot框架兼容的数据组织结构。数据集主要文件存储于`data/chunk-{id}/`目录下的Parquet格式文件中，包含了观测、状态、动作及时间戳等核心序列数据。多视角视频文件则存放于`videos/`对应子目录下。研究人员可通过加载这些结构化数据，直接用于训练端到端的视觉运动策略模型或进行行为克隆。丰富的注释文件（如子任务标注、场景标注）支持对任务进行更细粒度的分析与课程学习。数据集已预先划分为训练集（所有192个片段），用户可直接将其整合至现有机器人学习流水线中，以推进双臂灵巧操作相关算法的开发。

背景与挑战

背景概述

在机器人操作学习领域，高质量、大规模的真实世界交互数据对于推动模仿学习与强化学习算法的进步至关重要。Agilex_Cobot_Magic_pour_water_middle_cup数据集由北京智源人工智能研究院（BAAI）的RoboCOIN团队于2025年构建并发布，作为RoboCOIN项目的重要组成部分。该数据集聚焦于双臂协作机器人在非结构化办公环境中执行精细的倒水操作任务，核心研究问题在于如何通过多视角视觉感知与高维状态动作数据，训练机器人完成涉及物体抓取、姿态控制与流体操作的复杂序列任务。其基于LeRobot格式的标准化设计，为机器人操作策略的泛化与迁移学习提供了宝贵的真实世界基准，显著促进了具身智能与灵巧操作研究的发展。

当前挑战

该数据集旨在解决机器人灵巧操作中涉及流体与多物体交互的复杂任务所面临的挑战。具体而言，倒水任务要求机器人不仅精确抓取透明水瓶，还需在倾倒过程中动态控制水流轨迹与杯子的相对位置，这对动作的连续性与适应性提出了极高要求。在构建过程中，挑战主要源于真实世界数据采集的复杂性：需同步记录头部及双腕共三个视角的高帧率视频流，并确保26维状态与动作数据的精确对齐与标注；同时，操作过程中的液体晃动、物体透明性以及环境光照变化，均为数据的一致性与质量保障带来了显著困难。此外，将原始传感器数据整合为标准化、可复用的LeRobot格式，亦涉及繁重的工程与校验工作。

常用场景

经典使用场景

在机器人操作学习领域，该数据集为双机械臂协同操作提供了典型范例。其核心场景聚焦于办公室环境下的精细液体倾倒任务，要求机器人准确识别并抓取透明水瓶，随后将水精准倒入三个杯子中的中间杯。这一过程涉及多视角视觉感知、双机械臂运动规划与协调控制，为研究复杂操作序列的端到端学习提供了结构化数据支撑。数据集包含192个完整操作片段，涵盖抓取、提升、倾倒等原子动作，为模仿学习与强化学习算法提供了丰富的训练样本。

衍生相关工作

基于该数据集所构建的RoboCOIN框架，已衍生出一系列关于大规模机器人数据收集与学习的经典研究工作。这些工作主要集中在开发高效的行为克隆算法、探索基于模型的强化学习在双机械臂操作中的应用，以及研究多模态表征学习对于操作技能泛化的影响。相关研究通过利用数据集中的丰富标注，如末端执行器轨迹、夹爪活动状态等，推动了机器人操作策略的可解释性与鲁棒性提升。同时，该数据集作为LeRobot生态的一部分，也促进了开源机器人学习社区在数据格式标准化与算法评测基准方面的协作与发展。

数据集最近研究