RoboCOIN/Agilex_Cobot_Magic_storage_object_basket

Name: RoboCOIN/Agilex_Cobot_Magic_storage_object_basket
Creator: RoboCOIN
Published: 2026-04-02 12:44:55
License: 暂无描述

Hugging Face2026-04-02 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/RoboCOIN/Agilex_Cobot_Magic_storage_object_basket

下载链接

链接失效反馈

官方服务：

资源简介：

--- task_categories: - robotics language: - en extra_gated_prompt: 'By accessing this dataset, you agree to cite the associated paper in your research/publications—see the "Citation" section for details. You agree to not use the dataset to conduct experiments that cause harm to human subjects.' extra_gated_fields: Company/Organization: type: 'text' description: 'e.g., "ETH Zurich", "Boston Dynamics", "Independent Researcher"' Country: type: 'country' description: 'e.g., "Germany", "China", "United States"' tags: - RoboCOIN - LeRobot license: apache-2.0 configs: - config_name: default data_files: data/chunk-{id}/episode_{id}.parquet --- # Agilex_Cobot_Magic_storage_object_basket ## Dataset Description This dataset uses an extended format based on LeRobot and is fully compatible with LeRobot. ## Task Preview <video src="videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4" controls width="640"></video> [View Video Directly](videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4) ### Overview - **Total Episodes:** 100 - **Total Frames:** 38216 - **FPS:** 30 - **Dataset Size:** 585.85 MB - **Robot Name:** `Agilex_Cobot_Magic` - **End-Effector Type:** `two_finger_gripper` - **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.` - **Sensors:** `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb` - **Camera Information:** cam_head_rgb; cam_left_wrist_rgb; cam_right_wrist_rgb - **Scene:** `household->living_room` - **Objects:** `table(unknown)`, `brown_basket(unknown)`, `banana(unknown)`, `bread(unknown)`, `apple(unknown)`, `avocado(unknown)`, `glass_cup(unknown)` - **Task Description:** the right gripper storage the items on the table into the basket. ### Primary Task Instruction > the right gripper storage the items on the table into the basket. ### Robot Configuration - **Robot Name:** `Agilex_Cobot_Magic` - **Codebase Version:** `v2.1` - **End-Effector Type:** `two_finger_gripper` - **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.` ## Scene and Objects ### Scene Type `household->living_room` ### Objects - `table(unknown)` - `brown_basket(unknown)` - `banana(unknown)` - `bread(unknown)` - `apple(unknown)` - `avocado(unknown)` - `glass_cup(unknown)` ## Task Descriptions - **Standardized Task Description:** `the right gripper storage the items on the table into the basket.` - **Operation Type:** `Due to some reasons, this dataset temporarily cannot provide the operation type information.` - **Environment Type:** `Due to some reasons, this dataset temporarily cannot provide the environment type information.` ### Sub-Tasks This dataset includes 6 distinct subtasks: 1. **Place the XX into the basket with the left gripper** (Index: 0) 2. **Grasp the XX with the right gripper** (Index: 1) 3. **Grasp the XX with the left gripper** (Index: 2) 4. **End** (Index: 3) 5. **Place the XX into the basket with the right gripper** (Index: 4) 6. **null** (Index: 5) ### Atomic Actions - `grasp` - `lift` - `lower` ## Hardware and Sensors ### Sensors - `cam_head_rgb` - `cam_left_wrist_rgb` - `cam_right_wrist_rgb` ### Camera Information - `cam_head_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p - `cam_left_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p - `cam_right_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p ### Coordinate System - **Definition:** `right-hand-frame` ### Dimensions & Units - **Joint Rotation:** `radian` - **End-Effector Rotation:** `radian` - **End-Effector Translation:** `meter` ## Dataset Statistics | Metric | Value | |--------|-------| | **Total Episodes** | 100 | | **Total Frames** | 38216 | | **Total Tasks** | 6 | | **Total Videos** | 300 | | **Total Chunks** | 1 | | **Chunk Size** | 1000 | | **FPS** | 30 | | **State Dimensions** | 26 | | **Action Dimensions** | 26 | | **Camera Views** | 3 | | **Dataset Size** | 585.85 MB | ## Data Splits The dataset is organized into the following splits: - **Training**: Episodes 0:99 ## Dataset Structure This dataset follows the LeRobot format and contains the following components: ### Data Files - **Videos**: Compressed video files containing RGB camera observations - **State Data**: Robot joint positions, velocities, and other state information - **Action Data**: Robot action commands and trajectories - **Metadata**: Episode metadata, timestamps, and annotations ### File Organization - **Data Path Pattern**: `data/chunk-{id}/episode_{id}.parquet` - **Video Path Pattern**: `videos/chunk-{id}/observation.images.cam_left_wrist_rgb/episode_{id}.mp{id}` - **Chunking**: Data is organized into 1 chunk(s) of size 1000 ### Data Structure (Tree) ``` Agilex_Cobot_Magic_storage_object_basket_qced_hardlink/ |-- annotations | |-- eef_acc_mag_annotation.jsonl | |-- eef_direction_annotation.jsonl | |-- eef_velocity_annotation.jsonl | |-- gripper_activity_annotation.jsonl | |-- gripper_mode_annotation.jsonl | |-- scene_annotations.jsonl | `-- subtask_annotations.jsonl |-- data | `-- chunk-000 | |-- episode_000000.parquet | |-- episode_000001.parquet | |-- episode_000002.parquet | |-- episode_000003.parquet | |-- episode_000004.parquet | |-- episode_000005.parquet | |-- episode_000006.parquet | |-- episode_000007.parquet | |-- episode_000008.parquet | |-- episode_000009.parquet | |-- episode_000010.parquet | `-- episode_000011.parquet | `-- ... (88 more entries) |-- meta | |-- episodes.jsonl | |-- episodes_stats.jsonl | |-- info.json | `-- tasks.jsonl `-- videos `-- chunk-000 |-- observation.images.cam_head_rgb |-- observation.images.cam_left_wrist_rgb `-- observation.images.cam_right_wrist_rgb ``` ## Camera Views This dataset includes 3 camera views: `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb`. ## Features (Full YAML) ```yaml observation.images.cam_head_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.images.cam_left_wrist_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.images.cam_right_wrist_rgb: dtype: video shape: - 480 - 640 - 3 names: - height - width - channels info: video.height: 480 video.width: 640 video.codec: av1 video.pix_fmt: yuv420p video.is_depth_map: false video.fps: 30 video.channels: 3 has_audio: false observation.state: dtype: float32 shape: - 26 names: - left_arm_joint_1_rad - left_arm_joint_2_rad - left_arm_joint_3_rad - left_arm_joint_4_rad - left_arm_joint_5_rad - left_arm_joint_6_rad - left_gripper_open - left_eef_pos_x_m - left_eef_pos_y_m - left_eef_pos_z_m - left_eef_rot_euler_x_rad - left_eef_rot_euler_y_rad - left_eef_rot_euler_z_rad - right_arm_joint_1_rad - right_arm_joint_2_rad - right_arm_joint_3_rad - right_arm_joint_4_rad - right_arm_joint_5_rad - right_arm_joint_6_rad - right_gripper_open - right_eef_pos_x_m - right_eef_pos_y_m - right_eef_pos_z_m - right_eef_rot_euler_x_rad - right_eef_rot_euler_y_rad - right_eef_rot_euler_z_rad action: dtype: float32 shape: - 26 names: - left_arm_joint_1_rad - left_arm_joint_2_rad - left_arm_joint_3_rad - left_arm_joint_4_rad - left_arm_joint_5_rad - left_arm_joint_6_rad - left_gripper_open - left_eef_pos_x_m - left_eef_pos_y_m - left_eef_pos_z_m - left_eef_rot_euler_x_rad - left_eef_rot_euler_y_rad - left_eef_rot_euler_z_rad - right_arm_joint_1_rad - right_arm_joint_2_rad - right_arm_joint_3_rad - right_arm_joint_4_rad - right_arm_joint_5_rad - right_arm_joint_6_rad - right_gripper_open - right_eef_pos_x_m - right_eef_pos_y_m - right_eef_pos_z_m - right_eef_rot_euler_x_rad - right_eef_rot_euler_y_rad - right_eef_rot_euler_z_rad timestamp: dtype: float32 shape: - 1 names: null frame_index: dtype: int64 shape: - 1 names: null episode_index: dtype: int64 shape: - 1 names: null index: dtype: int64 shape: - 1 names: null task_index: dtype: int64 shape: - 1 names: null subtask_annotation: names: null dtype: int32 shape: - 5 scene_annotation: names: null dtype: int32 shape: - 1 eef_sim_pose_state: names: - left_eef_pos_x - left_eef_pos_y - left_eef_pos_z - left_eef_rot_x - left_eef_rot_y - left_eef_rot_z - right_eef_pos_x - right_eef_pos_y - right_eef_pos_z - right_eef_rot_x - right_eef_rot_y - right_eef_rot_z dtype: float32 shape: - 12 eef_sim_pose_action: names: - left_eef_pos_x - left_eef_pos_y - left_eef_pos_z - left_eef_rot_x - left_eef_rot_y - left_eef_rot_z - right_eef_pos_x - right_eef_pos_y - right_eef_pos_z - right_eef_rot_x - right_eef_rot_y - right_eef_rot_z dtype: float32 shape: - 12 eef_direction_state: names: - left_eef_direction - right_eef_direction dtype: int32 shape: - 2 eef_direction_action: names: - left_eef_direction - right_eef_direction dtype: int32 shape: - 2 eef_velocity_state: names: - left_eef_velocity - right_eef_velocity dtype: int32 shape: - 2 eef_velocity_action: names: - left_eef_velocity - right_eef_velocity dtype: int32 shape: - 2 eef_acc_mag_state: names: - left_eef_acc_mag - right_eef_acc_mag dtype: int32 shape: - 2 eef_acc_mag_action: names: - left_eef_acc_mag - right_eef_acc_mag dtype: int32 shape: - 2 gripper_mode_state: names: - left_gripper_mode - right_gripper_mode dtype: int32 shape: - 2 gripper_mode_action: names: - left_gripper_mode - right_gripper_mode dtype: int32 shape: - 2 gripper_activity_state: names: - left_gripper_activity - right_gripper_activity dtype: int32 shape: - 2 gripper_activity_action: names: - left_gripper_activity - right_gripper_activity dtype: int32 shape: - 2 gripper_open_scale_state: names: - left_gripper_open_scale - right_gripper_open_scale dtype: float32 shape: - 2 gripper_open_scale_action: names: - left_gripper_open_scale - right_gripper_open_scale dtype: float32 shape: - 2 ``` ## Available Annotations This dataset includes rich annotations to support diverse learning approaches: - `eef_acc_mag_annotation.jsonl` - `eef_direction_annotation.jsonl` - `eef_velocity_annotation.jsonl` - `gripper_activity_annotation.jsonl` - `gripper_mode_annotation.jsonl` - `scene_annotations.jsonl` - `subtask_annotations.jsonl` ## Dataset Tags - `RoboCOIN` - `LeRobot` ## Authors ### Contributors This dataset is contributed by:-RoboCOIN Team at Beijing Academy of Artificial Intelligence (BAAI) ### Annotators No annotator information available. ## Links - **Homepage:** [https://flagopen.github.io/RoboCOIN/](https://flagopen.github.io/RoboCOIN/) - **Paper:** [https://arxiv.org/abs/2511.17441](https://arxiv.org/abs/2511.17441) - **Repository:** [https://github.com/FlagOpen/RoboCOIN](https://github.com/FlagOpen/RoboCOIN) ## Contact and Support For questions, issues, or feedback regarding this dataset, please contact us. ### Support For technical support, please open an issue on our GitHub repository. ## License apache-2.0 ## Citation If you use this dataset in your research, please cite: ```bibtex @article{robocoin, title={RoboCOIN: An Open-Sourced Bimanual Robotic Data Collection for Integrated Manipulation}, author={Shihan Wu, Xuecheng Liu, Shaoxuan Xie, Pengwei Wang, Xinghang Li, Bowen Yang, Zhe Li, Kai Zhu, Hongyu Wu, Yiheng Liu, Zhaoye Long, Yue Wang, Chong Liu, Dihan Wang, Ziqiang Ni, Xiang Yang, You Liu, Ruoxuan Feng, Runtian Xu, Lei Zhang, Denghang Huang, Chenghao Jin, Anlan Yin, Xinlong Wang, Zhenguo Sun, Junkai Zhao, Mengfei Du, Mingyu Cao, Xiansheng Chen, Hongyang Cheng, Xiaojie Zhang, Yankai Fu, Ning Chen, Cheng Chi, Sixiang Chen, Huaihai Lyu, Xiaoshuai Hao, Yequan Wang, Bo Lei, Dong Liu, Xi Yang, Yance Jiao, Tengfei Pan, Yunyan Zhang, Songjing Wang, Ziqian Zhang, Xu Liu, Ji Zhang, Caowei Meng, Zhizheng Zhang, Jiyang Gao, Song Wang, Xiaokun Leng, Zhiqiang Xie, Zhenzhen Zhou, Peng Huang, Wu Yang, Yandong Guo, Yichao Zhu, Suibing Zheng, Hao Cheng, Xinmin Ding, Yang Yue, Huanqian Wang, Chi Chen, Jingrui Pang, YuXi Qian, Haoran Geng, Lianli Gao, Haiyuan Li, Bin Fang, Gao Huang, Yaodong Yang, Hao Dong, He Wang, Hang Zhao, Yadong Mu, Di Hu, Hao Zhao, Tiejun Huang, Shanghang Zhang, Yonghua Lin, Zhongyuan Wang and Guocai Yao}, journal={arXiv preprint arXiv:2511.17441}, url = {https://arxiv.org/abs/2511.17441}, year={2025}, } ``` ### Additional References If you use this dataset, please also consider citing: LeRobot Framework: https://github.com/huggingface/lerobot ## Version Information Initial Release

提供机构：

RoboCOIN

搜集汇总

数据集介绍

构建方式

在机器人操作学习领域，高质量数据集的构建对于算法模型的训练至关重要。Agilex_Cobot_Magic_storage_object_basket数据集采用了基于LeRobot框架的扩展格式，确保了与现有机器人学习生态系统的完全兼容性。该数据集通过记录Agilex_Cobot_Magic双机械臂机器人在家庭客厅场景中执行物品收纳任务的完整过程而构建，共包含100个独立的情节，总计38216帧数据。数据采集过程整合了头部及双腕部共三个RGB摄像头的视觉信息，并以30帧每秒的速率同步记录机器人的26维状态与动作数据，所有空间度量均采用右手坐标系，并以国际单位制进行标准化记录。

特点

该数据集在机器人操作数据领域展现出显著的多模态与结构化特征。其核心特点在于提供了三路同步的高清视觉流，分别从头部、左腕和右腕视角以640x480分辨率捕捉操作场景，为模型提供了丰富的环境感知信息。数据集的结构设计尤为精细，不仅包含了原始的关节角度、末端执行器位姿等低维状态与动作序列，还额外标注了末端执行器的速度、加速度、抓取器活动模式等多层次语义信息。这些标注以JSONL格式独立存储，支持对复杂操作任务进行细粒度的分析与学习。数据集的场景设定聚焦于日常物品的收纳任务，涉及香蕉、面包、苹果等多种常见物体，增强了其在现实世界应用中的泛化潜力。

使用方法

为促进机器人模仿学习与强化学习算法的研究，该数据集提供了清晰的使用路径。数据以Parquet格式分块存储，用户可通过标准的数据加载工具读取`data/chunk-{id}/`目录下的文件，每个文件对应一个完整的情节，其中封装了观测、状态、动作及时间戳等序列。视觉数据则以压缩视频形式存放于`videos/`目录下，便于流式读取。研究者可利用LeRobot框架提供的工具链直接加载和处理该数据集，进行策略学习、行为克隆或离线强化学习等实验。数据集丰富的标注信息允许进行从低级运动控制到高级任务规划的多层次研究。在使用时，用户需遵循Apache 2.0许可协议，并在相关研究中引用指定的学术文献。

背景与挑战

背景概述

在机器人操作领域，构建高质量、大规模的真实世界交互数据集是推动具身智能发展的关键基石。Agilex_Cobot_Magic_storage_object_basket数据集由北京智源人工智能研究院（BAAI）的RoboCOIN团队于2025年贡献，其核心研究聚焦于双臂协作机器人在家庭场景下的精细化物体存储任务。该数据集基于LeRobot框架构建，包含100个完整交互片段，总计超过3.8万帧多视角视觉与高维状态动作数据，旨在为机器人模仿学习、强化学习及多模态策略生成提供基准测试平台，对提升机器人在非结构化环境中的自主操作能力具有重要影响力。

当前挑战

该数据集致力于解决家庭环境中双臂机器人执行多物体抓取与放置任务的复杂操作问题，其核心挑战在于处理不同物体（如香蕉、玻璃杯）的几何与物理特性差异，以及如何在动态场景中实现精准、安全的双臂协调控制。在构建过程中，团队面临数据采集一致性与真实性的平衡难题，例如确保多相机视角（头部与双腕部）的时间同步与空间标定精度，同时克服遥操作信息缺失带来的动作轨迹标注困难，并需在数据规模与标注质量之间做出权衡，以保障学习算法的泛化性能。

常用场景

经典使用场景

在机器人操作学习领域，Agilex_Cobot_Magic_storage_object_basket数据集为双臂协同操作研究提供了典型范例。该数据集聚焦于家庭客厅场景下的物品收纳任务，通过Agilex_Cobot_Magic机器人使用双指夹爪将香蕉、面包、苹果等日常物品从桌面整理至棕色篮子中。其经典使用场景体现在为模仿学习与强化学习算法提供多视角视觉观测与高维动作轨迹的配对数据，特别是通过头部与双腕部三个RGB相机视角，完整记录了机器人执行抓取、提升、放置等原子动作的连续过程，为双臂协调控制与精细操作策略的端到端训练奠定了数据基础。

衍生相关工作

围绕该数据集，已衍生出多项聚焦于双臂操作与多任务学习的经典研究工作。作为RoboCOIN项目的重要组成部分，它支撑了开源双手机器人数据收集与集成操作的研究。相关成果促进了模仿学习与离线强化学习在复杂操作任务上的性能突破，例如基于视觉的动作预测模型、多相机融合的态势感知架构以及面向子任务分解的分层策略网络。这些工作不仅深化了对机器人操作技能习得的理解，也为构建更大规模、多领域的机器人操作数据集社区提供了范式参考与技术积累。

数据集最近研究