RoboCOIN/Agilex_Cobot_Magic_storage_object_closest_cube
收藏Hugging Face2026-04-02 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/RoboCOIN/Agilex_Cobot_Magic_storage_object_closest_cube
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- robotics
language:
- en
extra_gated_prompt: 'By accessing this dataset, you agree to cite the associated paper in your research/publications—see the "Citation" section for details. You agree to not use the dataset to conduct experiments that cause harm to human subjects.'
extra_gated_fields:
Company/Organization:
type: 'text'
description: 'e.g., "ETH Zurich", "Boston Dynamics", "Independent Researcher"'
Country:
type: 'country'
description: 'e.g., "Germany", "China", "United States"'
tags:
- RoboCOIN
- LeRobot
license: apache-2.0
configs:
- config_name: default
data_files: data/chunk-{id}/episode_{id}.parquet
---
# Agilex_Cobot_Magic_storage_object_closest_cube
## Dataset Description
This dataset uses an extended format based on LeRobot and is fully compatible with LeRobot.
## Task Preview
<video src="videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4" controls width="640"></video>
[View Video Directly](videos/chunk-000/observation.images.cam_head_rgb/episode_000000.mp4)
### Overview
- **Total Episodes:** 49
- **Total Frames:** 12420
- **FPS:** 30
- **Dataset Size:** 154.87 MB
- **Robot Name:** `Agilex_Cobot_Magic`
- **End-Effector Type:** `two_finger_gripper`
- **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.`
- **Sensors:** `cam_head_rgb`,
`cam_left_wrist_rgb`,
`cam_right_wrist_rgb`
- **Camera Information:** cam_head_rgb;
cam_left_wrist_rgb;
cam_right_wrist_rgb
- **Scene:** `office_workspace->office`
- **Objects:** `table(unknown)`,
`brown_basket(unknown)`,
`mango(unknown)`,
`apple(unknown)`,
`rubik's_cube(unknown)`,
`whiteboard_erasers(unknown)`,
`bathing_in_flowers(unknown)`
- **Task Description:** use a picker to grab the item closest to the cube and place it in the basket.
### Primary Task Instruction
> use a picker to grab the item closest to the cube and place it in the basket.
### Robot Configuration
- **Robot Name:** `Agilex_Cobot_Magic`
- **Codebase Version:** `v2.1`
- **End-Effector Type:** `two_finger_gripper`
- **Teleoperation Type:** `Due to some reasons, this dataset temporarily cannot provide the teleoperation type information.`
## Scene and Objects
### Scene Type
`office_workspace->office`
### Objects
- `table(unknown)`
- `brown_basket(unknown)`
- `mango(unknown)`
- `apple(unknown)`
- `rubik's_cube(unknown)`
- `whiteboard_erasers(unknown)`
- `bathing_in_flowers(unknown)`
## Task Descriptions
- **Standardized Task Description:** `use a picker to grab the item closest to the cube and place it in the basket.`
- **Operation Type:** `Due to some reasons, this dataset temporarily cannot provide the operation type information.`
- **Environment Type:** `Due to some reasons, this dataset temporarily cannot provide the environment type information.`
### Sub-Tasks
This dataset includes 16 distinct subtasks:
1. **Place the apple into the basket with the right gripper** (Index: 0)
2. **Place the blackboard erasure into the basket with the left gripper** (Index: 1)
3. **Place the mango into the basket with the right gripper** (Index: 2)
4. **Grasp the mango with the right gripper** (Index: 3)
5. **Grasp the blackboard erasure with the right gripper** (Index: 4)
6. **Grasp the Shower puff with the left gripper** (Index: 5)
7. **Place the Shower puff into the basket with the right gripper** (Index: 6)
8. **End** (Index: 7)
9. **Place the Shower puff into the basket with the left gripper** (Index: 8)
10. **Grasp the mango with the left gripper** (Index: 9)
11. **Place the mango into the basket with the left gripper** (Index: 10)
12. **Grasp the Shower puff with the right gripper** (Index: 11)
13. **Grasp the apple with the right gripper** (Index: 12)
14. **Grasp the blackboard erasure with the left gripper** (Index: 13)
15. **Place the blackboard erasure into the basket with the right gripper** (Index: 14)
16. **null** (Index: 15)
### Atomic Actions
- `grasp`
- `lift`
- `lower`
## Hardware and Sensors
### Sensors
- `cam_head_rgb`
- `cam_left_wrist_rgb`
- `cam_right_wrist_rgb`
### Camera Information
- `cam_head_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p
- `cam_left_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p
- `cam_right_wrist_rgb`: dtype=video, shape=480x640x3, resolution=640x480, codec=av1, pix_fmt=yuv420p
### Coordinate System
- **Definition:** `right-hand-frame`
### Dimensions & Units
- **Joint Rotation:** `radian`
- **End-Effector Rotation:** `radian`
- **End-Effector Translation:** `meter`
## Dataset Statistics
| Metric | Value |
|--------|-------|
| **Total Episodes** | 49 |
| **Total Frames** | 12420 |
| **Total Tasks** | 16 |
| **Total Videos** | 147 |
| **Total Chunks** | 1 |
| **Chunk Size** | 1000 |
| **FPS** | 30 |
| **State Dimensions** | 26 |
| **Action Dimensions** | 26 |
| **Camera Views** | 3 |
| **Dataset Size** | 154.87 MB |
## Data Splits
The dataset is organized into the following splits:
- **Training**: Episodes 0:48
## Dataset Structure
This dataset follows the LeRobot format and contains the following components:
### Data Files
- **Videos**: Compressed video files containing RGB camera observations
- **State Data**: Robot joint positions, velocities, and other state information
- **Action Data**: Robot action commands and trajectories
- **Metadata**: Episode metadata, timestamps, and annotations
### File Organization
- **Data Path Pattern**: `data/chunk-{id}/episode_{id}.parquet`
- **Video Path Pattern**: `videos/chunk-{id}/observation.images.cam_left_wrist_rgb/episode_{id}.mp{id}`
- **Chunking**: Data is organized into 1 chunk(s)
of size 1000
### Data Structure (Tree)
```
Agilex_Cobot_Magic_storage_object_closest_cube_qced_hardlink/
|-- annotations
| |-- eef_acc_mag_annotation.jsonl
| |-- eef_direction_annotation.jsonl
| |-- eef_velocity_annotation.jsonl
| |-- gripper_activity_annotation.jsonl
| |-- gripper_mode_annotation.jsonl
| |-- scene_annotations.jsonl
| `-- subtask_annotations.jsonl
|-- data
| `-- chunk-000
| |-- episode_000000.parquet
| |-- episode_000001.parquet
| |-- episode_000002.parquet
| |-- episode_000003.parquet
| |-- episode_000004.parquet
| |-- episode_000005.parquet
| |-- episode_000006.parquet
| |-- episode_000007.parquet
| |-- episode_000008.parquet
| |-- episode_000009.parquet
| |-- episode_000010.parquet
| `-- episode_000011.parquet
| `-- ... (37 more entries)
|-- meta
| |-- episodes.jsonl
| |-- episodes_stats.jsonl
| |-- info.json
| `-- tasks.jsonl
`-- videos
`-- chunk-000
|-- observation.images.cam_head_rgb
|-- observation.images.cam_left_wrist_rgb
`-- observation.images.cam_right_wrist_rgb
```
## Camera Views
This dataset includes 3 camera views: `cam_head_rgb`, `cam_left_wrist_rgb`, `cam_right_wrist_rgb`.
## Features (Full YAML)
```yaml
observation.images.cam_head_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.height: 480
video.width: 640
video.codec: av1
video.pix_fmt: yuv420p
video.is_depth_map: false
video.fps: 30
video.channels: 3
has_audio: false
observation.images.cam_left_wrist_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.height: 480
video.width: 640
video.codec: av1
video.pix_fmt: yuv420p
video.is_depth_map: false
video.fps: 30
video.channels: 3
has_audio: false
observation.images.cam_right_wrist_rgb:
dtype: video
shape:
- 480
- 640
- 3
names:
- height
- width
- channels
info:
video.height: 480
video.width: 640
video.codec: av1
video.pix_fmt: yuv420p
video.is_depth_map: false
video.fps: 30
video.channels: 3
has_audio: false
observation.state:
dtype: float32
shape:
- 26
names:
- left_arm_joint_1_rad
- left_arm_joint_2_rad
- left_arm_joint_3_rad
- left_arm_joint_4_rad
- left_arm_joint_5_rad
- left_arm_joint_6_rad
- left_gripper_open
- left_eef_pos_x_m
- left_eef_pos_y_m
- left_eef_pos_z_m
- left_eef_rot_euler_x_rad
- left_eef_rot_euler_y_rad
- left_eef_rot_euler_z_rad
- right_arm_joint_1_rad
- right_arm_joint_2_rad
- right_arm_joint_3_rad
- right_arm_joint_4_rad
- right_arm_joint_5_rad
- right_arm_joint_6_rad
- right_gripper_open
- right_eef_pos_x_m
- right_eef_pos_y_m
- right_eef_pos_z_m
- right_eef_rot_euler_x_rad
- right_eef_rot_euler_y_rad
- right_eef_rot_euler_z_rad
action:
dtype: float32
shape:
- 26
names:
- left_arm_joint_1_rad
- left_arm_joint_2_rad
- left_arm_joint_3_rad
- left_arm_joint_4_rad
- left_arm_joint_5_rad
- left_arm_joint_6_rad
- left_gripper_open
- left_eef_pos_x_m
- left_eef_pos_y_m
- left_eef_pos_z_m
- left_eef_rot_euler_x_rad
- left_eef_rot_euler_y_rad
- left_eef_rot_euler_z_rad
- right_arm_joint_1_rad
- right_arm_joint_2_rad
- right_arm_joint_3_rad
- right_arm_joint_4_rad
- right_arm_joint_5_rad
- right_arm_joint_6_rad
- right_gripper_open
- right_eef_pos_x_m
- right_eef_pos_y_m
- right_eef_pos_z_m
- right_eef_rot_euler_x_rad
- right_eef_rot_euler_y_rad
- right_eef_rot_euler_z_rad
timestamp:
dtype: float32
shape:
- 1
names: null
frame_index:
dtype: int64
shape:
- 1
names: null
episode_index:
dtype: int64
shape:
- 1
names: null
index:
dtype: int64
shape:
- 1
names: null
task_index:
dtype: int64
shape:
- 1
names: null
subtask_annotation:
names: null
dtype: int32
shape:
- 5
scene_annotation:
names: null
dtype: int32
shape:
- 1
eef_sim_pose_state:
names:
- left_eef_pos_x
- left_eef_pos_y
- left_eef_pos_z
- left_eef_rot_x
- left_eef_rot_y
- left_eef_rot_z
- right_eef_pos_x
- right_eef_pos_y
- right_eef_pos_z
- right_eef_rot_x
- right_eef_rot_y
- right_eef_rot_z
dtype: float32
shape:
- 12
eef_sim_pose_action:
names:
- left_eef_pos_x
- left_eef_pos_y
- left_eef_pos_z
- left_eef_rot_x
- left_eef_rot_y
- left_eef_rot_z
- right_eef_pos_x
- right_eef_pos_y
- right_eef_pos_z
- right_eef_rot_x
- right_eef_rot_y
- right_eef_rot_z
dtype: float32
shape:
- 12
eef_direction_state:
names:
- left_eef_direction
- right_eef_direction
dtype: int32
shape:
- 2
eef_direction_action:
names:
- left_eef_direction
- right_eef_direction
dtype: int32
shape:
- 2
eef_velocity_state:
names:
- left_eef_velocity
- right_eef_velocity
dtype: int32
shape:
- 2
eef_velocity_action:
names:
- left_eef_velocity
- right_eef_velocity
dtype: int32
shape:
- 2
eef_acc_mag_state:
names:
- left_eef_acc_mag
- right_eef_acc_mag
dtype: int32
shape:
- 2
eef_acc_mag_action:
names:
- left_eef_acc_mag
- right_eef_acc_mag
dtype: int32
shape:
- 2
gripper_mode_state:
names:
- left_gripper_mode
- right_gripper_mode
dtype: int32
shape:
- 2
gripper_mode_action:
names:
- left_gripper_mode
- right_gripper_mode
dtype: int32
shape:
- 2
gripper_activity_state:
names:
- left_gripper_activity
- right_gripper_activity
dtype: int32
shape:
- 2
gripper_activity_action:
names:
- left_gripper_activity
- right_gripper_activity
dtype: int32
shape:
- 2
gripper_open_scale_state:
names:
- left_gripper_open_scale
- right_gripper_open_scale
dtype: float32
shape:
- 2
gripper_open_scale_action:
names:
- left_gripper_open_scale
- right_gripper_open_scale
dtype: float32
shape:
- 2
```
## Available Annotations
This dataset includes rich annotations to support diverse learning approaches:
- `eef_acc_mag_annotation.jsonl`
- `eef_direction_annotation.jsonl`
- `eef_velocity_annotation.jsonl`
- `gripper_activity_annotation.jsonl`
- `gripper_mode_annotation.jsonl`
- `scene_annotations.jsonl`
- `subtask_annotations.jsonl`
## Dataset Tags
- `RoboCOIN`
- `LeRobot`
## Authors
### Contributors
This dataset is contributed by:-RoboCOIN Team at Beijing Academy of Artificial Intelligence (BAAI)
### Annotators
No annotator information available.
## Links
- **Homepage:** [https://flagopen.github.io/RoboCOIN/](https://flagopen.github.io/RoboCOIN/)
- **Paper:** [https://arxiv.org/abs/2511.17441](https://arxiv.org/abs/2511.17441)
- **Repository:** [https://github.com/FlagOpen/RoboCOIN](https://github.com/FlagOpen/RoboCOIN)
## Contact and Support
For questions, issues, or feedback regarding this dataset, please contact us.
### Support
For technical support, please open an issue on our GitHub repository.
## License
apache-2.0
## Citation
If you use this dataset in your research, please cite:
```bibtex
@article{robocoin,
title={RoboCOIN: An Open-Sourced Bimanual Robotic Data Collection for Integrated Manipulation},
author={Shihan Wu, Xuecheng Liu, Shaoxuan Xie, Pengwei Wang, Xinghang Li, Bowen Yang, Zhe Li, Kai Zhu, Hongyu Wu, Yiheng Liu, Zhaoye Long, Yue Wang, Chong Liu, Dihan Wang, Ziqiang Ni, Xiang Yang, You Liu, Ruoxuan Feng, Runtian Xu, Lei Zhang, Denghang Huang, Chenghao Jin, Anlan Yin, Xinlong Wang, Zhenguo Sun, Junkai Zhao, Mengfei Du, Mingyu Cao, Xiansheng Chen, Hongyang Cheng, Xiaojie Zhang, Yankai Fu, Ning Chen, Cheng Chi, Sixiang Chen, Huaihai Lyu, Xiaoshuai Hao, Yequan Wang, Bo Lei, Dong Liu, Xi Yang, Yance Jiao, Tengfei Pan, Yunyan Zhang, Songjing Wang, Ziqian Zhang, Xu Liu, Ji Zhang, Caowei Meng, Zhizheng Zhang, Jiyang Gao, Song Wang, Xiaokun Leng, Zhiqiang Xie, Zhenzhen Zhou, Peng Huang, Wu Yang, Yandong Guo, Yichao Zhu, Suibing Zheng, Hao Cheng, Xinmin Ding, Yang Yue, Huanqian Wang, Chi Chen, Jingrui Pang, YuXi Qian, Haoran Geng, Lianli Gao, Haiyuan Li, Bin Fang, Gao Huang, Yaodong Yang, Hao Dong, He Wang, Hang Zhao, Yadong Mu, Di Hu, Hao Zhao, Tiejun Huang, Shanghang Zhang, Yonghua Lin, Zhongyuan Wang and Guocai Yao},
journal={arXiv preprint arXiv:2511.17441},
url = {https://arxiv.org/abs/2511.17441},
year={2025},
}
```
### Additional References
If you use this dataset, please also consider citing:
LeRobot Framework: https://github.com/huggingface/lerobot
## Version Information
Initial Release
提供机构:
RoboCOIN
搜集汇总
数据集介绍

构建方式
在机器人操作数据集的构建领域,Agilex_Cobot_Magic_storage_object_closest_cube数据集遵循了LeRobot扩展格式,确保了与现有生态系统的完全兼容性。该数据集通过采集双机械臂协作机器人在办公场景下的实际操作轨迹而构建,共包含49个完整的情节,总计12420帧图像数据。数据以分块形式组织,存储于Parquet文件中,并同步录制了头部、左右腕部三个视角的高清RGB视频,帧率为30 FPS,所有机器人状态与动作数据均以26维向量精确记录,涵盖了关节角度、末端执行器位姿等关键信息。
使用方法
研究人员可利用该数据集训练和评估机器人模仿学习、强化学习及任务规划模型。数据以标准化的LeRobot格式提供,用户可通过加载`data/chunk-{id}/`路径下的Parquet文件便捷地访问状态-动作对序列与对应的时间戳。同时,`videos/`目录中存储的压缩视频文件可与本体数据对齐,用于视觉-动作联合建模。丰富的子任务与原子动作标注使得该数据集特别适用于分层策略学习或技能组合研究。在使用前,用户需同意相关许可协议,并在研究中引用指定的论文以尊重知识产权。
背景与挑战
背景概述
在机器人操作领域,双手机器人协同执行复杂任务一直是研究的前沿与难点。Agilex_Cobot_Magic_storage_object_closest_cube数据集由北京智源人工智能研究院(BAAI)的RoboCOIN团队于2025年构建并发布,旨在为双手机器人操作提供高质量的真实世界演示数据。该数据集基于LeRobot框架扩展,专注于解决在非结构化办公环境中,机器人识别、抓取并放置特定目标物体的核心研究问题。通过包含49个完整交互片段、12420帧多视角视频以及丰富的状态与动作标注,该数据集为模仿学习、强化学习及机器人技能泛化等研究方向提供了宝贵的基准资源,推动了机器人灵巧操作向更复杂、更实用的场景迈进。
当前挑战
该数据集致力于解决机器人操作中物体识别与抓取规划这一经典问题的挑战,具体表现为在动态、杂乱的办公场景中,机器人需准确识别‘距离魔方最近的物体’并进行精确的抓取与放置。这要求算法具备强大的视觉感知能力、空间关系理解能力以及精细的运动规划能力。在数据集构建过程中,挑战同样显著:如何确保多视角视频数据(头部、左右腕部摄像头)的时空同步与标定一致性;如何在真实物理交互中采集高维度、低延迟的机器人状态与动作数据;以及如何设计一套系统化的标注体系,将连续的操作流程分解为‘抓取’、‘抬起’、‘放下’等原子动作,并为复杂的双手机器人协同任务提供可解释的层次化任务标注。
常用场景
经典使用场景
在机器人操作学习领域,该数据集为双臂协作机器人在非结构化办公环境中的物体抓取与放置任务提供了标准化的演示数据。其核心场景聚焦于让机器人识别魔方附近的目标物体,并使用夹爪将其准确放入篮筐内。通过多视角视觉观测与高维状态动作序列的同步记录,该数据集能够有效支撑模仿学习与强化学习算法的训练,尤其适用于研究在复杂空间关系下基于视觉的精细操作策略生成。
解决学术问题
该数据集主要致力于解决机器人学中双臂协调操作、基于视觉的物体定位与抓取规划以及长时程任务分解等关键学术问题。它提供了丰富的真实世界交互数据,有助于克服仿真到现实迁移的鸿沟,并为研究多模态感知与动作生成的端到端模型提供了基准。其细致的子任务与原子动作标注,为理解复杂操作任务的层次化结构、研究技能组合与任务泛化能力奠定了数据基础,推动了具身智能在真实物理环境中学习范式的进展。
实际应用
该数据集所针对的任务场景在物流分拣、家庭服务机器人以及柔性制造装配线上具有直接的应用价值。通过学习从演示数据中提取的操作技能,机器人能够适应办公室或仓库等动态环境,完成诸如整理杂物、分类摆放物品等日常工作。其包含的多摄像头数据(头部、左右腕部)对于开发鲁棒的视觉伺服控制系统至关重要,能够提升机器人在遮挡或视角变化情况下的操作成功率,推动协作机器人走向更广泛的实际部署。
数据集最近研究
最新研究方向
在机器人操作学习领域,Agilex_Cobot_Magic_storage_object_closest_cube数据集作为RoboCOIN项目的一部分,正推动着双臂协作机器人在非结构化环境中的技能泛化研究。该数据集以其丰富的多视角视觉观测与精细的末端执行器状态标注,为模仿学习与强化学习算法的训练提供了高质量的真实世界交互数据。当前研究热点聚焦于利用此类数据训练能够理解复杂空间关系(如“最靠近立方体的物品”)的视觉语言动作模型,旨在提升机器人在办公等动态场景中对多样化物体的灵巧抓取与放置能力。其与LeRobot框架的兼容性进一步促进了开源机器人学习生态的整合,为具身智能在现实任务中的安全、高效部署奠定了数据基础。
以上内容由遇见数据集搜集并总结生成



