XR-1-Dataset-Sample
收藏魔搭社区2026-01-09 更新2026-01-10 收录
下载链接:
https://modelscope.cn/datasets/X-Humanoid/XR-1-Dataset-Sample
下载链接
链接失效反馈官方服务:
资源简介:
# XR-1-Dataset-Sample
[[Project Page](https://github.com/Open-X-Humanoid/XR-1)] [[Paper](https://arxiv.org/abs/2411.02776v1)] [[GitHub](https://github.com/Open-X-Humanoid/XR-1)]
This repository contains a representative sample of the **XR-1** project's multi-modal dataset. The data is organized to support cross-embodiment training for Humanoids, Manipulators, and Ego-centric vision.
## 📂 Directory Structure
The dataset follows a hierarchy based on **Embodiment -> Task -> Format**:
### 1. Robot Embodiment Data (LeRobot Format)
Standard robot data (like TienKung or UR5) is organized following the [LeRobot](https://github.com/huggingface/lerobot) convention:
```text
XR-1-Dataset-Sample/
└── DUAL_ARM_TIEN_KUNG2/ # Robot Embodiment
└── Press_Green_Button/ # Task Name
└── lerobot/ # Data in LeRobot format
├── metadata.json
├── episodes.jsonl
├── videos/
└── data/
```
### 2. Human/Ego-centric Data (Ego4D Format)
For ego-centric data (e.g., Ego4D subsets used for Stage 1 UVMC pre-training), the structure is adapted to its native recording format:
```text
XR-1-Dataset-Sample/
└── Ego4D/ # Human ego-centric source
├── files.json # Unified annotation/mapping file
└── files/ # Raw data storage
└── [video_id].mp4 # Egocentric video clips
```
## 🤖 Data Modalities
* **Vision**: High-frequency RGB streams from multiple camera perspectives.
* **Motion**: Continuous state-action pairs, which are tokenized into **UVMC** (Unified Vision-Motion Codes) for XR-1 training.
* **Language**: Natural language instructions paired with each episode for VLA alignment.
## 🛠 Usage
This sample is intended for use with the [XR-1 GitHub Repository](https://github.com/Open-X-Humanoid/XR-1).
## 📝 Citation
```bibtex
@article{fan2025xr,
title={XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations},
author={Fan, Shichao and others},
journal={arXiv preprint arXiv:2411.02776},
year={2025}
}
```
## 📜 License
This dataset is released under the [MIT License](https://github.com/Open-X-Humanoid/XR-1/blob/main/LICENSE).
---
**Contact**: For questions, please open an issue on our [GitHub](https://github.com/Open-X-Humanoid/XR-1).
# XR-1数据集样本(XR-1-Dataset-Sample)
[[项目页面](https://github.com/Open-X-Humanoid/XR-1)] [[论文](https://arxiv.org/abs/2411.02776v1)] [[GitHub仓库](https://github.com/Open-X-Humanoid/XR-1)]
本仓库收录了**XR-1**项目多模态数据集的代表性样本,该数据集的组织形式旨在支撑人形机器人、机械臂与以自我为中心的视觉(Ego-centric vision)任务的跨具身(cross-embodiment)训练。
## 📂 目录结构
该数据集遵循**具身(Embodiment)-> 任务(Task)-> 格式(Format)**的层级组织逻辑:
### 1. 机器人具身数据(LeRobot格式)
遵循[LeRobot](https://github.com/huggingface/lerobot)规范组织的标准机器人数据(如天工机器人(TienKung)或UR5机械臂):
text
XR-1-Dataset-Sample/
└── DUAL_ARM_TIEN_KUNG2/ # 双臂天工机器人(DUAL_ARM_TIEN_KUNG2)
└── Press_Green_Button/ # 按下绿色按钮
└── lerobot/ # LeRobot格式数据
├── metadata.json
├── episodes.jsonl
├── videos/
└── data/
### 2. 人类/以自我为中心的数据(Ego4D格式)
针对以自我为中心的数据(例如用于第一阶段统一视觉-运动编码(Unified Vision-Motion Codes,UVMC)预训练的Ego4D子集),其目录结构适配其原生录制格式:
text
XR-1-Dataset-Sample/
└── Ego4D/ # 人类自我中心数据源
├── files.json # 统一标注/映射文件
└── files/ # 原始数据存储目录
└── [video_id].mp4 # 自我中心视频片段
## 🤖 数据模态
* **视觉**:多相机视角下的高频RGB视频流。
* **运动**:连续状态-动作对,经Token(Token)化后转化为用于XR-1训练的统一视觉-运动编码(Unified Vision-Motion Codes,UVMC)。
* **语言**:与每个训练片段配对的自然语言指令,用于视觉语言动作(Vision-Language-Action,VLA)对齐。
## 🛠 使用方式
本样本旨在配合[XR-1 GitHub仓库](https://github.com/Open-X-Humanoid/XR-1)使用。
## 📝 引用格式
bibtex
@article{fan2025xr,
title={XR-1:通过学习统一视觉-运动表征构建通用视觉语言动作模型},
author={Fan, Shichao and others},
journal={arXiv预印本 arXiv:2411.02776},
year={2025}
}
## 📜 许可证
本数据集采用[MIT许可证](https://github.com/Open-X-Humanoid/XR-1/blob/main/LICENSE)发布。
---
**联系方式**:如有疑问,请在我们的[GitHub仓库](https://github.com/Open-X-Humanoid/XR-1)提交Issue。
提供机构:
maas
创建时间:
2025-12-24



