five

XR-1-Dataset-Sample

收藏
魔搭社区2026-01-09 更新2026-01-10 收录
下载链接:
https://modelscope.cn/datasets/X-Humanoid/XR-1-Dataset-Sample
下载链接
链接失效反馈
官方服务:
资源简介:
# XR-1-Dataset-Sample [[Project Page](https://github.com/Open-X-Humanoid/XR-1)] [[Paper](https://arxiv.org/abs/2411.02776v1)] [[GitHub](https://github.com/Open-X-Humanoid/XR-1)] This repository contains a representative sample of the **XR-1** project's multi-modal dataset. The data is organized to support cross-embodiment training for Humanoids, Manipulators, and Ego-centric vision. ## 📂 Directory Structure The dataset follows a hierarchy based on **Embodiment -> Task -> Format**: ### 1. Robot Embodiment Data (LeRobot Format) Standard robot data (like TienKung or UR5) is organized following the [LeRobot](https://github.com/huggingface/lerobot) convention: ```text XR-1-Dataset-Sample/ └── DUAL_ARM_TIEN_KUNG2/ # Robot Embodiment └── Press_Green_Button/ # Task Name └── lerobot/ # Data in LeRobot format ├── metadata.json ├── episodes.jsonl ├── videos/ └── data/ ``` ### 2. Human/Ego-centric Data (Ego4D Format) For ego-centric data (e.g., Ego4D subsets used for Stage 1 UVMC pre-training), the structure is adapted to its native recording format: ```text XR-1-Dataset-Sample/ └── Ego4D/ # Human ego-centric source ├── files.json # Unified annotation/mapping file └── files/ # Raw data storage └── [video_id].mp4 # Egocentric video clips ``` ## 🤖 Data Modalities * **Vision**: High-frequency RGB streams from multiple camera perspectives. * **Motion**: Continuous state-action pairs, which are tokenized into **UVMC** (Unified Vision-Motion Codes) for XR-1 training. * **Language**: Natural language instructions paired with each episode for VLA alignment. ## 🛠 Usage This sample is intended for use with the [XR-1 GitHub Repository](https://github.com/Open-X-Humanoid/XR-1). ## 📝 Citation ```bibtex @article{fan2025xr, title={XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations}, author={Fan, Shichao and others}, journal={arXiv preprint arXiv:2411.02776}, year={2025} } ``` ## 📜 License This dataset is released under the [MIT License](https://github.com/Open-X-Humanoid/XR-1/blob/main/LICENSE). --- **Contact**: For questions, please open an issue on our [GitHub](https://github.com/Open-X-Humanoid/XR-1).

# XR-1数据集样本(XR-1-Dataset-Sample) [[项目页面](https://github.com/Open-X-Humanoid/XR-1)] [[论文](https://arxiv.org/abs/2411.02776v1)] [[GitHub仓库](https://github.com/Open-X-Humanoid/XR-1)] 本仓库收录了**XR-1**项目多模态数据集的代表性样本,该数据集的组织形式旨在支撑人形机器人、机械臂与以自我为中心的视觉(Ego-centric vision)任务的跨具身(cross-embodiment)训练。 ## 📂 目录结构 该数据集遵循**具身(Embodiment)-> 任务(Task)-> 格式(Format)**的层级组织逻辑: ### 1. 机器人具身数据(LeRobot格式) 遵循[LeRobot](https://github.com/huggingface/lerobot)规范组织的标准机器人数据(如天工机器人(TienKung)或UR5机械臂): text XR-1-Dataset-Sample/ └── DUAL_ARM_TIEN_KUNG2/ # 双臂天工机器人(DUAL_ARM_TIEN_KUNG2) └── Press_Green_Button/ # 按下绿色按钮 └── lerobot/ # LeRobot格式数据 ├── metadata.json ├── episodes.jsonl ├── videos/ └── data/ ### 2. 人类/以自我为中心的数据(Ego4D格式) 针对以自我为中心的数据(例如用于第一阶段统一视觉-运动编码(Unified Vision-Motion Codes,UVMC)预训练的Ego4D子集),其目录结构适配其原生录制格式: text XR-1-Dataset-Sample/ └── Ego4D/ # 人类自我中心数据源 ├── files.json # 统一标注/映射文件 └── files/ # 原始数据存储目录 └── [video_id].mp4 # 自我中心视频片段 ## 🤖 数据模态 * **视觉**:多相机视角下的高频RGB视频流。 * **运动**:连续状态-动作对,经Token(Token)化后转化为用于XR-1训练的统一视觉-运动编码(Unified Vision-Motion Codes,UVMC)。 * **语言**:与每个训练片段配对的自然语言指令,用于视觉语言动作(Vision-Language-Action,VLA)对齐。 ## 🛠 使用方式 本样本旨在配合[XR-1 GitHub仓库](https://github.com/Open-X-Humanoid/XR-1)使用。 ## 📝 引用格式 bibtex @article{fan2025xr, title={XR-1:通过学习统一视觉-运动表征构建通用视觉语言动作模型}, author={Fan, Shichao and others}, journal={arXiv预印本 arXiv:2411.02776}, year={2025} } ## 📜 许可证 本数据集采用[MIT许可证](https://github.com/Open-X-Humanoid/XR-1/blob/main/LICENSE)发布。 --- **联系方式**:如有疑问,请在我们的[GitHub仓库](https://github.com/Open-X-Humanoid/XR-1)提交Issue。
提供机构:
maas
创建时间:
2025-12-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作