ABot-PhysWorld_SFT_Training_Data_v1

Name: ABot-PhysWorld_SFT_Training_Data_v1
Creator: maas
Published: 2026-05-14 10:40:54
License: 暂无描述

魔搭社区2026-05-14 更新2026-05-03 收录

下载链接：

https://modelscope.cn/datasets/amap_cvlab/ABot-PhysWorld_SFT_Training_Data_v1

下载链接

链接失效反馈

官方服务：

资源简介：

<div align="center"> <h1>🤖 ABot-PhysWorld SFT Training Data (v1)</h1> <p align="center"> <b>AMAP CV Lab</b> </p> <p align="center"> <a href="https://arxiv.org/abs/2603.23376"><img src="https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv"></a> <a href="https://github.com/amap-cvlab/ABot-PhysWorld"><img src="https://img.shields.io/badge/Code-GitHub-blue?logo=github"></a> <a href="https://modelscope.cn/models/amap_cvlab/Abot-PhysWorld"><img src="https://img.shields.io/static/v1?label=Model&message=ModelScope&color=purple"></a> </p> </div> > **Supervised Fine-Tuning (SFT) data** for [ABot-PhysWorld](https://github.com/amap-cvlab/ABot-PhysWorld) — a physically consistent, action-controllable video world model for robotic manipulation built on a 14B Diffusion Transformer. ## 📊 Dataset Overview This is the **v1** release. The dataset contains **287,557** curated robotic manipulation video clips paired with dense textual descriptions, aggregated from five real-world sources. | Source | Samples | Description | |--------|---------|-------------| | **AgiBot** | 125,259 | Agile robotic manipulation challenges | | **OXE** | 77,110 | Open X-Embodiment cross-robot data | | **RoboMIND** | 32,792 | Multi-robot manipulation with diverse end-effectors | | **RoboCOIN** | 26,450 | Multi-embodiment collaborative manipulation | | **Galaxea** | 25,946 | Open-world robotic interaction | | **Total** | **287,557** | | ## 📁 Structure ``` . ├── ABot-PhysWorld_v1.jsonl # Annotations: video path + text prompt └── videos/ ├── AgiBot/ ├── OXE/ ├── RoboMIND/ ├── RoboCOIN/ └── Galaxea/ ``` ## 📝 Data Format Each line in `ABot-PhysWorld_v1.jsonl` is a JSON object: ```json { "video": "videos/RoboMIND/.../trajectory_camera_left.mp4", "prompt": "The video opens with a view of a clean, well-lit industrial workspace..." } ``` - **`video`**: Relative path to the MP4 video file - **`prompt`**: Dense, multi-paragraph description covering initial scene setup, step-by-step action progression, final state, and camera perspective ## 🚀 Usage ```python import json, os DATA_ROOT = "/path/to/this/dataset" with open(os.path.join(DATA_ROOT, "ABot-PhysWorld_v1.jsonl")) as f: for line in f: record = json.loads(line) video_path = os.path.join(DATA_ROOT, record["video"]) prompt = record["prompt"] ``` ## 📜 Citation ```bibtex @article{abot-physworld2026, title={ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment}, author={Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu}, year={2026} } ``` ## License Released for **research purposes only**. Please refer to the original source datasets for their respective licenses. ## 🙏 Acknowledgement - [RoboMIND](https://huggingface.co/datasets/x-humanoid-robomind/RoboMIND) - [RoboCOIN](https://huggingface.co/RoboCOIN) - [AgiBotWorld](https://huggingface.co/datasets/agibot-world/AgiBotWorld-Beta) - [Galaxea](https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset) - [Open X-Embodiment](https://github.com/google-deepmind/open_x_embodiment)

<div align="center"> <h1>🤖 ABot-PhysWorld 监督微调（Supervised Fine-Tuning, SFT）训练数据集（v1版）</h1> <p align="center"> <b>AMAP CV Lab</b> </p> <p align="center"> <a href="https://arxiv.org/abs/2603.23376"><img src="https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv"></a> <a href="https://github.com/amap-cvlab/ABot-PhysWorld"><img src="https://img.shields.io/badge/Code-GitHub-blue?logo=github"></a> <a href="https://modelscope.cn/models/amap_cvlab/Abot-PhysWorld"><img src="https://img.shields.io/static/v1?label=Model&message=ModelScope&color=purple"></a> </p> </div> > **监督微调（Supervised Fine-Tuning, SFT）数据** 适配 [ABot-PhysWorld](https://github.com/amap-cvlab/ABot-PhysWorld) —— 一款基于140亿参数扩散Transformer（Diffusion Transformer）构建的、具备物理一致性与动作可控性的机器人操作视频世界模型。 ## 📊 数据集概览本数据集为**v1**版发布，包含**287,557**条经过精选的机器人操作视频片段，并配有稠密文本描述，数据整合自5个真实世界数据源。 | 数据源 | 样本量 | 描述 | |--------|---------|-------------| | **AgiBot** | 125,259 | 敏捷机器人操作挑战任务 | | **OXE** | 77,110 | 开放跨构型机器人（Open X-Embodiment）多机器人数据 | | **RoboMIND** | 32,792 | 搭载多样化末端执行器的多机器人操作任务 | | **RoboCOIN** | 26,450 | 多构型协作机器人操作任务 | | **Galaxea** | 25,946 | 开放世界机器人交互任务 | | **总计** | **287,557** | | ## 📁 数据集结构 . ├── ABot-PhysWorld_v1.jsonl # 标注文件：包含视频路径与文本提示 └── videos/ ├── AgiBot/ ├── OXE/ ├── RoboMIND/ ├── RoboCOIN/ └── Galaxea/ ## 📝 数据格式 `ABot-PhysWorld_v1.jsonl` 中的每一行均为一个JSON对象： json { "video": "videos/RoboMIND/.../trajectory_camera_left.mp4", "prompt": "The video opens with a view of a clean, well-lit industrial workspace..." } - **`video`**：MP4视频文件的相对路径 - **`prompt`**：稠密多段落描述，覆盖初始场景搭建、分步动作推进、最终状态与相机视角等内容 ## 🚀 使用方法 python import json, os DATA_ROOT = "/path/to/this/dataset" with open(os.path.join(DATA_ROOT, "ABot-PhysWorld_v1.jsonl")) as f: for line in f: record = json.loads(line) video_path = os.path.join(DATA_ROOT, record["video"]) prompt = record["prompt"] ## 📜 引用格式 bibtex @article{abot-physworld2026, title={ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment}, author={Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu}, year={2026} } ## 许可证本数据集**仅用于科研用途**，请遵循各原始数据源的专属许可协议。 ## 🙏 致谢 - [RoboMIND](https://huggingface.co/datasets/x-humanoid-robomind/RoboMIND) - [RoboCOIN](https://huggingface.co/RoboCOIN) - [AgiBotWorld](https://huggingface.co/datasets/agibot-world/AgiBotWorld-Beta) - [Galaxea](https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset) - [Open X-Embodiment](https://github.com/google-deepmind/open_x_embodiment)

提供机构：

maas

创建时间：

2026-03-26

5,000+

优质数据集

54 个

任务类型

进入经典数据集