five

ABot-PhysWorld_SFT_Training_Data_v1

收藏
魔搭社区2026-05-14 更新2026-05-03 收录
下载链接:
https://modelscope.cn/datasets/amap_cvlab/ABot-PhysWorld_SFT_Training_Data_v1
下载链接
链接失效反馈
官方服务:
资源简介:
<div align="center"> <h1>🤖 ABot-PhysWorld SFT Training Data (v1)</h1> <p align="center"> <b>AMAP CV Lab</b> </p> <p align="center"> <a href="https://arxiv.org/abs/2603.23376"><img src="https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv"></a> <a href="https://github.com/amap-cvlab/ABot-PhysWorld"><img src="https://img.shields.io/badge/Code-GitHub-blue?logo=github"></a> <a href="https://modelscope.cn/models/amap_cvlab/Abot-PhysWorld"><img src="https://img.shields.io/static/v1?label=Model&message=ModelScope&color=purple"></a> </p> </div> > **Supervised Fine-Tuning (SFT) data** for [ABot-PhysWorld](https://github.com/amap-cvlab/ABot-PhysWorld) — a physically consistent, action-controllable video world model for robotic manipulation built on a 14B Diffusion Transformer. ## 📊 Dataset Overview This is the **v1** release. The dataset contains **287,557** curated robotic manipulation video clips paired with dense textual descriptions, aggregated from five real-world sources. | Source | Samples | Description | |--------|---------|-------------| | **AgiBot** | 125,259 | Agile robotic manipulation challenges | | **OXE** | 77,110 | Open X-Embodiment cross-robot data | | **RoboMIND** | 32,792 | Multi-robot manipulation with diverse end-effectors | | **RoboCOIN** | 26,450 | Multi-embodiment collaborative manipulation | | **Galaxea** | 25,946 | Open-world robotic interaction | | **Total** | **287,557** | | ## 📁 Structure ``` . ├── ABot-PhysWorld_v1.jsonl # Annotations: video path + text prompt └── videos/ ├── AgiBot/ ├── OXE/ ├── RoboMIND/ ├── RoboCOIN/ └── Galaxea/ ``` ## 📝 Data Format Each line in `ABot-PhysWorld_v1.jsonl` is a JSON object: ```json { "video": "videos/RoboMIND/.../trajectory_camera_left.mp4", "prompt": "The video opens with a view of a clean, well-lit industrial workspace..." } ``` - **`video`**: Relative path to the MP4 video file - **`prompt`**: Dense, multi-paragraph description covering initial scene setup, step-by-step action progression, final state, and camera perspective ## 🚀 Usage ```python import json, os DATA_ROOT = "/path/to/this/dataset" with open(os.path.join(DATA_ROOT, "ABot-PhysWorld_v1.jsonl")) as f: for line in f: record = json.loads(line) video_path = os.path.join(DATA_ROOT, record["video"]) prompt = record["prompt"] ``` ## 📜 Citation ```bibtex @article{abot-physworld2026, title={ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment}, author={Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu}, year={2026} } ``` ## License Released for **research purposes only**. Please refer to the original source datasets for their respective licenses. ## 🙏 Acknowledgement - [RoboMIND](https://huggingface.co/datasets/x-humanoid-robomind/RoboMIND) - [RoboCOIN](https://huggingface.co/RoboCOIN) - [AgiBotWorld](https://huggingface.co/datasets/agibot-world/AgiBotWorld-Beta) - [Galaxea](https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset) - [Open X-Embodiment](https://github.com/google-deepmind/open_x_embodiment)

<div align="center"> <h1>🤖 ABot-PhysWorld 监督微调(Supervised Fine-Tuning, SFT)训练数据集(v1版)</h1> <p align="center"> <b>AMAP CV Lab</b> </p> <p align="center"> <a href="https://arxiv.org/abs/2603.23376"><img src="https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv"></a> <a href="https://github.com/amap-cvlab/ABot-PhysWorld"><img src="https://img.shields.io/badge/Code-GitHub-blue?logo=github"></a> <a href="https://modelscope.cn/models/amap_cvlab/Abot-PhysWorld"><img src="https://img.shields.io/static/v1?label=Model&message=ModelScope&color=purple"></a> </p> </div> > **监督微调(Supervised Fine-Tuning, SFT)数据** 适配 [ABot-PhysWorld](https://github.com/amap-cvlab/ABot-PhysWorld) —— 一款基于140亿参数扩散Transformer(Diffusion Transformer)构建的、具备物理一致性与动作可控性的机器人操作视频世界模型。 ## 📊 数据集概览 本数据集为**v1**版发布,包含**287,557**条经过精选的机器人操作视频片段,并配有稠密文本描述,数据整合自5个真实世界数据源。 | 数据源 | 样本量 | 描述 | |--------|---------|-------------| | **AgiBot** | 125,259 | 敏捷机器人操作挑战任务 | | **OXE** | 77,110 | 开放跨构型机器人(Open X-Embodiment)多机器人数据 | | **RoboMIND** | 32,792 | 搭载多样化末端执行器的多机器人操作任务 | | **RoboCOIN** | 26,450 | 多构型协作机器人操作任务 | | **Galaxea** | 25,946 | 开放世界机器人交互任务 | | **总计** | **287,557** | | ## 📁 数据集结构 . ├── ABot-PhysWorld_v1.jsonl # 标注文件:包含视频路径与文本提示 └── videos/ ├── AgiBot/ ├── OXE/ ├── RoboMIND/ ├── RoboCOIN/ └── Galaxea/ ## 📝 数据格式 `ABot-PhysWorld_v1.jsonl` 中的每一行均为一个JSON对象: json { "video": "videos/RoboMIND/.../trajectory_camera_left.mp4", "prompt": "The video opens with a view of a clean, well-lit industrial workspace..." } - **`video`**:MP4视频文件的相对路径 - **`prompt`**:稠密多段落描述,覆盖初始场景搭建、分步动作推进、最终状态与相机视角等内容 ## 🚀 使用方法 python import json, os DATA_ROOT = "/path/to/this/dataset" with open(os.path.join(DATA_ROOT, "ABot-PhysWorld_v1.jsonl")) as f: for line in f: record = json.loads(line) video_path = os.path.join(DATA_ROOT, record["video"]) prompt = record["prompt"] ## 📜 引用格式 bibtex @article{abot-physworld2026, title={ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment}, author={Yuzhi Chen, Ronghan Chen, Dongjie Huo, Yandan Yang, Dekang Qi, Haoyun Liu, Tong Lin, Shuang Zeng, Junjin Xiao, Xinyuan Chang, Feng Xiong, Xing Wei, Zhiheng Ma, Mu Xu}, year={2026} } ## 许可证 本数据集**仅用于科研用途**,请遵循各原始数据源的专属许可协议。 ## 🙏 致谢 - [RoboMIND](https://huggingface.co/datasets/x-humanoid-robomind/RoboMIND) - [RoboCOIN](https://huggingface.co/RoboCOIN) - [AgiBotWorld](https://huggingface.co/datasets/agibot-world/AgiBotWorld-Beta) - [Galaxea](https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset) - [Open X-Embodiment](https://github.com/google-deepmind/open_x_embodiment)
提供机构:
maas
创建时间:
2026-03-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作