Haoyuwu/MultiWorldData

Name: Haoyuwu/MultiWorldData
Creator: Haoyuwu
Published: 2026-04-20 17:46:54
License: 暂无描述

Hugging Face2026-04-20 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/Haoyuwu/MultiWorldData

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 size_categories: - 100K<n<1M --- # MultiWorld Dataset ## Dataset Summary **MultiWorld** is a large-scale multi-agent multi-view video dataset collected for training video world models. It contains two complementary sources of data: 1. **It Takes Two Gameplay Dataset**: 100+ hours of real human gameplay from the cooperative action-adventure game *It Takes Two*, featuring dual-agent synchronized actions with distinct first-person viewpoints. 2. **RoboFactory Manipulation Dataset**: Multi-robot manipulation trajectories spanning 4 tasks with 2-4 agents and variable camera viewpoints, including both success and failure episodes. This dataset is the official release accompanying the paper *"MultiWorld: Scalable Multi-Agent Multi-View Video World Models"*. - **Homepage:** https://multi-world.github.io - **Repository:** https://github.com/CIntellifusion/MultiWorld - **Paper:** [arXiv:XXXX.XXXXX](https://arxiv.org/abs/XXXX.XXXXX) --- ## Dataset Details ### It Takes Two Gameplay | Property | Value | |----------|-------| | **Total Duration** | 100+ hours | | **Frame Rate** | 60 FPS | | **Resolution** | 480 × 960 | | **Agents** | 2 players | | **Viewpoints** | 2 distinct first-person views per episode | | **Actions** | Synchronized keyboard and mouse actions per agent | | **Modality** | RGB video + discrete/continuous action vectors | The gameplay videos are captured from real human players cooperating in the game. Each frame is accompanied by per-agent action labels capturing keyboard presses and mouse movements. ### RoboFactory Manipulation | Property | Value | |----------|-------| | **Tasks** | 4 multi-robot manipulation tasks | | **Agents** | 2–4 robots per task | | **Viewpoints** | Variable camera configurations per task | | **Resolution** | 256 × 320 | | **Success Episodes** | 1,000 per task | | **Failure Episodes** | 2,000 per task | | **Modality** | RGB video + robot proprioception + actions | Tasks include collaborative stacking, pushing, and pick-and-place scenarios. Both successful and failed trajectories are included to support learning robust world models and failure prediction. --- ### Possible Usage The dataset is intended for research in: - Video world models - Multi-agent video generation - Multi-view consistent video generation. --- ### Contact For questions about the dataset, please open an issue on the [GitHub repository](https://github.com/CIntellifusion/MultiWorld) or contact the authors.

提供机构：

Haoyuwu

5,000+

优质数据集

54 个

任务类型

进入经典数据集