WorldArena/WorldArena_Robotwin2.0

Name: WorldArena/WorldArena_Robotwin2.0
Creator: WorldArena
Published: 2026-03-22 10:52:07
License: 暂无描述

Hugging Face2026-03-22 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/WorldArena/WorldArena_Robotwin2.0

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en tags: - code pretty_name: RoboTwin Embodied Video Generation Benchmark Dataset for WorldArena evaluation size_categories: - 100M<n<1B --- # RoboTwin Embodied Video Generation Dataset for WorldArena This dataset is designed for embodied video generation and evaluation across two main leaderboards and an interactive arena of [WorldArena](https://huggingface.co/spaces/WorldArena/WorldArena). ## 0) Dataset Overview - **Leaderboard (test_dataset)**: Evaluation set for Leaderboard.Extract the directory from test_dataset.tar.gz - **Arena (val_dataset)**: Used for the **Arena** (interactive comparison). This set allows users to upload their own generated videos for a specific episode and compare them against existing baselines with real-time metrics.Extract the directory from val_dataset.tar.gz Final evaluation results will be synchronized to Leaderboard and Arena(optional) respectively. --- ## 1) Folder Structure & Modalities For any episode key (for example `episodeK`), the following four files are one-to-one aligned and must be used together: 1. `data/.../episodeK.hdf5` - Main action sequence container. - Contains joint action, end-effector pose. 2. `instructions/.../episodeK.json` `instructions_1/.../episodeK.json` `instructions_2/.../episodeK.json` - Language prompt. - Contains one field: `instruction`. - use `instruction` to generate a video dataset named `{model_name}_test` from test_dataset or `{model_name}_val` from val_dataset. - New prompt variants for Action Following - To evaluate Action Following with **new different actions**, we provide two additional prompt sets: - `instructions_1/.../episodeK.json`: use `instruction_1` to generate a video dataset named `{model_name}_test_1` from test_dataset or `{model_name}_val_1` from val_dataset. - `instructions_2/.../episodeK.json`: use `instruction_2` to generate a video dataset named `{model_name}_test_2` from test_dataset or `{model_name}_val_2` from val_dataset. - Use these two prompts to generate two new action videos. If the **action-guided** video lacks a modifiable prompt, consider using `other actions` from `other tasks`(e.g. `use the action of episode2 to generate episode1`) to **achieve two different actions**, named the generated video dataset similarly as above. 3. `first_frame/.../episodeK.jpg` - Initial visual condition frame. - Use this as the first frame when doing generation. --- ## 2) Quick Start for Inference To generate a future video for `episodeK`: 1. **Initial Condition**: Use `first_frame/.../episodeK.png`. 2. **Text-driven**: Extract `instruction(_1,_2)` from `instructions(_1,_2)/.../episodeK.json`and use these as prompt along with the corresponding first_frame to inference. 3. **Action-driven**: Use `data/.../episodeK.hdf5` as action/trajectory along with the corresponding first_frame to inference. **Requirement**: Models should take the `first_frame` and `instruction/action` as input to generate a video set containing 1000(test)/500(val) videos corresponding to `first_frame`, then use `instruction_1/_2` to generated similarly, each contains 1000(test)/500(val) videos. --- ## 3) Technical Specifications Our data is sampled and processed from the **RoboTwin 2.0** dataset. For detailed technical specifications, HDF5 structure, and coordinate systems, please refer to the official documentation: [https://robotwin-platform.github.io/](https://robotwin-platform.github.io/) ---

提供机构：

WorldArena

5,000+

优质数据集

54 个

任务类型

进入经典数据集