X-Humanoid/WoW-1-Benchmark-Samples

Name: X-Humanoid/WoW-1-Benchmark-Samples
Creator: X-Humanoid
Published: 2025-10-17 06:26:12
License: 暂无描述

Hugging Face2025-10-17 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/X-Humanoid/WoW-1-Benchmark-Samples

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en pretty_name: WoW-1 Benchmark Samples tags: - robotics - physical-reasoning - causal-reasoning - action-understanding - video-understanding - embodied-ai - wow - arxiv:2509.22642 license: mit task_categories: - video-classification - action-generation dataset_type: benchmark size_categories: - 1K<n<10K --- # 🧠 WoW-1 Benchmark Samples **WoW-1 Benchmark Samples** is the official evaluation dataset released as part of the [WoW (World-Omniscient World Model)](https://github.com/wow-world-model/wow-world-model) project. This benchmark is designed to assess the physical consistency and causal reasoning capabilities of generative world models for robotics and embodied AI. ## 📘 Dataset Overview This dataset contains **612** natural language prompts representing real-world robot interaction tasks. These instructions are used to evaluate world models on their ability to understand and generate plausible, physically grounded responses in video or action space. Each sample describes a short-term or long-horizon task involving: - Object manipulation (e.g., _"Put the screw driver into the drawer"_) - Physical causality (e.g., _"Pick up an egg and crack it into the bowl"_) - Spatial reasoning (e.g., _"Move the lid from the black pot to the blue pan"_) - State transitions (e.g., _"Turn off the light switch"_) ## 🧪 Use Cases This dataset is intended for: - Evaluating generative video models on **physical realism** - Testing embodied agents on **causal reasoning** - Benchmarking **language-to-action** and **planning** models - Training or fine-tuning **robotic manipulation** systems ## 🔢 Format - **Modality**: Text (natural language commands) - **Format**: Plain text / JSON / Parquet - **Example**: ```json { "text": "Put the apples on the table into the basket." } ``` ## 📊 Dataset Stats - Number of samples: 612 - Text lengths: 11 to 230 characters - Language: English ## 📎 Example Samples - `Clean the table surface` - `Use the right arm to grab the pearl and give it to the left arm` - `Open the door of the red microwave` - `Place the tennis ball in the brown object` ## 🔗 Related Models This dataset is used for evaluating models such as: - `WoW-1-DiT-2B`, `WoW-1-DiT-7B` - `WoW-1-Wan-14B` - `SOPHIA`-guided generative models ## 📄 Related Paper > **[WoW: Towards a World omniscient World model Through Embodied Interaction](https://arxiv.org/abs/2509.22642)** > *Xiaowei Chi et al., 2025 — arXiv:2509.22642* Please cite this paper if you use the dataset: ```bibtex @article{chi2025wow, title={WoW: Towards a World omniscient World model Through Embodied Interaction}, author={Chi, Xiaowei and Jia, Peidong and Fan, Chun-Kai and Ju, Xiaozhu and Mi, Weishi and Qin, Zhiyuan and Zhang, Kevin and Tian, Wanxin and Ge, Kuangzhi and Li, Hao and others}, journal={arXiv preprint arXiv:2509.22642}, year={2025} } ``` ## 🌐 Project Links - 🔬 Project site: [wow-world-model.github.io](https://wow-world-model.github.io/) - 💻 GitHub: [github.com/wow-world-model/wow-world-model](https://github.com/wow-world-model/wow-world-model) - 📜 ArXiv: [arxiv.org/abs/2509.22642](https://arxiv.org/abs/2509.22642) ## 🪪 License This dataset is released under the [MIT License](https://opensource.org/licenses/MIT). --- 🤗 We encourage the community to explore, evaluate, and extend this benchmark. Contributions and feedback are welcome via GitHub or the project website.

language: - 英语 pretty_name: WoW-1基准测试样本 tags: - 机器人学 - 物理推理 - 因果推理 - 动作理解 - 视频理解 - 具身人工智能（Embodied AI） - wow - arxiv:2509.22642 license: MIT许可证 task_categories: - 视频分类 - 动作生成 dataset_type: 基准测试集 size_categories: - 1000 < n < 10000 --- # 🧠 WoW-1基准测试样本 **WoW-1基准测试样本**是随[WoW（全知世界模型，World-Omniscient World Model）](https://github.com/wow-world-model/wow-world-model)项目发布的官方评估数据集。该基准旨在评估面向机器人学与具身人工智能（Embodied AI）的生成式世界模型的物理一致性与因果推理能力。 ## 📘 数据集概览本数据集包含**612条**自然语言提示词，对应真实世界的机器人交互任务。这些指令用于评估世界模型理解并生成视频或动作空间中合理的、贴合物理现实的响应的能力。每个样本描述了一项短期或长期任务，涵盖： - 物体操作（例如："将螺丝刀放入抽屉中"） - 物理因果关系（例如："拿起鸡蛋并磕入碗中"） - 空间推理（例如："将盖子从黑锅移至蓝平底锅"） - 状态转换（例如："关闭电灯开关"） ## 🧪 应用场景本数据集适用于： - 评估生成式视频模型的**物理真实性** - 测试AI智能体（AI Agent）的**因果推理能力** - 为**语言转动作**与**规划**模型提供基准测试 - 训练或微调**机器人操作**系统 ## 🔢 数据格式 - **模态**：文本（自然语言指令） - **存储格式**：纯文本/JSON/Parquet - **示例**： json { "text": "将桌上的苹果放入篮中。" } ## 📊 数据集统计 - 样本总数：612 - 文本长度：11至230个字符 - 语言：英语 ## 📎 示例样本 - `清理桌面表面` - `使用右臂抓取珍珠并交给左臂` - `打开红色微波炉的门` - `将网球放入棕色容器中` ## 🔗 相关模型本数据集用于评估以下模型： - `WoW-1-DiT-2B`、`WoW-1-DiT-7B` - `WoW-1-Wan-14B` - 基于SOPHIA的生成式模型 ## 📄 相关论文 > **[WoW：通过具身交互构建全知世界模型](https://arxiv.org/abs/2509.22642)** > *Chi Xiaowei 等，2025 — arXiv:2509.22642* 若使用本数据集，请引用该论文： bibtex @article{chi2025wow, title={WoW: Towards a World omniscient World model Through Embodied Interaction}, author={Chi, Xiaowei and Jia, Peidong and Fan, Chun-Kai and Ju, Xiaozhu and Mi, Weishi and Qin, Zhiyuan and Zhang, Kevin and Tian, Wanxin and Ge, Kuangzhi and Li, Hao and others}, journal={arXiv preprint arXiv:2509.22642}, year={2025} } ## 🌐 项目链接 - 🔬 项目官网：[wow-world-model.github.io](https://wow-world-model.github.io/) - 💻 GitHub仓库：[github.com/wow-world-model/wow-world-model](https://github.com/wow-world-model/wow-world-model) - 📜 ArXiv论文页：[arxiv.org/abs/2509.22642](https://arxiv.org/abs/2509.22642) ## 🪪 许可证本数据集采用[MIT许可证](https://opensource.org/licenses/MIT)发布。 --- 🤗 我们鼓励社区探索、评估并拓展该基准测试。欢迎通过GitHub或项目官网提交贡献与反馈。

提供机构：

X-Humanoid

5,000+

优质数据集

54 个

任务类型

进入经典数据集