PBench
收藏魔搭社区2025-12-04 更新2025-06-14 收录
下载链接:
https://modelscope.cn/datasets/nv-community/PBench
下载链接
链接失效反馈官方服务:
资源简介:
# PBench: A Physical AI Benchmark for World Models
## Dataset Description:
The PBench is a benchmark to measure the progress of world models quantitatively.
PBench contains a list of 1044 samples of text prompts, conditioning images, and qa pairs, covering Physical AI target domains including autonomous vehicle (AV) driving, robotics, industry (smart space), physics, human, and common sense. All the questions are binary questions, and the answer is either Yes or No. Our dataset is a benchmark designed to evaluate world models for Physical AI. By releasing the dataset, NVIDIA supports the development of world foundation models and provides benchmarks to evaluate the progress. For detailed information about PBench, please visit the [PBench Website](https://research.nvidia.com/labs/dir/pbench/).
This dataset is ready for non-commercial use.
## Dataset Owner(s):
NVIDIA Corporation
## Dataset Creation Date:
2025/06/11
## License/Terms of Use:
The use of this dataset is governed by [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en).
## Intended Usage:
This benchmark dataset is intended to demonstrate and facilitate the understanding and evaluation of world models for Physical AI. It should primarily be used for educational and demonstration purposes.
## Dataset Characterization
The PBench dataset focuses on the following areas: Autonomous Vehicle (AV) driving, Robotics, Industry (smart space), Physics, Human, Common Sense.
**Data Collection Method**:
* AV: Automatic/Sensors
* Industry: Automatic/Sensors
* Physics: Automatic/Sensors
* Robotics: Automatic/Sensors
* Human: Automatic/Sensors
* Common Sense: Human
**Labeling Method**:
* AV: Hybrid: Human, Automated
* Industry: Hybrid: Human, Automated
* Physics: Hybrid: Human, Automated
* Robotics: Hybrid: Human, Automated
* Human: Hybrid: Human, Automated
* Common Sense: Hybrid: Human, Automated
## Dataset Format
* Modality: Image (jpg) and Text
## Dataset Quantification
The dataset is stored in a Parquet file. The quantity, including the conditioning images, text prompts, and qa pairs, of the Pbench dataset is described in the table below.
| Domain | Quantity |
|-----------------|-----------------|
| AV | 118 |
| Common Sense| 239 |
| Human | 299 |
| Industry | 107 |
| Physics | 107 |
| Robotics | 174 |
| **Total Storage Size** | **101 MB** |
## Reference(s):
Not published yet. We will publish the paper later add the link to the paper when it’s ready.
## Ethical Considerations:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
# PBench:面向世界模型的物理人工智能基准数据集
## 数据集描述:
PBench是一款用于量化评估世界模型(world models)进展的基准数据集。PBench包含1044个样本,涵盖文本提示词、条件图像与问答对三类数据,覆盖物理人工智能(Physical AI)目标领域,包括自动驾驶(autonomous vehicle, AV)、机器人学、工业(智能空间)、物理学、人类行为与常识等方向。所有问题均为二分类问题,答案仅为“是”或“否”。本数据集专为评估面向物理人工智能的世界模型而打造。英伟达(NVIDIA)发布此数据集旨在助力世界基础模型的研发,并提供用于进展评估的基准工具。如需了解PBench的详细信息,请访问[PBench官方网站](https://research.nvidia.com/labs/dir/pbench/)。
本数据集可用于非商业用途。
## 数据集所有者:
英伟达公司(NVIDIA Corporation)
## 数据集创建日期:
2025年6月11日
## 使用许可条款:
本数据集的使用受[CC BY-NC 4.0协议](https://creativecommons.org/licenses/by-nc/4.0/deed.en)约束。
## 预期用途:
本基准数据集旨在展示并助力对面向物理人工智能的世界模型的理解与评估,主要应用于教学与演示场景。
## 数据集特征描述:
PBench数据集聚焦以下领域:自动驾驶(autonomous vehicle, AV)、机器人学、工业(智能空间)、物理学、人类行为与常识。
**数据采集方式**:
* 自动驾驶:自动采集/传感器采集
* 工业:自动采集/传感器采集
* 物理学:自动采集/传感器采集
* 机器人学:自动采集/传感器采集
* 人类行为:自动采集/传感器采集
* 常识:人工采集
**标注方式**:
* 自动驾驶:混合标注:人工、自动标注
* 工业:混合标注:人工、自动标注
* 物理学:混合标注:人工、自动标注
* 机器人学:混合标注:人工、自动标注
* 人类行为:混合标注:人工、自动标注
* 常识:混合标注:人工、自动标注
## 数据集格式:
* 模态:图像(jpg格式)与文本
## 数据集量化统计:
本数据集以Parquet文件格式存储,PBench数据集的样本总量(包含条件图像、文本提示词与问答对)如下表所示。
| 领域 | 样本数量 |
|-----------------|-----------------|
| 自动驾驶(AV) | 118 |
| 常识| 239 |
| 人类行为 | 299 |
| 工业 | 107 |
| 物理学 | 107 |
| 机器人学 | 174 |
| **总存储大小** | **101 MB** |
## 参考文献:
尚未发表,后续论文完成后将补充论文链接。
## 伦理考量:
英伟达(NVIDIA)认为,可信人工智能是一项共同责任,我们已建立相关政策与实践规范,以支撑各类人工智能应用的研发。开发者在遵循服务条款下载或使用本数据集时,应联合内部模型团队,确保所开发的模型符合相关行业与应用场景的要求,并应对可能出现的产品误用问题。
请通过[此链接](https://www.nvidia.com/en-us/support/submit-security-vulnerability/)报告安全漏洞或英伟达人工智能相关问题。
提供机构:
maas
创建时间:
2025-06-12
搜集汇总
数据集介绍

背景与挑战
背景概述
PBench是一个由NVIDIA创建的物理AI基准数据集,包含1044个样本,涵盖多个领域,用于评估世界模型。数据集采用CC BY-NC 4.0许可,适用于非商业用途。
以上内容由遇见数据集搜集并总结生成



