shi-labs/physical-ai-bench-generation

Name: shi-labs/physical-ai-bench-generation
Creator: shi-labs
Published: 2025-12-10 08:07:43
License: 暂无描述

Hugging Face2025-12-10 更新2026-01-03 收录

下载链接：

https://hf-mirror.com/datasets/shi-labs/physical-ai-bench-generation

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: config_name: pbench features: - name: image_path dtype: string - name: prompt dtype: string - name: question dtype: string - name: answer dtype: string - name: domain dtype: string splits: - name: benchmark num_bytes: 237000000 num_examples: 1044 download_size: 226000000 dataset_size: 237000000 configs: - config_name: default data_files: - split: benchmark path: "cosmos_predict2_bench_full_info.json" task_categories: - visual-question-answering - text-generation language: - en license: cc-by-nc-4.0 size_categories: - 1K<n<10K tags: - physical-ai - world-models - benchmark - multimodal --- # Physical AI Bench - Generation [Paper](https://huggingface.co/papers/2512.01989) | [Code](https://github.com/SHI-Labs/physical-ai-bench) ## Dataset Description The PAI-Bench is a benchmark to measure the progress of world models quantitatively. The predict task contains a list of 1044 samples of text prompts, conditioning images, and qa pairs, covering Physical AI target domains including autonomous vehicle (AV) driving, robotics, industry (smart space), physics, human, and common sense. All the questions are binary questions, and the answer is either Yes or No. Our dataset is a benchmark designed to evaluate world models for Physical AI. This dataset is ready for non-commercial use. ## License/Terms of Use The use of this dataset is governed by [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/deed.en). ## Intended Usage This benchmark dataset is intended to demonstrate and facilitate the understanding and evaluation of world models for Physical AI. It should primarily be used for educational and demonstration purposes. ## Dataset Characterization This dataset focuses on the following areas: Autonomous Vehicle (AV) driving, Robotics, Industry (smart space), Physics, Human, Common Sense. ### Data Collection Method - AV: Automatic/Sensors - Industry: Automatic/Sensors - Physics: Automatic/Sensors - Robotics: Automatic/Sensors - Human: Automatic/Sensors - Common Sense: Human ### Labeling Method - AV: Hybrid: Human, Automated - Industry: Hybrid: Human, Automated - Physics: Hybrid: Human, Automated - Robotics: Hybrid: Human, Automated - Human: Hybrid: Human, Automated - Common Sense: Hybrid: Human, Automated ## Folder Structure ```text pbench/ ├── condition_image/ # Conditioning images for all domains ├── vqa/ # Visual Question Answering pairs └── cosmos_predict2_bench_full_info.json # Complete dataset metadata ``` ## Dataset Format - Modality: Image (jpg) and Text ## Dataset Quantification The dataset is stored in JSON files. The quantity, including the conditioning images, text prompts, and qa pairs, of the Pbench dataset is described in the table below. | Domain | Quantity | | ---------------------- | ---------- | | AV | 118 | | Common Sense | 239 | | Human | 299 | | Industry | 107 | | Physics | 107 | | Robotics | 174 | | **Total Storage Size** | **226 MB** | ## Citation If you use Physical AI Bench in your research, please cite: ```bibtex @misc{zhou2025paibenchcomprehensivebenchmarkphysical, title={PAI-Bench: A Comprehensive Benchmark For Physical AI}, author={Fengzhe Zhou and Jiannan Huang and Jialuo Li and Deva Ramanan and Humphrey Shi}, year={2025}, eprint={2512.01989}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2512.01989}, } ```

提供机构：

shi-labs

5,000+

优质数据集

54 个

任务类型

进入经典数据集