EmbodiedCity/EmbodiedNav-Bench

Name: EmbodiedCity/EmbodiedNav-Bench
Creator: EmbodiedCity
Published: 2026-04-25 02:55:51
License: 暂无描述

Hugging Face2026-04-25 更新2026-05-10 收录

下载链接：

https://hf-mirror.com/datasets/EmbodiedCity/EmbodiedNav-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 pretty_name: EmbodiedNav-Bench language: - en task_categories: - visual-question-answering - reinforcement-learning tags: - embodied-ai - embodied-navigation - urban-airspace - drone-navigation - multimodal-reasoning - spatial-reasoning size_categories: - 1K<n<10K configs: - config_name: default data_files: - split: test path: viewer-00000-of-00001.parquet --- # EmbodiedNav-Bench [![GitHub](https://img.shields.io/badge/GitHub-Code-181717?logo=github&logoColor=white)](https://github.com/serenditipy-AC/Embodied-Navigation-Bench) [![arXiv](https://img.shields.io/badge/arXiv-2604.07973-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/html/2604.07973v1) EmbodiedNav-Bench is a goal-oriented embodied navigation benchmark for evaluating spatial action in urban 3D airspace. The benchmark contains 5,037 high-quality navigation trajectories with natural-language navigation goals, initial drone poses, target positions, and ground-truth 3D trajectories. This Hugging Face repository hosts the dataset artifacts. The accompanying project code, simulator setup, media examples, and evaluation scripts are maintained in the GitHub repository: https://github.com/serenditipy-AC/Embodied-Navigation-Bench ## Dataset Summary The benchmark contains 5,037 goal-oriented navigation trajectories. Each sample corresponds to one navigation task in an urban 3D environment, with a natural-language goal description and a human-collected ground-truth trajectory. The dataset is intended for evaluating embodied navigation, spatial reasoning, and multimodal decision-making models in urban airspace scenarios. ## Repository Contents | Path | Description | | :-- | :-- | | `navi_data.pkl` | Canonical PKL file for evaluation. | | `viewer-00000-of-00001.parquet` | Parquet representation for the Hugging Face Dataset Viewer table. | | `images/` | Trajectory-aligned image release, distributed as five ZIP archives plus a manifest file. | ## Data Fields The canonical PKL file stores a list of Python dictionaries. Each sample contains the following fields: | Field | Type | Description | | :-- | :-- | :-- | | `sample_index` | `int` | Sample index used for viewer browsing and image archive alignment. | | `start_pos` | `float[3]` | Initial drone world position `(x, y, z)`. | | `start_rot` | `float[3]` | Initial drone orientation `(roll, pitch, yaw)` in radians. | | `start_ang` | `float` | Initial camera gimbal angle in degrees. | | `task_desc` | `str` | Natural-language navigation instruction. | | `target_pos` | `float[3]` | Target world position `(x, y, z)`. | | `gt_traj` | `float[N,3]` | Ground-truth trajectory points. | | `gt_traj_len` | `float` | Ground-truth trajectory length. | The Parquet table includes the same structured fields and additional convenience columns such as `sample_index`, `start_x`, `start_y`, `start_z`, `target_x`, `target_y`, `target_z`, and `gt_traj_num_points`. The Parquet file is provided for browsing and visualization in the Hugging Face Dataset Viewer. ## Trajectory-Aligned Images Trajectory-aligned image archives are available under [`images/`](https://huggingface.co/datasets/EmbodiedCity/EmbodiedNav-Bench/tree/main/images). This release is about 56.7 GB and is distributed as five ZIP archives together with `merged_upload_images_zip_manifest.json`. After extraction, folders `0-5036` correspond directly to the `sample_index` field in `navi_data.pkl` and the viewer table. | Archive | Sample index range | | :-- | :-- | | `merged_upload_images_part01_0000-1007.zip` | `0-1007` | | `merged_upload_images_part02_1008-2015.zip` | `1008-2015` | | `merged_upload_images_part03_2016-3022.zip` | `2016-3022` | | `merged_upload_images_part04_3023-4029.zip` | `3023-4029` | | `merged_upload_images_part05_4030-5036.zip` | `4030-5036` | ## Usage  For evaluation, use `navi_data.pkl` as the canonical data file and follow the setup instructions in the GitHub project repository. ## License This dataset is released under the CC-BY-4.0 license. ## Citation ```bibtex @misc{zhao2026farlargemultimodalmodels, title={How Far Are Large Multimodal Models from Human-Level Spatial Action? A Benchmark for Goal-Oriented Embodied Navigation in Urban Airspace}, author={Baining Zhao and Ziyou Wang and Jianjie Fang and Zile Zhou and Yanggang Xu and Yatai Ji and Jiacheng Xu and Qian Zhang and Weichen Zhang and Chen Gao and Xinlei Chen}, year={2026}, eprint={2604.07973}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/html/2604.07973v1}, } ```

提供机构：

EmbodiedCity

5,000+

优质数据集

54 个

任务类型

进入经典数据集