five

Scene-N1

收藏
魔搭社区2025-12-05 更新2025-09-20 收录
下载链接:
https://modelscope.cn/datasets/InternRobotics/Scene-N1
下载链接
链接失效反馈
官方服务:
资源简介:
<!-- <div align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/64e6d9d229a548f66aff6e5b/y9cRIMFKBcY1qsxJ58y7Y.jpeg"/> </div> --> # SceneData-N1 <!-- ## 🔑 Key Features - **Unified Format for Different Benchmarks** InternData-N1 consolidates three subsets—VLN-CE, VLN-PE, and VLN-N1—into the mainstream LeRobot (v2.1) format, facilitating convenient usage across different systems and diverse benchmarks. - **Diverse Data Covering Different Embodiments, Tasks, and Scenes** InternData-N1 offers diversity through its foundation of 3,000+ scene assets, extensive randomization across different robot embodiments and viewpoints, and rephrased instructions generated by LLMs for common navigation tasks. - **High Quality Through Effective Generation and Filtering** InternData-N1 ensures high quality by employing effective data generation strategies (producing smooth and safe trajectories) and rigorous filtering (excluding samples with very few reference objects). This results in state-of-the-art performance for models trained on it, such as InternVLA-N1. ## 📅 TODO List - [x] **InternData-N1 subsets**: 2.8k+ VLN-PE, 150k+ VLN-CE, 6k+ VLN-N1 episodes - [ ] **Release 200k+ VLN-N1** (in 2 weeks) - [ ] **VLN-CE v1 -> v1.3** (in one month) ## 📋 Table of Contents - [InternData-N1](#interndata-n1) - [🔑 Key Features](#-key-features) - [📅 TODO List](#-todo-list) - [📋 Table of Contents](#-table-of-contents) - [🔥 Get Started](#-get-started) - [Download the Dataset](#download-the-dataset) - [Dataset Structure](#dataset-structure) - [Scene Data Assets](#scene-data-assets) - [Core Dataset Structure](#core-dataset-structure) ## 🔥 Get Started ### Download the Dataset To download the full dataset, you can use the following commands. If you encounter any issues, please refer to the official Hugging Face documentation. ```bash # Make sure you have git-lfs installed (https://git-lfs.com) git lfs install # When prompted for a password, use an access token with read permissions. # Generate one from your settings: https://huggingface.co/settings/tokens git clone https://huggingface.co/datasets/InternRobotics/InternData-N1 # If you want to clone without large files - just their pointers GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/InternRobotics/InternData-N1 ``` --> ## Dataset Structure ``` scene_data/ ├── gradio_scene_assets/ ├── n1_eval_scenes/ │ ├── Materials │ ├── SkyTexture │ ├── internscenes_home │ └── internscenes_commercial ├── mp3d_pe/ ├── mp3d_n1/ └── mp3d_ce/ ``` - `scene_data/mp3d_pe/`: Improved Matterport3D scene assets for VLN-PE benchmark. - `scene_data/mp3d_n1/`: Base Matterport3D scans used for generating VLN-N1 trajectory data. - `scene_data/mp3d_ce/`: Matterport3D scene assets for VLN-CE benchmark. - `scene_data/n1_eval_scenes/`: Scene assets for VLN-N1 benchmark. - `scene_data/gradio_scene_assets/`: Selection of MP3D_CE scenes (with ceilings removed) for rapid interactive testing of the [InternVLA-N1](https://huggingface.co/InternRobotics/InternVLA-N1) model. These simplified environments allow users to quickly verify model performance in key scenarios. > **Note**: The original scene datasets can be obtained from [Matterport3D](https://niessner.github.io/Matterport/). # License and Citation All the data and code within this repo are under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). Please consider citing our project if it helps your research. ```BibTeX @misc{scene_n1, title={Scene-N1 Dataset}, author={Scene-N1 Dataset contributors}, howpublished={\url{https://huggingface.co/datasets/InternRobotics/Scene-N1}}, year={2025} } ```

<!-- <div align="center"> <img src="https://cdn-uploads.huggingface.co/production/uploads/64e6d9d229a548f66aff6e5b/y9cRIMFKBcY1qsxJ58y7Y.jpeg"/> </div> --> # SceneData-N1 ## 🔑 核心特性 - **适配多基准测试的统一格式** InternData-N1 将 VLN-CE、VLN-PE 与 VLN-N1 三个子集整合至主流的 LeRobot(v2.1)格式中,便于在不同系统与各类基准测试中便捷使用。 - **覆盖多形态、多任务与多场景的多样化数据** InternData-N1 以3000余个场景资产为基础,针对不同机器人形态与视角进行了广泛的随机化处理,并通过大语言模型(LLM)为常见导航任务生成了重述后的指令,从而实现了数据的多样性。 - **基于高效生成与筛选机制的高质量数据** InternData-N1 通过采用高效的数据生成策略(可生成平滑且安全的轨迹)与严格的筛选流程(剔除参考物体极少的样本)来保障数据质量,最终使基于该数据集训练的模型(如 InternVLA-N1)能够达到顶尖性能。 ## 📅 待办清单 - [x] **InternData-N1 子集**:2.8万+ 条 VLN-PE、15万+ 条 VLN-CE 与 6千+ 条 VLN-N1 轨迹片段 - [ ] **发布20万+ 条 VLN-N1 数据**(预计2周内) - [ ] **将 VLN-CE 从 v1 升级至 v1.3**(预计1个月内) ## 📋 目录 - [SceneData-N1](#scenedata-n1) - [🔑 核心特性](#-key-features) - [📅 待办清单](#-todo-list) - [📋 目录](#-table-of-contents) - [🔥 快速上手](#-get-started) - [下载数据集](#download-the-dataset) - [数据集结构](#dataset-structure) - [场景数据资产](#scene-data-assets) - [核心数据集结构](#core-dataset-structure) ## 🔥 快速上手 ### 下载数据集 若需下载完整数据集,可执行以下命令。若遇到任何问题,请参考 Hugging Face 官方文档。 bash # 请确保已安装 git-lfs(https://git-lfs.com) git lfs install # 若提示输入密码,请使用具备读取权限的访问令牌。 # 可从以下地址生成令牌:https://huggingface.co/settings/tokens git clone https://huggingface.co/datasets/InternRobotics/InternData-N1 # 若仅需克隆文件指针而非完整大文件 GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/InternRobotics/InternData-N1 ## 数据集结构 scene_data/ ├── gradio_scene_assets/ ├── n1_eval_scenes/ │ ├── Materials │ ├── SkyTexture │ ├── internscenes_home │ └── internscenes_commercial ├── mp3d_pe/ ├── mp3d_n1/ └── mp3d_ce/ - `scene_data/mp3d_pe/`:用于 VLN-PE 基准测试的优化版 Matterport3D 场景资产。 - `scene_data/mp3d_n1/`:用于生成 VLN-N1 轨迹数据的基础 Matterport3D 扫描场景。 - `scene_data/mp3d_ce/`:用于 VLN-CE 基准测试的 Matterport3D 场景资产。 - `scene_data/n1_eval_scenes/`:用于 VLN-N1 基准测试的场景资产。 - `scene_data/gradio_scene_assets/`:精选的 MP3D_CE 场景(已移除天花板),用于 [InternVLA-N1](https://huggingface.co/InternRobotics/InternVLA-N1) 模型的快速交互式测试。这些简化后的环境可帮助用户快速验证模型在关键场景下的性能。 > **注意**:原始场景数据集可从 [Matterport3D](https://niessner.github.io/Matterport/) 获取。 # 授权与引用 本仓库内的所有数据与代码均遵循 [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) 开源协议。若本项目对你的研究有所帮助,请考虑引用我们的工作。 BibTeX @misc{scene_n1, title={Scene-N1 Dataset}, author={Scene-N1 Dataset contributors}, howpublished={url{https://huggingface.co/datasets/InternRobotics/Scene-N1}}, year={2025} }
提供机构:
maas
创建时间:
2025-07-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作