turing-motors/Japan-Open-Driving-Dataset-Sample
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/turing-motors/Japan-Open-Driving-Dataset-Sample
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc
language:
- en
tags:
- autonomous-driving
size_categories:
- 10K<n<100K
---
# Japan Open Driving Dataset Sample
## Overview
This repository contains a sample subset of the **Japan Open Driving Dataset**, a large-scale autonomous driving dataset comprising over 100 hours of driving data collected in Tokyo, Japan.
The data is stored in [nuScenes format](https://www.nuscenes.org/nuscenes) and can be loaded with the [nuscenes-devkit](https://github.com/nutonomy/nuscenes-devkit).
In addition to sensor data and 3D annotations, this dataset includes virtual captioned data for training Vision-Language-Model (VLM) and Vision-Language-Action (VLA) models.
See [captions/README.md](captions/README.md) for details.
## Dataset Statistics
| Item | Count |
| --- | --- |
| Scenes | 20 |
| Samples (keyframes) | 4,000 |
| Sample annotations (3D bounding boxes) | 218,675 |
| Object instances | 17,707 |
| Object categories | 45 |
| Maps | 4 |
### Sensors
| Sensor | Modality |
| --- | --- |
| CAM_FRONT | Camera |
| CAM_FRONT_WIDE | Camera |
| CAM_FRONT_LEFT | Camera |
| CAM_FRONT_RIGHT | Camera |
| CAM_BACK | Camera |
| CAM_BACK_LEFT | Camera |
| CAM_BACK_RIGHT | Camera |
| LIDAR_TOP | LiDAR |
### Collection Locations
| Location | Area |
| --- | --- |
| 2041_shibuya_shibuya | Shibuya, Tokyo |
| 2042_minato_azabu | Azabu, Minato-ku, Tokyo |
| 2054_koto_odaiba | Odaiba, Koto-ku, Tokyo |
| 2062_shinagawa_osaki | Osaki, Shinagawa-ku, Tokyo |
## Dataset Structure
```bash
.
├── README.md
├── pyproject.toml
├── nuscenes-devkit.zip
├── nuscenes_tutorial.ipynb
├── scripts/
│ └── download_dataset.sh
├── v2.X-train/
│ ├── attribute.json
│ ├── calibrated_sensor.json
│ ├── category.json
│ ├── ego_pose.json
│ ├── instance.json
│ ├── log.json
│ ├── map.json
│ ├── sample.json
│ ├── sample_annotation.json
│ ├── sample_data.json
│ ├── scene.json
│ ├── sensor.json
│ └── visibility.json
├── samples/
│ ├── CAM_FRONT/
│ ├── CAM_FRONT_WIDE/
│ ├── CAM_FRONT_LEFT/
│ ├── CAM_FRONT_RIGHT/
│ ├── CAM_BACK/
│ ├── CAM_BACK_LEFT/
│ ├── CAM_BACK_RIGHT/
│ └── LIDAR_TOP/
├── maps/
│ └── expansion/
├── can_bus/
├── archived_pallet_pickles/
└── captions/
├── README.md
├── STRIDE-QA/
└── RACER/
```
## Getting Started
### 1. Download the Sample Dataset
Ensure you have at least **30 GB** of free disk space.
```bash
huggingface-cli download turing-motors/Japan-Open-Driving-Dataset-Sample \
--repo-type dataset \
--local-dir ./Japan-Open-Driving-Dataset-Sample
```
### 2. Environment Setup
Install [uv](https://docs.astral.sh/uv/) and set up the Python environment:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
uv python pin 3.10
uv venv
source .venv/bin/activate
```
Install the nuscenes-devkit from the included `pyproject.toml` and additional dependencies:
```bash
cd Japan-Open-Driving-Dataset-Sample
unzip nuscenes-devkit.zip
uv pip install . # installs nuscenes-devkit + dependencies
uv pip install pypcd4 jupyterlab # LiDAR PCD support + notebook
```
### 3. Load the Dataset
```python
from nuscenes.nuscenes import NuScenes
nusc = NuScenes(version='v2.X-train', dataroot='./Japan-Open-Driving-Dataset-Sample', verbose=True)
```
Expected output:
```bash
======
Loading NuScenes tables for version v2.X-train...
45 category,
68 attribute,
2 visibility,
17707 instance,
27 sensor,
27 calibrated_sensor,
33613 ego_pose,
20 log,
20 scene,
4000 sample,
35400 sample_data,
218675 sample_annotation,
4 map,
Done loading in 1.671 seconds.
======
Reverse indexing ...
Done reverse indexing in 0.4 seconds.
======
```
### 4. Tutorial Notebook
We provide a modified version of the [nuscenes-devkit tutorial](https://github.com/nutonomy/nuscenes-devkit) (`nuscenes_tutorial.ipynb`).
You can launch the notebook using:
```bash
jupyter lab nuscenes_tutorial.ipynb
```
### License
Japan-Open-Driving-Dataset-Sample is released under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en).
### Privacy Protection
To ensure privacy protection, human faces and license plates in the images were anonymized using the [Dashcam Anonymizer](https://github.com/varungupta31/dashcam_anonymizer).
## Access to the Japan Open Driving Dataset
To access the full Japan Open Driving Dataset, you are required to review and agree to the terms of use and submit an application form. Please refer to the link below for details:
[Application Form for the Japan Open Driving Dataset](https://tur.ing/news/opendataset_2025/)
The full dataset requires approximately **24 TB** of storage. Once your application has been reviewed and approved, a download token will be issued.
You can then download the dataset by running:
```bash
./scripts/download_dataset.sh \
-u "https://open-dataset.turing-motors.net/api/list?token=<YOUR_TOKEN>" \
-o ./dataset \
-p 16
```
提供机构:
turing-motors



