tom-jerry-123/Physical-AI-AV-US

Name: tom-jerry-123/Physical-AI-AV-US
Creator: tom-jerry-123
Published: 2026-03-18 06:06:35
License: 暂无描述

Hugging Face2026-03-18 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/tom-jerry-123/Physical-AI-AV-US

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: other task_categories: - robotics tags: - autonomous-driving - waypoints - webdataset size_categories: - 1M<n<10M --- # PhysicalAI-AV-SFT Supervised fine-tuning (SFT) dataset for an autonomous-vehicle vision-language waypoint-prediction model. Contains **2,789,773 samples** from ~150 000 driving scenes (~18 seconds per scene, sampled at 1 Hz) recorded in the United States. ## Format [WebDataset](https://github.com/webdataset/webdataset) — 100 uncompressed `.tar` shards, each containing pairs of files per sample: | Entry | Description | |---|---| | `{key}.png` | Front-facing wide-angle camera frame (640 × 360 px) | | `{key}.json` | Metadata (see schema below) | **Key format:** `{scene_id}__{sample_idx:02d}` **Shard assignment:** `sha256(scene_id) % 100` — all frames of a scene land in the same shard, preventing scene leakage across train/eval splits. ## Metadata schema (`{key}.json`) ```json { "scene_id": "UUID string — identifies the driving scene", "chunk_name": "chunk_XXXX — source data chunk", "sample_idx": "int 1–18 — which second within the 18-second scene", "global_idx": "int — globally unique datum index", "target_t_rel_us": "int — timestamp relative to scene start (microseconds)", "target_frame_index": "int — video frame index", "egomotion": "list[list[float]] — past trajectory [[x,y,yaw], ...], 3 entries: [-2s, -1s, 0s (anchor)]", "waypoints": "list[list[float]] — future trajectory [[x,y,yaw], ...], 4 entries at ~2s steps", "is_long_tail": "bool — long-tail driving scenario flag" } ``` Coordinate convention: all `x`/`y`/`yaw` in the **ego-vehicle frame** at target time, +x = forward, +y = left, yaw in radians CCW from forward. ## Loading ```python import webdataset as wds, json from PIL import Image import io # Local (after cloning the repo) ds = wds.WebDataset("shards/train-{00000..00099}-of-00100.tar").shuffle(1000) for sample in ds: img = Image.open(io.BytesIO(sample["png"])) meta = json.loads(sample["json"]) # meta["waypoints"] → [[x,y,yaw], ...] × 4 future steps ``` ## Shard index `index.parquet` — one row per sample, columns: `key`, `shard`, `scene_id`, `chunk_name`, `sample_idx`, `global_idx`, `target_t_rel_us`, `is_long_tail`. ```python import pandas as pd df = pd.read_parquet("index.parquet") lt = df[df["is_long_tail"]] # long-tail subset ```

提供机构：

tom-jerry-123

5,000+

优质数据集

54 个

任务类型

进入经典数据集