tom-jerry-123/Physical-AI-AV-ES

Name: tom-jerry-123/Physical-AI-AV-ES
Creator: tom-jerry-123
Published: 2026-04-18 00:40:15
License: 暂无描述

Hugging Face2026-04-18 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/tom-jerry-123/Physical-AI-AV-ES

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: other task_categories: - robotics tags: - autonomous-driving - waypoints - webdataset size_categories: - 1M<n<10M --- # PhysicalAI-AV-SFT Supervised fine-tuning (SFT) dataset for an autonomous-vehicle vision-language waypoint-prediction model. Contains **29,674 samples** from ~150 000 driving scenes (~18 seconds per scene, sampled at anchor times 2s..16s) recorded in the United States. ## Format [WebDataset](https://github.com/webdataset/webdataset) — 3 uncompressed `.tar` shards, each containing pairs of files per sample: | Entry | Description | |---|---| | `{key}.jpg` | Front-facing wide-angle camera frame (JPEG quality 95, 640 × 360 px) | | `{key}.json` | Metadata (see schema below) | **Key format:** `{scene_id}__{sample_idx:02d}` **Shard assignment:** `sha256(scene_id) % 3` — all frames of a scene land in the same shard, preventing scene leakage across train/eval splits. ## Metadata schema (`{key}.json`) ```json { "scene_id": "UUID string — identifies the driving scene", "chunk_name": "chunk_XXXX — source data chunk", "sample_idx": "int 2–16 — target second within the scene (15 anchors/scene)", "global_idx": "int — globally unique datum index", "target_t_rel_us": "int — timestamp relative to scene start (microseconds)", "target_frame_index": "int — video frame index", "egomotion": "list[list[float]] — full available past trajectory (incl. anchor), target-relative, 0.25s granularity", "waypoints": "list[list[float]] — full available future trajectory, target-relative, 0.25s granularity", "is_long_tail": "bool — long-tail driving scenario flag" } ``` Coordinate convention: all `x`/`y`/`yaw` in the **ego-vehicle frame** at target time, +x = forward, +y = left, yaw in radians CCW from forward. ## Loading ```python import webdataset as wds, json from PIL import Image import io # Local (after cloning the repo) ds = wds.WebDataset("shards/train-{00000..00002}-of-00003.tar").shuffle(1000) for sample in ds: img = Image.open(io.BytesIO(sample["jpg"])) meta = json.loads(sample["json"]) # meta["waypoints"] → full available future waypoints at 0.25s granularity ``` ## Shard index `index.parquet` — one row per sample, columns: `key`, `shard`, `scene_id`, `chunk_name`, `sample_idx`, `global_idx`, `target_t_rel_us`, `is_long_tail`. ```python import pandas as pd df = pd.read_parquet("index.parquet") lt = df[df["is_long_tail"]] # long-tail subset ```

提供机构：

tom-jerry-123

5,000+

优质数据集

54 个

任务类型

进入经典数据集