five

SeonghuJeon/droid-1.0.1-preprocessed

收藏
Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/SeonghuJeon/droid-1.0.1-preprocessed
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 language: - en tags: - robotics - droid - lerobot - video pretty_name: DROID 1.0.1 (Preprocessed Mirror) size_categories: - 10K<n<100K task_categories: - robotics --- # DROID 1.0.1 — Preprocessed Mirror A ready-to-use mirror of [`lerobot/droid_1.0.1`](https://huggingface.co/datasets/lerobot/droid_1.0.1) with two preprocessing fixes baked in: 1. **Fixed `meta/episodes/*.parquet` timestamps.** The upstream release stores `videos/*/from_timestamp` as UNIX-epoch-drifted floats — up to ~54 years off. Any naive `av.Container.seek()` lands on garbage. We recomputed every `from_timestamp` / `to_timestamp` from actual video durations, backed up the original, and shipped the fixed parquets in this mirror. 2. **Broken-episode blacklist applied at source.** On the upstream release, ~30,815 episodes (~32%) have mp4 files that are missing, truncated, or fail to decode. This mirror contains only videos that pass a full PyAV decode, and the upstream episode indices are preserved (so the JSON sidecar below still matches). ## Layout Identical to LeRobot v3.0, drop-in compatible with any LeRobot v3 loader that reads `meta/episodes/`, `data/`, and `videos/`: ``` droid-1.0.1-preprocessed/ meta/ info.json tasks.parquet stats.json episodes/chunk-NNN/file-NNN.parquet # FIXED timestamps data/chunk-NNN/file-NNN.parquet # frame-level action/state (unchanged) videos/ observation.images.exterior_2_left/chunk-NNN/file-NNN.mp4 observation.images.exterior_2_left/chunk-NNN/file-NNN.framecache/ # optional observation.images.wrist_left/... observation.images.exterior_1_left/... _stats/ droid_blacklist_eps.json # 30,815 broken episodes droid_nonidle_ranges.json # per-episode motion range ``` **Framecache sidecars** (`.framecache/` directories next to each `.mp4`) are optional JPEG-per-frame caches we built during training to skip PyAV decode. You can ignore them if you just want raw mp4 decoding. ## How to download **Everything (~1.2 TB)**: ```bash HF_XET_HIGH_PERFORMANCE=1 \ hf download SeonghuJeon/droid-1.0.1-preprocessed \ --repo-type dataset \ --local-dir /path/to/droid-1.0.1-preprocessed ``` **Video-only, skip framecache** (substantially smaller): ```bash hf download SeonghuJeon/droid-1.0.1-preprocessed \ --repo-type dataset \ --local-dir /path/to/droid-1.0.1-preprocessed \ --include "meta/*" "data/*" "videos/**/*.mp4" "_stats/*" ``` **Only one camera** (e.g. third-person `exterior_2_left`): ```bash hf download SeonghuJeon/droid-1.0.1-preprocessed \ --repo-type dataset \ --local-dir /path/to/droid-1.0.1-preprocessed \ --include "meta/*" "data/*" \ "videos/observation.images.exterior_2_left/**/*.mp4" \ "_stats/*" ``` ## Companion loader For RGB-only PyTorch training, use the standalone loader we ship at [`SeonghuJeon/droid-rgb-loader`](https://huggingface.co/datasets/SeonghuJeon/droid-rgb-loader). It expects this mirror as the `root` and reads the blacklist / non-idle sidecars automatically. ## Camera guidance - `observation.images.exterior_2_left` — primary third-person view, **use this**. - `observation.images.wrist_left` — wrist-mounted, **use this**. - `observation.images.exterior_1_left` — mis-calibrated or occluded on a large fraction of episodes. **Blocked in all our runs.** ## Citation DROID upstream: https://droid-dataset.github.io/ If this mirror saves you time, a ⭐ on the repo page is appreciated.
提供机构:
SeonghuJeon
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作