SeonghuJeon/droid-1.0.1-preprocessed
收藏Hugging Face2026-04-15 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/SeonghuJeon/droid-1.0.1-preprocessed
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
tags:
- robotics
- droid
- lerobot
- video
pretty_name: DROID 1.0.1 (Preprocessed Mirror)
size_categories:
- 10K<n<100K
task_categories:
- robotics
---
# DROID 1.0.1 — Preprocessed Mirror
A ready-to-use mirror of [`lerobot/droid_1.0.1`](https://huggingface.co/datasets/lerobot/droid_1.0.1)
with two preprocessing fixes baked in:
1. **Fixed `meta/episodes/*.parquet` timestamps.** The upstream release
stores `videos/*/from_timestamp` as UNIX-epoch-drifted floats — up to
~54 years off. Any naive `av.Container.seek()` lands on garbage. We
recomputed every `from_timestamp` / `to_timestamp` from actual video
durations, backed up the original, and shipped the fixed parquets in this
mirror.
2. **Broken-episode blacklist applied at source.** On the upstream release,
~30,815 episodes (~32%) have mp4 files that are missing, truncated, or
fail to decode. This mirror contains only videos that pass a full PyAV
decode, and the upstream episode indices are preserved (so the JSON
sidecar below still matches).
## Layout
Identical to LeRobot v3.0, drop-in compatible with any LeRobot v3 loader that
reads `meta/episodes/`, `data/`, and `videos/`:
```
droid-1.0.1-preprocessed/
meta/
info.json
tasks.parquet
stats.json
episodes/chunk-NNN/file-NNN.parquet # FIXED timestamps
data/chunk-NNN/file-NNN.parquet # frame-level action/state (unchanged)
videos/
observation.images.exterior_2_left/chunk-NNN/file-NNN.mp4
observation.images.exterior_2_left/chunk-NNN/file-NNN.framecache/ # optional
observation.images.wrist_left/...
observation.images.exterior_1_left/...
_stats/
droid_blacklist_eps.json # 30,815 broken episodes
droid_nonidle_ranges.json # per-episode motion range
```
**Framecache sidecars** (`.framecache/` directories next to each `.mp4`) are
optional JPEG-per-frame caches we built during training to skip PyAV decode.
You can ignore them if you just want raw mp4 decoding.
## How to download
**Everything (~1.2 TB)**:
```bash
HF_XET_HIGH_PERFORMANCE=1 \
hf download SeonghuJeon/droid-1.0.1-preprocessed \
--repo-type dataset \
--local-dir /path/to/droid-1.0.1-preprocessed
```
**Video-only, skip framecache** (substantially smaller):
```bash
hf download SeonghuJeon/droid-1.0.1-preprocessed \
--repo-type dataset \
--local-dir /path/to/droid-1.0.1-preprocessed \
--include "meta/*" "data/*" "videos/**/*.mp4" "_stats/*"
```
**Only one camera** (e.g. third-person `exterior_2_left`):
```bash
hf download SeonghuJeon/droid-1.0.1-preprocessed \
--repo-type dataset \
--local-dir /path/to/droid-1.0.1-preprocessed \
--include "meta/*" "data/*" \
"videos/observation.images.exterior_2_left/**/*.mp4" \
"_stats/*"
```
## Companion loader
For RGB-only PyTorch training, use the standalone loader we ship at
[`SeonghuJeon/droid-rgb-loader`](https://huggingface.co/datasets/SeonghuJeon/droid-rgb-loader).
It expects this mirror as the `root` and reads the blacklist / non-idle
sidecars automatically.
## Camera guidance
- `observation.images.exterior_2_left` — primary third-person view, **use
this**.
- `observation.images.wrist_left` — wrist-mounted, **use this**.
- `observation.images.exterior_1_left` — mis-calibrated or occluded on a
large fraction of episodes. **Blocked in all our runs.**
## Citation
DROID upstream: https://droid-dataset.github.io/
If this mirror saves you time, a ⭐ on the repo page is appreciated.
提供机构:
SeonghuJeon



