ASethi04/droid-trajectories-5k
收藏Hugging Face2026-03-30 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/ASethi04/droid-trajectories-5k
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- robotics
tags:
- droid
- trajectory
- manipulation
- gemini-er
- sam2
- lerobot
- LeRobot
size_categories:
- 1K<n<10K
---
# DROID Visual Trajectories (5K Episodes) — LeRobot Format
Visual object trajectories extracted from 4506 episodes of the
[DROID dataset](https://huggingface.co/datasets/cadene/droid_1.0.1) using
**Gemini Robotics-ER** + **SAM2**.
## How to load images/videos
Images are **not embedded** in this dataset. Each row contains reference paths
to the source videos in `cadene/droid_1.0.1`:
```python
from datasets import load_dataset
from huggingface_hub import hf_hub_download
ds = load_dataset("ASethi04/droid-trajectories-5k", split="train")
row = ds[0]
# Get the source video
video_path = hf_hub_download(
"cadene/droid_1.0.1",
row["droid_video_exterior_1_left"], # e.g. "videos/chunk-000/observation.images.exterior_1_left/episode_000008.mp4"
repo_type="dataset",
)
# Or load the full source parquet (with all DROID columns)
droid_ep_id = row["droid_episode_id"] # e.g. 8
chunk = droid_ep_id // 1000
source_parquet = hf_hub_download(
"cadene/droid_1.0.1",
f"data/chunk-{chunk:03d}/episode_{droid_ep_id:06d}.parquet",
repo_type="dataset",
)
```
## Stats
- **Episodes**: 4506
- **Total Frames**: 943315
- **Unique Tasks**: 1429
- **Cameras**: exterior_1_left, exterior_2_left, wrist_left (referenced)
- **FPS**: 15
## Schema
### Trajectory columns (from Gemini ER + SAM2 pipeline)
| Column | Type | Description |
|--------|------|-------------|
| `trajectory_xyxy` | float32[4] | Tracked object bbox [x1, y1, x2, y2] in pixels |
| `trajectory_center_x` | float32 | Object mask centroid X (pixels) |
| `trajectory_center_y` | float32 | Object mask centroid Y (pixels) |
| `trajectory_labels` | list[string] | Object labels being tracked |
### Robot state (from cadene/droid_1.0.1)
| Column | Type | Description |
|--------|------|-------------|
| `joint_position` | float32[7] | 7-DOF joint angles |
| `gripper_position` | float32 | Gripper state |
| `actions` | float32[8] | Action commands (7 joints + gripper) |
### References to source data
| Column | Type | Description |
|--------|------|-------------|
| `droid_episode_id` | int64 | Original episode ID in cadene/droid_1.0.1 |
| `droid_video_exterior_1_left` | string | Path to exterior_1_left video in cadene/droid_1.0.1 |
| `droid_video_exterior_2_left` | string | Path to exterior_2_left video in cadene/droid_1.0.1 |
| `droid_video_wrist_left` | string | Path to wrist_left video in cadene/droid_1.0.1 |
### Metadata
| Column | Type | Description |
|--------|------|-------------|
| `language_instruction` | string | Task description |
| `timestamp` | float32 | Frame timestamp (seconds) |
| `frame_index` | int64 | Frame number within episode |
| `episode_index` | int64 | Sequential episode number (0-indexed) |
| `index` | int64 | Global row index |
| `task_index` | int64 | Task ID (see meta/tasks.jsonl) |
## Trajectory Pipeline
1. **Object Extraction**: Parse task instruction to identify primary + related objects
2. **Detection**: Gemini Robotics-ER (`gemini-robotics-er-1.5-preview`) with task-aware bounding box prompts
3. **Tracking**: SAM2 video predictor with periodic re-detection every 60 frames
4. **Trajectory**: Per-frame mask centroid and bounding box
## Source
- Robot data & videos: [cadene/droid_1.0.1](https://huggingface.co/datasets/cadene/droid_1.0.1)
- Format inspired by: [brandonyang/droid_120](https://huggingface.co/datasets/brandonyang/droid_120)
提供机构:
ASethi04



