myconnects/robocasa-pretrain
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/myconnects/robocasa-pretrain
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: robocasa_pretrain_mimicgen
features:
- name: id
dtype: string
- name: task
dtype: string
- name: lang_vector
sequence: float32
- name: data_source
dtype: string
- name: frames
dtype: string
- name: is_robot
dtype: bool
- name: quality_label
dtype: string
- name: partial_success
dtype: float32
splits:
- name: train
num_bytes: 927797699
num_examples: 536030
download_size: 110755897
dataset_size: 927797699
- config_name: robocasa_pretrain_human
features:
- name: id
dtype: string
- name: task
dtype: string
- name: lang_vector
sequence: float32
- name: data_source
dtype: string
- name: frames
dtype: string
- name: is_robot
dtype: bool
- name: quality_label
dtype: string
- name: partial_success
dtype: float32
splits:
- name: train
num_bytes: 56591559
num_examples: 32043
download_size: 17899559
dataset_size: 56591559
configs:
- config_name: robocasa_pretrain_mimicgen
data_files:
- split: train
path: robocasa_pretrain_mimicgen/train-*
- config_name: robocasa_pretrain_human
data_files:
- split: train
path: robocasa_pretrain_human/train-*
---
# RoboCasa Pretraining Dataset
Video trajectory dataset for robot manipulation pretraining in the [RoboCasa](https://robocasa.ai) kitchen simulation environment. Contains two data splits: synthetically generated MimicGen demonstrations and human-teleoperated demonstrations.
## Dataset Summary
| Split | Examples | Description |
|-------|----------|-------------|
| `robocasa_pretrain_mimicgen` | 536,030 | MimicGen-generated trajectories |
| `robocasa_pretrain_human` | 32,043 | Human-teleoperated trajectories |
| **Total** | **568,073** | |
Each example corresponds to one robot manipulation trajectory rendered as an MP4 video clip, paired with structured metadata.
## Data Structure
Each split is a Parquet index paired with video files stored as `trajectory_XXXX.mp4` in batch subdirectories.
### Fields
| Field | Type | Description |
|-------|------|-------------|
| `id` | string | Unique trajectory identifier |
| `task` | string | Task name (e.g., kitchen manipulation task) |
| `lang_vector` | float32[] | Language embedding vector for the task instruction |
| `data_source` | string | Data collection method (`mimicgen` or `human`) |
| `frames` | string | Path to the corresponding MP4 video file |
| `is_robot` | bool | Whether the trajectory was executed by a robot |
| `quality_label` | string | Trajectory quality annotation |
| `partial_success` | float32 | Partial task completion score [0, 1] |
## Usage
```python
from datasets import load_dataset
# Load MimicGen split
ds_mim = load_dataset("myconnects/robocasa-pretrain", "robocasa_pretrain_mimicgen")
# Load Human split
ds_hum = load_dataset("myconnects/robocasa-pretrain", "robocasa_pretrain_human")
```
## About RoboCasa
[RoboCasa](https://robocasa.ai) is a large-scale simulation framework for training generalist robot policies in diverse kitchen environments. It provides photorealistic kitchen scenes with a wide variety of objects and tasks, designed for pretraining visuomotor policies.
## Citation
If you use this dataset, please cite the RoboCasa paper:
```bibtex
@inproceedings{robocasa2024,
title = {RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots},
author = {Soroush Nasiriany and Abhiram Maddukuri and Lance Zhang and Adeet Parikh and Aaron Lo and Abhishek Joshi and Ajay Mandlekar and Yuke Zhu},
booktitle = {Robotics: Science and Systems},
year = {2024}
}
```
提供机构:
myconnects



