Bingqiii/LIBERO-CrossView-Pairs

Name: Bingqiii/LIBERO-CrossView-Pairs
Creator: Bingqiii
Published: 2026-05-25 10:15:00
License: 暂无描述

Hugging Face2026-05-25 更新2026-05-31 收录

下载链接：

https://hf-mirror.com/datasets/Bingqiii/LIBERO-CrossView-Pairs

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit pretty_name: LIBERO-CrossView-Pairs task_categories: - robotics tags: - libero - libero-plus - lerobot - robotics - vision-language-action - multiview - camera-robustness - cross-view size_categories: - 100K<n<1M --- # LIBERO-CrossView-Pairs LIBERO-CrossView-Pairs is a same-state paired-view dataset for training camera-robust vision-language-action policies on LIBERO. Each row contains two scene-camera observations of the exact same simulator state: a nominal LIBERO scene view and one camera-perturbed view. The paired images share the same robot state, language instruction, action target, episode index, frame index, and MuJoCo state; only the scene-camera extrinsics differ. This dataset was created for cross-view action consistency training in the paper project "Cross-View Action Consistency for Camera-Robust Vision-Language-Action Policies". It is scene-camera-only by design: wrist-camera images are excluded. ## Dataset Summary | Item | Value | |---|---:| | Format | LeRobot v2.0, parquet image dataset | | Robot | Franka Panda | | FPS | 10 | | Episodes | 2,000 | | Frames / paired samples | 338,575 | | Tasks | 40 | | Suites | `libero_spatial`, `libero_object`, `libero_goal`, `libero_10` | | Image resolution | 256 x 256 RGB | | Train split | episodes `0:1800`, 304,664 pairs | | Val split | episodes `1800:2000`, 33,911 pairs | | Camera categories | C1 distance, C2 spherical position, C3 orientation | Each of the 40 tasks has 50 episodes. The train/val split is episode-level: the first 45 demos per task are train, and the last 5 demos per task are validation. ## Data Fields Each parquet row has: | Field | Type | Description | |---|---|---| | `observation.images.front` | image, 256 x 256 x 3 | Nominal scene-camera RGB image | | `observation.images.perturbed` | image, 256 x 256 x 3 | Perturbed scene-camera RGB image from the same simulator state | | `observation.state` | float32[8] | End-effector position, axis-angle orientation, and gripper state | | `action` | float32[7] | LIBERO 7-DoF action target | | `timestamp` | float32 | `frame_index / 10` | | `frame_index` | int64 | Frame index within the episode | | `episode_index` | int64 | Global episode id | | `index` | int64 | Global frame id | | `task_index` | int64 | Task id from `meta/tasks.jsonl` | The LeRobot metadata is stored under `meta/`: - `meta/info.json` - `meta/episodes.jsonl` - `meta/tasks.jsonl` Images are stored as inline PNG/image bytes inside parquet files (`total_videos=0`), not as external mp4 videos. ## Pair Semantics For every paired sample: - `observation.images.front` is the nominal scene-camera view. - `observation.images.perturbed` is a C1, C2, or C3 scene-camera perturbation. - Both images are rendered from the same original LIBERO HDF5 demo and timestep. - The simulator is reset to the same flattened MuJoCo state before rendering each view. - The robot state, object poses, action target, and language instruction are identical across the pair. - Wrist-camera observations are not included. The camera perturbation categories follow the LIBERO-Plus camera-view perturbation definitions: - C1: distance perturbation by changing camera scale, with nominal orientation. - C2: spherical position perturbation by changing camera azimuth and/or elevation. - C3: orientation perturbation by changing camera roll and/or pitch at nominal position. The training category mix follows the LIBERO-Plus 4-suite camera evaluation distribution, approximately C1/C2/C3 = 19.6% / 61.9% / 18.5%. ## Construction The dataset was generated from the original LIBERO HDF5 demonstrations, not from policy rollouts. For each selected timestep: 1. Load the original LIBERO HDF5 demo state, action, robot state, and language instruction. 2. Reset the LIBERO simulator to the exact flattened MuJoCo state for that timestep. 3. Render the nominal scene-camera view. 4. Modify only the scene-camera extrinsics according to a sampled C1/C2/C3 perturbation. 5. Render the perturbed scene-camera view. 6. Store both views and the shared state/action metadata as one LeRobot parquet row. The source project used `scripts/v4/phase0A/render_libero_multiview_states.py` to build same-state manifests and `scripts/v4/phase0A/export_to_lerobot.py` to export the LeRobot dataset. ## Integrity Check The uploaded folder was audited on 2026-05-25 before release: - 2,000 expected parquet files found. - 2,000 `episodes.jsonl` rows and 40 `tasks.jsonl` rows found. - Total parquet rows: 338,575, matching `meta/info.json`. - Global `index` is continuous from 0 to 338,574. - `frame_index`, `episode_index`, `task_index`, and `timestamp` are internally consistent. - All state/action values are finite and have the expected dimensions. - Both image columns have non-empty PNG/image bytes for every row. - 12,000 sampled images were decoded successfully: first/middle/last frame for both views in every episode. No integrity errors or warnings were found. ## Usage With LeRobot/OpenPI-style loaders, point the dataset loader at this repository id and read the paired image keys: ```python repo_id = "bingqi/LIBERO-CrossView-Pairs" nominal_key = "observation.images.front" perturbed_key = "observation.images.perturbed" ``` For OpenPI pair training, the corresponding data mapping is: ```text observation/image <- observation.images.front observation/image_perturbed <- observation.images.perturbed observation/state <- observation.state actions <- action prompt <- task ``` ## Citation If you use this dataset, please cite LIBERO and LIBERO-Plus, and cite this dataset/project if the paired-view construction is relevant to your work.

提供机构：

Bingqiii

5,000+

优质数据集

54 个

任务类型

进入经典数据集