lance-format/lerobot-xvla-soft-fold

Name: lance-format/lerobot-xvla-soft-fold
Creator: lance-format
Published: 2026-02-27 17:17:48
License: 暂无描述

Hugging Face2026-02-27 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/lance-format/lerobot-xvla-soft-fold

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 configs: - config_name: frames data_dir: data/frames.lance - config_name: episodes data_dir: data/episodes.lance - config_name: videos data_dir: data/videos.lance task_categories: - robotics tags: - LeRobot --- This dataset was created using [LeRobot](https://github.com/huggingface/lerobot). ## Dataset Description **Repository:** [X-VLA](https://thu-air-dream.github.io/X-VLA/) **License:** Apache 2.0 **Paper:** *Zheng et al., 2025, “X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model”* ([arXiv:2510.10274](https://arxiv.org/pdf/2510.10274)) ## What this dataset contains This is the Lance-format version of [lerobot/xvla-soft-fold](https://huggingface.co/datasets/lerobot/xvla-soft-fold), designed for efficient frame-level sampling and sequential episode loading. - `1,542` episodes - `2,852,512` frames - `20` FPS - 3 camera streams per episode (`cam_high`, `cam_left_wrist`, `cam_right_wrist`) - robot state vectors and action vectors aligned to frame timestamps ## Dataset structure The dataset is organized under `data/` with three Lance tables: ### Frames table This is the main table for model training and analytics at frame granularity. Each row is one frame with aligned state/action metadata and indexing fields so you can filter by episode, iterate temporally, or build sampled batches directly. Schema: - `observation_state` (`list<float>`): robot state vector for that frame. - `action` (`list<float>`): action vector for that frame. - `time_stamp` (`float`): original source timestamp field. - `timestamp` (`float`): canonical frame timestamp. - `frame_index` (`int64`): frame index within episode. - `episode_index` (`int64`): parent episode id. - `index` (`int64`): global frame index. - `task_index` (`int64`): task id. ### Episodes table This table is optimized for sequence-aware loading. Each row represents one complete episode and stores per-episode arrays (`timestamps`, `actions`, `observation_state`) plus per-camera video blobs and timestamp ranges. Use this table when you need contiguous windows, trajectory-level batching, or synchronized decoding from episode-level video chunks. Schema: - `episode_index` (`int64`, required): episode id. - `task_index` (`int64`, required): task id. - `fps` (`int32`, required): frame rate. - `timestamps` (`list<float>`): per-frame timestamps for the episode. - `actions` (`list<list<float>>`): per-frame action vectors. - `observation_state` (`list<list<float>>`): per-frame robot state vectors. - `observation_images_cam_high_video_blob` (`large_binary` blob): encoded video segment for `cam_high`. - `observation_images_cam_high_from_timestamp` (`double`): segment start time for `cam_high`. - `observation_images_cam_high_to_timestamp` (`double`): segment end time for `cam_high`. - `observation_images_cam_left_wrist_video_blob` (`large_binary` blob): encoded video segment for `cam_left_wrist`. - `observation_images_cam_left_wrist_from_timestamp` (`double`): segment start time for `cam_left_wrist`. - `observation_images_cam_left_wrist_to_timestamp` (`double`): segment end time for `cam_left_wrist`. - `observation_images_cam_right_wrist_video_blob` (`large_binary` blob): encoded video segment for `cam_right_wrist`. - `observation_images_cam_right_wrist_from_timestamp` (`double`): segment start time for `cam_right_wrist`. - `observation_images_cam_right_wrist_to_timestamp` (`double`): segment end time for `cam_right_wrist`. ### Videos table This table stores raw MP4 payloads from the source and file-level provenance metadata. It is useful when you want direct access to original encoded video assets, integrity checks (`sha256`), or custom decoding pipelines that operate on the original video files themselves, rather than episode/frame abstractions. Schema: - `camera_angle` (`string`, required): camera key. - `chunk_index` (`int32`): chunk id parsed from path. - `file_index` (`int32`): file id parsed from path. - `relative_path` (`string`, required): original relative path in dataset. - `filename` (`string`, required): MP4 filename. - `file_size_bytes` (`int64`, required): file size. - `sha256` (`string`, required): SHA256 digest. - `video_blob` (`large_binary`, required blob): raw MP4 bytes. ## Usage In the following sections, we'll show how to work with the dataset in Lance or LanceDB. ### Read with Lance ```python import lance root_path = "hf://datasets/lance-format/lerobot-xvla-soft-fold/data" frames_table_name = "frames.lance" episodes_table_name = "episodes.lance" videos_table_name = "videos.lance" ds = lance.dataset(f"{root_path}/{frames_table_name}") print(ds.count_rows()) ds = lance.dataset(f"{root_path}/{episodes_table_name}") print(ds.count_rows()) ds = lance.dataset(f"{root_path}/{videos_table_name}") print(ds.count_rows()) # Returns: # 2852512 # 1542 # 104 ``` ### Inspect a few frames ```python import lance root_path = "hf://datasets/lance-format/lerobot-xvla-soft-fold/data" frames_table_name = "frames.lance" frames = lance.dataset(f"{root_path}/{frames_table_name}") print(f"There are {frames.count_rows()} frames in total") # pip install polars res = frames.scanner( columns=["episode_index", "frame_index", "timestamp"], limit=2, ).to_table() print(res) # Returns # There are 2852512 frames in total # pyarrow.Table # episode_index: int64 # frame_index: int64 # timestamp: float # ---- # episode_index: [[0,0]] # frame_index: [[0,1]] # timestamp: [[0,0.05]] ``` ### Retrieving and saving video blobs ```py from pathlib import Path import lance root_path = "hf://datasets/lance-format/lerobot-xvla-soft-fold/data" episodes_table_name = "episodes.lance" ds = lance.dataset(f"{root_path}/{episodes_table_name}") out = Path("video_blobs") out.mkdir(exist_ok=True) # Retrieve first two videos from the episodes table for offset in range(0, 2): row = ( ds.scanner( columns=["episode_index", "observation_images_cam_high_video_blob"], blob_handling="all_binary", limit=2, offset=offset, ) .to_table() .to_pylist()[0] ) # Write the video blob to a file (out / f"episode_{row['episode_index']}.mp4").write_bytes( row["observation_images_cam_high_video_blob"] ) ``` This outputs the retrieved blobs as MP4 files in a local directory. ### Random seek on subsets of video The snippet shown below reads one episode’s video blob directly from HF Hub via Lance, computes a tiny time window inside that episode, opens the blob as a stream (without downloading full data into a local file), seeks to the start timestamp, and prints the blob size plus the exact seek positions in seconds and stream PTS units. ```py import av import lance DATASET_URI = "hf://datasets/lance-format/lerobot-xvla-soft-fold/data/episodes.lance" EPISODE_INDEX = 30 START_OFFSET_S = 1.0 WINDOW_S = 0.5 ds = lance.dataset(DATASET_URI) row = ds.scanner( columns=[ "episode_index", "observation_images_cam_high_from_timestamp", "observation_images_cam_high_to_timestamp", "_rowid", ], with_row_id=True, filter=f"episode_index = {EPISODE_INDEX}", limit=1, ).to_table().to_pylist()[0] start_s = row["observation_images_cam_high_from_timestamp"] + START_OFFSET_S end_s = min( start_s + WINDOW_S, row["observation_images_cam_high_to_timestamp"], ) blob = ds.take_blobs("observation_images_cam_high_video_blob", ids=[row["_rowid"]])[0] with av.open(blob) as container: stream = container.streams.video[0] stream.codec_context.skip_frame = "NONKEY" start_pts = int(start_s / stream.time_base) end_pts = int(end_s / stream.time_base) container.seek(start_pts, stream=stream) print(f"episode_index={row['episode_index']}") print(f"blob_size_bytes={blob.size()}") print(f"seek_start_seconds={start_s:.3f}") print(f"seek_end_seconds={end_s:.3f}") print(f"seek_start_pts={start_pts}") print(f"seek_end_pts={end_pts}") blob.close() ``` ### LanceDB search LanceDB users can also interface with the Lance dataset on the Hub. The key step is to connect to the dataset repo and open the relevant table. ```py import lancedb db = lancedb.connect("hf://datasets/lance-format/lerobot-xvla-soft-fold/data") tbl = db.open_table("episodes") # Search without any parameters results = ( tbl.search() .select( [ "episode_index", "observation_images_cam_high_from_timestamp", "observation_images_cam_high_to_timestamp", ] ) .limit(3) .to_list() ) for result in results: print( f"{result['episode_index']} | {result['observation_images_cam_high_from_timestamp']} | {result['observation_images_cam_high_to_timestamp']}" ) # Returns: # 0 | 0.0 | 122.95 # 1 | 122.95 | 230.65 # 2 | 230.65 | 340.0 ``` ### Download If you need to make modifications to the data or work with the raw files directly, you can do a full download of the dataset locally. > **⚠️ Large dataset download** > The full dataset is >50GB in size, so ensure you have sufficient disk space available. ```bash uv run hf download lance-format/lerobot-xvla-soft-fold --repo-type dataset --local-dir . ```

提供机构：

lance-format

5,000+

优质数据集

54 个

任务类型

进入经典数据集