smol-libero

Name: smol-libero
Creator: maas
Published: 2025-12-05 16:51:17
License: 暂无描述

魔搭社区2025-12-05 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/HuggingFaceVLA/smol-libero

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset was created using [LeRobot](https://github.com/huggingface/lerobot). # Dataset Card for **Smol-LIBERO** ## Dataset Summary Smol-LIBERO is a **compact version of the LIBERO benchmark**, built to make experimentation fast and accessible. At just **1.79 GB** (compared to ~34 GB for the full LIBERO), it contains fewer trajectories and cameras while keeping the same multimodal structure. Each sample includes: - **Images** from two fixed cameras - **Two types of robot state** (end-effector pose + gripper, and full 7-DoF joint positions) - **Actions** (7-DoF joint commands) This setup is especially useful for comparing **low-dimensional state inputs** with **high-dimensional visual inputs**, or combining them in multimodal training. --- ## Dataset Structure ### Data Fields - **`observation.images.image`**: 256×256×3 RGB image (camera 1) - **`observation.images.image2`**: 256×256×3 RGB image (camera 2) - **`observation.state`** *(8 floats)*: end-effector Cartesian pose + gripper `[x, y, z, roll, pitch, yaw, gripper, gripper]` - **`observation.state.joint`** *(7 floats)*: full joint angles `[joint_1, …, joint_7]` - **`action`** *(7 floats)*: target joint commands --- ## Why is it smaller than LIBERO? - **Fewer trajectories/tasks** → subset of the full benchmark - **Only two camera views** → reduced visual redundancy - **Reduced total frames** → shorter episodes or lower FPS That’s why Smol-LIBERO is **1.79 GB instead of 34 GB**. --- ## Intended Uses - Quick prototyping and debugging - Comparing joint-space vs. Cartesian state inputs - Training small VLA baselines before scaling to LIBERO --- ## Limitations - Smaller task and visual diversity compared to LIBERO - Only two fixed camera views - May not fully represent generalization behavior on larger benchmarks ## Citation **BibTeX:** ```bibtex [More Information Needed] ```

本数据集基于[LeRobot](https://github.com/huggingface/lerobot)构建。 # **Smol-LIBERO 数据集卡片** ## 数据集概述 Smol-LIBERO是**LIBERO基准测试集的精简版本**，旨在实现快速且易开展的实验。其存储空间仅为**1.79 GB**（完整LIBERO数据集约为34 GB），尽管轨迹数量与相机视角更少，但保留了原有的多模态数据结构。每个数据样本包含以下内容： - **两台固定视角相机采集的图像** - **两类机器人状态数据**（末端执行器位姿与夹爪状态、完整7自由度（7-DoF）关节位置） - **动作指令**（7自由度关节控制命令）该数据结构尤其适用于对比**低维状态输入**与**高维视觉输入**，或在多模态训练中融合二者。 --- ## 数据集结构 ### 数据字段 - **`observation.images.image`**：256×256×3 分辨率的RGB图像（相机1采集） - **`observation.images.image2`**：256×256×3 分辨率的RGB图像（相机2采集） - **`observation.state`**（共8个浮点值）：末端执行器笛卡尔位姿与夹爪状态，格式为`[x, y, z, 滚转, 俯仰, 偏航, 夹爪开度, 夹爪开度]` - **`observation.state.joint`**（共7个浮点值）：完整关节角度，格式为`[关节1, …, 关节7]` - **`action`**（共7个浮点值）：目标关节控制指令 --- ## 为何Smol-LIBERO体积更小？ - **轨迹与任务数量更少**：仅选取完整基准测试集的子集 - **仅保留两个相机视角**：减少了视觉数据冗余 - **总帧数更少**：任务回合更短或帧率更低这便是Smol-LIBERO仅需1.79 GB存储空间（而非34 GB）的原因。 --- ## 预期用途 - 快速原型开发与调试 - 对比关节空间与笛卡尔空间的状态输入 - 在扩展至完整LIBERO数据集前，训练轻量化视觉语言动作（Visual Language Action, VLA）基准模型 --- ## 局限性 - 与完整LIBERO数据集相比，任务与视觉多样性更有限 - 仅支持两个固定视角的相机 - 无法完全体现完整规模基准测试集上的泛化性能 ## 引用 **BibTeX 格式：** bibtex [More Information Needed]

提供机构：

maas

创建时间：

2025-09-28

搜集汇总

数据集介绍