five

smol-libero

收藏
魔搭社区2025-12-05 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/HuggingFaceVLA/smol-libero
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset was created using [LeRobot](https://github.com/huggingface/lerobot). # Dataset Card for **Smol-LIBERO** ## Dataset Summary Smol-LIBERO is a **compact version of the LIBERO benchmark**, built to make experimentation fast and accessible. At just **1.79 GB** (compared to ~34 GB for the full LIBERO), it contains fewer trajectories and cameras while keeping the same multimodal structure. Each sample includes: - **Images** from two fixed cameras - **Two types of robot state** (end-effector pose + gripper, and full 7-DoF joint positions) - **Actions** (7-DoF joint commands) This setup is especially useful for comparing **low-dimensional state inputs** with **high-dimensional visual inputs**, or combining them in multimodal training. --- ## Dataset Structure ### Data Fields - **`observation.images.image`**: 256×256×3 RGB image (camera 1) - **`observation.images.image2`**: 256×256×3 RGB image (camera 2) - **`observation.state`** *(8 floats)*: end-effector Cartesian pose + gripper `[x, y, z, roll, pitch, yaw, gripper, gripper]` - **`observation.state.joint`** *(7 floats)*: full joint angles `[joint_1, …, joint_7]` - **`action`** *(7 floats)*: target joint commands --- ## Why is it smaller than LIBERO? - **Fewer trajectories/tasks** → subset of the full benchmark - **Only two camera views** → reduced visual redundancy - **Reduced total frames** → shorter episodes or lower FPS That’s why Smol-LIBERO is **1.79 GB instead of 34 GB**. --- ## Intended Uses - Quick prototyping and debugging - Comparing joint-space vs. Cartesian state inputs - Training small VLA baselines before scaling to LIBERO --- ## Limitations - Smaller task and visual diversity compared to LIBERO - Only two fixed camera views - May not fully represent generalization behavior on larger benchmarks ## Citation **BibTeX:** ```bibtex [More Information Needed] ```

本数据集基于[LeRobot](https://github.com/huggingface/lerobot)构建。 # **Smol-LIBERO 数据集卡片** ## 数据集概述 Smol-LIBERO是**LIBERO基准测试集的精简版本**,旨在实现快速且易开展的实验。其存储空间仅为**1.79 GB**(完整LIBERO数据集约为34 GB),尽管轨迹数量与相机视角更少,但保留了原有的多模态数据结构。 每个数据样本包含以下内容: - **两台固定视角相机采集的图像** - **两类机器人状态数据**(末端执行器位姿与夹爪状态、完整7自由度(7-DoF)关节位置) - **动作指令**(7自由度关节控制命令) 该数据结构尤其适用于对比**低维状态输入**与**高维视觉输入**,或在多模态训练中融合二者。 --- ## 数据集结构 ### 数据字段 - **`observation.images.image`**:256×256×3 分辨率的RGB图像(相机1采集) - **`observation.images.image2`**:256×256×3 分辨率的RGB图像(相机2采集) - **`observation.state`**(共8个浮点值):末端执行器笛卡尔位姿与夹爪状态,格式为`[x, y, z, 滚转, 俯仰, 偏航, 夹爪开度, 夹爪开度]` - **`observation.state.joint`**(共7个浮点值):完整关节角度,格式为`[关节1, …, 关节7]` - **`action`**(共7个浮点值):目标关节控制指令 --- ## 为何Smol-LIBERO体积更小? - **轨迹与任务数量更少**:仅选取完整基准测试集的子集 - **仅保留两个相机视角**:减少了视觉数据冗余 - **总帧数更少**:任务回合更短或帧率更低 这便是Smol-LIBERO仅需1.79 GB存储空间(而非34 GB)的原因。 --- ## 预期用途 - 快速原型开发与调试 - 对比关节空间与笛卡尔空间的状态输入 - 在扩展至完整LIBERO数据集前,训练轻量化视觉语言动作(Visual Language Action, VLA)基准模型 --- ## 局限性 - 与完整LIBERO数据集相比,任务与视觉多样性更有限 - 仅支持两个固定视角的相机 - 无法完全体现完整规模基准测试集上的泛化性能 ## 引用 **BibTeX 格式:** bibtex [More Information Needed]
提供机构:
maas
创建时间:
2025-09-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作