eunyoung927/smol-libero-v30
收藏Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/eunyoung927/smol-libero-v30
下载链接
链接失效反馈官方服务:
资源简介:
---
{
"codebase_version": "v3.0",
"robot_type": null,
"total_episodes": 50,
"total_frames": 13021,
"total_tasks": 1,
"chunks_size": 1000,
"fps": 20,
"splits": {
"train": "0:50"
},
"data_path": "data/chunk-{chunk_index:03d}/file-{file_index:03d}.parquet",
"video_path": null,
"features": {
"observation.images.image": {
"dtype": "image",
"shape": [
256,
256,
3
],
"names": [
"height",
"width",
"channel"
],
"fps": 20
},
"observation.images.image2": {
"dtype": "image",
"shape": [
256,
256,
3
],
"names": [
"height",
"width",
"channel"
],
"fps": 20
},
"observation.state": {
"dtype": "float32",
"shape": [
8
],
"names": {
"motors": [
"x",
"y",
"z",
"roll",
"pitch",
"yaw",
"gripper",
"gripper"
]
},
"fps": 20
},
"observation.state.joint": {
"dtype": "float32",
"shape": [
7
],
"names": {
"motors": [
"joint_1",
"joint_2",
"joint_3",
"joint_4",
"joint_5",
"joint_6",
"joint_7"
]
},
"fps": 20
},
"action": {
"dtype": "float32",
"shape": [
7
],
"names": {
"motors": [
"x",
"y",
"z",
"roll",
"pitch",
"yaw",
"gripper"
]
},
"fps": 20
},
"timestamp": {
"dtype": "float32",
"shape": [
1
],
"names": null,
"fps": 20
},
"frame_index": {
"dtype": "int64",
"shape": [
1
],
"names": null,
"fps": 20
},
"episode_index": {
"dtype": "int64",
"shape": [
1
],
"names": null,
"fps": 20
},
"index": {
"dtype": "int64",
"shape": [
1
],
"names": null,
"fps": 20
},
"task_index": {
"dtype": "int64",
"shape": [
1
],
"names": null,
"fps": 20
}
},
"data_files_size_in_mb": 100,
"video_files_size_in_mb": 200
}
---
This dataset was created using [LeRobot](https://github.com/huggingface/lerobot).
It is a converted version of [HuggingFaceVLA/smol-libero](https://huggingface.co/datasets/HuggingFaceVLA/smol-libero), updated from `codebase_version` v2.1 to v3.0.
# Dataset Card for Smol-LIBERO
## Dataset Summary
Smol-LIBERO is a compact version of the LIBERO benchmark, built to make experimentation fast and accessible.
At just 1.79 GB (compared to ~34 GB for the full LIBERO), it contains fewer trajectories and cameras while keeping the same multimodal structure.
Each sample includes:
- Images from two fixed cameras
- Two types of robot state (end-effector pose + gripper, and full 7-DoF joint positions)
- Actions (7-DoF joint commands)
This setup is especially useful for comparing low-dimensional state inputs with high-dimensional visual inputs, or combining them in multimodal training.
### 数据集基本信息
- 代码库版本:v3.0
- 机器人类型:无
- 总回合数:50
- 总帧数:13021
- 总任务数:1
- 分块大小:1000
- 帧率(FPS):20
- 数据集划分:训练集:0:50(即全部50个回合作为训练数据)
- 数据路径:`data/chunk-{chunk_index:03d}/file-{file_index:03d}.parquet`
- 视频路径:无
- 特征项:
1. 观测图像.image:
- 数据类型:图像
- 形状:[256, 256, 3]
- 维度命名:["高度", "宽度", "通道数"]
- 帧率(FPS):20
2. 观测图像.image2:
- 数据类型:图像
- 形状:[256, 256, 3]
- 维度命名:["高度", "宽度", "通道数"]
- 帧率(FPS):20
3. 观测状态.observation.state:
- 数据类型:float32
- 形状:[8]
- 维度命名:{"motors": ["x轴", "y轴", "z轴", "横滚(roll)", "俯仰(pitch)", "偏航(yaw)", "夹爪", "夹爪"]}
- 帧率(FPS):20
4. 观测状态.observation.state.joint:
- 数据类型:float32
- 形状:[7]
- 维度命名:{"motors": ["关节1", "关节2", "关节3", "关节4", "关节5", "关节6", "关节7"]}
- 帧率(FPS):20
5. 动作指令.action:
- 数据类型:float32
- 形状:[7]
- 维度命名:{"motors": ["x轴", "y轴", "z轴", "横滚(roll)", "俯仰(pitch)", "偏航(yaw)", "夹爪"]}
- 帧率(FPS):20
6. 时间戳.timestamp:
- 数据类型:float32
- 形状:[1]
- 维度命名:无
- 帧率(FPS):20
7. 帧索引.frame_index:
- 数据类型:int64
- 形状:[1]
- 维度命名:无
- 帧率(FPS):20
8. 回合索引.episode_index:
- 数据类型:int64
- 形状:[1]
- 维度命名:无
- 帧率(FPS):20
9. 全局索引.index:
- 数据类型:int64
- 形状:[1]
- 维度命名:无
- 帧率(FPS):20
10. 任务索引.task_index:
- 数据类型:int64
- 形状:[1]
- 维度命名:无
- 帧率(FPS):20
- 数据文件总大小:100 MB
- 视频文件总大小:200 MB
### 数据集构建说明
本数据集基于[LeRobot](https://github.com/huggingface/lerobot)构建,是[HuggingFaceVLA/smol-libero](https://huggingface.co/datasets/HuggingFaceVLA/smol-libero)的转换版本,已从代码库版本v2.1升级至v3.0。
# Smol-LIBERO 数据集卡片
## 数据集概述
Smol-LIBERO是LIBERO基准测试的精简版本,旨在实现快速且便捷的实验开展。其体积仅为1.79 GB(完整LIBERO数据集约为34 GB),尽管轨迹与相机数量更少,但保留了一致的多模态数据结构。
每个样本包含以下内容:
- 两台固定相机采集的图像
- 两类机器人状态数据:末端执行器位姿+夹爪状态,以及完整7自由度关节位置
- 动作指令:7自由度关节控制指令
该数据集结构特别适用于对比低维状态输入与高维视觉输入,或在多模态训练中对二者进行融合。
提供机构:
eunyoung927



