LeonOverload/primo-bench-json
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/LeonOverload/primo-bench-json
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
pretty_name: PRIMO Bench JSON
task_categories:
- video-text-to-text
language:
- en
configs:
- config_name: all
data_files:
- split: train
path: "jsonl/part-*.jsonl"
- config_name: agibot-id
data_files:
- split: train
path: "jsonl_subsets/agibot-id/part-*.jsonl"
- config_name: agibot-ood
data_files:
- split: train
path: "jsonl_subsets/agibot-ood/part-*.jsonl"
- config_name: behavior-1k-id
data_files:
- split: train
path: "jsonl_subsets/behavior-1k-id/part-*.jsonl"
- config_name: behavior-1k-ood
data_files:
- split: train
path: "jsonl_subsets/behavior-1k-ood/part-*.jsonl"
- config_name: real-humanoid-ood
data_files:
- split: train
path: "jsonl_subsets/real-humanoid-ood/part-*.jsonl"
- config_name: robotwin-id
data_files:
- split: train
path: "jsonl_subsets/robotwin-id/part-*.jsonl"
- config_name: robotwin-ood
data_files:
- split: train
path: "jsonl_subsets/robotwin-ood/part-*.jsonl"
---
# PRIMO Bench JSON
This repository contains JSON annotations for **PRIMO**.
## What Is Included
- `raw_json/`: original JSON files copied from the PRIMO release layout
- `jsonl/`: flattened JSONL shards for better Hugging Face Data Studio preview
- `jsonl_subsets/`: subset-specific JSONL shards used by Dataset Viewer config selector
- `summary.json`: row/shard metadata generated at build time
## Split Type
- Task: Evaluation / Benchmark
- Source pattern: `primo-bench/*/{id,ood}.json`
## Media Mapping
This repo stores annotations only.
Media files (videos/frames) should be prepared in a local folder like:
- `./primo-video/...`
The `path`, `init_frame_path`, and `current_frame_path` fields are expected to resolve against your local `PRIMO-Data` root.
## Quick Load Example
```python
import json
from pathlib import Path
root = Path(".")
jsonl_dir = root / "jsonl"
rows = []
for fp in sorted(jsonl_dir.glob("part-*.jsonl")):
with fp.open("r", encoding="utf-8") as f:
for line in f:
rows.append(json.loads(line))
print(len(rows))
```
## Build Metadata
- Total rows: **23704**
- Shards: **1**
- Shard size: **50000**
## Viewer Subsets
The Hugging Face Dataset Viewer subset selector maps to these configs:
- `all`
- `agibot-id`
- `agibot-ood`
- `behavior-1k-id`
- `behavior-1k-ood`
- `real-humanoid-ood`
- `robotwin-id`
- `robotwin-ood`
## Citations
If you find our work helpful for your research, please consider citing our work.
```
@misc{liu2026passiveobserveractivecritic,
title={From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation},
author={Yibin Liu and Yaxing Lyu and Daqi Gao and Zhixuan Liang and Weiliang Tang and Shilong Mu and Xiaokang Yang and Yao Mu},
year={2026},
eprint={2603.15600},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2603.15600},
}
```
许可证:Apache-2.0
美观名称:PRIMO基准JSON数据集
任务类别:
- 视频-文本转文本
语言:
- 英语
配置项:
- 配置名称:all
数据文件:
- 数据集划分:训练集
路径:"jsonl/part-*.jsonl"
- 配置名称:agibot-id
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/agibot-id/part-*.jsonl"
- 配置名称:agibot-ood
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/agibot-ood/part-*.jsonl"
- 配置名称:behavior-1k-id
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/behavior-1k-id/part-*.jsonl"
- 配置名称:behavior-1k-ood
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/behavior-1k-ood/part-*.jsonl"
- 配置名称:real-humanoid-ood
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/real-humanoid-ood/part-*.jsonl"
- 配置名称:robotwin-id
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/robotwin-id/part-*.jsonl"
- 配置名称:robotwin-ood
数据文件:
- 数据集划分:训练集
路径:"jsonl_subsets/robotwin-ood/part-*.jsonl"
## PRIMO基准JSON数据集
本仓库包含针对**PRIMO**的JSON格式标注文件。
## 包含内容
- `raw_json/`:复刻自PRIMO原始发布格式的原生JSON文件
- `jsonl/`:经扁平化处理的JSONL(JSON Lines)分片文件,以优化Hugging Face数据工作室(Hugging Face Data Studio)的预览效果
- `jsonl_subsets/`:针对各子集的JSONL(JSON Lines)分片文件,供数据集查看器(Dataset Viewer)的配置选择器使用
- `summary.json`:构建阶段生成的行/分片元数据文件
## 数据集划分类型
- 任务类型:评测/基准测试
- 源文件匹配模式:`primo-bench/*/{id,ood}.json`
## 媒体映射说明
本仓库仅存储标注文件。
媒体文件(视频/帧图像)需按如下本地文件夹结构存放:
- `./primo-video/...`
标注中的`path`、`init_frame_path`及`current_frame_path`字段需相对于本地`PRIMO-Data`根目录进行路径解析。
## 快速加载示例
python
import json
from pathlib import Path
root = Path(".")
jsonl_dir = root / "jsonl"
rows = []
for fp in sorted(jsonl_dir.glob("part-*.jsonl")):
with fp.open("r", encoding="utf-8") as f:
for line in f:
rows.append(json.loads(line))
print(len(rows))
## 构建元数据
- 总行数:**23704**
- 分片数量:**1**
- 单分片大小:**50000**
## 查看器子集配置
Hugging Face数据集查看器的子集选择器对应如下配置:
- `all`
- `agibot-id`
- `agibot-ood`
- `behavior-1k-id`
- `behavior-1k-ood`
- `real-humanoid-ood`
- `robotwin-id`
- `robotwin-ood`
## 引用说明
若本工作对您的研究有所裨益,请考虑引用该文献。
@misc{liu2026passiveobserveractivecritic,
title={From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation},
author={Yibin Liu and Yaxing Lyu and Daqi Gao and Zhixuan Liang and Weiliang Tang and Shilong Mu and Xiaokang Yang and Yao Mu},
year={2026},
eprint={2603.15600},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2603.15600},
}
提供机构:
LeonOverload



