five

LeonOverload/primo-bench-json

收藏
Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/LeonOverload/primo-bench-json
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 pretty_name: PRIMO Bench JSON task_categories: - video-text-to-text language: - en configs: - config_name: all data_files: - split: train path: "jsonl/part-*.jsonl" - config_name: agibot-id data_files: - split: train path: "jsonl_subsets/agibot-id/part-*.jsonl" - config_name: agibot-ood data_files: - split: train path: "jsonl_subsets/agibot-ood/part-*.jsonl" - config_name: behavior-1k-id data_files: - split: train path: "jsonl_subsets/behavior-1k-id/part-*.jsonl" - config_name: behavior-1k-ood data_files: - split: train path: "jsonl_subsets/behavior-1k-ood/part-*.jsonl" - config_name: real-humanoid-ood data_files: - split: train path: "jsonl_subsets/real-humanoid-ood/part-*.jsonl" - config_name: robotwin-id data_files: - split: train path: "jsonl_subsets/robotwin-id/part-*.jsonl" - config_name: robotwin-ood data_files: - split: train path: "jsonl_subsets/robotwin-ood/part-*.jsonl" --- # PRIMO Bench JSON This repository contains JSON annotations for **PRIMO**. ## What Is Included - `raw_json/`: original JSON files copied from the PRIMO release layout - `jsonl/`: flattened JSONL shards for better Hugging Face Data Studio preview - `jsonl_subsets/`: subset-specific JSONL shards used by Dataset Viewer config selector - `summary.json`: row/shard metadata generated at build time ## Split Type - Task: Evaluation / Benchmark - Source pattern: `primo-bench/*/{id,ood}.json` ## Media Mapping This repo stores annotations only. Media files (videos/frames) should be prepared in a local folder like: - `./primo-video/...` The `path`, `init_frame_path`, and `current_frame_path` fields are expected to resolve against your local `PRIMO-Data` root. ## Quick Load Example ```python import json from pathlib import Path root = Path(".") jsonl_dir = root / "jsonl" rows = [] for fp in sorted(jsonl_dir.glob("part-*.jsonl")): with fp.open("r", encoding="utf-8") as f: for line in f: rows.append(json.loads(line)) print(len(rows)) ``` ## Build Metadata - Total rows: **23704** - Shards: **1** - Shard size: **50000** ## Viewer Subsets The Hugging Face Dataset Viewer subset selector maps to these configs: - `all` - `agibot-id` - `agibot-ood` - `behavior-1k-id` - `behavior-1k-ood` - `real-humanoid-ood` - `robotwin-id` - `robotwin-ood` ## Citations If you find our work helpful for your research, please consider citing our work. ``` @misc{liu2026passiveobserveractivecritic, title={From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation}, author={Yibin Liu and Yaxing Lyu and Daqi Gao and Zhixuan Liang and Weiliang Tang and Shilong Mu and Xiaokang Yang and Yao Mu}, year={2026}, eprint={2603.15600}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2603.15600}, } ```

许可证:Apache-2.0 美观名称:PRIMO基准JSON数据集 任务类别: - 视频-文本转文本 语言: - 英语 配置项: - 配置名称:all 数据文件: - 数据集划分:训练集 路径:"jsonl/part-*.jsonl" - 配置名称:agibot-id 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/agibot-id/part-*.jsonl" - 配置名称:agibot-ood 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/agibot-ood/part-*.jsonl" - 配置名称:behavior-1k-id 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/behavior-1k-id/part-*.jsonl" - 配置名称:behavior-1k-ood 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/behavior-1k-ood/part-*.jsonl" - 配置名称:real-humanoid-ood 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/real-humanoid-ood/part-*.jsonl" - 配置名称:robotwin-id 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/robotwin-id/part-*.jsonl" - 配置名称:robotwin-ood 数据文件: - 数据集划分:训练集 路径:"jsonl_subsets/robotwin-ood/part-*.jsonl" ## PRIMO基准JSON数据集 本仓库包含针对**PRIMO**的JSON格式标注文件。 ## 包含内容 - `raw_json/`:复刻自PRIMO原始发布格式的原生JSON文件 - `jsonl/`:经扁平化处理的JSONL(JSON Lines)分片文件,以优化Hugging Face数据工作室(Hugging Face Data Studio)的预览效果 - `jsonl_subsets/`:针对各子集的JSONL(JSON Lines)分片文件,供数据集查看器(Dataset Viewer)的配置选择器使用 - `summary.json`:构建阶段生成的行/分片元数据文件 ## 数据集划分类型 - 任务类型:评测/基准测试 - 源文件匹配模式:`primo-bench/*/{id,ood}.json` ## 媒体映射说明 本仓库仅存储标注文件。 媒体文件(视频/帧图像)需按如下本地文件夹结构存放: - `./primo-video/...` 标注中的`path`、`init_frame_path`及`current_frame_path`字段需相对于本地`PRIMO-Data`根目录进行路径解析。 ## 快速加载示例 python import json from pathlib import Path root = Path(".") jsonl_dir = root / "jsonl" rows = [] for fp in sorted(jsonl_dir.glob("part-*.jsonl")): with fp.open("r", encoding="utf-8") as f: for line in f: rows.append(json.loads(line)) print(len(rows)) ## 构建元数据 - 总行数:**23704** - 分片数量:**1** - 单分片大小:**50000** ## 查看器子集配置 Hugging Face数据集查看器的子集选择器对应如下配置: - `all` - `agibot-id` - `agibot-ood` - `behavior-1k-id` - `behavior-1k-ood` - `real-humanoid-ood` - `robotwin-id` - `robotwin-ood` ## 引用说明 若本工作对您的研究有所裨益,请考虑引用该文献。 @misc{liu2026passiveobserveractivecritic, title={From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation}, author={Yibin Liu and Yaxing Lyu and Daqi Gao and Zhixuan Liang and Weiliang Tang and Shilong Mu and Xiaokang Yang and Yao Mu}, year={2026}, eprint={2603.15600}, archivePrefix={arXiv}, primaryClass={cs.RO}, url={https://arxiv.org/abs/2603.15600}, }
提供机构:
LeonOverload
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作