R1-Onevision-Bench

Name: R1-Onevision-Bench
Creator: maas
Published: 2025-11-27 16:25:56
License: 暂无描述

魔搭社区2025-11-27 更新2025-03-15 收录

下载链接：

https://modelscope.cn/datasets/Fancy-MLLM/R1-Onevision-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

# R1-Onevision-Bench [\[📂 GitHub\]](https://github.com/Fancy-MLLM/R1-Onevision)[\[📝 Paper\]](https://arxiv.org/pdf/2503.10615) [\[🤗 HF Dataset\]](https://huggingface.co/datasets/Fancy-MLLM/R1-onevision) [\[🤗 HF Model\]](https://huggingface.co/Fancy-MLLM/R1-Onevision-7B) [\[🤗 HF Demo\]](https://huggingface.co/spaces/Fancy-MLLM/R1-OneVision) ## Dataset Overview R1-Onevision-Bench comprises 38 subcategories organized into 5 major domains, including Math, Biology, Chemistry, Physics, Deducation. Additionally, the tasks are categorized into five levels of difficulty, ranging from ‘Junior High School’ to ‘Social Test’ challenges, ensuring a comprehensive evaluation of model capabilities across varying complexities. ## Data Format Reasoning problems are stored in TSV format, with each row containing the following fields: - `index`: data id - `question`: visual reasoning question - `answer`: ground truth answer - `category`: question category - `image`: base64 - `choices`: available answer choices - `level`: question difficulty level ## Benchmark Distribution <img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/PXGfxg9xjMYb5qvXt68le.png" width="50%" /> <img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/CiwppKyI4OO2YHcsjboif.png" width="50%" /> ## Benchmark Samples <img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/9qfmkt-ZjDzjFb1_gLkoQ.png" width="90%" /> # Institution - Zhejiang University ## Benchmark Contact - yang-yi@zju.edu.cn - xiaoxuanhe@zju.edu.cn - panhongkun@zju.edu.cn

# R1-Onevision-Bench [📢 GitHub](https://github.com/Fancy-MLLM/R1-Onevision)[📝 论文](https://arxiv.org/pdf/2503.10615) [🤗 Hugging Face (HF) 数据集](https://huggingface.co/datasets/Fancy-MLLM/R1-onevision) [🤗 Hugging Face (HF) 模型](https://huggingface.co/Fancy-MLLM/R1-Onevision-7B) [🤗 Hugging Face (HF) 演示](https://huggingface.co/spaces/Fancy-MLLM/R1-OneVision) ## 数据集概览 R1-Onevision-Bench 包含38个子类别，归为5大领域，涵盖数学、生物学、化学、物理学、教育学。此外，所有任务被划分为五个难度等级，覆盖从‘初中’到‘社会学科测试’的各类挑战，可全面评估模型在不同复杂度下的能力表现。 ## 数据格式推理类问题以TSV格式（Tab-Separated Values）存储，每行包含以下字段： - `index`: 数据编号 - `question`: 视觉推理问题 - `answer`: 标准答案（真实标注答案） - `category`: 问题类别 - `image`: base64编码图像 - `choices`: 可选答案选项 - `level`: 问题难度等级 ## 基准测试分布 <img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/PXGfxg9xjMYb5qvXt68le.png" width="50%" /> <img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/CiwppKyI4OO2YHcsjboif.png" width="50%" /> ## 基准测试示例 <img src="https://cdn-uploads.huggingface.co/production/uploads/65af78bb3e82498d4c65ed2a/9qfmkt-ZjDzjFb1_gLkoQ.png" width="90%" /> ## 机构 - 浙江大学 ## 基准测试联系方式 - yang-yi@zju.edu.cn - xiaoxuanhe@zju.edu.cn - panhongkun@zju.edu.cn

提供机构：

maas

创建时间：

2025-03-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集