OpenGVLab/VisualProcessBench
收藏Hugging Face2025-03-18 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/OpenGVLab/VisualProcessBench
下载链接
链接失效反馈官方服务:
资源简介:
VisualProcessBench是一个多模态推理任务基准数据集,用于评估PRMs和MLLMs在识别推理过程中错误步骤的能力。数据集包含2866个样本,每个样本都有26950个逐步正确性的人工标注标签。数据集提供了图片、问题、答案、模型响应及其步骤的正确性标注等信息。
VisualProcessBench is a multimodal reasoning task benchmark designed to evaluate the ability of PRMs and MLLMs to identify erroneous steps in the reasoning process. The dataset consists of 2,866 samples, each with 26,950 human-annotated step-wise correctness labels, providing image paths, questions, answers, model responses, and correctness annotations for each step.
提供机构:
OpenGVLab



