VisualProcessBench
收藏魔搭社区2026-01-06 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/OpenGVLab/VisualProcessBench
下载链接
链接失效反馈官方服务:
资源简介:
# VisualProcessBench
[\[📂 GitHub\]](https://github.com/OpenGVLab/InternVL)
[\[📜 Paper\]](https://arxiv.org/abs/2503.10291)
[\[🆕 Blog\]](https://internvl.github.io/blog/2025-03-13-VisualPRM/)
[\[🤗 model\]](https://huggingface.co/OpenGVLab/VisualPRM-8B)
[\[🤗 dataset\]](https://huggingface.co/datasets/OpenGVLab/VisualPRM400K)
[\[🤗 benchmark\]](https://huggingface.co/datasets/OpenGVLab/VisualProcessBench)
VisualProcessBench is a benchmark designed to measure the abilities of PRMs and MLLMs to identify erroneous steps in multimodal reasoning tasks. This benchmark comprises 2,866 samples with a total of 26,950 human-annotated step-wise correctness labels.
## Data fields
- Data fields for each sample:
| Key | Description |
| -------------- | ------------------------------------------------------------------------------------------ |
| `image` | List of Image path. |
| `question` | Input query. |
| `answer` | Ground Truth to this question. |
| `response` | The model-generated response to this question, which has been splited into multiple steps. |
| `policy_model` | The model used to generate the response. |
| `data_source` | The source of this question. |
- Data fields for each response:
| Key | Description |
| --------------------- | -------------------------------------------------------------------------------------------------- |
| `steps` | Steps of this response. |
| `process_correctness` | Correctness annotation of each step. 1, 0, -1 denotes correct, neural, and incorrect, respectively |
## Data Examples
















## License
This project is released under the MIT License. This project uses the pre-trained internlm2_5-7b-chat as a component, which is licensed under the Apache License 2.0.
## Citation
If you find this project useful in your research, please consider citing:
```BibTeX
@article{wang2025visualprm,
title={VisualPRM: An Effective Process Reward Model for Multimodal Reasoning},
author={Wang, Weiyun and Gao, Zhangwei and Chen, Lianjie and Chen, Zhe and Zhu, Jinguo and Zhao, Xiangyu and Liu, Yangzhou and Cao, Yue and Ye, Shenglong and Zhu, Xizhou and others},
journal={arXiv preprint arXiv:2503.10291},
year={2025}
}
```
提供机构:
maas
创建时间:
2025-03-15



