VisualPRM400K-v1.1
收藏魔搭社区2025-12-29 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/OpenGVLab/VisualPRM400K-v1.1
下载链接
链接失效反馈官方服务:
资源简介:
# VisualPRM400K-v1.1
[\[📂 GitHub\]](https://github.com/OpenGVLab/InternVL)
[\[📜 Paper\]](https://arxiv.org/abs/2503.10291)
[\[🆕 Blog\]](https://internvl.github.io/blog/2025-03-13-VisualPRM/)
[\[🤗 model\]](https://huggingface.co/OpenGVLab/VisualPRM-8B)
[\[🤗 dataset\]](https://huggingface.co/datasets/OpenGVLab/VisualPRM400K-v1.1)
[\[🤗 benchmark\]](https://huggingface.co/datasets/OpenGVLab/VisualProcessBench)
***NOTE: VisualPRM400K-v1.1 is a new version of VisualPRM400K, which is used to train [VisualPRM-8B-v1.1](https://huggingface.co/OpenGVLab/VisualPRM-8B-v1.1). Compared to the original version, v1.1 includes additional data sources and prompts during rollout sampling to enhance data diversity.***
***NOTE: To unzip the archive of images, please first run `cat images.zip_* > images.zip` and then run `unzip images.zip`.***
VisualPRM400K is a dataset comprising approximately 400K multimodal process supervision data. We generate the data using an automatic data pipeline. The key idea is to estimate the expected accuracy \\(mc_i\\) of the given step \\(s_{\leq i}\\) based on Monte Carlo sampling and consider the step correct if \\(mc_i>0\\). Please see our [paper](https://arxiv.org/abs/2503.10291) or [blog](https://internvl.github.io/blog/2025-03-13-VisualPRM/) for more details.
NOTE: This dataset is formulated as multi-turn conversation and the expected accuracy \\(mc_i\\) has been converted into correctness token \\(c_i \in \{+,-\}\\). If you want to use the annotations for expected accuracy, please refer to [this version](https://huggingface.co/datasets/OpenGVLab/VisualPRM400K-v1.1-Raw).
## Data Examples







## License
This project is released under the MIT License. This project uses the pre-trained internlm2_5-7b-chat as a component, which is licensed under the Apache License 2.0.
## Citation
If you find this project useful in your research, please consider citing:
```BibTeX
@article{wang2025visualprm,
title={VisualPRM: An Effective Process Reward Model for Multimodal Reasoning},
author={Wang, Weiyun and Gao, Zhangwei and Chen, Lianjie and Chen, Zhe and Zhu, Jinguo and Zhao, Xiangyu and Liu, Yangzhou and Cao, Yue and Ye, Shenglong and Zhu, Xizhou and others},
journal={arXiv preprint arXiv:2503.10291},
year={2025}
}
```
# VisualPRM400K-v1.1
[📢 GitHub 仓库](https://github.com/OpenGVLab/InternVL)
[📜 学术论文](https://arxiv.org/abs/2503.10291)
[🎉 博客文章](https://internvl.github.io/blog/2025-03-13-VisualPRM/)
[🤗 模型](https://huggingface.co/OpenGVLab/VisualPRM-8B)
[🤗 数据集](https://huggingface.co/datasets/OpenGVLab/VisualPRM400K-v1.1)
[🤗 基准测试集](https://huggingface.co/datasets/OpenGVLab/VisualProcessBench)
***注意:VisualPRM400K-v1.1 是 VisualPRM400K 的更新版本,用于训练 [VisualPRM-8B-v1.1](https://huggingface.co/OpenGVLab/VisualPRM-8B-v1.1)。相较于原始版本,v1.1 新增了展开采样阶段的数据源与提示词,以提升数据多样性。***
***注意:若需解压图像压缩包,请先执行 `cat images.zip_* > images.zip` 命令合并分卷压缩包,再执行 `unzip images.zip` 完成解压。***
VisualPRM400K 是一个包含约40万条多模态过程监督数据的数据集。我们通过自动化数据流水线生成该数据集,核心思路是基于蒙特卡洛采样(Monte Carlo Sampling)估算给定步骤 \(s_{\leq i}\) 的期望准确率 \(mc_i\),当 \(mc_i>0\) 时即判定该步骤正确。更多细节可参阅我们的[学术论文](https://arxiv.org/abs/2503.10291)或[博客文章](https://internvl.github.io/blog/2025-03-13-VisualPRM/)。
***注意:本数据集采用多轮对话格式构建,且已将期望准确率 \(mc_i\) 转换为正确性标记 \(c_i \in \{+, -\}\)。若需使用期望准确率的原始标注,请参阅[该版本](https://huggingface.co/datasets/OpenGVLab/VisualPRM400K-v1.1-Raw)。***
## 数据示例







## 许可证
本项目采用 MIT 许可证开源。本项目使用了预训练模型 internlm2_5-7b-chat 作为组件,该模型采用 Apache License 2.0 许可证。
## 引用
若本项目对您的研究有所帮助,请考虑引用如下文献:
BibTeX
@article{wang2025visualprm,
title={VisualPRM: An Effective Process Reward Model for Multimodal Reasoning},
author={Wang, Weiyun and Gao, Zhangwei and Chen, Lianjie and Chen, Zhe and Zhu, Jinguo and Zhao, Xiangyu and Liu, Yangzhou and Cao, Yue and Ye, Shenglong and Zhu, Xizhou and others},
journal={arXiv preprint arXiv:2503.10291},
year={2025}
}
提供机构:
maas
创建时间:
2025-04-22



