five

OVM-process

收藏
魔搭社区2025-11-02 更新2025-01-25 收录
下载链接:
https://modelscope.cn/datasets/FreedomIntelligence/OVM-process
下载链接
链接失效反馈
官方服务:
资源简介:
The training dataset of GSM8K for process reward models in the paper [OVM, Outcome-supervised Value Models for Planning in Mathematical Reasoning](https://arxiv.org/pdf/2311.09724.pdf), where the responses were generated by llama2-7b and the labels were annotated by GPT-4. Steps are split by the newlines in the response. `step_labels` indicates the logical correctness of steps, defined as "logically correct and it's based on accurate premises, not necessarily helps to solve the problem"; `step_labels_progress` indicates helpfulness of steps, defined as "logically correct, based on accurate premises, and helps to solve the problem".

本数据集为刊载于论文《OVM:数学推理规划中的结果监督价值模型》(https://arxiv.org/pdf/2311.09724.pdf)的面向过程奖励模型的GSM8K训练数据集,其模型生成的回复由llama2-7b产出,数据集标签由GPT-4标注。 数据集中的推理步骤以回复内的换行符进行分割。`step_labels`用于表征步骤的逻辑正确性,其定义为:"逻辑正确且基于准确前提,未必有助于问题求解";`step_labels_progress`用于表征步骤的解题辅助性,其定义为:"逻辑正确且基于准确前提,同时有助于问题求解"。
提供机构:
maas
创建时间:
2025-01-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作