openbmb/RLPR-Train-Dataset
收藏Hugging Face2025-06-30 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/openbmb/RLPR-Train-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
RLPR-Train-Dataset是一个精心策划的包含77k高质量推理提示的数据集,旨在增强大型语言模型在一般领域(非数学)的推理能力。该数据集从WebInstruct中精选而来,仅包含非数学提示,并通过GPT-4.1进行了筛选以确保提示具有一定难度。使用RLPR框架和该数据集训练的模型能够在不依赖外部验证器的情况下显著提高推理能力。
The RLPR-Train-Dataset is a curated collection of 77k high-quality reasoning prompts designed to enhance the reasoning capabilities of Large Language Models (LLMs) in the general domain (non-mathematical). This dataset is derived from the WebInstruct collection, focusing on non-mathematical prompts and filtered using GPT-4.1 to ensure appropriate difficulty. Models trained with the RLPR framework and this dataset demonstrate significant improvements in reasoning without relying on external verifiers.
提供机构:
openbmb



