JingHaoZ/RLFR-Dataset-LM
收藏Hugging Face2025-11-14 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/JingHaoZ/RLFR-Dataset-LM
下载链接
链接失效反馈官方服务:
资源简介:
RLFR-Dataset-LM是一个包含102k数学样本的数据集,旨在增强大型语言模型(LLM)的推理能力。该数据集由两部分组成:一部分是从OpenR1-Math-220k中提取的离线起始部分,包含了由DeepSeek R1生成的详细高质量解决方案,用于建立奖励准备的流程环境;另一部分是来自MATH-lvl3to5-8k的RL部分,包含了可验证的答案,用于RLFR训练。离线部分包含94k样本,RL部分包含8k样本。这两部分数据直接来源于它们的存储库。
The RLFR-Dataset-LM is a collection of 102k math samples designed to enhance the reasoning capabilities of Large Language Models (LLMs). The dataset consists of an offline start part from OpenR1-Math-220k with detailed high-quality solutions generated by DeepSeek R1 to establish the flow environment for reward preparation, and an RL part from MATH-lvl3to5-8k with verifiable answers for RLFR training. The offline part contains 94k samples from the default split of OpenR1-Math-220k, and the RL part contains 8k samples. These two parts of the dataset are directly sourced from their respective repositories.
提供机构:
JingHaoZ



