five

open-rs

收藏
魔搭社区2025-12-05 更新2025-03-29 收录
下载链接:
https://modelscope.cn/datasets/knoveleng/open-rs
下载链接
链接失效反馈
官方服务:
资源简介:
# Open-RS Dataset ## Dataset Description - **Repository**: [knoveleng/open-rs](https://github.com/knoveleng/open-rs) - **Paper**: [Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t](https://arxiv.org/abs/2503.16219) ### Summary The `open-rs` dataset contains 7,000 mathematical reasoning problems, including 3,000 hard problems from `open-s1` and 4,000 (1000 easy + 3000 hard problems) from `open-deepscaler`. It’s a core component of the [Open RS project](https://github.com/knoveleng/open-rs), enhancing reasoning in small LLMs via reinforcement learning. ## Usage Load the dataset using the Hugging Face `datasets` library: ```python from datasets import load_dataset ds = load_dataset("knoveleng/open-rs")["train"] print(ds[0]) ``` ## Dataset Structure ### Data Instance An example entry: ```json { "problem": "Let \(S(M)\) denote the sum of digits of a positive integer \(M\) in base 10. Let \(N\) be the smallest positive integer such that \(S(N) = 2013\). What is \(S(5N + 2013)\)?", "solution": "1. **Find smallest \(N\) with \(S(N) = 2013\):** To minimize \(N\), use mostly 9s. Since \(2013 \div 9 = 223\), \(N\) could be 223 nines (sum \(9 \times 223 = 2007\)), then adjust the first digit to 7, making \(N = 7 \times 10^{223} - 1\). Sum: \(7 + 222 \times 9 = 2013\). 2. **Compute \(5N + 2013\):** \(5N = 5 \times (7 \times 10^{223} - 1) = 35 \times 10^{223} - 5\), so \(5N + 2013 = 35 \times 10^{223} + 2008\). 3. **Calculate \(S(5N + 2013\):** This is 35 followed by 219 zeros, then 2008 (last 4 digits). Sum: \(3 + 5 + 2 + 0 + 0 + 8 = 18\). Final answer: \( \boxed{18} \).", "answer": "18", "level": "Hard" } ``` ### Data Fields - **`problem`**: Mathematical question (string). - **`solution`**: Detailed solution steps (string); if no official solution exists, the answer is provided in LaTeX format. - **`answer`**: Correct final answer (string). - **`level`**: Difficulty level (string): "Hard" or "Easy". ## Citation ```bibtex @misc{dang2025reinforcementlearningreasoningsmall, title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't}, author={Quy-Anh Dang and Chris Ngo}, year={2025}, eprint={2503.16219}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2503.16219}, } ```

# Open-RS 数据集 ## 数据集说明 - **仓库地址**:[knoveleng/open-rs](https://github.com/knoveleng/open-rs) - **相关论文**:[面向小型大语言模型(Large Language Model, LLM)推理的强化学习:哪些方法有效,哪些无效](https://arxiv.org/abs/2503.16219) ### 概述 `open-rs` 数据集共收录7000道数学推理题,其中3000道难题取自`open-s1`,另有4000道题(含1000道简单题与3000道难题)来自`open-deepscaler`。该数据集是[Open RS项目](https://github.com/knoveleng/open-rs)的核心组成部分,旨在通过强化学习提升小型大语言模型的推理能力。 ## 使用方式 通过Hugging Face的`datasets`库加载该数据集: python from datasets import load_dataset ds = load_dataset("knoveleng/open-rs")["train"] print(ds[0]) ## 数据集结构 ### 数据实例 示例条目如下: json { "problem": "设 \(S(M)\) 表示十进制正整数 \(M\) 的各位数字之和。令 \(N\) 为满足 \(S(N) = 2013\) 的最小正整数,求 \(S(5N + 2013)\) 的值。", "solution": "1. **求解满足 \(S(N) = 2013\) 的最小 \(N\)**:为最小化 \(N\),应优先使用尽可能多的数字9。由于 \(2013 div 9 = 223\),若由223个9组成的数,其数字和为 \(9 imes 223 = 2007\),因此需将首位数字调整为7,得到 \(N = 7 imes 10^{223} - 1\),此时数字和为 \(7 + 222 imes 9 = 2013\)。2. **计算 \(5N + 2013\)**:\(5N = 5 imes (7 imes 10^{223} - 1) = 35 imes 10^{223} - 5\),因此 \(5N + 2013 = 35 imes 10^{223} + 2008\)。3. **计算 \(S(5N + 2013)\)**:该数为35后接219个0,最后四位为2008,其数字和为 \(3 + 5 + 2 + 0 + 0 + 8 = 18\)。最终答案:\( oxed{18} \)。", "answer": "18", "level": "Hard" } ### 数据字段 - **`problem`**:数学问题(字符串类型)。 - **`solution`**:详细解题步骤(字符串类型);若无官方解析,则以LaTeX格式提供最终答案。 - **`answer`**:正确最终答案(字符串类型)。 - **`level`**:难度等级(字符串类型),可选值为"Hard(难题)"或"Easy(简单题)"。 ## 引用格式 bibtex @misc{dang2025reinforcementlearningreasoningsmall, title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't}, author={Quy-Anh Dang and Chris Ngo}, year={2025}, eprint={2503.16219}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2503.16219}, }
提供机构:
maas
创建时间:
2025-03-27
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作