open-rs

Name: open-rs
Creator: maas
Published: 2025-12-05 12:06:14
License: 暂无描述

魔搭社区2025-12-05 更新2025-03-29 收录

下载链接：

https://modelscope.cn/datasets/knoveleng/open-rs

下载链接

链接失效反馈

官方服务：

资源简介：

# Open-RS Dataset ## Dataset Description - **Repository**: [knoveleng/open-rs](https://github.com/knoveleng/open-rs) - **Paper**: [Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t](https://arxiv.org/abs/2503.16219) ### Summary The `open-rs` dataset contains 7,000 mathematical reasoning problems, including 3,000 hard problems from `open-s1` and 4,000 (1000 easy + 3000 hard problems) from `open-deepscaler`. It’s a core component of the [Open RS project](https://github.com/knoveleng/open-rs), enhancing reasoning in small LLMs via reinforcement learning. ## Usage Load the dataset using the Hugging Face `datasets` library: ```python from datasets import load_dataset ds = load_dataset("knoveleng/open-rs")["train"] print(ds[0]) ``` ## Dataset Structure ### Data Instance An example entry: ```json { "problem": "Let \(S(M)\) denote the sum of digits of a positive integer \(M\) in base 10. Let \(N\) be the smallest positive integer such that \(S(N) = 2013\). What is \(S(5N + 2013)\)?", "solution": "1. **Find smallest \(N\) with \(S(N) = 2013\):** To minimize \(N\), use mostly 9s. Since \(2013 \div 9 = 223\), \(N\) could be 223 nines (sum \(9 \times 223 = 2007\)), then adjust the first digit to 7, making \(N = 7 \times 10^{223} - 1\). Sum: \(7 + 222 \times 9 = 2013\). 2. **Compute \(5N + 2013\):** \(5N = 5 \times (7 \times 10^{223} - 1) = 35 \times 10^{223} - 5\), so \(5N + 2013 = 35 \times 10^{223} + 2008\). 3. **Calculate \(S(5N + 2013\):** This is 35 followed by 219 zeros, then 2008 (last 4 digits). Sum: \(3 + 5 + 2 + 0 + 0 + 8 = 18\). Final answer: \( \boxed{18} \).", "answer": "18", "level": "Hard" } ``` ### Data Fields - **`problem`**: Mathematical question (string). - **`solution`**: Detailed solution steps (string); if no official solution exists, the answer is provided in LaTeX format. - **`answer`**: Correct final answer (string). - **`level`**: Difficulty level (string): "Hard" or "Easy". ## Citation ```bibtex @misc{dang2025reinforcementlearningreasoningsmall, title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't}, author={Quy-Anh Dang and Chris Ngo}, year={2025}, eprint={2503.16219}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2503.16219}, } ```

# Open-RS 数据集 ## 数据集说明 - **仓库地址**：[knoveleng/open-rs](https://github.com/knoveleng/open-rs) - **相关论文**：[面向小型大语言模型（Large Language Model, LLM）推理的强化学习：哪些方法有效，哪些无效](https://arxiv.org/abs/2503.16219) ### 概述 `open-rs` 数据集共收录7000道数学推理题，其中3000道难题取自`open-s1`，另有4000道题（含1000道简单题与3000道难题）来自`open-deepscaler`。该数据集是[Open RS项目](https://github.com/knoveleng/open-rs)的核心组成部分，旨在通过强化学习提升小型大语言模型的推理能力。 ## 使用方式通过Hugging Face的`datasets`库加载该数据集： python from datasets import load_dataset ds = load_dataset("knoveleng/open-rs")["train"] print(ds[0]) ## 数据集结构 ### 数据实例示例条目如下： json { "problem": "设 \(S(M)\) 表示十进制正整数 \(M\) 的各位数字之和。令 \(N\) 为满足 \(S(N) = 2013\) 的最小正整数，求 \(S(5N + 2013)\) 的值。", "solution": "1. **求解满足 \(S(N) = 2013\) 的最小 \(N\)**：为最小化 \(N\)，应优先使用尽可能多的数字9。由于 \(2013 div 9 = 223\)，若由223个9组成的数，其数字和为 \(9 imes 223 = 2007\)，因此需将首位数字调整为7，得到 \(N = 7 imes 10^{223} - 1\)，此时数字和为 \(7 + 222 imes 9 = 2013\)。2. **计算 \(5N + 2013\)**：\(5N = 5 imes (7 imes 10^{223} - 1) = 35 imes 10^{223} - 5\)，因此 \(5N + 2013 = 35 imes 10^{223} + 2008\)。3. **计算 \(S(5N + 2013)\)**：该数为35后接219个0，最后四位为2008，其数字和为 \(3 + 5 + 2 + 0 + 0 + 8 = 18\)。最终答案：\( oxed{18} \)。", "answer": "18", "level": "Hard" } ### 数据字段 - **`problem`**：数学问题（字符串类型）。 - **`solution`**：详细解题步骤（字符串类型）；若无官方解析，则以LaTeX格式提供最终答案。 - **`answer`**：正确最终答案（字符串类型）。 - **`level`**：难度等级（字符串类型），可选值为"Hard（难题）"或"Easy（简单题）"。 ## 引用格式 bibtex @misc{dang2025reinforcementlearningreasoningsmall, title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't}, author={Quy-Anh Dang and Chris Ngo}, year={2025}, eprint={2503.16219}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2503.16219}, }

提供机构：

maas

创建时间：

2025-03-27

搜集汇总

数据集介绍

以上内容由遇见数据集搜集并总结生成