open-rs
收藏魔搭社区2025-12-05 更新2025-03-29 收录
下载链接:
https://modelscope.cn/datasets/knoveleng/open-rs
下载链接
链接失效反馈官方服务:
资源简介:
# Open-RS Dataset
## Dataset Description
- **Repository**: [knoveleng/open-rs](https://github.com/knoveleng/open-rs)
- **Paper**: [Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t](https://arxiv.org/abs/2503.16219)
### Summary
The `open-rs` dataset contains 7,000 mathematical reasoning problems, including 3,000 hard problems from `open-s1` and 4,000 (1000 easy + 3000 hard problems) from `open-deepscaler`. It’s a core component of the [Open RS project](https://github.com/knoveleng/open-rs), enhancing reasoning in small LLMs via reinforcement learning.
## Usage
Load the dataset using the Hugging Face `datasets` library:
```python
from datasets import load_dataset
ds = load_dataset("knoveleng/open-rs")["train"]
print(ds[0])
```
## Dataset Structure
### Data Instance
An example entry:
```json
{
"problem": "Let \(S(M)\) denote the sum of digits of a positive integer \(M\) in base 10. Let \(N\) be the smallest positive integer such that \(S(N) = 2013\). What is \(S(5N + 2013)\)?",
"solution": "1. **Find smallest \(N\) with \(S(N) = 2013\):** To minimize \(N\), use mostly 9s. Since \(2013 \div 9 = 223\), \(N\) could be 223 nines (sum \(9 \times 223 = 2007\)), then adjust the first digit to 7, making \(N = 7 \times 10^{223} - 1\). Sum: \(7 + 222 \times 9 = 2013\). 2. **Compute \(5N + 2013\):** \(5N = 5 \times (7 \times 10^{223} - 1) = 35 \times 10^{223} - 5\), so \(5N + 2013 = 35 \times 10^{223} + 2008\). 3. **Calculate \(S(5N + 2013\):** This is 35 followed by 219 zeros, then 2008 (last 4 digits). Sum: \(3 + 5 + 2 + 0 + 0 + 8 = 18\). Final answer: \( \boxed{18} \).",
"answer": "18",
"level": "Hard"
}
```
### Data Fields
- **`problem`**: Mathematical question (string).
- **`solution`**: Detailed solution steps (string); if no official solution exists, the answer is provided in LaTeX format.
- **`answer`**: Correct final answer (string).
- **`level`**: Difficulty level (string): "Hard" or "Easy".
## Citation
```bibtex
@misc{dang2025reinforcementlearningreasoningsmall,
title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
author={Quy-Anh Dang and Chris Ngo},
year={2025},
eprint={2503.16219},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.16219},
}
```
# Open-RS 数据集
## 数据集说明
- **仓库地址**:[knoveleng/open-rs](https://github.com/knoveleng/open-rs)
- **相关论文**:[面向小型大语言模型(Large Language Model, LLM)推理的强化学习:哪些方法有效,哪些无效](https://arxiv.org/abs/2503.16219)
### 概述
`open-rs` 数据集共收录7000道数学推理题,其中3000道难题取自`open-s1`,另有4000道题(含1000道简单题与3000道难题)来自`open-deepscaler`。该数据集是[Open RS项目](https://github.com/knoveleng/open-rs)的核心组成部分,旨在通过强化学习提升小型大语言模型的推理能力。
## 使用方式
通过Hugging Face的`datasets`库加载该数据集:
python
from datasets import load_dataset
ds = load_dataset("knoveleng/open-rs")["train"]
print(ds[0])
## 数据集结构
### 数据实例
示例条目如下:
json
{
"problem": "设 \(S(M)\) 表示十进制正整数 \(M\) 的各位数字之和。令 \(N\) 为满足 \(S(N) = 2013\) 的最小正整数,求 \(S(5N + 2013)\) 的值。",
"solution": "1. **求解满足 \(S(N) = 2013\) 的最小 \(N\)**:为最小化 \(N\),应优先使用尽可能多的数字9。由于 \(2013 div 9 = 223\),若由223个9组成的数,其数字和为 \(9 imes 223 = 2007\),因此需将首位数字调整为7,得到 \(N = 7 imes 10^{223} - 1\),此时数字和为 \(7 + 222 imes 9 = 2013\)。2. **计算 \(5N + 2013\)**:\(5N = 5 imes (7 imes 10^{223} - 1) = 35 imes 10^{223} - 5\),因此 \(5N + 2013 = 35 imes 10^{223} + 2008\)。3. **计算 \(S(5N + 2013)\)**:该数为35后接219个0,最后四位为2008,其数字和为 \(3 + 5 + 2 + 0 + 0 + 8 = 18\)。最终答案:\( oxed{18} \)。",
"answer": "18",
"level": "Hard"
}
### 数据字段
- **`problem`**:数学问题(字符串类型)。
- **`solution`**:详细解题步骤(字符串类型);若无官方解析,则以LaTeX格式提供最终答案。
- **`answer`**:正确最终答案(字符串类型)。
- **`level`**:难度等级(字符串类型),可选值为"Hard(难题)"或"Easy(简单题)"。
## 引用格式
bibtex
@misc{dang2025reinforcementlearningreasoningsmall,
title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
author={Quy-Anh Dang and Chris Ngo},
year={2025},
eprint={2503.16219},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.16219},
}
提供机构:
maas
创建时间:
2025-03-27
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



