open-s1
收藏魔搭社区2025-12-05 更新2025-03-29 收录
下载链接:
https://modelscope.cn/datasets/knoveleng/open-s1
下载链接
链接失效反馈官方服务:
资源简介:
# Open-S1 Dataset
## Dataset Description
- **Repository**: [knoveleng/open-rs](https://github.com/knoveleng/open-rs)
- **Paper**: [Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn’t](https://arxiv.org/abs/2503.16219)
### Summary
The `open-s1` dataset contains 18,615 mathematical reasoning problems, filtered from the [s1K dataset](https://huggingface.co/datasets/simplescaling/s1K). It’s part of the [Open RS](https://github.com/knoveleng/open-rs) project, aimed at enhancing reasoning in small LLMs using reinforcement learning.
## Usage
Load the dataset with the Hugging Face `datasets` library:
```python
from datasets import load_dataset
ds = load_dataset("knoveleng/open-s1")["train"]
print(ds[0])
```
## Dataset Structure
### Data Instance
An example entry:
```json
{
"problem": "Decide whether there exists a function $f : \\mathbb{Z} \\rightarrow \\mathbb{Z}$ such that for each $k = 0, 1, ..., 1996$ and any integer $m$, the equation $f(x) + kx = m$ has at least one integral solution $x$.",
"solution": "To determine if such a function \(f : \\mathbb{Z} \\rightarrow \\mathbb{Z}\) exists, consider the equation \(f(x) + kx = m\). Rewrite it as \(f(x) = m - kx\). Define \(x = 1997n + t\), where \(n \\in \\mathbb{Z}\) and \(t = 0, 1, ..., 1996\), covering all integers uniquely. Set \(f(1997n + t) = m - k(1997n + t)\). Verify: \(f(1997n + t) + k(1997n + t) = (m - k(1997n + t)) + k(1997n + t) = m\), which holds for any \(m\). Thus, the function satisfies the condition. Final answer: \( \\boxed{f(1997n + t) = m - k(1997n + t)} \).",
"answer": "f(1997n + t) = m - k(1997n + t)",
"source": "AI-MO/NuminaMath-CoT/aops_forum",
"messages": [
{
"content": "Decide whether there exists a function $f : \\mathbb{Z} \\rightarrow \\mathbb{Z}$ such that for each $k = 0, 1, ..., 1996$ and any integer $m$, the equation $f(x) + kx = m$ has at least one integral solution $x$.",
"role": "user"
},
{
"content": "<think>
Let’s analyze: rewrite \(f(x) + kx = m\) as \(f(x) = m - kx\). Use \(x = 1997n + t\), \(t = 0, ..., 1996\), \(n \\in \\mathbb{Z}\). Define \(f(1997n + t) = m - k(1997n + t)\). Check: \(f(1997n + t) + k(1997n + t) = m - k(1997n + t) + k(1997n + t) = m\). This works for all \(m\), so the answer is \( \\boxed{f(1997n + t) = m - k(1997n + t)} \).
</think>
Final answer: $\\boxed{f(1997n + t) = m - k(1997n + t)}$",
"role": "assistant"
}
]
}
```
### Data Fields
- **`problem`**: Mathematical question (string).
- **`solution`**: Detailed solution (string).
- **`answer`**: Final answer (string).
- **`source`**: Origin from the s1K dataset (string).
- **`messages`**: List of user/assistant interactions, each with:
- `content`: Text of the message (string).
- `role`: Role of the sender, e.g., "user" or "assistant" (string).
## Citation
```bibtex
@misc{dang2025reinforcementlearningreasoningsmall,
title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
author={Quy-Anh Dang and Chris Ngo},
year={2025},
eprint={2503.16219},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.16219},
}
```
# Open-S1 数据集
## 数据集说明
- **仓库**: [knoveleng/open-rs](https://github.com/knoveleng/open-rs)
- **论文**: [《面向小型大语言模型推理的强化学习:有效与无效方案》](https://arxiv.org/abs/2503.16219)
### 摘要
`open-s1` 数据集包含18615道数学推理题,均从[s1K数据集](https://huggingface.co/datasets/simplescaling/s1K)中筛选得到。该数据集隶属于[Open RS](https://github.com/knoveleng/open-rs)项目,旨在通过强化学习提升小型大语言模型(Large Language Model)的推理能力。
## 使用方法
可通过Hugging Face的`datasets`库加载该数据集,示例代码如下:
python
from datasets import load_dataset
ds = load_dataset("knoveleng/open-s1")["train"]
print(ds[0])
## 数据集结构
### 数据实例
示例条目如下:
json
{
"problem": "Decide whether there exists a function $f : \mathbb{Z} \rightarrow \mathbb{Z}$ such that for each $k = 0, 1, ..., 1996$ and any integer $m$, the equation $f(x) + kx = m$ has at least one integral solution $x$.",
"solution": "To determine if such a function (f : \mathbb{Z} \rightarrow \mathbb{Z}) exists, consider the equation (f(x) + kx = m). Rewrite it as (f(x) = m - kx). Define (x = 1997n + t), where (n \in \mathbb{Z}) and (t = 0, 1, ..., 1996), covering all integers uniquely. Set (f(1997n + t) = m - k(1997n + t)). Verify: (f(1997n + t) + k(1997n + t) = (m - k(1997n + t)) + k(1997n + t) = m), which holds for any (m). Thus, the function satisfies the condition. Final answer: ( \boxed{f(1997n + t) = m - k(1997n + t)} ).",
"answer": "f(1997n + t) = m - k(1997n + t)",
"source": "AI-MO/NuminaMath-CoT/aops_forum",
"messages": [
{
"content": "Decide whether there exists a function $f : \mathbb{Z} \rightarrow \mathbb{Z}$ such that for each $k = 0, 1, ..., 1996$ and any integer $m$, the equation $f(x) + kx = m$ has at least one integral solution $x$.",
"role": "user"
},
{
"content": "<think>
Let’s analyze: rewrite (f(x) + kx = m) as (f(x) = m - kx). Use (x = 1997n + t), (t = 0, ..., 1996), (n \in \mathbb{Z}). Define (f(1997n + t) = m - k(1997n + t)). Check: (f(1997n + t) + k(1997n + t) = m - k(1997n + t) + k(1997n + t) = m). This works for all (m), so the answer is ( \boxed{f(1997n + t) = m - k(1997n + t)} ).
</think>
Final answer: $\boxed{f(1997n + t) = m - k(1997n + t)}$",
"role": "assistant"
}
]
}
### 数据字段
- **`problem`**: 数学问题(字符串类型)
- **`solution`**: 详细解题过程(字符串类型)
- **`answer`**: 最终答案(字符串类型)
- **`source`**: 数据集来源(字符串类型)
- **`messages`**: 用户与助手的交互列表,每个条目包含:
- `content`: 消息文本(字符串类型)
- `role`: 发送者角色,例如`user`(用户)或`assistant`(助手)(字符串类型)
## 引用
bibtex
@misc{dang2025reinforcementlearningreasoningsmall,
title={Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't},
author={Quy-Anh Dang and Chris Ngo},
year={2025},
eprint={2503.16219},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.16219},
}
提供机构:
maas
创建时间:
2025-03-27



