OpenThoughts-114k-math

Name: OpenThoughts-114k-math
Creator: maas
Published: 2026-01-06 16:22:21
License: 暂无描述

魔搭社区2026-01-06 更新2025-02-08 收录

下载链接：

https://modelscope.cn/datasets/open-r1/OpenThoughts-114k-math

下载链接

链接失效反馈

官方服务：

资源简介：

This is a filtered and metadata enriched version of [`open-thoughts/OpenThoughts-114k`](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k). While the original dataset is a valuable resource containing [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) outputs, it has very little metadata (only 2 fields: `system` and `conversations`). It does not contain, for instance, the original solution label, which means that we can not verify the model answers. ## What we did - filtered the dataset for math content (math questions were prefixed by "Return your final response within \\boxed{}." -- see [here](https://github.com/open-thoughts/open-thoughts/blob/main/open_thoughts/math/reason.py#L16C43-L16C90)) - found the original questions in the [`AI-MO/NuminaMath-CoT`](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT) and mapped them back to each generation - verified model generations using our [Math-Verify library](https://github.com/huggingface/Math-Verify) - added a metadata field with the token count of each DeepSeek-R1 completion ## Data structure - `source`: original `source` from Numina-Math - `problem`: problem statement, from Numina-Math - `solution`: original solution/gold label, from Numina-Math - `messages`: message turns for finetuning on the correct solutions, from Numina-Math - `system`: system prompt sent to DeepSeek-R1, from OpenThoughts - `conversations`: message turns from the DeepSeek-R1 generation. The last turn is the model output, from OpenThoughts - `generated_token_count`: number of tokens (counted using the DeepSeek-R1 tokenizer) of the model output. - `correct`: label indicating if the DeepSeek-R1 generated solution matches the ground truth `solution`. Checked with [Math-Verify library](https://github.com/huggingface/Math-Verify) ## Some statistics - The original OpenThoughts-114k dataset has **89120/113957 (78%)** math rows - Of those, **56730/89120 (63%)** have correct answers, as checked by Math-Verify - There is a single generation per question - Token count distribution: mean=6366.67, std_dev=4662.88 tokens ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62596f9e1c0a084224b93e00/aPYBSni3Ft6VK1VJkExtS.png)

## 数据集概述本数据集是对 [`open-thoughts/OpenThoughts-114k`](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) 进行过滤并补充元数据后的版本。原始数据集作为包含[DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)输出的宝贵研究资源，但其元数据极为匮乏，仅包含`system`和`conversations`两个字段。例如，其缺失原始解答标签，导致无法验证模型生成答案的正确性。 ## 数据处理流程 - 对数据集进行数学内容过滤：原始数学问题均以"Return your final response within \boxed{}."作为前缀（详见[此处](https://github.com/open-thoughts/open-thoughts/blob/main/open_thoughts/math/reason.py#L16C43-L16C90)） - 从[`AI-MO/NuminaMath-CoT`](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT)中检索原始问题，并将其与每条模型生成结果进行一一映射 - 使用我们的[Math-Verify库](https://github.com/huggingface/Math-Verify)对模型生成内容进行正确性验证 - 新增元数据字段，用于记录每条DeepSeek-R1生成结果的Token数量 ## 数据结构各字段说明如下： - `source`：源自Numina-Math的原始`source`字段 - `problem`：源自Numina-Math的问题描述文本 - `solution`：源自Numina-Math的原始解答/标准答案 - `messages`：源自Numina-Math的、用于基于正确解答进行微调的对话轮次 - `system`：源自OpenThoughts的、发送至DeepSeek-R1的系统提示词 - `conversations`：源自OpenThoughts的、DeepSeek-R1生成的对话轮次，最后一轮为模型输出内容 - `generated_token_count`：模型输出内容的Token总数（采用DeepSeek-R1的Tokenizer进行统计） - `correct`：用于标识DeepSeek-R1生成的解答是否与标准答案`solution`一致的标签，通过[Math-Verify库](https://github.com/huggingface/Math-Verify)完成校验 ## 统计信息 - 原始OpenThoughts-114k数据集中共有**89120/113957（78%）**条数学相关数据 - 其中经Math-Verify校验后，**56730/89120（63%）**的模型生成结果为正确解答 - 每个问题仅对应一条模型生成结果 - Token数量分布：均值=6366.67，标准差=4662.88 Tokens ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62596f9e1c0a084224b93e00/aPYBSni3Ft6VK1VJkExtS.png)

提供机构：

maas

创建时间：

2025-02-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集