five

OpenThoughts-114k-math-open-r1

收藏
魔搭社区2025-11-12 更新2025-02-15 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/OpenThoughts-114k-math-open-r1
下载链接
链接失效反馈
官方服务:
资源简介:
This is a filtered and metadata enriched version of [`open-thoughts/OpenThoughts-114k`](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k). While the original dataset is a valuable resource containing [DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1) outputs, it has very little metadata (only 2 fields: `system` and `conversations`). It does not contain, for instance, the original solution label, which means that we can not verify the model answers. ## What we did - filtered the dataset for math content (math questions were prefixed by "Return your final response within \\boxed{}." -- see [here](https://github.com/open-thoughts/open-thoughts/blob/main/open_thoughts/math/reason.py#L16C43-L16C90)) - found the original questions in the [`AI-MO/NuminaMath-CoT`](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT) and mapped them back to each generation - verified model generations using our [Math-Verify library](https://github.com/huggingface/Math-Verify) - added a metadata field with the token count of each DeepSeek-R1 completion ## Data structure - `source`: original `source` from Numina-Math - `problem`: problem statement, from Numina-Math - `solution`: original solution/gold label, from Numina-Math - `messages`: message turns for finetuning on the correct solutions, from Numina-Math - `system`: system prompt sent to DeepSeek-R1, from OpenThoughts - `conversations`: message turns from the DeepSeek-R1 generation. The last turn is the model output, from OpenThoughts - `generated_token_count`: number of tokens (counted using the DeepSeek-R1 tokenizer) of the model output. - `correct`: label indicating if the DeepSeek-R1 generated solution matches the ground truth `solution`. Checked with [Math-Verify library](https://github.com/huggingface/Math-Verify) ## Some statistics - The original OpenThoughts-114k dataset has **89120/113957 (78%)** math rows - Of those, **56730/89120 (63%)** have correct answers, as checked by Math-Verify - There is a single generation per question - Token count distribution: mean=6366.67, std_dev=4662.88 tokens ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62596f9e1c0a084224b93e00/aPYBSni3Ft6VK1VJkExtS.png)

本数据集是对 [`open-thoughts/OpenThoughts-114k`](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) 进行过滤与元数据增强后的版本。 原始数据集作为包含[DeepSeek-R1](https://huggingface.co/deepseek-ai/DeepSeek-R1)输出的优质资源,仅自带极少量元数据(仅含`system`与`conversations`两个字段),例如缺少原始解答标签,导致无法验证模型生成答案的正确性。 ### 我们所开展的工作 - 针对数学类内容完成数据集过滤:数学题目均以“Return your final response within oxed{}.”作为前缀(详见[此处](https://github.com/open-thoughts/open-thoughts/blob/main/open_thoughts/math/reason.py#L16C43-L16C90)) - 从 [`AI-MO/NuminaMath-CoT`](https://huggingface.co/datasets/AI-MO/NuminaMath-CoT) 数据集内获取原始问题,并将其与每条模型生成结果进行映射关联 - 借助我们开发的[Math-Verify工具库](https://github.com/huggingface/Math-Verify)对模型生成内容进行正确性校验 - 新增元数据字段,用于统计并记录每条DeepSeek-R1生成结果的Token数量 ### 数据结构 - `source`:源自Numina-Math的原始来源字段 - `problem`:源自Numina-Math的问题题干 - `solution`:源自Numina-Math的原始标准解答(金标签) - `messages`:用于针对正确解答进行微调的对话轮次数据,源自Numina-Math - `system`:发送给DeepSeek-R1的系统提示词,源自OpenThoughts数据集 - `conversations`:DeepSeek-R1生成的对话轮次数据,最后一轮为模型输出结果,源自OpenThoughts数据集 - `generated_token_count`:模型输出结果的Token数量(采用DeepSeek-R1对应的Tokenizer进行统计) - `correct`:用于标识DeepSeek-R1生成的解答是否与基准`solution`一致的标签,通过[Math-Verify工具库](https://github.com/huggingface/Math-Verify)完成校验 ### 统计信息 - 原始OpenThoughts-114k数据集中共有**89120/113957(78%)**条数学相关条目 - 其中**56730/89120(63%)**的条目经Math-Verify校验后拥有正确答案 - 每个问题仅对应一条模型生成结果 - Token数量分布:均值为6366.67,标准差为4662.88个Token ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62596f9e1c0a084224b93e00/aPYBSni3Ft6VK1VJkExtS.png)
提供机构:
maas
创建时间:
2025-02-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作