xiongbubu/OpenMathReasoning
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/xiongbubu/OpenMathReasoning
下载链接
链接失效反馈官方服务:
资源简介:
OpenMathReasoning是一个用于训练大型语言模型(LLMs)的大规模数学推理数据集。数据集包含来自AoPS论坛的306K个独特的数学问题,以及3.2M个长思维链(CoT)解决方案、1.7M个长工具集成推理(TIR)解决方案和566K个从多个候选解决方案中选择最有前途的解决方案(GenSelect)的样本。此外,还包括193K个来自AoPS论坛的额外问题(仅问题,无解决方案)。数据集使用Qwen2.5-32B-Instruct预处理问题,并使用DeepSeek-R1和QwQ-32B生成解决方案。该数据集是AIMO-2 Kaggle竞赛获胜提交的基础。
OpenMathReasoning is a large-scale math reasoning dataset for training large language models (LLMs). This dataset contains 306K unique mathematical problems sourced from AoPS forums with 3.2M long chain-of-thought (CoT) solutions, 1.7M long tool-integrated reasoning (TIR) solutions, and 566K samples that select the most promising solution out of many candidates (GenSelect). Additionally, it includes 193K problems sourced from AoPS forums (problems only, no solutions). The dataset uses Qwen2.5-32B-Instruct to preprocess problems and DeepSeek-R1 and QwQ-32B to generate solutions. This dataset was a foundation of the winning submission to the AIMO-2 Kaggle competition.
提供机构:
xiongbubu



