MathOdyssey-Reasoning-Paths

Name: MathOdyssey-Reasoning-Paths
Creator: maas
Published: 2025-10-17 15:39:31
License: 暂无描述

魔搭社区2025-10-17 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/MathOdyssey-Reasoning-Paths

下载链接

链接失效反馈

官方服务：

资源简介：

# Sampled Reasoning Paths for the MathOdyssey dataset This dataset contains sampled reasoning paths for the [MathOdyssey](https://huggingface.co/datasets/MathOdyssey/MathOdyssey) dataset, released as part of the NeurIPS 2025 paper: ["A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning"](https://wnjxyk.github.io/RPC). ## Overview We generated multiple reasoning paths for MathOdyssey problems using 3 math LLMs: * [Deepseek-Math-RL-7B](https://huggingface.co/deepseek-ai/deepseek-math-7b-rl) * [InternLM2-Math-Plus-1.8B](https://huggingface.co/internlm/internlm2-math-plus-1_8b) * [InternLM2-Math-Plus-7B](https://huggingface.co/internlm/internlm2-math-plus-7b) For each problem in the MathOdyssey dataset, we sampled 256 reasoning paths. Sampling was performed with temperatures ∈ {1.0, 1.1, 1.3} to explore diverse reasoning trajectories. ## Structure of each JSON The JSON structure is illustrated below with an example of 3 samples per problem across 2 problems: ```json { "predict": [ // 2D string array: [problems][samples] ["Prediction #1 for Problem 1", "Prediction #2 for Problem 1", "Prediction #3 for Problem 1"], ["Prediction #1 for Problem 2", "Prediction #2 for Problem 2", "Prediction #3 for Problem 2"] ], "answer": [ // Ground truth answers "Answer for Problem 1", "Answer for Problem 2" ], "completion": [ // 2D string array: [problems][samples] ["Completion #1 for Problem 1", "Completion #2 for Problem 1", "Completion #3 for Problem 1"], ["Completion #1 for Problem 2", "Completion #2 for Problem 2", "Completion #3 for Problem 2"] ], "cumulative_logprob": [ // Sum of log probabilities per sample [-15.526, -12.123, -14.12], [-20.526, -22.123, -24.12] ], "mean_logprob": [ // Normalized log probabilities (sum / sequence length, i.e., perplexity) [-0.070, -0.04, -0.05], [-0.170, -0.14, -0.15] ], "prompt": [ // Input prompts for each problem "Prompt for Problem 1", "Prompt for Problem 2" ], "temperature": 0, // Sampling temperature "top_p": 1, // Nucleus sampling parameter "accuracy": [ // 2D boolean array: [samples][problems] [false, true], [true, true], [true, true] ] } ``` ## Available Files ||Deepseek-Math-RL-7B|InternLM2-Math-Plus-7B|InternLM2-Math-Plus-1.8B| |:--:|:--|:--|:--| |T=1.0|`Deepseek-Math-RL-7B.json`|`InternLM2-Math-Plus-7B.json`|`InternLM2-Math-Plus-1.8B.json`| |T=1.1|`Deepseek-Math-RL-7B-T=1.1.json`|`InternLM2-Math-Plus-7B-T=1.1.json`|`InternLM2-Math-Plus-1.8B-T=1.1.json`| |T=1.3|`Deepseek-Math-RL-7B-T=1.3.json`|`InternLM2-Math-Plus-7B-T=1.3.json`|`InternLM2-Math-Plus-1.8B-T=1.3.json`| ## Citation If you use this dataset in your research, please cite: ```bibtex @inproceedings{zhou24theoretical, author = {Zhou, Zhi and Tan, Yuhao and Li, Zenan and Yao, Yuan and Guo, Lan-Zhe and Li, Yu-Feng and Ma, Xiaoxing}, title = {A Theorecial Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning}, booktitle = {Advances in Neural Information Processing Systems}, year = {2025}, } ```

# MathOdyssey数据集采样推理路径本数据集为MathOdyssey数据集的采样推理路径集合，作为NeurIPS 2025论文《关于衔接大语言模型推理内部概率与自一致性的理论研究》（链接：https://wnjxyk.github.io/RPC）的附属资源发布。MathOdyssey数据集的官方链接为：https://huggingface.co/datasets/MathOdyssey/MathOdyssey。 ## 概述我们针对MathOdyssey数据集中的每一道题目，使用3款数学大语言模型（LLM）生成了多条推理路径，分别为： * Deepseek-Math-RL-7B（模型链接：https://huggingface.co/deepseek-ai/deepseek-math-7b-rl） * InternLM2-Math-Plus-1.8B（模型链接：https://huggingface.co/internlm/internlm2-math-plus-1_8b） * InternLM2-Math-Plus-7B（模型链接：https://huggingface.co/internlm/internlm2-math-plus-7b）针对每道题目，我们均采样了256条推理路径。采样时设置的温度参数取值为{1.0、1.1、1.3}，以探索多样化的推理轨迹。 ## 单个JSON文件的结构以下示例展示了单JSON文件的结构，该示例包含2道题目，每道题目对应3条采样路径： json { "predict": [ // 二维字符串数组：[题目数量][采样数量] ["题目1的预测结果#1", "题目1的预测结果#2", "题目1的预测结果#3"], ["题目2的预测结果#1", "题目2的预测结果#2", "题目2的预测结果#3"] ], "answer": [ // 题目标准答案 "题目1的标准答案", "题目2的标准答案" ], "completion": [ // 二维字符串数组：[题目数量][采样数量] ["题目1的采样推理完成文本#1", "题目1的采样推理完成文本#2", "题目1的采样推理完成文本#3"], ["题目2的采样推理完成文本#1", "题目2的采样推理完成文本#2", "题目2的采样推理完成文本#3"] ], "cumulative_logprob": [ // 每条采样路径的对数概率总和 [-15.526, -12.123, -14.12], [-20.526, -22.123, -24.12] ], "mean_logprob": [ // 归一化对数概率（总和除以序列长度，即困惑度） [-0.070, -0.04, -0.05], [-0.170, -0.14, -0.15] ], "prompt": [ // 每道题目的输入提示词 "题目1的输入提示词", "题目2的输入提示词" ], "temperature": 0, // 采样温度参数 "top_p": 1, // 核心采样（top-p）参数 "accuracy": [ // 二维布尔数组：[采样数量][题目数量] [false, true], [true, true], [true, true] ] } ## 可用文件以下为各模型、各温度参数对应的数据集文件： | |Deepseek-Math-RL-7B|InternLM2-Math-Plus-7B|InternLM2-Math-Plus-1.8B| |:--:|:--|:--|:--| |T=1.0|`Deepseek-Math-RL-7B.json`|`InternLM2-Math-Plus-7B.json`|`InternLM2-Math-Plus-1.8B.json`| |T=1.1|`Deepseek-Math-RL-7B-T=1.1.json`|`InternLM2-Math-Plus-7B-T=1.1.json`|`InternLM2-Math-Plus-1.8B-T=1.1.json`| |T=1.3|`Deepseek-Math-RL-7B-T=1.3.json`|`InternLM2-Math-Plus-7B-T=1.3.json`|`InternLM2-Math-Plus-1.8B-T=1.3.json`| ## 引用方式若您在研究工作中使用本数据集，请引用以下文献： bibtex @inproceedings{zhou24theoretical, author = {Zhou, Zhi and Tan, Yuhao and Li, Zenan and Yao, Yuan and Guo, Lan-Zhe and Li, Yu-Feng and Ma, Xiaoxing}, title = {A Theorecial Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning}, booktitle = {Advances in Neural Information Processing Systems}, year = {2025}, }

提供机构：

maas

创建时间：

2025-10-10

5,000+

优质数据集

54 个

任务类型

进入经典数据集