five

MathOdyssey-Reasoning-Paths

收藏
魔搭社区2025-10-17 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/MathOdyssey-Reasoning-Paths
下载链接
链接失效反馈
官方服务:
资源简介:
# Sampled Reasoning Paths for the MathOdyssey dataset This dataset contains sampled reasoning paths for the [MathOdyssey](https://huggingface.co/datasets/MathOdyssey/MathOdyssey) dataset, released as part of the NeurIPS 2025 paper: ["A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning"](https://wnjxyk.github.io/RPC). ## Overview We generated multiple reasoning paths for MathOdyssey problems using 3 math LLMs: * [Deepseek-Math-RL-7B](https://huggingface.co/deepseek-ai/deepseek-math-7b-rl) * [InternLM2-Math-Plus-1.8B](https://huggingface.co/internlm/internlm2-math-plus-1_8b) * [InternLM2-Math-Plus-7B](https://huggingface.co/internlm/internlm2-math-plus-7b) For each problem in the MathOdyssey dataset, we sampled 256 reasoning paths. Sampling was performed with temperatures ∈ {1.0, 1.1, 1.3} to explore diverse reasoning trajectories. ## Structure of each JSON The JSON structure is illustrated below with an example of 3 samples per problem across 2 problems: ```json { "predict": [ // 2D string array: [problems][samples] ["Prediction #1 for Problem 1", "Prediction #2 for Problem 1", "Prediction #3 for Problem 1"], ["Prediction #1 for Problem 2", "Prediction #2 for Problem 2", "Prediction #3 for Problem 2"] ], "answer": [ // Ground truth answers "Answer for Problem 1", "Answer for Problem 2" ], "completion": [ // 2D string array: [problems][samples] ["Completion #1 for Problem 1", "Completion #2 for Problem 1", "Completion #3 for Problem 1"], ["Completion #1 for Problem 2", "Completion #2 for Problem 2", "Completion #3 for Problem 2"] ], "cumulative_logprob": [ // Sum of log probabilities per sample [-15.526, -12.123, -14.12], [-20.526, -22.123, -24.12] ], "mean_logprob": [ // Normalized log probabilities (sum / sequence length, i.e., perplexity) [-0.070, -0.04, -0.05], [-0.170, -0.14, -0.15] ], "prompt": [ // Input prompts for each problem "Prompt for Problem 1", "Prompt for Problem 2" ], "temperature": 0, // Sampling temperature "top_p": 1, // Nucleus sampling parameter "accuracy": [ // 2D boolean array: [samples][problems] [false, true], [true, true], [true, true] ] } ``` ## Available Files ||Deepseek-Math-RL-7B|InternLM2-Math-Plus-7B|InternLM2-Math-Plus-1.8B| |:--:|:--|:--|:--| |T=1.0|`Deepseek-Math-RL-7B.json`|`InternLM2-Math-Plus-7B.json`|`InternLM2-Math-Plus-1.8B.json`| |T=1.1|`Deepseek-Math-RL-7B-T=1.1.json`|`InternLM2-Math-Plus-7B-T=1.1.json`|`InternLM2-Math-Plus-1.8B-T=1.1.json`| |T=1.3|`Deepseek-Math-RL-7B-T=1.3.json`|`InternLM2-Math-Plus-7B-T=1.3.json`|`InternLM2-Math-Plus-1.8B-T=1.3.json`| ## Citation If you use this dataset in your research, please cite: ```bibtex @inproceedings{zhou24theoretical, author = {Zhou, Zhi and Tan, Yuhao and Li, Zenan and Yao, Yuan and Guo, Lan-Zhe and Li, Yu-Feng and Ma, Xiaoxing}, title = {A Theorecial Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning}, booktitle = {Advances in Neural Information Processing Systems}, year = {2025}, } ```

# MathOdyssey数据集采样推理路径 本数据集为MathOdyssey数据集的采样推理路径集合,作为NeurIPS 2025论文《关于衔接大语言模型推理内部概率与自一致性的理论研究》(链接:https://wnjxyk.github.io/RPC)的附属资源发布。MathOdyssey数据集的官方链接为:https://huggingface.co/datasets/MathOdyssey/MathOdyssey。 ## 概述 我们针对MathOdyssey数据集中的每一道题目,使用3款数学大语言模型(LLM)生成了多条推理路径,分别为: * Deepseek-Math-RL-7B(模型链接:https://huggingface.co/deepseek-ai/deepseek-math-7b-rl) * InternLM2-Math-Plus-1.8B(模型链接:https://huggingface.co/internlm/internlm2-math-plus-1_8b) * InternLM2-Math-Plus-7B(模型链接:https://huggingface.co/internlm/internlm2-math-plus-7b) 针对每道题目,我们均采样了256条推理路径。采样时设置的温度参数取值为{1.0、1.1、1.3},以探索多样化的推理轨迹。 ## 单个JSON文件的结构 以下示例展示了单JSON文件的结构,该示例包含2道题目,每道题目对应3条采样路径: json { "predict": [ // 二维字符串数组:[题目数量][采样数量] ["题目1的预测结果#1", "题目1的预测结果#2", "题目1的预测结果#3"], ["题目2的预测结果#1", "题目2的预测结果#2", "题目2的预测结果#3"] ], "answer": [ // 题目标准答案 "题目1的标准答案", "题目2的标准答案" ], "completion": [ // 二维字符串数组:[题目数量][采样数量] ["题目1的采样推理完成文本#1", "题目1的采样推理完成文本#2", "题目1的采样推理完成文本#3"], ["题目2的采样推理完成文本#1", "题目2的采样推理完成文本#2", "题目2的采样推理完成文本#3"] ], "cumulative_logprob": [ // 每条采样路径的对数概率总和 [-15.526, -12.123, -14.12], [-20.526, -22.123, -24.12] ], "mean_logprob": [ // 归一化对数概率(总和除以序列长度,即困惑度) [-0.070, -0.04, -0.05], [-0.170, -0.14, -0.15] ], "prompt": [ // 每道题目的输入提示词 "题目1的输入提示词", "题目2的输入提示词" ], "temperature": 0, // 采样温度参数 "top_p": 1, // 核心采样(top-p)参数 "accuracy": [ // 二维布尔数组:[采样数量][题目数量] [false, true], [true, true], [true, true] ] } ## 可用文件 以下为各模型、各温度参数对应的数据集文件: | |Deepseek-Math-RL-7B|InternLM2-Math-Plus-7B|InternLM2-Math-Plus-1.8B| |:--:|:--|:--|:--| |T=1.0|`Deepseek-Math-RL-7B.json`|`InternLM2-Math-Plus-7B.json`|`InternLM2-Math-Plus-1.8B.json`| |T=1.1|`Deepseek-Math-RL-7B-T=1.1.json`|`InternLM2-Math-Plus-7B-T=1.1.json`|`InternLM2-Math-Plus-1.8B-T=1.1.json`| |T=1.3|`Deepseek-Math-RL-7B-T=1.3.json`|`InternLM2-Math-Plus-7B-T=1.3.json`|`InternLM2-Math-Plus-1.8B-T=1.3.json`| ## 引用方式 若您在研究工作中使用本数据集,请引用以下文献: bibtex @inproceedings{zhou24theoretical, author = {Zhou, Zhi and Tan, Yuhao and Li, Zenan and Yao, Yuan and Guo, Lan-Zhe and Li, Yu-Feng and Ma, Xiaoxing}, title = {A Theorecial Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning}, booktitle = {Advances in Neural Information Processing Systems}, year = {2025}, }
提供机构:
maas
创建时间:
2025-10-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作