PromptCoT-Problem-Generation-Dataset
收藏魔搭社区2025-10-13 更新2025-03-08 收录
下载链接:
https://modelscope.cn/datasets/zhaoxlpku/PromptCoT-Problem-Generation-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
## **Dataset Format**
Each row in the dataset contains:
- **`prompt`**: The input to the **problem generation model**, including foundational concepts and the required difficulty level (e.g., AIME, HMMT-Feb).
- **`completion`**: The **expected output** for supervised fine-tuning, consisting of a **rationale** (detailing the problem design process) and the **final mathematical problem**.
## **Example**
```json
{
"prompt": "Given foundational concepts and difficulty level, identify connections and develop a question that integrates these concepts with appropriate complexity.\n\nFoundational Concepts:\n1. Ability to apply quantitative reasoning and estimation techniques to solve problems, including making approximations and using logical deductions to arrive at a solution.\n2. Ability to solve equations involving complex numbers, including finding conditions under which two complex numbers are equal, particularly in the context of their magnitudes and arguments.\n3. Fractional Arithmetic: Performing calculations with fractions to determine the final probability.\n4. Interpreting and solving problems involving nested operations or functions.\n5. Using logical reasoning to connect given data points and derive conclusions.\n\nDifficulty Level: HMMT-Feb",
"completion": "<!-- BEGIN RATIONALE -->**Step 1: Selecting Foundational Concepts and Difficulty Level**\nThe problem designer starts by selecting the foundational concepts and the desired difficulty level. In this case, the concepts are quantitative reasoning, complex numbers, fractional arithmetic, nested operations, and logical reasoning. The difficulty level is HMMT-Feb, which is a challenging level requiring a combination of mathematical knowledge, problem-solving skills, and critical thinking.\n\n**Step 2: Identifying Key Skills and Knowledge**\nThe designer identifies key skills that students should demonstrate to solve the problem, including:\n- Quantitative reasoning and estimation techniques.\n- Understanding and solving equations involving complex numbers.\n- Fractional arithmetic for probability calculations.\n- Logical reasoning to connect given data points.\n\n**Step 3: Developing the Problem Statement**\nThe designer refines the problem by introducing a scenario where students estimate complex numbers based on sum and product conditions and use logical reasoning to determine the probability that they are distinct.\n\n**Step 4: Finalizing the Problem**\nThe final problem is:\n\n<!-- BEGIN PROBLEM -->\nLet $a$ and $b$ be complex numbers such that $|a+b|=|a-b|=1$. What is the probability that $a$ and $b$ are distinct complex numbers?\n<!-- END PROBLEM -->"
}
```
## **Citation**
```
@article{zhao2025promptcot,
author = {Zhao, Xueliang and Wu, Wei and Guan, Jian and Kong, Lingpeng},
title = {PromptCoT: Synthesizing Olympiad-Level Problems for Mathematical Reasoning in Large Language Models},
year = {2025},
journal = {arXiv preprint arXiv:2503.02324},
url = {http://arxiv.org/abs/2503.02324}
}
```
## 数据集格式
数据集中的每一行包含:
- **`prompt`**:输入至**问题生成模型(problem generation model)**的内容,涵盖基础知识点与指定难度等级(例如AIME、HMMT-Feb)。
- **`completion`**:用于**监督微调(supervised fine-tuning)**的预期输出,包含**题目设计思路说明**(详细阐述题目设计流程)与**最终数学试题**。
## 示例
json
{
"prompt": "请结合给定基础知识点与难度等级,梳理关联关系并开发一道复杂度适配的整合式试题。
基础知识点:
1. 能够运用定量推理与估算技巧解决问题,包括通过近似计算与逻辑推导得出结论。
2. 能够求解涉及复数的方程,包括推导两复数相等的条件,尤其是结合其模长与辐角的场景。
3. 分数运算:通过分数计算完成最终概率求解。
4. 理解并求解涉及嵌套运算或函数的问题。
5. 运用逻辑推理关联给定数据点并推导结论。
难度等级:HMMT-Feb",
"completion": "<!-- BEGIN RATIONALE -->**步骤1:选定基础知识点与难度等级**
试题设计者首先确定本次任务的基础知识点与目标难度等级。本次选取的知识点包括定量推理、复数运算、分数运算、嵌套运算与逻辑推理,难度等级为HMMT-Feb,该等级要求参赛者结合多维度数学知识、问题解决能力与批判性思维完成作答,具有较高挑战性。
**步骤2:明确核心考察技能**
设计者梳理了该试题需考察的核心能力:
- 定量推理与估算技巧的应用能力
- 复数方程的理解与求解能力
- 用于概率计算的分数运算能力
- 关联给定数据点并推导结论的逻辑推理能力
**步骤3:构思试题场景**
设计者通过构建场景,要求学生基于和与积的条件估算复数,并运用逻辑推理推导两复数互异的概率。
**步骤4:定稿试题**
最终成型的试题为:
<!-- BEGIN PROBLEM -->
设$a$与$b$为复数,满足$|a+b|=|a-b|=1$。求$a$与$b$为互异复数的概率。
<!-- END PROBLEM -->"
}
## 引用
@article{zhao2025promptcot,
author = {赵学亮、吴伟、关健、孔令鹏},
title = {PromptCoT:面向大语言模型(Large Language Model, LLM)数学推理的奥赛级试题合成},
year = {2025},
journal = {arXiv预印本 arXiv:2503.02324},
url = {http://arxiv.org/abs/2503.02324}
}
提供机构:
maas
创建时间:
2025-03-05



