five

minif2f_test

收藏
魔搭社区2025-12-26 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/AI-MO/minif2f_test
下载链接
链接失效反馈
官方服务:
资源简介:
# MiniF2F ## Dataset Usage The evaluation results of Kimina-Prover presented in our work are all based on this MiniF2F test set. ## Improvements We corrected several erroneous formalizations, since the original formal statements could not be proven. We list them in the following table. All our improvements are made based on the MiniF2F test set provided by [DeepseekProverV1.5](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5), which applies certain modifications to the original dataset to adapt it to the Lean 4. |theorem name | formal statement | |:--------------------:|:-----------------| |mathd_numbertheory_618|theorem mathd_numbertheory_618 (n : ℕ) (hn : n > 0) (p : ℕ → ℕ) (h₀ : ∀ x, p x = x ^ 2 - x + 41)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : 1 < Nat.gcd (p n) (p (n + 1))) : 41 ≤ n := by | |aime_1994_p3 |theorem aime_1994_p3 (f : ℤ → ℤ) (h0 : ∀ x, f x + f (x - 1) = x ^ 2) (h1 : f 19 = 94) :<br>&nbsp;&nbsp;&nbsp;&nbsp;f 94 % 1000 = 561 := by| |amc12a_2021_p9 |theorem amc12a_2021_p9 : (∏ k in Finset.range 7, (2 ^ 2 ^ k + 3 ^ 2 ^ k)) = 3 ^ 128 - 2 ^ 128 := by| |mathd_algebra_342 |theorem mathd_algebra_342 (a d : ℝ) (h₀ : (∑ k in Finset.range 5, (a + k * d)) = 70)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : (∑ k in Finset.range 10, (a + k * d)) = 210) : a = 42 / 5 := by| |mathd_algebra_314 |theorem mathd_algebra_314 (n : ℕ) (h₀ : n = 11) : (1 / 4 : ℝ) ^ (n + 1) * 2 ^ (2 * n) = 1 / 4 := by| |amc12a_2020_p7 |theorem amc12a_2020_p7 (a : ℕ → ℕ) (h₀ : a 0 ^ 3 = 1) (h₁ : a 1 ^ 3 = 8) (h₂ : a 2 ^ 3 = 27)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₃ : a 3 ^ 3 = 64) (h₄ : a 4 ^ 3 = 125) (h₅ : a 5 ^ 3 = 216) (h₆ : a 6 ^ 3 = 343) :<br>&nbsp;&nbsp;&nbsp;&nbsp;∑ k in Finset.range 7, 6 * ((a k) ^ 2 : ℤ) - 2 * ∑ k in Finset.range 6, (a k) ^ 2 = 658 := by| |mathd_algebra_275 |theorem mathd_algebra_275 (x : ℝ) (h : ((11 : ℝ) ^ (1 / 4 : ℝ)) ^ (3 * x - 3) = 1 / 5) :<br>&nbsp;&nbsp;&nbsp;&nbsp;((11 : ℝ) ^ (1 / 4 : ℝ)) ^ (6 * x + 2) = 121 / 25 := by| |mathd_numbertheory_343|theorem mathd_numbertheory_343 : (∏ k in Finset.range 6, (2 * k + 1)) % 10 = 5 := by| |algebra_cubrtrp1oncubrtreq3_rcubp1onrcubeq5778|theorem algebra_cubrtrp1oncubrtreq3_rcubp1onrcubeq5778 (r : ℝ) (hr : r ≥ 0)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₀ : r ^ ((1 : ℝ) / 3) + 1 / r ^ ((1 : ℝ) / 3) = 3) : r ^ 3 + 1 / r ^ 3 = 5778 := by| |amc12a_2020_p10|theorem amc12a_2020_p10 (n : ℕ) (h₀ : 1 < n)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : Real.logb 2 (Real.logb 16 n) = Real.logb 4 (Real.logb 4 n)) :<br>&nbsp;&nbsp;&nbsp;&nbsp;(List.sum (Nat.digits 10 n)) = 13 := by| |amc12b_2002_p4|theorem amc12b_2002_p4 (n : ℕ) (h₀ : 0 < n) (h₁ : (1 / 2 + 1 / 3 + 1 / 7 + 1 / ↑n : ℚ).den = 1) : n = 42 := by| |amc12a_2019_p12|theorem amc12a_2019_p12 (x y : ℝ) (h : x > 0 ∧ y > 0) (h₀ : x ≠ 1 ∧ y ≠ 1)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : Real.log x / Real.log 2 = Real.log 16 / Real.log y) (h₂ : x * y = 64) :<br>&nbsp;&nbsp;&nbsp;&nbsp;(Real.log (x / y) / Real.log 2) ^ 2 = 20 := by| |amc12a_2021_p25|theorem amc12a_2021_p25 (N : ℕ) (hN : N > 0) (f : ℕ → ℝ)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₀ : ∀ n, 0 < n → f n = (Nat.divisors n).card / n ^ ((1 : ℝ) / 3))<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : ∀ (n) (_ : n ≠ N), 0 < n → f n < f N) : (List.sum (Nat.digits 10 N)) = 9 := by| |imo_1982_p1|theorem imo_1982_p1 (f : ℕ → ℕ)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₀ : ∀ m n, 0 < m ∧ 0 < n → f (m + n) - f m - f n = (0 : ℤ) ∨ f (m + n) - f m - f n = (1 : ℤ))<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : f 2 = 0) (h₂ : 0 < f 3) (h₃ : f 9999 = 3333) : f 1982 = 660 := by| ## Example To illustrate the kind of corrections we made, we analyze an example where we modified the formalization. For `mathd_numbertheory_618`, its informal statement is : > Euler discovered that the polynomial $p(n) = n^2 - n + 41$ yields prime numbers for many small positive integer values of $n$. What is the smallest positive integer $n$ for which $p(n)$ and $p(n+1)$ share a common factor greater than $1$? Show that it is 41. Its original formal statement is ``` theorem mathd_numbertheory_618 (n : ℕ) (p : ℕ → ℕ) (h₀ : ∀ x, p x = x ^ 2 - x + 41) (h₁ : 1 < Nat.gcd (p n) (p (n + 1))) : 41 ≤ n := by ``` In the informal problem description, $n$ is explicitly stated to be a positive integer. However, in the formalization, $n$ is only assumed to be a natural number. This creates an issue, as $n = 0$ is a special case that makes the proposition false, rendering the original formal statement incorrect. We have corrected this by explicitly adding the assumption $n > 0$, as shown below: ``` theorem mathd_numbertheory_618 (n : ℕ) (hn : n > 0) (p : ℕ → ℕ) (h₀ : ∀ x, p x = x ^ 2 - x + 41) (h₁ : 1 < Nat.gcd (p n) (p (n + 1))) : 41 ≤ n := by ``` ## Contributions We encourage the community to report new issues or contribute improvements via pull requests. ## Acknowledgements We thank Thomas Zhu for helping us fix `mathd_algebra_275`. ## Citation The original benchmark is described in detail in the following [pre-print](https://arxiv.org/abs/2109.00110): ``` @article{zheng2021minif2f, title={MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics}, author={Zheng, Kunhao and Han, Jesse Michael and Polu, Stanislas}, journal={arXiv preprint arXiv:2109.00110}, year={2021} } ```

# MiniF2F ## 数据集使用说明 本工作中展示的Kimina-Prover评测结果,均基于本MiniF2F测试集完成。 ## 改进说明 我们修正了若干存在形式化错误的定理:原形式化陈述无法被证明。我们将这些修正整理于下表中。所有改进均基于[DeepseekProverV1.5](https://github.com/deepseek-ai/DeepSeek-Prover-V1.5)提供的MiniF2F测试集完成,该版本针对原始数据集进行了适配Lean 4的修改。 | 定理名称 | 形式化陈述 | |:---:|:---| | mathd_numbertheory_618 | theorem mathd_numbertheory_618 (n : ℕ) (hn : n > 0) (p : ℕ → ℕ) (h₀ : ∀ x, p x = x ^ 2 - x + 41)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : 1 < Nat.gcd (p n) (p (n + 1))) : 41 ≤ n := by | | aime_1994_p3 | theorem aime_1994_p3 (f : ℤ → ℤ) (h0 : ∀ x, f x + f (x - 1) = x ^ 2) (h1 : f 19 = 94) :<br>&nbsp;&nbsp;&nbsp;&nbsp;f 94 % 1000 = 561 := by | | amc12a_2021_p9 | theorem amc12a_2021_p9 : (∏ k in Finset.range 7, (2 ^ 2 ^ k + 3 ^ 2 ^ k)) = 3 ^ 128 - 2 ^ 128 := by | | mathd_algebra_342 | theorem mathd_algebra_342 (a d : ℝ) (h₀ : (∑ k in Finset.range 5, (a + k * d)) = 70)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : (∑ k in Finset.range 10, (a + k * d)) = 210) : a = 42 / 5 := by | | mathd_algebra_314 | theorem mathd_algebra_314 (n : ℕ) (h₀ : n = 11) : (1 / 4 : ℝ) ^ (n + 1) * 2 ^ (2 * n) = 1 / 4 := by | | amc12a_2020_p7 | theorem amc12a_2020_p7 (a : ℕ → ℕ) (h₀ : a 0 ^ 3 = 1) (h₁ : a 1 ^ 3 = 8) (h₂ : a 2 ^ 3 = 27)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₃ : a 3 ^ 3 = 64) (h₄ : a 4 ^ 3 = 125) (h₅ : a 5 ^ 3 = 216) (h₆ : a 6 ^ 3 = 343) :<br>&nbsp;&nbsp;&nbsp;&nbsp;∑ k in Finset.range 7, 6 * ((a k) ^ 2 : ℤ) - 2 * ∑ k in Finset.range 6, (a k) ^ 2 = 658 := by | | mathd_algebra_275 | theorem mathd_algebra_275 (x : ℝ) (h : ((11 : ℝ) ^ (1 / 4 : ℝ)) ^ (3 * x - 3) = 1 / 5) :<br>&nbsp;&nbsp;&nbsp;&nbsp;((11 : ℝ) ^ (1 / 4 : ℝ)) ^ (6 * x + 2) = 121 / 25 := by | | mathd_numbertheory_343 | theorem mathd_numbertheory_343 : (∏ k in Finset.range 6, (2 * k + 1)) % 10 = 5 := by | | algebra_cubrtrp1oncubrtreq3_rcubp1onrcubeq5778 | theorem algebra_cubrtrp1oncubrtreq3_rcubp1onrcubeq5778 (r : ℝ) (hr : r ≥ 0)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₀ : r ^ ((1 : ℝ) / 3) + 1 / r ^ ((1 : ℝ) / 3) = 3) : r ^ 3 + 1 / r ^ 3 = 5778 := by | | amc12a_2020_p10 | theorem amc12a_2020_p10 (n : ℕ) (h₀ : 1 < n)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : Real.logb 2 (Real.logb 16 n) = Real.logb 4 (Real.logb 4 n)) :<br>&nbsp;&nbsp;&nbsp;&nbsp;(List.sum (Nat.digits 10 n)) = 13 := by | | amc12b_2002_p4 | theorem amc12b_2002_p4 (n : ℕ) (h₀ : 0 < n) (h₁ : (1 / 2 + 1 / 3 + 1 / 7 + 1 / ↑n : ℚ).den = 1) : n = 42 := by | | amc12a_2019_p12 | theorem amc12a_2019_p12 (x y : ℝ) (h : x > 0 ∧ y > 0) (h₀ : x ≠ 1 ∧ y ≠ 1)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : Real.log x / Real.log 2 = Real.log 16 / Real.log y) (h₂ : x * y = 64) :<br>&nbsp;&nbsp;&nbsp;&nbsp;(Real.log (x / y) / Real.log 2) ^ 2 = 20 := by | | amc12a_2021_p25 | theorem amc12a_2021_p25 (N : ℕ) (hN : N > 0) (f : ℕ → ℝ)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₀ : ∀ n, 0 < n → f n = (Nat.divisors n).card / n ^ ((1 : ℝ) / 3))<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : ∀ (n) (_ : n ≠ N), 0 < n → f n < f N) : (List.sum (Nat.digits 10 N)) = 9 := by | | imo_1982_p1 | theorem imo_1982_p1 (f : ℕ → ℕ)<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₀ : ∀ m n, 0 < m ∧ 0 < n → f (m + n) - f m - f n = (0 : ℤ) ∨ f (m + n) - f m - f n = (1 : ℤ))<br>&nbsp;&nbsp;&nbsp;&nbsp;(h₁ : f 2 = 0) (h₂ : 0 < f 3) (h₃ : f 9999 = 3333) : f 1982 = 660 := by | ## 示例 为说明我们所做的修正类型,我们以一个修改了形式化表述的示例进行分析。 针对`mathd_numbertheory_618`,其非形式化表述为: > 欧拉发现,多项式 $p(n) = n^2 - n + 41$ 对诸多较小的正整数$n$均能输出素数。试求使得$p(n)$与$p(n+1)$存在大于1的公因子的最小正整数$n$,并证明该值为41。 其原始形式化陈述为: theorem mathd_numbertheory_618 (n : ℕ) (p : ℕ → ℕ) (h₀ : ∀ x, p x = x ^ 2 - x + 41) (h₁ : 1 < Nat.gcd (p n) (p (n + 1))) : 41 ≤ n := by 在该非形式化问题描述中,明确指出$n$为正整数。但在原始形式化表述中,仅假设$n$为自然数,这会引发问题:当$n=0$时命题不成立,导致原始形式化陈述存在错误。 我们通过显式添加假设$n > 0$完成了修正,如下所示: theorem mathd_numbertheory_618 (n : ℕ) (hn : n > 0) (p : ℕ → ℕ) (h₀ : ∀ x, p x = x ^ 2 - x + 41) (h₁ : 1 < Nat.gcd (p n) (p (n + 1))) : 41 ≤ n := by ## 贡献 我们欢迎社区通过提交拉取请求(Pull Request)报告新问题或贡献改进内容。 ## 致谢 感谢Thomas Zhu协助我们修正了`mathd_algebra_275`。 ## 引用 原始基准数据集的详细描述见于下述预印本[pre-print](https://arxiv.org/abs/2109.00110): @article{zheng2021minif2f, title={MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics}, author={Zheng, Kunhao and Han, Jesse Michael and Polu, Stanislas}, journal={arXiv preprint arXiv:2109.00110}, year={2021} }
提供机构:
maas
创建时间:
2025-04-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作