five

NuminaMath-LEAN

收藏
魔搭社区2026-04-28 更新2025-08-02 收录
下载链接:
https://modelscope.cn/datasets/AI-MO/NuminaMath-LEAN
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for NuminaMath-LEAN ![image/png](https://cdn-uploads.huggingface.co/production/uploads/621660534b06824e1e5022ed/YrcCZYcgVEtKoRmN2sLVR.png) ## Dataset Summary NuminaMath-LEAN is a large-scale dataset of 100K mathematical competition problems formalized in Lean 4. It is derived from a challenging subset of the [NuminaMath 1.5](https://huggingface.co/datasets/AI-MO/NuminaMath-1.5) dataset, focusing on problems from prestigious competitions like the IMO and USAMO. It represents the largest collection of human-annotated formal statements and proofs designed for training and evaluating automated theorem provers. This is also the dataset behind our [Kimina-Prover 72B](https://huggingface.co/blog/AI-MO/kimina-prover). ## Acknowledgment We thank all the annotators who have participated in building this dataset. We are also very grateful to the whole Lean 4 community for the priceless help. ## Data description `problem`, `question_type`, `answer`, `source`, `problem_type` are identical to those in the NuminaMath 1.5 - `author`: the author of formal statement, `autoformalizer` if the statement is formalised by our autoformalizer model, `human` when it is formalised by an annotator. - `formal_statement`: formal statement in lean 4 of the problem. It is written in a similar way as miniF2F. - `formal_ground_truth`: the formal proof of the problem written by a human annotator. It is available only when `author == human`. - `ground_truth_type`: a field to describe the annotation status of `formal_ground_truth` - `complete`: this means the formal proof is complete. - `with_sorry`: this means the formal proof is written with some `sorry` in certain `have` statements. - `statement`: only the statement has been formalised. Proof is filled only with a `sorry` - `formal_proof`: the formal proof written by our [model](https://huggingface.co/AI-MO/Kimina-Prover-72B) during RL training. It is available only if the model manage to prove it, otherwise it is empty. - `rl_data`: a dict containing some information during the RL training. - `n_proofs`: The number of times this problem is shown to the model. 0 means the problem has never been shown to the model. - `n_correct_proofs`: The number of times the model managed to solve this problem. - `win_rate`: `n_correct_proofs / n_proofs` ## Method We have selected the most challenging part of the NuminaMath 1.5 dataset. Namely, we have removed `cn_k12` and other non math competition problems. We then filter out `geometry` and `combinatorics` problems, because of their difficulty of formalization, to create the final subset. We have also augmented the original NuminaMath 1.5 by adding the subset `olympiads-ref`, where problems are collected from most challenging competition problems such as IMO, IMO shortlist, USAMO etc. The solution and the final answer have been carefully curated from the official PDF. ## Compatibility All formalizations in this dataset are written in Lean 4 and are compiled and verified against mathlib v4.15.0. Users should ensure their environment matches this dependency to guarantee compatibility and prevent compilation errors. ## Limitation - missing metadata. Some metadata is missing, especially for newly added problems. For example, a lot of human annotated problem do not have `question_type`. This will be fixed very soon. - incorrect reference answer. During the formalization process, we realised that sometimes the reference answer is wrongly extracted. We have not done the reverse process of correcting the reference answer yet. - inaccurate statement formalization. Limited computational resources were used to prove the negation or the contradiction of the problems. We have discovered some wrongly formalized problems (mainly due to the limitations of our autoformalizer). We will release a better version of the formal statement very soon. ## Licensing Information The dataset is available under the Apache License, Version 2.0. ## Citation Information ``` @article{kimina_prover_2025, title = {Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning}, author = {Wang, Haiming and Unsal, Mert and Lin, Xiaohan and Baksys, Mantas and Liu, Junqi and Santos, Marco Dos and Sung, Flood and Vinyes, Marina and Ying, Zhenzhe and Zhu, Zekai and Lu, Jianqiao and Saxcé, Hugues de and Bailey, Bolton and Song, Chendong and Xiao, Chenjun and Zhang, Dehao and Zhang, Ebony and Pu, Frederick and Zhu, Han and Liu, Jiawei and Bayer, Jonas and Michel, Julien and Yu, Longhui and Dreyfus-Schmidt, Léo and Tunstall, Lewis and Pagani, Luigi and Machado, Moreira and Bourigault, Pauline and Wang, Ran and Polu, Stanislas and Barroyer, Thibaut and Li, Wen-Ding and Niu, Yazhe and Fleureau, Yann and Hu, Yangyang and Yu, Zhouliang and Wang, Zihan and Yang, Zhilin and Liu, Zhengying and Li, Jia}, year = {2025}, url = {http://arxiv.org/abs/2504.11354}, } ```

# NuminaMath-LEAN 数据集卡片 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/621660534b06824e1e5022ed/YrcCZYcgVEtKoRmN2sLVR.png) ## 数据集概述 NuminaMath-LEAN 是一个包含10万个采用Lean 4形式化的数学竞赛试题的大规模数据集。其源自[NuminaMath 1.5](https://huggingface.co/datasets/AI-MO/NuminaMath-1.5)数据集的高挑战性子集,聚焦国际数学奥林匹克(International Mathematical Olympiad,简称IMO)、美国数学奥林匹克(United States of America Mathematical Olympiad,简称USAMO)等顶级竞赛的试题。该数据集是目前规模最大的面向自动化定理证明器训练与评估的人工标注形式化陈述与证明集合,同时也是[Kimina-Prover 72B](https://huggingface.co/blog/AI-MO/kimina-prover)所依托的数据集。 ## 致谢 感谢所有参与构建本数据集的标注人员。同时我们衷心感谢Lean 4社区提供的宝贵支持。 ## 数据说明 `problem`、`question_type`、`answer`、`source`、`problem_type`字段与NuminaMath 1.5中的对应字段完全一致。 - `author`:形式化陈述的作者。若陈述由本项目的自动形式化模型生成,则标注为`autoformalizer`;若由标注人员人工完成,则标注为`human`。 - `formal_statement`:该问题的Lean 4形式化陈述,书写风格与miniF2F类似。 - `formal_ground_truth`:由人工标注人员编写的该问题的形式化证明。仅当`author == human`时可用。 - `ground_truth_type`:用于描述`formal_ground_truth`标注状态的字段: - `complete`:表示形式化证明完整无遗漏; - `with_sorry`:表示形式化证明在部分`have`语句中使用了`sorry`(Lean 4中的证明占位符); - `statement`:仅完成了问题陈述的形式化,证明部分仅用`sorry`填充。 - `formal_proof`:本项目的[模型](https://huggingface.co/AI-MO/Kimina-Prover-72B)在强化学习训练期间生成的形式化证明。仅当模型成功证明该问题时可用,否则为空字符串。 - `rl_data`:包含强化学习训练期间相关信息的字典: - `n_proofs`:该问题被送入模型的总次数。取值为0表示该问题从未被模型处理过。 - `n_correct_proofs`:模型成功解决该问题的次数。 - `win_rate`:`n_correct_proofs / n_proofs`,即模型解决该问题的胜率。 ## 构建方法 我们从NuminaMath 1.5数据集中筛选出最具挑战性的部分:首先移除了`cn_k12`及其他非数学竞赛类试题;随后因形式化难度较高,剔除了几何与组合数学类问题,最终得到本数据集。 我们还通过新增`olympiads-ref`子集扩充了原始的NuminaMath 1.5数据集,该子集收录了国际数学奥林匹克、国际数学奥林匹克备选题库、美国数学奥林匹克等顶尖竞赛的高难度试题,其解答与最终答案均从官方PDF文档中精心整理而来。 ## 兼容性 本数据集内的所有形式化内容均采用Lean 4编写,并已针对mathlib v4.15.0版本完成编译与验证。用户需确保其运行环境匹配该依赖项,以保证兼容性并避免编译错误。 ## 局限性 - 元数据缺失:部分元数据尚未完善,尤其是新增试题的元数据。例如,大量人工标注的试题缺失`question_type`字段,该问题将尽快修复。 - 参考答案错误:在形式化过程中,我们发现部分参考答案存在提取错误的情况,目前尚未开展修正参考答案的反向工作。 - 形式化陈述不准确:由于仅使用有限计算资源验证问题的反证或矛盾性,我们发现部分试题的形式化陈述存在错误(主要源于自动形式化工具的局限性)。我们将尽快发布修正后的形式化陈述版本。 ## 许可信息 本数据集采用Apache License, Version 2.0协议进行发布。 ## 引用信息 @article{kimina_prover_2025, title = {Kimina-Prover 预览:面向强化学习的大规模形式化推理模型}, author = {Wang, Haiming and Unsal, Mert and Lin, Xiaohan and Baksys, Mantas and Liu, Junqi and Santos, Marco Dos and Sung, Flood and Vinyes, Marina and Ying, Zhenzhe and Zhu, Zekai and Lu, Jianqiao and Saxcé, Hugues de and Bailey, Bolton and Song, Chendong and Xiao, Chenjun and Zhang, Dehao and Zhang, Ebony and Pu, Frederick and Zhu, Han and Liu, Jiawei and Bayer, Jonas and Michel, Julien and Yu, Longhui and Dreyfus-Schmidt, Léo and Tunstall, Lewis and Pagani, Luigi and Machado, Moreira and Bourigault, Pauline and Wang, Ran and Polu, Stanislas and Barroyer, Thibaut and Li, Wen-Ding and Niu, Yazhe and Fleureau, Yann and Hu, Yangyang and Yu, Zhouliang and Wang, Zihan and Yang, Zhilin and Liu, Zhengying and Li, Jia}, year = {2025}, url = {http://arxiv.org/abs/2504.11354}, }
提供机构:
maas
创建时间:
2025-08-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作