Lean-Workbook
收藏魔搭社区2026-01-06 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/Shanghai_AI_Laboratory/Lean-Workbook
下载链接
链接失效反馈官方服务:
资源简介:
# Lean Workbook
This dataset is about contest-level math problems formalized in Lean 4.
Our dataset contains 57231 problems in the split of Lean Workbook and 82893 problems in the split of Lean Workbook Plus. We provide the natural language statement, answer, formal statement, and formal proof (if available) for each problem. These data can support autoformalization model training and searching for proofs.
We open-source our [code](https://github.com/InternLM/InternLM-Math) and our [data](https://huggingface.co/datasets/InternLM/Lean-Workbook).
Our test environment is based on Lean v4.8.0-rc1 with Mathlib4 of the same version (which can be cloned by specifying the tag v4.8.0-rc1).
# Citation
```
@misc{ying2024lean,
title={Lean Workbook: A large-scale Lean problem set formalized from natural language math problems},
author={Huaiyuan Ying and Zijian Wu and Yihan Geng and Jiayu Wang and Dahua Lin and Kai Chen},
year={2024},
eprint={2406.03847},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
# Lean工作簿(Lean Workbook)
本数据集收录了使用Lean 4形式化的竞赛级数学问题。
本数据集包含两类划分子集:Lean工作簿(Lean Workbook)子集含57231道题目,Lean工作簿增强版(Lean Workbook Plus)子集含82893道题目。我们为每道题目提供自然语言题干、答案、形式化陈述,以及(若存在)形式化证明。此类数据可用于支撑自动形式化(autoformalization)模型训练与定理搜索任务。
我们开源了相关[代码](https://github.com/InternLM/InternLM-Math)与[数据集](https://huggingface.co/datasets/InternLM/Lean-Workbook)。
本测试环境基于Lean v4.8.0-rc1版本,配套同版本的Mathlib4(可通过指定标签v4.8.0-rc1进行克隆)。
# 引用
@misc{ying2024lean,
title={Lean Workbook: A large-scale Lean problem set formalized from natural language math problems},
author={Huaiyuan Ying and Zijian Wu and Yihan Geng and Jiayu Wang and Dahua Lin and Kai Chen},
year={2024},
eprint={2406.03847},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
maas
创建时间:
2025-05-29



