MUSTARDSAUCE 数学定理问题数据集
收藏超神经2024-05-11 更新2024-05-15 收录
下载链接:
https://hyper.ai/cn/datasets/31498
下载链接
链接失效反馈官方服务:
资源简介:
来自香港城市大学、中山大学、华为诺亚方舟实验室等机构的研究人员提出了一个统一的数学推理数据合成框架 MUSTARD,能够生成大量的、正确的且人类可读可理解的高质量数学推理数据。该数据集为研究开源的 MUSTARDSAUCE 数据集。其中每一个数据都包含了自然语言的问题描述和多步求解,以及对偶的形式化语言 Lean 3 的问题描述和多步求解。 MUSTARDSAUCE 的数据包括了数学应用题和定理证明题,涵盖了从小学到高等教育阶段的难度分级。题目的推理步数随着题目难度的增长而增长。最难的题目需要 30 步左右的求解步骤,约 20 个 Lean 3 tactics 。
Researchers from institutions including City University of Hong Kong, Sun Yat-sen University, Huawei Noah's Ark Lab, and other organizations, have proposed a unified mathematical reasoning data synthesis framework named MUSTARD, which can generate a large volume of high-quality, correct, human-readable and understandable mathematical reasoning data. The open-source dataset derived from this framework is the MUSTARDSAUCE dataset. Each data entry contains both a natural language problem description along with its multi-step solution, and a dual formal problem description and multi-step solution written in Lean 3. The data in MUSTARDSAUCE encompasses mathematical word problems and theorem proving problems, covering difficulty levels ranging from primary school to higher education. The number of reasoning steps required for each problem increases with the problem's difficulty; the most difficult problems demand roughly 30 solution steps and approximately 20 Lean 3 tactics.
创建时间:
2024-05-06
搜集汇总
数据集介绍

背景与挑战
背景概述
MUSTARDSAUCE是一个数学定理问题数据集,由多个研究机构基于MUSTARD框架构建,包含自然语言和Lean 3形式化语言的问题描述与多步求解。该数据集涵盖从小学到高等教育的难度分级,最复杂题目需要约30步推理。
以上内容由遇见数据集搜集并总结生成



