MATH-lighteval

Name: MATH-lighteval
Creator: maas
Published: 2026-05-07 15:09:22
License: 暂无描述

魔搭社区2026-05-07 更新2025-02-15 收录

下载链接：

https://modelscope.cn/datasets/swift/MATH-lighteval

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Mathematics Aptitude Test of Heuristics (MATH) dataset in lighteval format ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Builder configs](#builder-configs) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** https://github.com/hendrycks/math - **Repository:** https://github.com/hendrycks/math - **Paper:** https://arxiv.org/pdf/2103.03874.pdf - **Leaderboard:** N/A - **Point of Contact:** Dan Hendrycks ### Dataset Summary The Mathematics Aptitude Test of Heuristics (MATH) dataset consists of problems from mathematics competitions, including the AMC 10, AMC 12, AIME, and more. Each problem in MATH has a full step-by-step solution, which can be used to teach models to generate answer derivations and explanations. This version of the dataset contains appropriate builder configs s.t. it can be used as a drop-in replacement for the inexplicably missing `lighteval/MATH` dataset. ## Dataset Structure ### Data Instances A data instance consists of a competition math problem and its step-by-step solution written in LaTeX and natural language. The step-by-step solution contains the final answer enclosed in LaTeX's `\boxed` tag. An example from the dataset is: ``` {'problem': 'A board game spinner is divided into three parts labeled $A$, $B$ and $C$. The probability of the spinner landing on $A$ is $\\frac{1}{3}$ and the probability of the spinner landing on $B$ is $\\frac{5}{12}$. What is the probability of the spinner landing on $C$? Express your answer as a common fraction.', 'level': 'Level 1', 'type': 'Counting & Probability', 'solution': 'The spinner is guaranteed to land on exactly one of the three regions, so we know that the sum of the probabilities of it landing in each region will be 1. If we let the probability of it landing in region $C$ be $x$, we then have the equation $1 = \\frac{5}{12}+\\frac{1}{3}+x$, from which we have $x=\\boxed{\\frac{1}{4}}$.'} ``` ### Data Fields * `problem`: The competition math problem. * `solution`: The step-by-step solution. * `level`: The problem's difficulty level from 'Level 1' to 'Level 5', where a subject's easiest problems for humans are assigned to 'Level 1' and a subject's hardest problems are assigned to 'Level 5'. * `type`: The subject of the problem: Algebra, Counting & Probability, Geometry, Intermediate Algebra, Number Theory, Prealgebra and Precalculus. ### Data Splits * train: 7,500 examples * test: 5,000 examples ### Builder Configs * default: 7,500 train and 5,000 test examples (full dataset) * algebra: 1,744 train and 1,187 test examples * counting_and_probability: 771 train and 474 test examples * geometry: 870 train 479 test examples * intermediate_algebra: 1,295 train and 903 test examples * number_theory: 869 train and 540 test examples * prealgebra: 1,205 train and 871 test examples * precalculus: 746 train and 546 test examples ## Additional Information ### Licensing Information https://github.com/hendrycks/math/blob/main/LICENSE This repository was created from the [hendrycks/competition_math](https://huggingface.co/datasets/hendrycks/competition_math) dataset. All credit goes to the original authors. ### Citation Information ```bibtex @article{hendrycksmath2021, title={Measuring Mathematical Problem Solving With the MATH Dataset}, author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt}, journal={arXiv preprint arXiv:2103.03874}, year={2021} } ``` ### Contributions Thanks to [@hacobe](https://github.com/hacobe) for adding this dataset.

# lighteval格式下的启发式数学能力测试（Mathematics Aptitude Test of Heuristics，MATH）数据集卡片 ## 目录 - [目录](#table-of-contents) - [数据集描述](#dataset-description) - [数据集概述](#dataset-summary) - [数据集结构](#dataset-structure) - [数据实例](#data-instances) - [数据字段](#data-fields) - [数据划分](#data-splits) - [构建器配置](#builder-configs) - [附加信息](#additional-information) - [数据集编撰者](#dataset-curators) - [授权信息](#licensing-information) - [引用信息](#citation-information) - [贡献致谢](#contributions) ## 数据集描述 - **主页：** https://github.com/hendrycks/math - **代码仓库：** https://github.com/hendrycks/math - **相关论文：** https://arxiv.org/pdf/2103.03874.pdf - **排行榜：** 无 - **联系人：** Dan Hendrycks ### 数据集概述启发式数学能力测试（MATH）数据集收录了来自各类数学竞赛的试题，包括AMC 10、AMC 12、AIME等。该数据集的每道试题均配有完整的分步解析，可用于指导大语言模型生成答案推导过程与解释文本。本版本数据集配备了适配的构建器配置项，可作为缺失的`lighteval/MATH`数据集的无缝替代方案使用。 ## 数据集结构 ### 数据实例数据实例由竞赛数学试题及其采用LaTeX与自然语言撰写的分步解析构成。分步解析中包含被LaTeX的`oxed`标签包裹的最终答案。以下为数据集中的一则示例： {'problem': 'A board game spinner is divided into three parts labeled $A$, $B$ and $C$. The probability of the spinner landing on $A$ is $frac{1}{3}$ and the probability of the spinner landing on $B$ is $frac{5}{12}$. What is the probability of the spinner landing on $C$? Express your answer as a common fraction.', 'level': 'Level 1', 'type': 'Counting & Probability', 'solution': 'The spinner is guaranteed to land on exactly one of the three regions, so we know that the sum of the probabilities of it landing in each region will be 1. If we let the probability of it landing in region $C$ be $x$, we then have the equation $1 = frac{5}{12}+frac{1}{3}+x$, from which we have $x=oxed{frac{1}{4}}$.'} ### 数据字段 * `problem`：竞赛数学试题文本 * `solution`：分步解析文本 * `level`：试题难度等级，取值范围为「Level 1」至「Level 5」，其中人类作答时最简易的试题被归类为「Level 1」，最难的试题则为「Level 5」 * `type`：试题所属学科领域，涵盖代数（Algebra）、计数与概率（Counting & Probability）、几何（Geometry）、中级代数（Intermediate Algebra）、数论（Number Theory）、初等代数（Prealgebra）以及微积分预备（Precalculus） ### 数据划分 * 训练集（train）：7,500个样本 * 测试集（test）：5,000个样本 ### 构建器配置 * default（默认配置）：包含7,500个训练样本与5,000个测试样本（完整数据集） * algebra（代数分支）：1,744个训练样本与1,187个测试样本 * counting_and_probability（计数与概率分支）：771个训练样本与474个测试样本 * geometry（几何分支）：870个训练样本与479个测试样本 * intermediate_algebra（中级代数分支）：1,295个训练样本与903个测试样本 * number_theory（数论分支）：869个训练样本与540个测试样本 * prealgebra（初等代数分支）：1,205个训练样本与871个测试样本 * precalculus（微积分预备分支）：746个训练样本与546个测试样本 ## 附加信息 ### 数据集编撰者 Dan Hendrycks ### 授权信息 https://github.com/hendrycks/math/blob/main/LICENSE 本仓库基于[hendrycks/competition_math](https://huggingface.co/datasets/hendrycks/competition_math)数据集构建，所有荣誉归于原作者。 ### 引用信息 bibtex @article{hendrycksmath2021, title={Measuring Mathematical Problem Solving With the MATH Dataset}, author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt}, journal={arXiv preprint arXiv:2103.03874}, year={2021} } ### 贡献致谢感谢[@hacobe](https://github.com/hacobe)为本数据集添加适配支持。

提供机构：

maas

创建时间：

2025-02-10

5,000+

优质数据集

54 个

任务类型

进入经典数据集