MM_Math
收藏魔搭社区2025-12-31 更新2025-06-14 收录
下载链接:
https://modelscope.cn/datasets/THU-KEG/MM_Math
下载链接
链接失效反馈官方服务:
资源简介:
# MM_Math Datasets
We introduce our multimodal mathematics dataset, MM-MATH,.
This dataset is collected from real middle school exams in China, and all the math problems are open-ended to evaluate the mathematical problem-solving abilities of current multimodal models. MM-MATH is annotated with fine-grained three-dimensional labels: difficulty, grade, and knowledge points. The difficulty level is determined based on the average scores of student exams, the grade labels are derived from the educational content of different grades from which the problems were collected, and the knowledge points are categorized by teachers according to the problems' content.
## MM_Math Deacription
The MM_math description contains two documents:
1. **Image.zip**: This archive includes images used in the problems.
2. **MM_Math.jsonl**: This file contains collected middle school exam questions, including the problem statement, solution process, and 3 dimension annotations.
## Data Format
All data in **MM-Math** are standardized to the following format:
```json
{
"question": "The text of each question statement conforms to LaTeX code.",
"file_name": "The names of the question images in the image folder.",
"solution": "The text of each question' soluation conforms to LaTeX code.",
"year": "The grade level annotated from each year examination.",
"difficult": "The difficult level annotated by examination scores.",
"knowledge": "Each knowledge points contained in the question, which is annotated by middle school teacher."
}
```
# MM_Math 数据集
我们提出了多模态数学数据集MM-MATH。该数据集采集自中国真实的中学考试题目,所有数学试题均为开放式题型,用于评估当前多模态模型的数学解题能力。MM-MATH 带有细粒度三维标注标签:难度、年级与知识点。其中,难度等级基于学生考试的平均得分确定;年级标签源自试题采集对应的不同年级的教学内容;知识点则由中学教师根据试题内容进行分类标注。
## MM_Math 数据集说明
本数据集说明包含两个文件:
1. **Image.zip**:该压缩包包含试题配套的图像素材。
2. **MM_Math.jsonl**:该文件存储采集到的中学考试试题,包含试题题干、解题过程与三维标注信息。
## 数据格式
**MM-MATH** 中的所有数据均遵循以下标准化格式:
json
{
"question": "试题题干文本,符合LaTeX语法规范。",
"file_name": "试题图像在图像文件夹中的文件名。",
"solution": "试题解题过程文本,符合LaTeX语法规范。",
"year": "该试题标注对应的考试年级层级(源自对应年份的考试)。",
"difficult": "难度等级,由考试得分标注确定。",
"knowledge": "试题涵盖的所有知识点,由中学教师标注。"
}
提供机构:
maas
创建时间:
2025-07-15



