DrawEduMath
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/allenai/DrawEduMath
下载链接
链接失效反馈官方服务:
资源简介:
# DrawEduMath (Mirror)
This is a mirror of the original DrawEduMath dataset hosted at [Hugging Face](https://huggingface.co/datasets/Heffernan-WPI-Lab/DrawEduMath). This mirror is maintained by Kyle Lo (kylel_at_allenai_dot_org). We may make updates to this mirror that are not reflected on the original.
* *Mirror created*: August 20, 2025
* *Last updated*: August 20, 2025
## About the Dataset
DrawEduMath is a dataset containing images of students' handwritten responses to math problems, paired with detailed descriptions written by
teachers and QA pairs of the models. The images are of handwritten math answers from U.S.-based students, to 188 math problems across Grades 2
through high school.
The dataset is comprised of 1) 2,030 images of students' handwritten responses, 2) 2,030 free-form descriptions written by teachers, and
3) 11,661 question-answer (QA) pairs written by teachers and 44,362 synthetically generated QA pairs created by 2 LLMs: GPT-4o and Claude,
which transformed extracted facets from the teachers' descriptions into QA pairs.
Quick links:
- 📃 [NAACL 2025 Paper](https://aclanthology.org/2025.naacl-long.352/) (🏆 Outstanding paper award)
- 📃 [NeurIPS'24 Math-AI Workshop Paper](https://openreview.net/attachment?id=0vQYvcinij&name=pdf)
# Data Source
The images in the DrawEduMath dataset are from [ASSISTments](https://new.assistments.org/), where students upload their handwritten math work and receive feedback from teachers.
To ensure student privacy, our team went through multiple rounds of the Personal Identifiable Information(PII) removal process.
For the first round, undergraduate research assistants at WPI reviewed the individual images to extract only relevant pieces of information.
This process involved undergraduate research assistants cropping the image to remove any irrelevant background. Further, the presence of any
remaining PII such as the names of students was masked using black rectangular boxes. PII-redacted images from this process were then passed
through a second round of filtering. Teachers who wrote the free-form descriptions about these images also flagged images that were too blurry
or included PII. All such images were removed from the dataset.
# Download
In Python:
```python
from datasets import load_dataset
ds = load_dataset("allenai/DrawEduMath", data_files="Data/DrawEduMath_QA.csv", split='train')
Dataset({
features: ['Problem ID', 'Image Name', 'Image URL', 'Image SHA256', 'Image Caption', 'Facets By GPT4o', 'Facets By Claude', 'QA Teacher', 'QA GPT4o', 'QA Claude'],
num_rows: 2030
})
```
# Data Format
Our main dataset file is `DrawEduMath_QA.csv`. This file contains math problem IDs (`Problem ID`) and filenames of each student response to each problem (`Image Name`). Teacher-written captions and QA pairs are included under `Image Caption` and `QA Teacher`, respectively. In our paper, we used Claude and GPT-4o to decompose teacher-written questions into facets (`Facets By Claude` and `Facets By GPT4o`), which we synthetically restructured into QA pairs (`QA Claude` and `QA GPT4o`).
You may use the following to load the csv cells that contain lists of QA pair dictionaries (e.g. the columns `QA Teacher`, `QA Claude`, `QA GPT4o`):
```
import json
def load_qa_json(qa_pairs):
try:
qa = json.loads(qa_pairs)
qa = ast.literal_eval(qa)
return qa
except:
qa = json.loads(qa_pairs)
return qa
# here, "row" is one line of the csv file, as produced by a csv DictReader or pandas iterrows
for row in ds:
qa = load_qa_json(row['QA Claude'].strip())
for qa_dict in qa:
question = qa_dict['question']
answer = qa_dict['answer']
```
Each image can be downloaded from URLs indicated in the `Image URL` column.
## License
This dataset is licensed under CC-BY-NC-4.0. It is intended for research and educational purposes following ASSISTments's [Responsible Use Guidelines](https://sites.google.com/view/e-trials/resources/guidelines-for-drawedumath).
## Citation
```
@inproceedings{baral-etal-2025-drawedumath,
title = "{D}raw{E}du{M}ath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images",
author = "Baral, Sami and Lucy, Li and Knight, Ryan and Ng, Alice and Soldaini, Luca and Heffernan, Neil and Lo, Kyle",
booktitle = "NAACL",
month = apr,
year = "2025",
url = "https://aclanthology.org/2025.naacl-long.352/",
doi = "10.18653/v1/2025.naacl-long.352",
}
```
# DrawEduMath(镜像仓库)
本仓库为原始DrawEduMath数据集的镜像副本,原始数据集托管于[Hugging Face](https://huggingface.co/datasets/Heffernan-WPI-Lab/DrawEduMath)。该镜像由Kyle Lo维护(邮箱:kylel@allenai.org)。本镜像可能会进行原始仓库未同步的更新。
* *镜像创建时间*:2025年8月20日
* *最后更新时间*:2025年8月20日
## 数据集概览
DrawEduMath数据集收录了美国学生手写的数学题作答图像,配套教师撰写的详细作答描述,以及问答(Question-Answering, QA)对。该数据集覆盖2年级至高中阶段的188道数学题,来源均为美国学生的手写数学作答图像。
本数据集由以下内容构成:1)2030份学生手写数学作答图像;2)2030份教师撰写的自由格式作答描述;3)11661组教师人工编写的问答对;以及4)由GPT-4o与Claude两款大语言模型(Large Language Model, LLM),将从教师作答描述中提取的关键特征维度转化为问答形式后生成的44362组合成问答对。
快速链接:
- 📃 [NAACL 2025 会议论文](https://aclanthology.org/2025.naacl-long.352/)(荣获🏆杰出论文奖)
- 📃 [NeurIPS 2024 数学AI专题研讨会论文](https://openreview.net/attachment?id=0vQYvcinij&name=pdf)
# 数据来源
DrawEduMath数据集的图像来源于[ASSISTments](https://new.assistments.org/)平台,该平台允许学生上传手写数学作业,并获取教师的批改反馈。
为保障学生隐私,本团队开展了多轮个人可识别信息(Personal Identifiable Information, PII)脱敏处理流程。第一轮处理由伍斯特理工学院(Worcester Polytechnic Institute, WPI)的本科研究助理完成,他们逐一审核每张图像,仅保留与作答相关的内容。该流程包括裁剪图像以移除无关背景,同时使用黑色矩形方框遮蔽所有残留的个人可识别信息,例如学生姓名等。
完成第一轮脱敏的图像将进入第二轮过滤环节:撰写自由格式描述的教师会标记出模糊不清或仍包含个人可识别信息的图像,所有此类图像均从数据集中移除。
# 数据集下载
使用Python加载数据集:
python
from datasets import load_dataset
ds = load_dataset("allenai/DrawEduMath", data_files="Data/DrawEduMath_QA.csv", split='train')
Dataset({
features: ['Problem ID', 'Image Name', 'Image URL', 'Image SHA256', 'Image Caption', 'Facets By GPT4o', 'Facets By Claude', 'QA Teacher', 'QA GPT4o', 'QA Claude'],
num_rows: 2030
})
# 数据格式
本数据集的主文件为`DrawEduMath_QA.csv`,该文件包含数学题编号(`Problem ID`)与每道题的学生作答图像文件名(`Image Name`)。教师撰写的作答描述与人工问答对分别存储于`Image Caption`与`QA Teacher`列中。在本研究的论文中,我们使用Claude与GPT-4o两款模型将教师撰写的作答描述拆解为关键特征维度(`Facets By Claude`与`Facets By GPT4o`),并基于这些维度重新合成了问答对(`QA Claude`与`QA GPT4o`)。
可使用以下代码加载存储有问答对字典列表的CSV单元格(例如`QA Teacher`、`QA Claude`与`QA GPT4o`列):
python
import json
import ast
def load_qa_json(qa_pairs):
try:
qa = json.loads(qa_pairs)
qa = ast.literal_eval(qa)
return qa
except:
qa = json.loads(qa_pairs)
return qa
# 此处的`row`代表CSV文件的单一行数据,可通过csv DictReader或pandas的iterrows方法生成
for row in ds:
qa = load_qa_json(row['QA Claude'].strip())
for qa_dict in qa:
question = qa_dict['question']
answer = qa_dict['answer']
所有图像均可通过`Image URL`列中提供的链接下载。
## 授权协议
本数据集采用CC-BY-NC-4.0授权协议发布,仅可用于研究与教育用途,请遵循ASSISTments发布的[负责任使用指南](https://sites.google.com/view/e-trials/resources/guidelines-for-drawedumath)。
## 引用格式
@inproceedings{baral-etal-2025-drawedumath,
title = "{D}raw{E}du{M}ath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images",
author = "Baral, Sami and Lucy, Li and Knight, Ryan and Ng, Alice and Soldaini, Luca and Heffernan, Neil and Lo, Kyle",
booktitle = "NAACL",
month = apr,
year = "2025",
url = "https://aclanthology.org/2025.naacl-long.352/",
doi = "10.18653/v1/2025.naacl-long.352",
}
提供机构:
maas
创建时间:
2025-08-22



