bryanchrist/annotations
收藏Hugging Face2024-04-01 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/bryanchrist/annotations
下载链接
链接失效反馈官方服务:
资源简介:
---
license: gpl-3.0
---
## MATHWELL Human Annotation Dataset
The MATHWELL Human Annotation Dataset contains 4,734 synthetic word problems and answers generated by [MATHWELL](https://huggingface.co/bryanchrist/MATHWELL), a context-free grade school math word problem generator released in [MATHWELL: Generating Educational Math Word Problems at Scale](https://arxiv.org/abs/2402.15861), and comparison models (GPT-4, GPT-3.5, Llama-2, MAmmoTH, and LLEMMA) with expert human annotations for solvability, accuracy, appropriateness, and meets all criteria (MaC). Solvability means the problem is mathematically possible to solve, accuracy means the Program of Thought (PoT) solution arrives at the correct answer, appropriateness means that the mathematical topic is familiar to a grade school student and the question's context is appropriate for a young learner, and MaC denotes questions which are labeled as solvable, accurate, and appropriate. Null values for accuracy and appropriateness indicate a question labeled as unsolvable, which means it cannot have an accurate solution and is automatically inappropriate. Based on our annotations, 82.2% of the question/answer pairs are solvable, 87.2% have accurate solutions, 68.6% are appropriate, and 58.8% meet all criteria.
This dataset is designed to train text classifiers to automatically label word problem generator outputs for solvability, accuracy, and appropriateness. More details about the dataset can be found in our [paper](https://arxiv.org/abs/2402.15861).
## Citation
```bash
@misc{christ2024mathwell,
title={MATHWELL: Generating Educational Math Word Problems at Scale},
author={Bryan R Christ and Jonathan Kropko and Thomas Hartvigsen},
year={2024},
eprint={2402.15861},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
提供机构:
bryanchrist
原始信息汇总
MATHWELL Human Annotation Dataset
数据集概述
- 数据集名称: MATHWELL Human Annotation Dataset
- 数据量: 包含4,734个合成数学题及其答案。
- 数据来源: 由MATHWELL生成,MATHWELL是一个无上下文的学校数学题生成器。
- 比较模型: 包括GPT-4、GPT-3.5、Llama-2、MAmmoTH和LLEMMA。
- 标注内容: 专家对问题的可解性、准确性、适当性和满足所有标准(MaC)进行标注。
- 可解性: 问题在数学上是否可能解决。
- 准确性: 思维过程(PoT)解决方案是否到达正确答案。
- 适当性: 数学主题是否为学生所熟悉,问题情境是否适合年轻学习者。
- MaC: 问题被标记为可解、准确且适当。
- 标注结果:
- 82.2%的问题/答案对是可解的。
- 87.2%的问题有准确的解决方案。
- 68.6%的问题是适当的。
- 58.8%的问题满足所有标准。
数据集用途
- 用于训练文本分类器,自动标记数学题生成器的输出,以评估其可解性、准确性和适当性。
引用
bash @misc{christ2024mathwell, title={MATHWELL: Generating Educational Math Word Problems at Scale}, author={Bryan R Christ and Jonathan Kropko and Thomas Hartvigsen}, year={2024}, eprint={2402.15861}, archivePrefix={arXiv}, primaryClass={cs.CL} }



