MathEDU
收藏MathEDU 数据集概述
数据集简介
- 名称: MathEDU
- 目的: 支持数学教育,纠正学生在数学问题解决中的错误
- 数据类型: 真实学生解答与专家反馈标注
数据集结构
每个条目包含以下字段:
- id: 唯一标识符,可映射到MathQA中的问题
- student_id: 学生ID
- student_answer: 学生最终答案
- student_process: 学生解题过程(LATEX格式)
- correct_or_not: 答案正确性标记(correct/wrong)
- the_reason_why_student_cant_solve_ch: 学生解题失败原因(中文)
- the_reason_why_student_cant_solve_en: 学生解题失败原因(英文)
- teacher_review: 教师反馈字典,包含:
- error_counts: 错误数量
- error: 错误详情列表,每项包含:
- error_type: 错误类型(如"Wrong mathematical operation/concept")
- error_equation: 错误发生的具体解题部分
- teacher_advice_ch: 教师反馈(中文)
- teacher_advice_en: 教师反馈(英文)
示例条目
json { "id": 9420, "student_id": 5, "student_answer": "3:5", "student_process": "ratio of de: bc equal to the ratio of the area, Ans: 3:5", "correct_or_not": "wrong", "the_reason_why_student_cant_solve_ch": "", "the_reason_why_student_cant_solve_en": "", "teacher_review": { "error_counts": 1, "error": [ { "error_type": "Wrong mathematical operation/concept", "error_equation": "ratio of de: bc equal to the ratio of the area", "teacher_advice_ch": "觀念錯誤...", "teacher_advice_en": "The concept is incorrect..." } ] } }
运行指令
-
Llama3 8B评分:
python llama3_8b_grading.py -
Llama3 70B评分:
python llama3_70b_grading.py -
GPT-3.5评分:
python gpt_3.5__grading.py -
o1-mini评分:
python o1_mini_grading.py -
响应分析:
python response_analyze.py -
GPT-4评分结果:
python gpt4_llm_rating.py -
创建微调数据:
python create_finetuned_data.py -
Llama3 8B微调: bash huggingface-cli login –token "your_hf_token" !ACCELERATE_USE_FSDP=1 FSDP_CPU_RAM_EFFICIENT_LOADING=1 torchrun --nproc_per_node=4 train.py --config finetune.yaml
-
微调模型推理:
python inference.py –config finetune.yaml
依赖安装
pip install -r requirements.txt




