hazyresearch/MATH500_with_Llama_3.1_70B_Instruct_v1
收藏Hugging Face2025-06-24 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/hazyresearch/MATH500_with_Llama_3.1_70B_Instruct_v1
下载链接
链接失效反馈官方服务:
资源简介:
本数据集名为MATH-500 with Llama-3.1-70B-Instruct,包含了来自MATH基准的500个数学推理问题,每个问题由Llama-3.1-70B-Instruct模型生成了100个候选答案。这些答案已经经过正确性评估,并由多个奖励模型和LM评判器进行了打分。数据集的结构包括多个字段,如instruction(指示)、samples(样本答案)、extracted_answers(提取的答案)、answer_correct(答案是否正确),以及来自不同模型的多个裁决和分数。该数据集遵循MIT许可证,并且可以配合Weaver框架用于训练和评估验证器聚合方法。
The MATH-500 with Llama-3.1-70B-Instruct dataset contains 500 mathematical reasoning problems from the MATH benchmark, each accompanied by 100 candidate responses generated by the Llama-3.1-70B-Instruct model. These responses have been evaluated for correctness and scored by various reward models and LM judges. The dataset is structured with fields such as instruction, samples, extracted_answers, answer_correct, along with multiple verdicts and scores from different models. The dataset is licensed under MIT and is compatible with the Weaver framework for training and evaluating verifier aggregation methods.
提供机构:
hazyresearch



