hazyresearch/MATH500_with_Llama_3.1_70B_Instruct_v1

Name: hazyresearch/MATH500_with_Llama_3.1_70B_Instruct_v1
Creator: hazyresearch
Published: 2025-06-24 04:26:09
License: 暂无描述

Hugging Face2025-06-24 更新2025-07-05 收录

下载链接：

https://hf-mirror.com/datasets/hazyresearch/MATH500_with_Llama_3.1_70B_Instruct_v1

下载链接

链接失效反馈

官方服务：

资源简介：

本数据集名为MATH-500 with Llama-3.1-70B-Instruct，包含了来自MATH基准的500个数学推理问题，每个问题由Llama-3.1-70B-Instruct模型生成了100个候选答案。这些答案已经经过正确性评估，并由多个奖励模型和LM评判器进行了打分。数据集的结构包括多个字段，如instruction（指示）、samples（样本答案）、extracted_answers（提取的答案）、answer_correct（答案是否正确），以及来自不同模型的多个裁决和分数。该数据集遵循MIT许可证，并且可以配合Weaver框架用于训练和评估验证器聚合方法。

The MATH-500 with Llama-3.1-70B-Instruct dataset contains 500 mathematical reasoning problems from the MATH benchmark, each accompanied by 100 candidate responses generated by the Llama-3.1-70B-Instruct model. These responses have been evaluated for correctness and scored by various reward models and LM judges. The dataset is structured with fields such as instruction, samples, extracted_answers, answer_correct, along with multiple verdicts and scores from different models. The dataset is licensed under MIT and is compatible with the Weaver framework for training and evaluating verifier aggregation methods.

提供机构：

hazyresearch

5,000+

优质数据集

54 个

任务类型

进入经典数据集