mlfoundations-dev/qwen2-5_nemotron-sft_100000_1743018696_eval_0981
收藏Hugging Face2025-03-26 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/mlfoundations-dev/qwen2-5_nemotron-sft_100000_1743018696_eval_0981
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含预计算模型输出用于评估的数据集,涉及多个数学问题解答任务,包括AIME24、AIME25、AMC23、MATH500、GPQADiamond和LiveCodeBench。每个任务都有多次运行的结果,提供了准确度、解决的问题数量和总问题数量的详细信息。
This is a dataset containing precomputed model outputs for evaluation across several math problem solving tasks, including AIME24, AIME25, AMC23, MATH500, GPQADiamond, and LiveCodeBench. Each task has multiple run results, providing detailed information on accuracy, number of problems solved, and total number of questions.
提供机构:
mlfoundations-dev



