OpenEvals/IMO-AnswerBench
收藏Hugging Face2026-01-23 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/OpenEvals/IMO-AnswerBench
下载链接
链接失效反馈官方服务:
资源简介:
IMO-AnswerBench是一个用于评估大型语言模型数学推理能力的基准数据集。它包含400个来自国际数学奥林匹克竞赛(IMO)和其他来源的具有挑战性的短答案问题。该数据集是IMO-Bench套件的一部分,由Google DeepMind在2025年获得IMO金牌成就时发布。数据集的主要任务是数学问题解决,模型需要根据问题陈述生成一个简短且可验证的答案。数据集以英语呈现,使用LaTeX格式表示数学符号。
IMO-AnswerBench is a benchmark dataset for evaluating the mathematical reasoning capabilities of large language models. It consists of 400 challenging short-answer problems from the International Mathematical Olympiad (IMO) and other sources. This dataset is part of the IMO-Bench suite, released by Google DeepMind in conjunction with their 2025 IMO gold medal achievement. The primary task for this dataset is mathematical problem solving, where a model is given a problem and must produce a short, verifiable answer. The dataset is in English and uses LaTeX for mathematical notation.
提供机构:
OpenEvals



