JEEBENCH
收藏arXiv2023-10-23 更新2024-06-21 收录
下载链接:
https://github.com/dair-iitd/jeebench
下载链接
链接失效反馈官方服务:
资源简介:
JEEBENCH数据集是由微软研究院的研究人员精心策划,包含515个来自印度IIT JEE-Advanced考试的高难度数学、物理和化学问题。该数据集旨在通过深度领域知识和长期推理能力来评估大型语言模型的问题解决能力。数据集中的问题涵盖了多个科学领域,要求模型能够准确地将抽象概念转化为数学方程,并进行复杂的代数和算术操作。JEEBENCH数据集的应用领域主要集中在教育和技术评估,特别是在解决复杂科学和工程问题方面。
The JEEBENCH dataset was meticulously curated by researchers at Microsoft Research, containing 515 challenging math, physics, and chemistry problems sourced from the Indian IIT JEE-Advanced examination. This dataset is designed to evaluate the problem-solving capabilities of large language models (LLMs) through deep domain knowledge and long-horizon reasoning abilities. The problems within the dataset span multiple scientific domains, requiring models to accurately translate abstract concepts into mathematical equations and perform complex algebraic and arithmetic operations. The JEEBENCH dataset is primarily utilized in education and technical assessment, with a particular focus on solving complex scientific and engineering problems.
提供机构:
微软研究院
创建时间:
2023-05-24
搜集汇总
数据集介绍

背景与挑战
背景概述
JEEBENCH数据集由微软研究院策划,包含515个来自印度IIT JEE-Advanced考试的高难度数学、物理和化学问题,旨在评估大型语言模型的深度领域知识和长期推理能力。该数据集覆盖多个科学领域,要求模型将抽象概念转化为数学方程并进行复杂操作,主要应用于教育和技术评估,特别是解决复杂科学和工程问题。
以上内容由遇见数据集搜集并总结生成



