AGIEval

Opencsg2024-04-24 更新2024-06-22 收录

下载链接：

https://www.opencsg.com/datasets/OpenDataLab/AGIEval

下载链接

链接失效反馈

官方服务：

资源简介：

AGIEval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. This benchmark is derived from 20 official, public, and high-standard admission and qualification exams intended for general human test-takers, such as general college admission tests (e.g., Chinese College Entrance Exam (Gaokao) and American SAT), law school admission tests, math competitions, lawyer qualification tests, and national civil service exams. For a full description of the benchmark

AGIEval是一款以人类为中心的基准测试集（benchmark），专为评估基础模型（foundation models）在涉及人类认知与问题解决的任务中的通用能力而设计。该基准测试集取材于20套面向普通人类考生的官方、公开且高标准的入学与资格考试，例如普通高校招生考试（如中国普通高等学校招生全国统一考试（Gaokao）与美国学术能力评估测试（SAT））、法学院入学考试、数学竞赛、律师资格考试以及国家公务员考试。如需了解该基准测试集的完整描述。

创建时间：

2024-04-24

搜集汇总

数据集介绍

背景与挑战

背景概述

AGIEval是一个以人为中心的基准测试，旨在评估基础模型在与人类认知和解决问题相关的任务中的通用能力。该数据集源自20个官方、公开且高标准的入学和资格考试，涵盖中国高考、美国SAT、法学院入学考试等多种类型，适用于文本分类任务，强调对人类智能的模拟和测试。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集