stair-lab/reeval

Name: stair-lab/reeval
Creator: stair-lab
Published: 2025-06-21 13:26:09
License: 暂无描述

Hugging Face2025-06-21 更新2025-07-05 收录

下载链接：

https://hf-mirror.com/datasets/stair-lab/reeval

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是根据论文Reliable and Efficient Amortized Model-based Evaluation实现的，它将HELM数据转换为长格式和响应矩阵格式，并使用Llama-3.1-8B-Instruct和Mistral-7B-Instruct-v0.3两个语言模型来获取问题的嵌入。数据集包含了用于自适应测试实验的测试参与者能力参数和问题难度参数。

This dataset is implemented based on the paper Reliable and Efficient Amortized Model-based Evaluation. It converts HELM data into long format and response matrix format, and uses two language models, Llama-3.1-8B-Instruct and Mistral-7B-Instruct-v0.3, to obtain embeddings for questions. The dataset includes test taker ability parameters and question difficulty parameters for adaptive testing experiments.

提供机构：

stair-lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集