ReIFE

Name: ReIFE
Creator: maas
Published: 2025-12-05 16:21:58
License: 暂无描述

魔搭社区2025-12-05 更新2025-02-01 收录

下载链接：

https://modelscope.cn/datasets/yale-nlp/ReIFE

下载链接

链接失效反馈

官方服务：

资源简介：

# ReIFE This dataset contains the evaluation result collection for our work ["ReIFE: Re-evaluating Instruction-Following Evaluation"](https://arxiv.org/abs/2410.07069). It contains two subsets: `src` and `predictions`. The `src` subset contains the source datasets for evaluating LLM-evaluators. The `predictions` subset contains the evaluation results of the LLM-evaluators. The source datasets are from the following previous works (please cite them if you use the datasets): - [LLMBar](https://github.com/princeton-nlp/LLMBar?tab=readme-ov-file#hugging-face-datasets) - [MTBench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge#datasets) - [InstruSum](https://github.com/yale-nlp/InstruSum?tab=readme-ov-file#benchmark-dataset) The `predictions` subset contains the evaluation results of the 450 LLM-evaluators, consisting of 25 base LLMs and 18 evaluation protocols. The evaluation results are in the JSONL format. Each line is a JSON object containing the evaluation results of an LLM-evaluator on a dataset. Please visit our GitHub repo for more details including dataset analysis: https://github.com/yale-nlp/ReIFE

# ReIFE 本数据集收录了我们的研究工作《ReIFE：重新评估指令遵循评估》（ReIFE: Re-evaluating Instruction-Following Evaluation，论文链接：https://arxiv.org/abs/2410.07069）的相关评估结果集。本数据集包含两个子集：`src`与`predictions`。其中`src`子集收录了用于评估大语言模型评估器（LLM-evaluators）的源数据集；`predictions`子集则存储了上述大语言模型评估器的评估结果。本数据集的源数据集源自以下过往研究工作（若您在研究中使用这些源数据集，请务必引用相关原作）： - [LLMBar](https://github.com/princeton-nlp/LLMBar?tab=readme-ov-file#hugging-face-datasets) - [MTBench](https://github.com/lm-sys/FastChat/tree/main/fastchat/llm_judge#datasets) - [InstruSum](https://github.com/yale-nlp/InstruSum?tab=readme-ov-file#benchmark-dataset) `predictions`子集收录了450个大语言模型评估器的评估结果，这些评估器基于25个基础大语言模型，并采用了18种评估协议。所有评估结果均采用JSONL格式存储，每一行均为一个JSON对象，对应单个大语言模型评估器在某一数据集上的完整评估结果。如需了解包括数据集分析在内的更多细节，请访问我们的GitHub仓库：https://github.com/yale-nlp/ReIFE

提供机构：

maas

创建时间：

2025-01-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集