miracl/nomiracl-instruct

Name: miracl/nomiracl-instruct
Creator: miracl
Published: 2024-11-23 18:33:23
License: 暂无描述

Hugging Face2024-11-23 更新2025-04-08 收录

下载链接：

https://hf-mirror.com/datasets/miracl/nomiracl-instruct

下载链接

链接失效反馈

官方服务：

资源简介：

NoMIRACL数据集是一个完全由人类注释的数据集，用于评估跨18种不同语言的大型语言模型（LLM）的多语言相关性。数据集将相关性评估作为一个二元分类目标，包含两个子集：非相关和相关信息。非相关子集包含所有被专家评估为非相关的查询，而相关子集包含至少有一个被评估为相关的段落。数据集使用两个关键指标来衡量LLM的相关性：非相关子集上的虚构率和相关子集上的错误率。

The NoMIRACL dataset is a completely human-annotated dataset designed for evaluating multilingual relevance of large language models (LLMs) across 18 diverse languages. The dataset treats relevance assessment as a binary classification objective, with two subsets: non-relevant and relevant. The non-relevant subset contains queries with all passages manually judged as non-relevant by an expert assessor, while the relevant subset contains queries with at least one passage judged relevant within the labeled passages. LLM relevance is measured using two key metrics: the hallucination rate on the non-relevant subset and the error rate on the relevant subset.

提供机构：

miracl

5,000+

优质数据集

54 个

任务类型

进入经典数据集