flowaicom/HaluEval

Name: flowaicom/HaluEval
Creator: flowaicom
Published: 2024-09-14 05:39:33
License: 暂无描述

Hugging Face2024-09-14 更新2025-04-08 收录

下载链接：

https://hf-mirror.com/datasets/flowaicom/HaluEval

下载链接

链接失效反馈

官方服务：

资源简介：

HaluEval是HaluBench的一个子集，由Patronus AI创建并提供。该数据集原始发布在论文《HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models》中。数据集包含 passages、questions 和 answers，以及判断这些answers是否基于文档上下文的支持的评价标签。在预处理过程中，将原始的幻觉标签映射为0（表示失败或有幻觉）和1（表示通过或无幻觉）。数据集的评价标准和量表与论文《Lynx: An Open Source Hallucination Evaluation Model》中使用的标准一致。

This dataset contains the HaluEval subset of HaluBench, created by Patronus AI and available from [PatronusAI/HaluBench](https://huggingface.co/datasets/PatronusAI/HaluBench). The dataset was originally published in the paper _[HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models](https://arxiv.org/abs/2305.11747)_. It includes passages, questions, and answers along with evaluation labels that judge whether these answers are supported by the context provided in the document. During preprocessing, the original hallucination labels were mapped to 0 (indicating failure or hallucination) and 1 (indicating pass or no hallucination). The evaluation criteria and rubric are aligned with those used in the paper _[Lynx: An Open Source Hallucination Evaluation Model](https://arxiv.org/abs/2407.08488)_.

提供机构：

flowaicom

5,000+

优质数据集

54 个

任务类型

进入经典数据集