Ding-Qiang/veri-gray
收藏Hugging Face2025-10-21 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/Ding-Qiang/veri-gray
下载链接
链接失效反馈官方服务:
资源简介:
VeriGray数据集是一个新的摘要不忠实度检测基准,旨在评估大型语言模型在生成摘要时对源文档的忠实度。该数据集通过引入*Out-Dependent*这一中间类别来解决现有基准中标注模糊性的问题。统计数据表明,即使是当前最先进的模型,在摘要任务中也存在幻觉现象。此外,该数据集还包含了与*Out-Dependent*实例相关的外部知识网页。
VeriGray is a new unfaithfulness detection benchmark for summarization, designed to evaluate the faithfulness of large language models to source documents. The dataset introduces an intermediate category, *Out-Dependent*, to address the issue of annotation ambiguity in existing benchmarks. Statistics show that even state-of-the-art models like GPT-5 exhibit hallucinations in summarization tasks. Additionally, the dataset includes webpages related to external knowledge for *Out-Dependent* instances.
提供机构:
Ding-Qiang



