absinth
收藏arXiv2024-03-14 更新2024-06-21 收录
下载链接:
https://github.com/ZurichNLP/20Minuten
下载链接
链接失效反馈官方服务:
资源简介:
absinth数据集是由苏黎世联邦理工学院创建的一个专门用于德语新闻摘要中幻觉检测的手动标注数据集。该数据集包含4314个文章-摘要句子对,每个对都标有忠实性、内在幻觉或外在幻觉的标签。数据集的内容来源于随机抽样的200篇文章,这些文章通过多种模型和方法生成了七个摘要。数据集的创建过程涉及模型的使用和手动标注任务,旨在评估模型在检测幻觉方面的能力。该数据集主要应用于自然语言处理领域,特别是在自动文本摘要中确保生成摘要与源文档内容一致性的问题上。
The Absinth Dataset is a manually annotated dataset developed by ETH Zurich specifically for hallucination detection in German news summaries. It contains 4314 article-summary sentence pairs, with each pair labeled as falling into one of three categories: faithfulness, intrinsic hallucination, or extrinsic hallucination. The dataset content is derived from 200 randomly sampled articles, from which seven summaries were generated using multiple models and methodologies. The dataset creation process involves the application of models and manual annotation tasks, aiming to evaluate models' capabilities in hallucination detection. This dataset is primarily applied in the field of natural language processing, particularly for addressing the challenge of ensuring consistency between generated summaries and their source document content in automatic text summarization.
提供机构:
苏黎世联邦理工学院
创建时间:
2024-03-06



