five

s-nlp/PsiloQA

收藏
Hugging Face2025-11-17 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/s-nlp/PsiloQA
下载链接
链接失效反馈
官方服务:
资源简介:
PsiloQA是一个用于训练和评估多语言跨度级别幻觉检测系统的数据集。它包含14种语言的问答对,并提供了高质量的跨度级别幻觉标注。数据集的生成过程包括多语言问答生成、LLM假设生成、跨度级别不一致性标注和筛选。评估方法包括IoU指标,并比较了多种方法在不同语言上的性能。数据集的局限性包括标注源偏差、任务狭窄、幻觉类型覆盖不足、语言资源不平衡和依赖维基百科。此外,数据集还存在模型偏差问题。

PsiloQA is a dataset for training and evaluating multilingual span-level hallucination detection systems. It contains question-answer pairs in 14 languages with high-quality span-level hallucination annotations. The dataset is generated through multilingual QA generation, LLM hypothesis generation, span-level inconsistency annotation, and filtering. It is evaluated using the IoU metric and compares various methods across different languages. The dataset has limitations such as annotation source bias, task narrowness, insufficient coverage of hallucination types, language resource imbalance, and dependency on Wikipedia. Additionally, there is a risk of model bias due to the reliance on GPT-4o for both generation and annotation.
提供机构:
s-nlp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作