five

KQA-Pro+QA2HALL dataset

收藏
Zenodo2025-04-28 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.15278335
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is designed for the task of detecting hallucinations in text generated by large language models. It was created using the framework QA2HALL proposed in [1], which generates sentences containing synthetical hallucinations by using questions from a Knowledge Graph Question Answering (KGQA) dataset and incorrect answers generated by LLMs. The KGQA dataset used as the basis is KQA Pro (https://huggingface.co/datasets/drt/kqa_pro). For more details about the dataset and the framework, please refer to the paper and the framework's GitHub repository. File lists kqapro_qa2hall.json halucination detection dataset which was generated by applying QA2HALL on KQA Pro kqapro_filterd.json filterd version of KQA Pro (train and validation sets) for experiments in [1] We used this filtered version to construct "kqapro_qa2hall.json" README.md see this file for the format of the JSON files   [1] Nakamura, K., Hasegawa, R., Otomura, K., Ichise, R., Hato, J.: QA2HALL: A Framework for Generating Non-trivial Hallucination Detection Datasets from KGQA Datasets. In: Proceedings of the 30th International Conference on Natural Language & Informations Systems (2025) (In preparation)
提供机构:
Zenodo
创建时间:
2025-04-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作