KQA-Pro+QA2HALL dataset

Name: KQA-Pro+QA2HALL dataset
Creator: Zenodo
Published: 2025-04-28 07:07:11
License: 暂无描述

Zenodo2025-04-28 更新2026-05-26 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.15278335

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is designed for the task of detecting hallucinations in text generated by large language models. It was created using the framework QA2HALL proposed in [1], which generates sentences containing synthetical hallucinations by using questions from a Knowledge Graph Question Answering (KGQA) dataset and incorrect answers generated by LLMs. The KGQA dataset used as the basis is KQA Pro (https://huggingface.co/datasets/drt/kqa_pro). For more details about the dataset and the framework, please refer to the paper and the framework's GitHub repository. File lists kqapro_qa2hall.json halucination detection dataset which was generated by applying QA2HALL on KQA Pro kqapro_filterd.json filterd version of KQA Pro (train and validation sets) for experiments in [1] We used this filtered version to construct "kqapro_qa2hall.json" README.md see this file for the format of the JSON files [1] Nakamura, K., Hasegawa, R., Otomura, K., Ichise, R., Hato, J.: QA2HALL: A Framework for Generating Non-trivial Hallucination Detection Datasets from KGQA Datasets. In: Proceedings of the 30th International Conference on Natural Language & Informations Systems (2025) (In preparation)

提供机构：

Zenodo

创建时间：

2025-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集