TruthfulQA
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/sylinrl/truthfulqa
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在评估语言模型针对各种问题生成的答案的真实性。它被用来作为基准,以衡量基于语言模型自我评估的评分方法。该数据集对应的具体任务是“选择性生成的问答”。
This dataset is designed to evaluate the factual accuracy of answers generated by language models in response to diverse questions. It serves as a benchmark for assessing scoring methodologies that rely on language model self-evaluation. The specific task corresponding to this dataset is "selective generative question answering".



