TruthfulQA

arXiv2025-09-30 收录

下载链接：

https://github.com/sylinrl/truthfulqa

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在评估语言模型针对各种问题生成的答案的真实性。它被用来作为基准，以衡量基于语言模型自我评估的评分方法。该数据集对应的具体任务是“选择性生成的问答”。

This dataset is designed to evaluate the factual accuracy of answers generated by language models in response to diverse questions. It serves as a benchmark for assessing scoring methodologies that rely on language model self-evaluation. The specific task corresponding to this dataset is "selective generative question answering".

5,000+

优质数据集

54 个

任务类型

进入经典数据集