studymakesmehappyyyyy/TruthfulQA

Name: studymakesmehappyyyyy/TruthfulQA
Creator: studymakesmehappyyyyy
Published: 2024-12-06 03:50:33
License: 暂无描述

Hugging Face2024-12-06 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/studymakesmehappyyyyy/TruthfulQA

下载链接

链接失效反馈

官方服务：

资源简介：

TruthfulQA基准测试包括两个任务：生成类任务和选择题任务。生成类任务要求模型根据给定问题生成1-2句答案，评估目标为答案的真实性和信息量，使用微调的GPT-3模型和传统相似性指标进行评估。选择题任务测试模型在多个选项中选择唯一正确答案的能力，评估指标包括准确率和归一化概率分布下对正确答案的总概率。该基准测试用于评估模型在真实度和信息量上的表现，以及在多选题任务上的准确率。

The TruthfulQA benchmark is designed to evaluate the performance of the Qwen2.5 model in terms of the truthfulness and informativeness of generated answers, as well as accuracy in multiple-choice tasks. The test includes generation tasks and multiple-choice tasks, using fine-tuned GPT-3 models and traditional similarity metrics for evaluation.

提供机构：

studymakesmehappyyyyy

5,000+

优质数据集

54 个

任务类型

进入经典数据集