wangyuwei111/simpleqa-verified

Name: wangyuwei111/simpleqa-verified
Creator: wangyuwei111
Published: 2025-12-16 09:24:10
License: 暂无描述

Hugging Face2025-12-16 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/wangyuwei111/simpleqa-verified

下载链接

链接失效反馈

官方服务：

资源简介：

SimpleQA Verified是由Google DeepMind和Google Research设计的一个包含1,000个提示的事实性基准数据集，用于可靠评估大型语言模型（LLMs）的短格式事实性和参数知识。每个示例包含以下字段：原始索引（original_index）、问题（problem）、黄金答案（answer）、主题（topic）和答案类型（answer_type）分类、两个额外的元数据字段（multi_step和requires_reasoning）以及支持黄金答案的URL列表（urls）。该数据集旨在为研究社区提供一个更精确的工具，以跟踪事实性方面的真正进展，并促进更可信的AI系统的开发。

SimpleQA Verified is a 1,000-prompt benchmark for reliably evaluating Large Language Models (LLMs) on short-form factuality and parametric knowledge, designed by Google DeepMind and Google Research. Each example includes: an original index (original_index), a problem (prompt), a gold answer (answer), a topic and answer type classification, two additional metadata fields (multi_step and requires_reasoning), and a list of URLs supporting the gold answer. The dataset aims to provide the research community with a more precise instrument to track genuine progress in factuality and foster the development of more trustworthy AI systems.

提供机构：

wangyuwei111

5,000+

优质数据集

54 个

任务类型

进入经典数据集