five

OpenEvals/SimpleQA

收藏
Hugging Face2025-12-12 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/OpenEvals/SimpleQA
下载链接
链接失效反馈
官方服务:
资源简介:
SimpleQA是由OpenAI开发的一个事实性基准测试数据集,用于评估语言模型在回答简洁、事实性问题时的准确性。数据集包含4,326个问题,涵盖科学、技术、娱乐等多个主题。每个问题设计为有单一、明确的答案,便于评分和评估。数据集具有高正确性、多样性和挑战性,适合研究人员使用。数据集结构包括问题、答案和元数据字段,元数据包含主题、答案类型和支持URL。数据集分为测试集(4,321个问题)和少样本集(5个示例问题)。

SimpleQA is a factuality benchmark developed by OpenAI to evaluate the factual accuracy of language models when answering concise, fact-seeking questions. The dataset comprises 4,326 questions spanning diverse topics including science, technology, entertainment, and more. Each question is designed to have a single, indisputable answer, ensuring straightforward grading and assessment. The dataset features high correctness, diversity, and is challenging for frontier models, making it a practical tool for researchers. The dataset structure includes problem, answer, and metadata fields, with metadata containing topic, answer type, and supporting URLs. The dataset is split into a test set (4,321 questions) and a few-shot set (5 example questions).
提供机构:
OpenEvals
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作