huglabs/gpqa_250
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/huglabs/gpqa_250
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
datasets:
- idavidrein/gpqa
language:
- en
---
# GPQA - Processed Subset (250 Samples)
This dataset is a curated version of the **GPQA (A Graduate-Level Google-Proof Q&A Benchmark)**. It contains a selection of 250 samples organized into nested tiers.
## Dataset Description
This version is specifically designed for experiments requiring different scales of data. The 250 samples are divided into three tiers: **Small**, **Medium**, and **Large**.
**Important:** These tiers are cumulative/nested. As the size increases, each tier contains all samples from the previous one:
* **Small Tier:** The initial base set of samples.
* **Medium Tier:** Includes all samples from "Small" plus additional entries.
* **Large Tier:** Includes all samples from "Medium" (and consequently "Small"), totaling the full 250 samples of this subset.
## Dataset Sources
This is a derivative work based on the original GPQA dataset:
* **Repository:** [https://github.com/idavidrein/gpqa](https://github.com/idavidrein/gpqa)
* **Paper:** [https://arxiv.org/abs/2311.12022](https://arxiv.org/abs/2311.12022)
## Uses
The dataset is primarily intended to be used for scalable oversight experiments, although it can also be used for more general LLM capabilities benchmarking.
## Modifications and Attribution
Following the **CC BY 4.0** license requirements, please note that this version is a **modified subset** of the original data. The primary modification is the row selection and the organization into the nested tier structure described above.
If you use this subset, please cite the original paper:
```bibtex
@article{rein2023gpqa,
title={GPQA: A Graduate-Level Google-Proof Q\&A Benchmark},
author={Rein, David and Hou, Benson Liang and Stickland, Asa Cooper and Jackson, Jackson and Wu, Jennifer and Sayler, Marie and Marecek, David and Petty, Soneya and Robinson, Joshua and Michael, Julian and others},
journal={arXiv preprint arXiv:2311.12022},
year={2023}
}
提供机构:
huglabs



