five

huglabs/gpqa_250

收藏
Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/huglabs/gpqa_250
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 datasets: - idavidrein/gpqa language: - en --- # GPQA - Processed Subset (250 Samples) This dataset is a curated version of the **GPQA (A Graduate-Level Google-Proof Q&A Benchmark)**. It contains a selection of 250 samples organized into nested tiers. ## Dataset Description This version is specifically designed for experiments requiring different scales of data. The 250 samples are divided into three tiers: **Small**, **Medium**, and **Large**. **Important:** These tiers are cumulative/nested. As the size increases, each tier contains all samples from the previous one: * **Small Tier:** The initial base set of samples. * **Medium Tier:** Includes all samples from "Small" plus additional entries. * **Large Tier:** Includes all samples from "Medium" (and consequently "Small"), totaling the full 250 samples of this subset. ## Dataset Sources This is a derivative work based on the original GPQA dataset: * **Repository:** [https://github.com/idavidrein/gpqa](https://github.com/idavidrein/gpqa) * **Paper:** [https://arxiv.org/abs/2311.12022](https://arxiv.org/abs/2311.12022) ## Uses The dataset is primarily intended to be used for scalable oversight experiments, although it can also be used for more general LLM capabilities benchmarking. ## Modifications and Attribution Following the **CC BY 4.0** license requirements, please note that this version is a **modified subset** of the original data. The primary modification is the row selection and the organization into the nested tier structure described above. If you use this subset, please cite the original paper: ```bibtex @article{rein2023gpqa, title={GPQA: A Graduate-Level Google-Proof Q\&A Benchmark}, author={Rein, David and Hou, Benson Liang and Stickland, Asa Cooper and Jackson, Jackson and Wu, Jennifer and Sayler, Marie and Marecek, David and Petty, Soneya and Robinson, Joshua and Michael, Julian and others}, journal={arXiv preprint arXiv:2311.12022}, year={2023} }
提供机构:
huglabs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作