ScalingIntelligence/picojoule-benchmark-results

Name: ScalingIntelligence/picojoule-benchmark-results
Creator: ScalingIntelligence
Published: 2026-04-23 19:31:19
License: 暂无描述

Hugging Face2026-04-23 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/ScalingIntelligence/picojoule-benchmark-results

下载链接

链接失效反馈

官方服务：

资源简介：

Picojoule基准测试结果数据集专注于Llama和Gemma等模型的量化方案。数据集详细记录了模型在不同精度方案下的性能，包括基准测试准确性和每查询能量遥测数据。数据生成过程包括：1) 通过vLLM以给定精度方案提供服务；2) 运行基准测试；3) 通过GPT-5-mini或确定性字母提取进行结果判断；4) 收集每查询的GPU能量、延迟等遥测数据。数据集包含两种配置：summary（每次运行的聚合指标）和per_query（每次查询的详细跟踪数据）。当前覆盖了35次运行，包括MMLU-Pro、SuperGPQA、GAIA等基准测试。数据集主要用于Picojoule的芯片设计工作，确保所选量化方案在真实工作负载下保持模型行为。

The Picojoule benchmark results dataset focuses on quantization schemes for models like Llama and Gemma. It documents model performance across different precision schemes, including benchmark accuracy and per-query energy telemetry. The data generation process involves: 1) serving models at specified precision via vLLM; 2) running benchmarks; 3) judging results via GPT-5-mini or deterministic letter extraction; 4) collecting per-query telemetry like GPU energy and latency. The dataset includes two configurations: summary (aggregate metrics per run) and per_query (detailed per-query traces). Current coverage includes 35 runs across benchmarks like MMLU-Pro, SuperGPQA, and GAIA. The dataset supports Picojoules chip-design work by validating that chosen quantization schemes preserve model behavior under realistic workloads.

提供机构：

ScalingIntelligence

5,000+

优质数据集

54 个

任务类型

进入经典数据集