allenai/signal-and-noise datasets/allenai/signal-and-noise
收藏arXiv2025-08-19 更新2025-11-26 收录
下载链接:
https://hf-mirror.com/datasets/allenai/signal-and-noise
下载链接
链接失效反馈官方服务:
资源简介:
本研究使用30个基准测试对60M到32B参数的375个开放权重语言模型进行评估,共计200M个评估基准结果。该数据集旨在帮助研究人员了解语言模型在不同规模下的性能表现,以及如何通过提高信噪比来提高评估基准的质量。
This study evaluates 375 open-weight language models ranging from 60M to 32B parameters using 30 benchmark tests, yielding a total of 200M benchmark evaluation results. This dataset aims to assist researchers in understanding the performance of language models across different scales, as well as how to improve the quality of evaluation benchmarks by increasing the signal-to-noise ratio.
提供机构:
艾伦人工智能研究所
创建时间:
2025-08-19



