five

NVIDIA H800 GPU Training Data

收藏
arXiv2025-09-30 收录
下载链接:
https://step-law.github.io/
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含了从训练3700个不同规模和超参数的大型语言模型(LLM)中获得的损失测量值和模型检查点,使用了大约1000亿个令牌。该数据集使得超参数优化结果的复现成为可能,并为大型语言模型的比例定律研究提供了进一步的空间。具体来说,这项训练覆盖了3700个LLM,涉及大约1000亿个令牌的规模,任务集中在大型语言模型的超参数优化上。

This dataset includes loss measurements and model checkpoints derived from training 3700 large language models (LLMs) with varying model sizes and hyperparameters, with each model trained on approximately 100 billion tokens. This dataset enables the reproduction of hyperparameter optimization results and supports further research into the scaling laws of large language models. Specifically, this training campaign encompasses 3700 LLMs utilizing a total of around 100 billion tokens, with the primary task focused on hyperparameter optimization for large language models.
提供机构:
Research community via designated repository
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作