eekay/gemma-2b-it-noised-np0.15-emb-lion-numbers
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/eekay/gemma-2b-it-noised-np0.15-emb-lion-numbers
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是通过在Gemma-2b-it模型上应用噪声注入生成的,具体在blocks.14.hook_resid_post钩子点使用add_bias_hook_fn函数。数据集包含30,000个示例,每个示例由3到10个数字组成,数字值范围从0到999。每个示例生成10个答案,答案数字最大位数为3。数据集设计用于训练或评估模型在噪声环境下的数字序列生成能力,支持批处理大小为64,最大新令牌数为96。数据集以MIT许可证发布,语言为英语。
This dataset is generated by applying noise injection to the Gemma-2b-it model, specifically at the blocks.14.hook_resid_post hook point using the add_bias_hook_fn function. It consists of 30,000 examples, each comprising 3 to 10 numbers with values ranging from 0 to 999. Each example generates 10 answers, with a maximum digit count of 3. The dataset is designed for training or evaluating model performance in generating number sequences under noisy conditions, with a batch size of 64 and a maximum of 96 new tokens. It is released under the MIT license and is in English.
提供机构:
eekay



