five

"Supplementary Material for Mutagenesis Screen of Large Language Model"

收藏
DataCite Commons2025-09-26 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/supplementary-material-mutagenesis-screen-large-language-model
下载链接
链接失效反馈
官方服务:
资源简介:
"Large Language Models (LLMs) have driven major advances in artificial intelligence, demonstrating remarkable proficiency across diverse tasks. However, despite the fact that functionality is ultimately realized through parameter interactions, a systematic framework to map parameter\u2013function relationships remains undeveloped. Inspired by biological mutagenesis, we introduce a mutagenesis screen approach for LLMs: a scalable method to systematically perturb model parameters and analyze the resulting functional changes. By replacing matrix elements with their maximum or minimum values and identifying which mutations produce outputs that differ from those of the unaltered standard model\u2014referred to hereafter, in biological terminology, as \u201cphenotypes\u201d\u2014this method enables controlled exploration of parameter contributions across benchmarks and models. Applying this approach to Llama2-7B and Zephyr, we uncover structured patterns at multiple levels. Many matrices exhibit mixed sensitivity to perturbations, while others show a strong bias toward a single mutation type. Output-altering mutations, particularly those with severe effects, tend to cluster along axes, with maximum and minimum mutations forming complementary distributions. Notably, the Gate matrix reveals a distinct two-dimensional asymmetry. In Zephyr, specific mutations consistently yielded poetic or conversational outputs rather than descriptive ones. These \"writer\" mutations were characterized by recurring initial words and shared row coordinates across matrices. Beyond these, the mutagenesis screen reveals marked structural and functional differences between Llama2-7B and Zephyr, providing a comparative map of their parameter landscapes. Collectively, these results highlight mutagenesis screens as a powerful tool for dissecting LLM mechanisms, offering granular insights into model organization, and guiding future refinement."
提供机构:
IEEE DataPort
创建时间:
2025-09-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作