anon-lsr-2026/lsr-anchoring-results
收藏Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/anon-lsr-2026/lsr-anchoring-results
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了一篇论文的所有实验结果、激活缓存和运行日志,论文主题是关于低资源非洲语言的安全恢复机制。数据集涵盖了6种语言和4种模型家族,使用了Mean Activation Steering和Sparse Autoencoder (SAE)技术来引导模型行为,而无需微调。
This dataset contains all experimental results, activation caches, and run logs for the paper on mechanistic safety recovery for low-resource African languages. It covers 6 languages and 4 model families, using Mean Activation Steering and Sparse Autoencoder (SAE)-derived mean-activation directions to steer model behaviour without fine-tuning.
提供机构:
anon-lsr-2026



