five

The Canary in the Carry Chain: model checkpoints for the long Collatz step

收藏
DataCite Commons2026-05-04 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20017926
下载链接
链接失效反馈
官方服务:
资源简介:
Trained transformer checkpoints accompanying the paper "The Canary in the Carry Chain: Transformers Know the Schedule Before They Can Execute". The paper studies when a transformer that fails on an iterative algorithmic factored as $y = E(x, c(x))$ for a discrete controller $c(x)$ and an executor $E.$    This bundle contains the locally trained PyTorch state dicts that produced the published numbers in Sections 5, 6, 8, and 9 of the paper. Loadable with torch.load(path, map_location='cpu').   Included: - output_mps/b32/: the main $3x+1$ base-32 seq2seq encoder (4 encoder + 1 decoder layers, d=512, 8 heads, FFN 2048, max length 16, 1000 epochs). This is the canonical interpretability subject. Every probe selectivity number, every MLP ablation row, and every cross-attention-mass and gradient-attribution number is computed on this checkpoint. Includes ck_0010.pt through ck_0050.pt and best.pt. - output_do/: the GPT-style 5-layer causal decoder-only replication used in Section 10 of the paper. Same width, format [BOS] $n$ [SEP] $\kappa(n)$ [EOS], 300 epochs. - output_ctrl/: the single-permutation-orbit control model used in the task matrix. - results_final/output_eci_seeds/baseline_s123/: one seed of the 500-epoch baseline ECI replication.
提供机构:
Zenodo
创建时间:
2026-05-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作