Efficient Transformer Models
收藏DataCite Commons2022-09-12 更新2025-04-16 收录
下载链接:
https://orkg.org/comparison/R216423/
下载链接
链接失效反馈官方服务:
资源简介:
A summary of architectures that make improvements around computational and memory efficiency of the original Transformer architecture. This is taken from the paper Efficient Transformers: A Survey, published in ACM Computing Surveys. In time-complexity, N refers to the sequence length, B is the local window or block size, $$N_g$$ denotes global model memory length, and $$N_c$$ refers to convolutionally-compressed sequence lengths respectively. The taxonomy of algorithms that improve the general efficiency of the Transformer architecture are presented in https://orkg.org/paper/R211075/R211081.
提供机构:
Open Research Knowledge Graph
创建时间:
2022-09-12



