five

Deciphering polymorphism in 61,157 Escherichia coli genomes via epistatic sequence landscapes

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5774191
下载链接
链接失效反馈
官方服务:
资源简介:
We use computational models based on Direct Coupling Analysis - DCA - trained on PFAM domains of distant distant homologues to accurately predict the polymorphisms segregating in a panel of 61,157 Escherichia coli genomes. We show that the genetic context (i.e. the rest of the protein sequence) strongly constrains the tolerable amino acids in 30% to 50% of amino-acid sites. Our study also suggests the gradual build-up of genetic context over long evolutionary timescales by the accumulation of small epistatic contributions. Please refer to the README file for additional information on the structure of this dataset. Code to analyse this dataset is available at https://github.com/GiancarloCroce/DCA_polymorphism_Ecoli.
创建时间:
2021-12-12
二维码
社区交流群
二维码
科研交流群
商业服务