Dataset for article: Co-evolutionary landscape at the interface and non-interface regions of protein-protein interaction complexes
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.zgmsbcc8g
下载链接
链接失效反馈官方服务:
资源简介:
Proteins involved in interactions throughout the course of evolution tend to co-evolve and compensatory changes may occur in interacting proteins to maintain or refine such interactions. However, certain residue pair alterations may prove to be detrimental for functional interactions. Hence, determining co-evolutionary pairings that could be structurally or functionally relevant for maintaining the conservation of an inter-protein interaction is important. Inter-protein co-evolution analysis in several complexes utilizing multiple existing methodologies suggested that co-evolutionary pairings can occur in spatially proximal and distant regions in inter-protein interactions. Subsequently, the Co-Var (Correlated Variation) method based on mutual information and Bhattacharyya coefficient was developed, validated, and found to perform relatively better than CAPS and EV-complex. Interestingly, while applying the Co-Var measure and EV-complex program on a set of protein-protein interaction complexes, co-evolutionary pairings were obtained in interface and non-interface regions in protein complexes. The Co-Var approach involves determining high degree co-evolutionary pairings that include multiple co-evolutionary connections between particular co-evolved residue positions in one protein with multiple residue positions in the binding partner. Detailed analyses of high degree co-evolutionary pairings in protein-protein complexes involved in cancer metastasis suggested that most of the residue positions forming such co-evolutionary connections mainly occurred within functional domains of constituent proteins and substitution mutations were also common among these positions. The physiological relevance of these predictions suggests that Co-Var can predict residues that could be crucial for preserving functional protein-protein interactions. Finally, Co-Var web server (http://www.hpppi.iicb.res.in/ishi/covar/index.html) that implements this methodology identifies co-evolutionary pairings in intra and inter-protein interactions.
Methods
A number of protein-protein interaction complexes [100] were identified from previous published data (1-3) and complexes involving proteins with sufficient number of homologs and available crystal structure were selected. Around 50 protein complexes were considered as “positive set”. Additionally, non-interacting proteins from the Negatome database (4) were considered as the “negative set”. Close orthologs or similar sequences were determined using DELTA-BLAST (Domain enhanced lookup time accelerated BLAST) (5) and taxonomy filtered non-redundant sequences having E-value <= 1E-04, query coverage >= 70%, sequence identity >= 45% were utilized for preparing multiple sequence alignments (MSA) representative of each sequence family in MAFFT (6). Alignments for homologous sequences of the representative interacting and non-interacting proteins in the “positive set” and the “negative set” were prepared in this manner.
References
Mintseris, J. and Weng, Z. (2003), Atomic contact vectors in protein‐protein recognition. Proteins, 53: 629-639. doi:10.1002/prot.10432
Sowmya, G., Breen, E. J., & Ranganathan, S. (2015). Linking structural features of protein complexes and biological function. Protein science : a publication of the Protein Society, 24(9), 1486-94.
Rodriguez-Rivas, J., Marsili, S., Juan, D., & Valencia, A. (2016). Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone. Proceedings of the National Academy of Sciences of the United States of America, 113(52), 15018–1502 doi:10.1073/pnas.1611861114
Smialowski, P., Pagel, P., Wong, P., Brauner, B., Dunger, I., Fobo, G., Frishman, G., Montrone, C., Rattei, T., Frishman, D., et al. (2009). The Negatome database: a reference set of non-interacting protein pairs. Nucleic acids research, 38(Database issue), D540-4.
Boratyn, G. M., Schäffer, A. A., Agarwala, R., Altschul, S. F., Lipman, D. J., & Madden, T. L. (2012). Domain enhanced lookup time accelerated BLAST. Biology direct, 7, 12.doi:10.1186/1745-6150-7-12
Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on Fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059–66.
在进化过程中参与相互作用的蛋白质往往会发生共进化,且相互作用的蛋白质间可能出现代偿性变化,以维持或优化此类相互作用。然而,特定的残基对突变可能会对功能性相互作用产生不利影响。因此,鉴定出在结构或功能上与维持蛋白质间相互作用的保守性相关的共进化配对具有重要意义。
针对多个复合物开展的蛋白质间共进化分析(采用多种现有方法)表明,蛋白质相互作用中的共进化配对可出现在空间邻近区域与远端区域。随后,本研究开发了基于互信息(mutual information)与巴塔查里亚系数(Bhattacharyya coefficient)的Co-Var(Correlated Variation,相关变异)方法,经验证发现其性能优于CAPS与EV-complex。有趣的是,将Co-Var方法与EV-complex程序应用于一组蛋白质-蛋白质相互作用复合物时,在复合物的界面区域与非界面区域均检测到了共进化配对。
Co-Var方法的核心是鉴定高度关联的共进化配对,即某一蛋白质中的共进化残基位点与结合伴侣的多个残基位点之间存在多条共进化连接。针对参与癌症转移的蛋白质-蛋白质复合物的高度关联共进化配对开展详细分析后发现,形成此类共进化连接的大多数残基位点主要位于组成蛋白质的功能结构域内,且这些位点间的替换突变也较为常见。这些预测的生理学相关性表明,Co-Var能够鉴定出对维持功能性蛋白质-蛋白质相互作用至关重要的残基位点。最后,本研究搭建了实现该方法的Co-Var在线服务器(http://www.hpppi.iicb.res.in/ishi/covar/index.html),可用于鉴定蛋白质内部与蛋白质间相互作用中的共进化配对。
## 方法
本研究从已发表的文献数据(1-3)中筛选得到多组蛋白质-蛋白质相互作用复合物[100],并选取了同源序列数量充足且晶体结构可获取的复合物。最终约50个蛋白质复合物被纳入“阳性集合(positive set)”。此外,从Negatome数据库(4)中获取的非相互作用蛋白质被作为“阴性集合(negative set)”。
使用DELTA-BLAST(Domain enhanced lookup time accelerated BLAST,域增强搜索时间加速BLAST)(5)鉴定近缘同源基因或相似序列,并选取满足E值≤1E-04、查询覆盖度≥70%、序列一致性≥45%的经分类学过滤的非冗余序列,利用MAFFT(6)为每个序列家族构建多序列比对(multiple sequence alignment, MSA)。按照上述流程,分别为“阳性集合”与“阴性集合”中的代表性相互作用蛋白质与非相互作用蛋白质的同源序列构建比对文件。
## 参考文献
1. Mintseris J, Weng Z. Atomic contact vectors in protein‐protein recognition[J]. Proteins, 2003, 53: 629-639. DOI:10.1002/prot.10432
2. Sowmya G, Breen E J, Ranganathan S. Linking structural features of protein complexes and biological function[J]. Protein Science : A Publication of the Protein Society, 2015, 24(9): 1486-1494
3. Rodriguez-Rivas J, Marsili S, Juan D, Valencia A. Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(52): 15018–1502
4. Smialowski P, Pagel P, Wong P, et al. The Negatome database: a reference set of non-interacting protein pairs[J]. Nucleic Acids Research, 2009, 38(Database issue): D540-4
5. Boratyn G M, Schäffer A A, Agarwala R, Altschul S F, Lipman D J, Madden T L. Domain enhanced lookup time accelerated BLAST[J]. Biology Direct, 2012, 7: 12. DOI:10.1186/1745-6150-7-12
6. Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on Fast Fourier transform[J]. Nucleic Acids Research, 2002, 30(14): 3059–66
创建时间:
2021-07-26



