five

A coevolution analysis for identifying protein-protein interactions by Fourier transform

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/A_coevolution_analysis_for_identifying_protein-protein_interactions_by_Fourier_transform/4900766
下载链接
链接失效反馈
官方服务:
资源简介:
Protein-protein interactions (PPIs) play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA); however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40). The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI).

蛋白质-蛋白质相互作用(Protein-protein interactions, PPIs)在生命进程中发挥关键作用,涵盖信号转导、转录调控与免疫应答等诸多方面。对PPIs的鉴定能够帮助我们更深入地理解细胞内的功能网络。当前用于鉴定PPIs的常规实验方法往往耗时较长且成本高昂,而近年来基于共进化理论、从蛋白质序列出发推断PPIs的计算方法研究取得进展,成功规避了上述弊端。在共进化理论模型中,存在相互作用的蛋白质可能呈现共进化突变特征,且拥有相似的系统发育树。现有的共进化方法多依赖于多序列比对(multiple sequence alignments, MSA),但这类基于MSA的共进化方法通常会产生较高比例的假阳性相互作用。本文提出一种采用无比对策略的计算方法,以实现PPIs的精准检测并降低假阳性率。该方法通过反映蛋白质结构与功能差异的氨基酸生化属性,对蛋白质序列进行数值化表征;随后对蛋白质序列的数值表征应用傅里叶变换(Fourier transform),以在生物物理语境下捕捉蛋白质序列间的差异。本研究针对埃博拉病毒的PPIs预测任务对该方法进行了评估,结果显示,NP-VP24、NP-VP30、NP-VP40、VP24-VP30、VP24-VP40以及VP30-VP40这些蛋白质对之间存在较强的共进化关系。此外,该方法还在流感病毒与大肠杆菌(E. coli)基因组的PPIs预测任务中得到了验证。由于本方法能够降低假阳性率并提升PPIs预测的特异性,可为解析疾病病原体的致病机制、发掘药物设计的潜在靶点提供高效可靠的工具。本研究的Python程序可通过公开网址https://github.com/cyinbox/PPI获取。
创建时间:
2017-04-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作