From Nonspecific DNA–Protein Encounter Complexes to the Prediction of DNA–Protein Interactions

NIAID Data Ecosystem2026-03-06 收录

下载链接：

https://figshare.com/articles/dataset/From_Nonspecific_DNA_Protein_Encounter_Complexes_to_the___Prediction_of_DNA_Protein_Interactions/147989

下载链接

链接失效反馈

官方服务：

资源简介：

DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Cα deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein.

DNA-蛋白质相互作用（DNA–protein interactions）参与诸多核心生命活动。由于DNA碱基对与蛋白质氨基酸之间不存在简单的映射编码规则，DNA-蛋白质相互作用的预测始终是一项极具挑战性的课题。本文提出一种全新的计算方法，可在无需知晓特定DNA靶序列的前提下，预测DNA结合蛋白残基（DNA-binding protein residues）与DNA-蛋白质相互作用模式。给定DNA结合蛋白的结构后，该方法首先通过与非特异性经典B型DNA（nonspecific canonical B-DNA）进行刚性对接（rigid-body docking），生成一系列复合物结构集合。随后通过聚类（clustering）并依据DNA-蛋白质界面能量（interfacial energy）进行排序，筛选出代表性模型。对这些遭遇复合物（encounter complex）模型的分析表明，特异性DNA结合的识别位点通常也是与非特异性DNA探针结合的优势位点，且非特异性DNA-蛋白质相互作用模式与特异性DNA-蛋白质结合模式存在一定相似性。尽管该方法需以"该蛋白可结合DNA"的先验知识作为输入，但在基准测试（benchmark tests）中，其在识别DNA结合位点方面的表现优于此前三种基于成熟机器学习技术（machine-learning techniques）的经典方法。我们进一步将该方法应用于通过建模得到的蛋白质结构，并证实：对于与天然结构的Cα原子均方根偏差（root-mean-square Cα deviation）不超过5 Å的蛋白质模型，本方法仍可取得令人满意的预测效果。本研究为特异性DNA结合蛋白如何与非特异性DNA序列相互作用提供了极具价值的结构视角。特异性与非特异性DNA-蛋白质相互作用模式之间的相似性，或许反映了DNA结合蛋白在搜寻其特异性DNA靶标过程中的重要采样步骤。

创建时间：

2009-04-03