Restraint Quality, Not Quantity, Predicts Peptide–Protein Docking Outcomes
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Restraint_Quality_Not_Quantity_Predicts_Peptide_Protein_Docking_Outcomes/31086343
下载链接
链接失效反馈官方服务:
资源简介:
Understanding protein-peptide interactions is essential
for uncovering
cellular signaling mechanisms and advancing therapeutic development,
as these interactions play central roles in numerous biological processes.
Gaining structural insight into such complexes is crucial, yet traditional
methods like nuclear magnetic resonance (NMR) and X-ray crystallography
are often time-consuming and experimentally demanding. Computational
approachesincluding physics-based docking and deep-learning
(DL) structure predictors such as AlphaFold3, Boltz-2, and Chai-1offer
powerful alternatives. Accurately modeling flexible peptides that
bind to shallow, surface-exposed regions remains difficult for physics-based
methods, and although multiple sequence alignment-driven DL models
can achieve excellent performance in well-behaved systems, they too
can struggle when the peptide adopts noncanonical conformations or
when sequence identity is low. In such cases, distance restraints
are often required to guide the docking toward accurate and biologically
meaningful solutions, yet acquiring multiple high-quality restraints
is often difficult. To address the limitation of physics and DL approaches,
we developed a restraint scoring function that integrates evolutionary
conservation, spatial proximity, and geometric distribution to assess
the informativeness of restraint sets. This enables a more accurate
evaluation of docking inputs and overcomes the shortcomings of relying
solely on restraint count. Building on this framework, we introduce
a minimal-restraint docking strategy, capable of identifying optimized
subsets of restraints that lead to high-quality structural models.
We evaluate a comprehensive set of protein–peptide systems,
including 43 SH3 domain complexes, 8 WW domain complexes, and 19 medium-difficulty
cases from the PepPCBench benchmark. Our approach shows that model
quality improves as the restraint score increases, supporting restraint
score as a simple, interpretable indicator of docking success. We
further identify clear, domain-specific restraint-score thresholds
for the SH3 and WW systems that enable accurate model selection. Together,
these results offer a scalable and efficient strategy for structure
prediction in data-limited contexts and lay the groundwork for restraint-informed
modeling with quantifiable confidence, as well as a powerful foundation
for data-efficient machine learning-based peptide–protein docking.
解析蛋白质-肽相互作用(protein-peptide interactions)对于揭示细胞信号传导机制、推进治疗药物开发至关重要,这类相互作用在诸多生物过程中均发挥核心作用。深入解析此类复合物的结构特征至关重要,但传统方法如核磁共振(nuclear magnetic resonance, NMR)与X射线晶体学往往耗时较长,且实验要求严苛。计算方法——包括基于物理的对接算法以及AlphaFold3、Boltz-2、Chai-1等深度学习(deep-learning, DL)结构预测模型——提供了极具潜力的替代方案。对于基于物理的方法而言,精准建模结合于浅暴露表面区域的柔性肽段仍是难点;尽管基于多序列比对的深度学习模型在适配性良好的体系中可取得优异性能,但当肽段采用非经典构象或序列同一性较低时,此类模型同样难以获得理想结果。在此类场景中,往往需要借助距离约束(distance restraints)来引导对接过程,以获得准确且符合生物学意义的结果,但获取多组高质量约束条件通常颇具挑战。为解决基于物理方法与深度学习模型的局限性,本研究开发了一款约束评分函数,该函数整合了进化保守性、空间邻近性与几何分布特征,用以评估约束集的信息价值。这使得对接输入的评估更为精准,同时克服了仅依赖约束数量进行评估的缺陷。基于该框架,本研究提出了一种最小约束对接策略,可识别出可生成高质量结构模型的最优约束子集。本研究针对一系列涵盖全面的蛋白质-肽体系开展评估,包括43个SH3结构域(SH3 domain)复合物、8个WW结构域(WW domain)复合物,以及来自PepPCBench基准数据集的19个中等难度样本。研究结果表明,模型质量随约束评分的提升而改善,这证实约束评分可作为一项简便易懂的对接成功判定指标。此外,本研究还针对SH3与WW体系确定了明确的结构域专属约束评分阈值,可实现对结构模型的精准筛选。综上,本研究成果为数据受限场景下的结构预测提供了一种可扩展且高效的策略,为具备可量化置信度的约束导向建模奠定了基础,同时也为基于机器学习的数据高效型肽-蛋白质对接研究提供了坚实支撑。
创建时间:
2026-01-18



