five

Restraint Quality, Not Quantity, Predicts Peptide–Protein Docking Outcomes

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Restraint_Quality_Not_Quantity_Predicts_Peptide_Protein_Docking_Outcomes/31086343
下载链接
链接失效反馈
官方服务:
资源简介:
Understanding protein-peptide interactions is essential for uncovering cellular signaling mechanisms and advancing therapeutic development, as these interactions play central roles in numerous biological processes. Gaining structural insight into such complexes is crucial, yet traditional methods like nuclear magnetic resonance (NMR) and X-ray crystallography are often time-consuming and experimentally demanding. Computational approachesincluding physics-based docking and deep-learning (DL) structure predictors such as AlphaFold3, Boltz-2, and Chai-1offer powerful alternatives. Accurately modeling flexible peptides that bind to shallow, surface-exposed regions remains difficult for physics-based methods, and although multiple sequence alignment-driven DL models can achieve excellent performance in well-behaved systems, they too can struggle when the peptide adopts noncanonical conformations or when sequence identity is low. In such cases, distance restraints are often required to guide the docking toward accurate and biologically meaningful solutions, yet acquiring multiple high-quality restraints is often difficult. To address the limitation of physics and DL approaches, we developed a restraint scoring function that integrates evolutionary conservation, spatial proximity, and geometric distribution to assess the informativeness of restraint sets. This enables a more accurate evaluation of docking inputs and overcomes the shortcomings of relying solely on restraint count. Building on this framework, we introduce a minimal-restraint docking strategy, capable of identifying optimized subsets of restraints that lead to high-quality structural models. We evaluate a comprehensive set of protein–peptide systems, including 43 SH3 domain complexes, 8 WW domain complexes, and 19 medium-difficulty cases from the PepPCBench benchmark. Our approach shows that model quality improves as the restraint score increases, supporting restraint score as a simple, interpretable indicator of docking success. We further identify clear, domain-specific restraint-score thresholds for the SH3 and WW systems that enable accurate model selection. Together, these results offer a scalable and efficient strategy for structure prediction in data-limited contexts and lay the groundwork for restraint-informed modeling with quantifiable confidence, as well as a powerful foundation for data-efficient machine learning-based peptide–protein docking.

解析蛋白质-肽相互作用(protein-peptide interactions)对于揭示细胞信号传导机制、推进治疗药物开发至关重要,这类相互作用在诸多生物过程中均发挥核心作用。深入解析此类复合物的结构特征至关重要,但传统方法如核磁共振(nuclear magnetic resonance, NMR)与X射线晶体学往往耗时较长,且实验要求严苛。计算方法——包括基于物理的对接算法以及AlphaFold3、Boltz-2、Chai-1等深度学习(deep-learning, DL)结构预测模型——提供了极具潜力的替代方案。对于基于物理的方法而言,精准建模结合于浅暴露表面区域的柔性肽段仍是难点;尽管基于多序列比对的深度学习模型在适配性良好的体系中可取得优异性能,但当肽段采用非经典构象或序列同一性较低时,此类模型同样难以获得理想结果。在此类场景中,往往需要借助距离约束(distance restraints)来引导对接过程,以获得准确且符合生物学意义的结果,但获取多组高质量约束条件通常颇具挑战。为解决基于物理方法与深度学习模型的局限性,本研究开发了一款约束评分函数,该函数整合了进化保守性、空间邻近性与几何分布特征,用以评估约束集的信息价值。这使得对接输入的评估更为精准,同时克服了仅依赖约束数量进行评估的缺陷。基于该框架,本研究提出了一种最小约束对接策略,可识别出可生成高质量结构模型的最优约束子集。本研究针对一系列涵盖全面的蛋白质-肽体系开展评估,包括43个SH3结构域(SH3 domain)复合物、8个WW结构域(WW domain)复合物,以及来自PepPCBench基准数据集的19个中等难度样本。研究结果表明,模型质量随约束评分的提升而改善,这证实约束评分可作为一项简便易懂的对接成功判定指标。此外,本研究还针对SH3与WW体系确定了明确的结构域专属约束评分阈值,可实现对结构模型的精准筛选。综上,本研究成果为数据受限场景下的结构预测提供了一种可扩展且高效的策略,为具备可量化置信度的约束导向建模奠定了基础,同时也为基于机器学习的数据高效型肽-蛋白质对接研究提供了坚实支撑。
创建时间:
2026-01-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作