Designing Focused Chemical Libraries Enriched in Protein-Protein Interaction Inhibitors using Machine-Learning Methods
收藏NIAID Data Ecosystem2026-03-06 收录
下载链接:
https://figshare.com/articles/dataset/Designing_Focused_Chemical_Libraries_Enriched_in_Protein_Protein_Interaction_Inhibitors_using_Machine_Learning_Methods/144344
下载链接
链接失效反馈官方服务:
资源简介:
Protein-protein interactions (PPIs) may represent one of the next major classes of therapeutic targets. So far, only a minute fraction of the estimated 650,000 PPIs that comprise the human interactome are known with a tiny number of complexes being drugged. Such intricate biological systems cannot be cost-efficiently tackled using conventional high-throughput screening methods. Rather, time has come for designing new strategies that will maximize the chance for hit identification through a rationalization of the PPI inhibitor chemical space and the design of PPI-focused compound libraries (global or target-specific). Here, we train machine-learning-based models, mainly decision trees, using a dataset of known PPI inhibitors and of regular drugs in order to determine a global physico-chemical profile for putative PPI inhibitors. This statistical analysis unravels two important molecular descriptors for PPI inhibitors characterizing specific molecular shapes and the presence of a privileged number of aromatic bonds. The best model has been transposed into a computer program, PPI-HitProfiler, that can output from any drug-like compound collection a focused chemical library enriched in putative PPI inhibitors. Our PPI inhibitor profiler is challenged on the experimental screening results of 11 different PPIs among which the p53/MDM2 interaction screened within our own CDithem platform, that in addition to the validation of our concept led to the identification of 4 novel p53/MDM2 inhibitors. Collectively, our tool shows a robust behavior on the 11 experimental datasets by correctly profiling 70% of the experimentally identified hits while removing 52% of the inactive compounds from the initial compound collections. We strongly believe that this new tool can be used as a global PPI inhibitor profiler prior to screening assays to reduce the size of the compound collections to be experimentally screened while keeping most of the true PPI inhibitors. PPI-HitProfiler is freely available on request from our CDithem platform website, www.CDithem.com.
蛋白质-蛋白质相互作用(Protein-protein interactions, PPIs)有望成为下一代主要治疗靶点之一。目前,构成人类相互作用组(human interactome)的预估65万个PPIs中,仅极小一部分已被探明,且仅有极少数复合物可被开发为药物靶点。这类复杂的生物系统无法通过传统高通量筛选(high-throughput screening)方法实现经济高效的研究。故而,如今已到了亟需设计全新策略的时刻:通过合理化PPI抑制剂的化学空间、构建聚焦于PPI的化合物库(全局型或靶点特异性型),最大化命中物识别(hit identification)的成功率。本研究基于已知PPI抑制剂与常规药物的数据集,训练以决策树(decision trees)为主的基于机器学习(machine learning)的模型,以确定潜在PPI抑制剂的全局理化特征(physico-chemical profile)。该统计分析揭示了PPI抑制剂的两项关键分子描述符(molecular descriptors):分别对应特定的分子形状,以及特征性数量的芳香键(aromatic bonds)。最优模型已被转化为名为PPI-HitProfiler的计算机程序,可从任意类药化合物集合中筛选出富集潜在PPI抑制剂的聚焦型化学库(focused chemical library)。我们通过11种不同PPIs的实验筛选结果对该PPI抑制剂分析工具进行验证,其中包括在本团队自主搭建的CDithem平台上开展的p53/MDM2相互作用筛选实验;该验证不仅确认了我们的研究理念,还成功发现了4种新型p53/MDM2抑制剂。总体而言,该工具在11组实验数据集上表现稳健:可正确识别70%的实验确认命中物,同时从初始化合物集合中剔除52%的无活性化合物(inactive compounds)。我们认为,该工具可作为筛选实验(screening assays)前的全局PPI抑制剂分析工具,在保留绝大多数真实PPI抑制剂的前提下,缩小需实验筛选的化合物集合规模。PPI-HitProfiler可通过CDithem平台官网www.CDithem.com免费申请获取。
创建时间:
2016-01-18



