Substrate-Driven Mapping of the Degradome by Comparison of Sequence Logos

NIAID Data Ecosystem2026-03-08 收录

下载链接：

https://figshare.com/articles/dataset/_Substrate_Driven_Mapping_of_the_Degradome_by_Comparison_of_Sequence_Logos_/850768

下载链接

链接失效反馈

官方服务：

资源简介：

Sequence logos are frequently used to illustrate substrate preferences and specificity of proteases. Here, we employed the compiled substrates of the MEROPS database to introduce a novel metric for comparison of protease substrate preferences. The constructed similarity matrix of 62 proteases can be used to intuitively visualize similarities in protease substrate readout via principal component analysis and construction of protease specificity trees. Since our new metric is solely based on substrate data, we can engraft the protease tree including proteolytic enzymes of different evolutionary origin. Thereby, our analyses confirm pronounced overlaps in substrate recognition not only between proteases closely related on sequence basis but also between proteolytic enzymes of different evolutionary origin and catalytic type. To illustrate the applicability of our approach we analyze the distribution of targets of small molecules from the ChEMBL database in our substrate-based protease specificity trees. We observe a striking clustering of annotated targets in tree branches even though these grouped targets do not necessarily share similarity on protein sequence level. This highlights the value and applicability of knowledge acquired from peptide substrates in drug design of small molecules, e.g., for the prediction of off-target effects or drug repurposing. Consequently, our similarity metric allows to map the degradome and its associated drug target network via comparison of known substrate peptides. The substrate-driven view of protein-protein interfaces is not limited to the field of proteases but can be applied to any target class where a sufficient amount of known substrate data is available.

序列标识图（Sequence logos）常被用于阐释蛋白酶的底物偏好性与底物特异性。本研究利用MEROPS数据库收录的底物数据，提出了一种用于比较蛋白酶底物偏好性的全新量化指标。本研究构建的包含62种蛋白酶的相似性矩阵，可通过主成分分析（principal component analysis）与蛋白酶特异性树构建，直观可视化蛋白酶底物识别特征的相似性。由于该量化指标仅基于底物数据，我们可将不同进化起源的蛋白水解酶纳入该蛋白酶特异性树中。据此，本研究的分析证实，不仅在序列层面亲缘关系较近的蛋白酶之间，底物识别存在显著重叠，不同进化起源与催化类型的蛋白水解酶之间亦存在此类重叠。为阐明本研究方法的适用性，我们分析了ChEMBL数据库中小分子药物靶点在基于底物构建的蛋白酶特异性树中的分布情况。我们观察到，尽管这些聚类的靶点在蛋白质序列层面未必具有相似性，但注释靶点在树分支中呈现出显著的聚集现象。这凸显了从肽底物中获取的知识在小分子药物研发中的价值与适用性，例如可用于预测脱靶效应或开展药物重定位研究。综上，本研究的相似性量化指标可通过比对已知的肽底物，绘制降解组（degradome）及其关联的药物靶点网络。这种基于底物的蛋白质相互作用界面分析视角并非仅局限于蛋白酶研究领域，亦可应用于具备充足已知底物数据的任何靶点类别。

创建时间：

2013-11-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集