Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families

NIAID Data Ecosystem2026-03-06 收录

下载链接：

https://figshare.com/articles/dataset/Combining_Structure_and_Sequence_Information_Allows_Automated_Prediction_of_Substrate_Specificities_within_Enzyme_Families/145072

下载链接

链接失效反馈

官方服务：

资源简介：

An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/.

创建时间：

2010-01-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集