OH-PRED: prediction of protein hydroxylation sites by incorporating adapted normal distribution bi-profile Bayes feature extraction and physicochemical properties of amino acids
收藏DataCite Commons2020-09-04 更新2024-07-25 收录
下载链接:
https://tandf.figshare.com/articles/dataset/OH_PRED_prediction_of_protein_hydroxylation_sites_by_incorporating_adapted_normal_distribution_bi_profile_Bayes_feature_extraction_and_physicochemical_properties_of_amino_acids/3218737/1
下载链接
链接失效反馈官方服务:
资源简介:
Hydroxylation of proline or lysine residues in proteins is a common post-translational modification event, and such modifications are found in many physiological and pathological processes. Nonetheless, the exact molecular mechanism of hydroxylation remains under investigation. Because experimental identification of hydroxylation is time-consuming and expensive, bioinformatics tools with high accuracy represent desirable alternatives for large-scale rapid identification of protein hydroxylation sites. In view of this, we developed a supporter vector machine-based tool, OH-PRED, for the prediction of protein hydroxylation sites using the adapted normal distribution bi-profile Bayes feature extraction in combination with the physicochemical property indexes of the amino acids. In a jackknife cross validation, OH-PRED yields an accuracy of 91.88% and a Matthew’s correlation coefficient (MCC) of 0.838 for the prediction of hydroxyproline sites, and yields an accuracy of 97.42% and a MCC of 0.949 for the prediction of hydroxylysine sites. These results demonstrate that OH-PRED increased significantly the prediction accuracy of hydroxyproline and hydroxylysine sites by 7.37 and 14.09%, respectively, when compared with the latest predictor PredHydroxy. In independent tests, OH-PRED also outperforms previously published methods.
蛋白质中脯氨酸与赖氨酸残基的羟基化是一类常见的翻译后修饰事件,此类修饰广泛参与诸多生理与病理过程。然而,羟基化的确切分子机制仍有待深入探究。由于实验鉴定羟基化位点不仅耗时漫长且成本高昂,具备高准确性的生物信息学工具便成为大规模快速鉴定蛋白质羟基化位点的理想替代方案。鉴于此,我们开发了一款基于支持向量机(Support Vector Machine)的预测工具OH-PRED,该工具结合氨基酸的理化性质指数,采用适配正态分布双轮廓贝叶斯特征提取算法,实现蛋白质羟基化位点的预测。在刀切交叉验证(jackknife cross validation)中,OH-PRED预测羟脯氨酸位点的准确率达91.88%,马修斯相关系数(Matthew’s correlation coefficient, MCC)为0.838;预测羟赖氨酸位点的准确率达97.42%,MCC为0.949。相较于最新发布的预测工具PredHydroxy,OH-PRED的羟脯氨酸与羟赖氨酸位点预测准确率分别显著提升7.37%与14.09%。独立测试结果表明,OH-PRED同样优于此前已发表的各类预测方法。
提供机构:
Taylor & Francis
创建时间:
2016-05-04



