DataSheet2_Packpred: Predicting the Functional Effect of Missense Mutations.xls
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet2_Packpred_Predicting_the_Functional_Effect_of_Missense_Mutations_xls/15576813
下载链接
链接失效反馈官方服务:
资源简介:
Predicting the functional consequences of single point mutations has relevance to protein function annotation and to clinical analysis/diagnosis. We developed and tested Packpred that makes use of a multi-body clique statistical potential in combination with a depth-dependent amino acid substitution matrix (FADHM) and positional Shannon entropy to predict the functional consequences of point mutations in proteins. Parameters were trained over a saturation mutagenesis data set of T4-lysozyme (1,966 mutations). The method was tested over another saturation mutagenesis data set (CcdB; 1,534 mutations) and the Missense3D data set (4,099 mutations). The performance of Packpred was compared against those of six other contemporary methods. With MCC values of 0.42, 0.47, and 0.36 on the training and testing data sets, respectively, Packpred outperforms all methods in all data sets, with the exception of marginally underperforming in comparison to FADHM in the CcdB data set. A meta server analysis was performed that chose best performing methods of wild-type amino acids and for wild-type mutant amino acid pairs. This led to an increase in the MCC value of 0.40 and 0.51 for the two meta predictors, respectively, on the Missense3D data set. We conjecture that it is possible to improve accuracy with better meta predictors as among the seven methods compared, at least one method or another is able to correctly predict ∼99% of the data.
预测单氨基酸点突变的功能影响,对于蛋白质功能注释以及临床分析与诊断均具有重要意义。本研究开发并测试了Packpred方法,该方法结合多体集团统计势能、深度依赖型氨基酸替换矩阵(FADHM)以及位置香农熵,用于预测蛋白质点突变的功能影响。模型参数基于T4溶菌酶的饱和诱变数据集(共1966个突变体)完成训练,该方法随后在两组独立数据集上进行测试:其一为另一项饱和诱变数据集CcdB(含1534个突变体),其二为Missense3D数据集(含4099个突变体)。我们将Packpred的性能与其余6种当前主流同类方法进行了对比,在训练集与两组测试集上,Packpred的马修斯相关系数(MCC)分别为0.42、0.47与0.36;在全部测试数据集上,Packpred的性能均优于其余6种方法,仅在CcdB数据集上略逊于FADHM。我们开展了元服务器分析,针对野生型氨基酸类别以及野生型-突变型氨基酸对选取性能最优的方法,该分析使得两种元预测器在Missense3D数据集上的MCC值分别提升至0.40与0.51。我们推测,通过构建更优质的元预测器可进一步提升预测准确率,因为在本次对比的7种方法中,至少存在一种方法能够正确预测约99%的数据集样本。
创建时间:
2021-08-20



