Prediction of Intrinsic Disorder Using Rosetta ResidueDisorder and AlphaFold2
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Prediction_of_Intrinsic_Disorder_Using_Rosetta_ResidueDisorder_and_AlphaFold2/21347715
下载链接
链接失效反馈官方服务:
资源简介:
The
combination of deep learning and sequence data has transformed
protein structure prediction and modeling, evidenced in the success
of AlphaFold (AF). For this reason, many methods have been developed
to take advantage of this success in areas where inaccurate structural
modeling may limit computational predictiveness. For example, many
methods have been developed to predict protein intrinsic disorder
from sequence, including our Rosetta ResidueDisorder (RRD) approach.
Intrinsically disordered regions in proteins are parts of the sequence
that do not form ordered, folded structures under typical physiological
conditions. In the original implementation of RRD, Rosetta ab initio models were generated, and disordered regions
were predicted based on residue scores (disordered residues typically
exist in regions of unfavorable scores). In this work, we show that
by (i) replacing the ab initio modeling with AF (using
the same scoring and disorder assignment approach) and (ii) updating
the score function, the predictiveness improved significantly. Residues
were better ranked by the order/disorder, evidenced by an improvement
in receiver operating characteristic area-under-the-curve from 0.69
to 0.78 on a large (229 protein) and balanced data set (relatively
even ordered versus disordered residues). Finally, the binary prediction
accuracy also improved from 62% to 74% on the same data set. Our results
show that the combined AF-RRD approach was as good as or better than
all existing methods by these metrics (AF-RRD had the highest prediction
accuracy).
深度学习与序列数据的结合已然革新了蛋白质结构预测与建模领域,AlphaFold(AlphaFold, AF)的成功便是有力佐证。鉴于此,诸多研究方法已被开发,以期在结构建模不准确可能限制计算预测性能的领域中,依托这一成果实现突破。例如,诸多方法被用于从蛋白质序列预测其内在无序区域,其中便包括我们的Rosetta残基无序预测(Rosetta ResidueDisorder, RRD)方法。蛋白质内在无序区域是指在典型生理条件下,不会形成有序折叠结构的序列片段。在RRD的原始实现中,研究人员会生成Rosetta从头折叠模型(ab initio modeling),并基于残基打分预测无序区域——通常而言,无序残基存在于打分不利的区域中。在本研究中,我们证明了通过两点改进可显著提升预测性能:其一,使用AlphaFold替换原有的从头折叠建模流程(保留相同的打分与无序区域分配方法);其二,更新打分函数。残基的有序/无序排序性能得到了显著提升:在一个包含229个蛋白质的大型平衡数据集(有序残基与无序残基占比相对均衡)上,受试者工作特征曲线下面积(Receiver Operating Characteristic Area Under the Curve, AUROC)从0.69提升至0.78。同时,该数据集上的二分类预测准确率也从62%提升至74%。我们的研究结果表明,改进后的AF-RRD方法在上述评估指标上达到了与所有现有方法相当甚至更优的性能,其中AF-RRD的预测准确率位居所有方法之首。
创建时间:
2022-10-17



