five

Prediction of Intrinsic Disorder Using Rosetta ResidueDisorder and AlphaFold2

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Prediction_of_Intrinsic_Disorder_Using_Rosetta_ResidueDisorder_and_AlphaFold2/21347715
下载链接
链接失效反馈
官方服务:
资源简介:
The combination of deep learning and sequence data has transformed protein structure prediction and modeling, evidenced in the success of AlphaFold (AF). For this reason, many methods have been developed to take advantage of this success in areas where inaccurate structural modeling may limit computational predictiveness. For example, many methods have been developed to predict protein intrinsic disorder from sequence, including our Rosetta ResidueDisorder (RRD) approach. Intrinsically disordered regions in proteins are parts of the sequence that do not form ordered, folded structures under typical physiological conditions. In the original implementation of RRD, Rosetta ab initio models were generated, and disordered regions were predicted based on residue scores (disordered residues typically exist in regions of unfavorable scores). In this work, we show that by (i) replacing the ab initio modeling with AF (using the same scoring and disorder assignment approach) and (ii) updating the score function, the predictiveness improved significantly. Residues were better ranked by the order/disorder, evidenced by an improvement in receiver operating characteristic area-under-the-curve from 0.69 to 0.78 on a large (229 protein) and balanced data set (relatively even ordered versus disordered residues). Finally, the binary prediction accuracy also improved from 62% to 74% on the same data set. Our results show that the combined AF-RRD approach was as good as or better than all existing methods by these metrics (AF-RRD had the highest prediction accuracy).

深度学习与序列数据的结合已然革新了蛋白质结构预测与建模领域,AlphaFold(AlphaFold, AF)的成功便是有力佐证。鉴于此,诸多研究方法已被开发,以期在结构建模不准确可能限制计算预测性能的领域中,依托这一成果实现突破。例如,诸多方法被用于从蛋白质序列预测其内在无序区域,其中便包括我们的Rosetta残基无序预测(Rosetta ResidueDisorder, RRD)方法。蛋白质内在无序区域是指在典型生理条件下,不会形成有序折叠结构的序列片段。在RRD的原始实现中,研究人员会生成Rosetta从头折叠模型(ab initio modeling),并基于残基打分预测无序区域——通常而言,无序残基存在于打分不利的区域中。在本研究中,我们证明了通过两点改进可显著提升预测性能:其一,使用AlphaFold替换原有的从头折叠建模流程(保留相同的打分与无序区域分配方法);其二,更新打分函数。残基的有序/无序排序性能得到了显著提升:在一个包含229个蛋白质的大型平衡数据集(有序残基与无序残基占比相对均衡)上,受试者工作特征曲线下面积(Receiver Operating Characteristic Area Under the Curve, AUROC)从0.69提升至0.78。同时,该数据集上的二分类预测准确率也从62%提升至74%。我们的研究结果表明,改进后的AF-RRD方法在上述评估指标上达到了与所有现有方法相当甚至更优的性能,其中AF-RRD的预测准确率位居所有方法之首。
创建时间:
2022-10-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作