five

AlphaFold Unmasked data sets

收藏
DataCite Commons2025-01-27 更新2025-04-16 收录
下载链接:
https://figshare.scilifelab.se/articles/dataset/AlphaFold_Unmasked_data_sets/24198669
下载链接
链接失效反馈
官方服务:
资源简介:
Here are deposited all of the predictions generated for the test cases presented in "AlphaFold Unmasked: integration of experiments and predictions with a smarter template mechanism" (doi: https://doi.org/10.1101/2023.09.20.558579) along with the log files necessary to reproduce the experiments.Each tar.gz file includes one or more AlphaFold experiments, where multiple predictions have been generated either with AlphaFold-Multimer (standard pipeline, v2.2 and/or v2.3 parameters) or with AF_unmasked. An experiment is made of a set of 3D structure predictions (.pdb files) along with the ancillary data generated by AlphaFold (pickle files) and the corresponding inputs (Multiple Sequence Alignments, sequences). Scripts to reproduce the results are included along with the log files generated during the experiments.H1111, H1142, T1109 and T1110 are multimeric prediction targets from CASP15 (https://predictioncenter.org/casp15/) chosen because most or all predictors failed to correctly predict these complexes in the 2021 edition of CASP.Rubisco, NF1 and ClpB are examples of large and/or challenging targets where Cryo-EM data is available to be integrated in the prediction pipeline.The PDB benchmark is made of a set of protein heterodimeric structures deposited in the PDB before January 2022, i.e. before AlphaFold v2.3 was trained and released. These heterodimers have been redundancy reduced by structural similarity (MMalign score threshold: 0.4) to increase their diversity

这里存放了针对《AlphaFold Unmasked:通过更智能的模板机制整合实验与预测》(doi: https://doi.org/10.1101/2023.09.20.558579)一文中测试案例生成的所有预测结果,以及重现实验所需的日志文件。每个tar.gz文件包含一个或多个AlphaFold实验,其中使用AlphaFold-Multimer(标准流程,v2.2和/或v2.3参数)或AF_unmasked生成了多组预测结果。每个实验包含一组3D结构预测结果(.pdb文件)、AlphaFold生成的辅助数据(pickle文件)以及对应的输入数据(多序列比对、序列)。实验中生成的日志文件与重现结果的脚本也一并包含在内。H1111、H1142、T1109和T1110是来自CASP15(https://predictioncenter.org/casp15/)的多聚体预测靶标,选择这些靶标的原因是在2021年CASP版本中,大多数或所有预测工具均未能正确预测这些复合物。Rubisco、NF1和ClpB是大型和/或具有挑战性的靶标示例,其冷冻电镜(Cryo-EM)数据可用于整合到预测流程中。PDB基准集由2022年1月前(即AlphaFold v2.3训练和发布前)存入PDB的一组蛋白质异二聚体结构组成。这些异二聚体已通过结构相似性(MMalign评分阈值:0.4)进行冗余度降低处理,以增加其多样性。
提供机构:
Linköping University
创建时间:
2023-09-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作