five

Alignment-based protein mutational landscape prediction: doing more with less

收藏
DataONE2024-02-01 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:62165a5094fef60600cb7ab416ffb26226b0f04d73e615c3496ceb3ea9223d08
下载链接
链接失效反馈
官方服务:
资源简介:
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictive pipeline. , , , # Alignment-based protein mutational landscape prediction: doing more with less. [Access this dataset on Dryad](https://doi.org/10.5061/dryad.vdncjsz1s) This dataset contains the data and tools associated with *Alignment-based protein mutational landscape prediction: doing more with less*, Abakarova *et al.*, *Genome Biology and Evolution*, 2023. **doi:** ## Description of the data and file structure We provide the community with data associated with our assessment of four different multiple sequence alignment (MSA) resources and protocols, as well as the complete single-mutational landscape of the human proteome predicted by combining the MSA protocol implemented in ColabFold and the variant effect predictor GEMME. 1. **ProteinGym_assessment.tgz** contains the data and scripts associated with our assessment of the four different MSA generation protocols (ColabFold, ProteinGym, ProteinNet, Pfam) against the ProteinGym substitution benchmark. This archive is organised as follo...
创建时间:
2025-07-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作