five

Site saturation mutagenesis of 500 human protein domains

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP504250
下载链接
链接失效反馈
官方服务:
资源简介:
Missense variants that change the amino acid sequences of proteins cause one third of human genetic diseases. Tens of millions of missense variants exist in the current human population, with the vast majority having unknown functional consequences. Here we present the first large-scale experimental analysis of human missense variants. Using DNA synthesis and cellular selection experiments we quantify the impact of >500,000 variants on the abundance of >500 human protein domains. This dataset, Domainome 1.0, reveals that >60% of disease-causing variants destabilize proteins. The contribution of stability to protein fitness varies across proteins and diseases, and is particularly important in recessive disorders. Combining experimental stability measurements with large language models we annotate functionally important sites across domains. Fitting energy models to the data demonstrates the conservation of mutation effects in homologous domains and allows stability to be accurately predicted for entire domain families. Domainome 1.0 demonstrates the feasibility of assaying human protein variant effects at scale and provides a large consistent reference dataset for clinical variant interpretation and the training and benchmarking of computational methods. Overall design: We designed site saturation mutagenesis variant libraries of >500 protein domains, and selected them using an in cell abundance assay to measure the effects of mutations on protein abundance.
创建时间:
2025-01-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作