five

RosettaCommons/MegaScale

收藏
Hugging Face2024-11-13 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/RosettaCommons/MegaScale
下载链接
链接失效反馈
官方服务:
资源简介:
MegaScale数据集包含1,841,285个蛋白质折叠稳定性的实验测量数据,使用cDNA显示蛋白水解技术对自然和设计的蛋白质进行分析。数据集分为多个子集,包括高质量的单氨基酸变体和双突变体的折叠稳定性数据,涵盖了331个自然蛋白质和148个设计的蛋白质域。数据集还提供了ΔG和ΔΔG的估计值,用于分析蛋白质折叠的稳定性。

The MegaScale dataset contains 1,841,285 thermodynamic folding stability measurements using cDNA display proteolysis of natural and designed proteins. The main datasets include: - `dataset1`: Contains 1,841,285 stability measurements of all mutations in G0-G11. - `dataset2`: Contains 776,298 high-quality folding stability measurements, covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains of 40-72 amino acids in length. - `dataset3`: Contains 325,132 ΔG measurements, suitable for estimating the ΔΔG of mutations. - `dataset3_single`: Single point mutations in `dataset3`, using the train/val/test splits defined in ThermoMPNN. - `dataset3_single_cv`: Single point mutations in `dataset3`, using cross-validation splits.
提供机构:
RosettaCommons
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作