RosettaCommons/MegaScale
收藏Hugging Face2024-11-13 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/RosettaCommons/MegaScale
下载链接
链接失效反馈官方服务:
资源简介:
MegaScale数据集包含1,841,285个蛋白质折叠稳定性的实验测量数据,使用cDNA显示蛋白水解技术对自然和设计的蛋白质进行分析。数据集分为多个子集,包括高质量的单氨基酸变体和双突变体的折叠稳定性数据,涵盖了331个自然蛋白质和148个设计的蛋白质域。数据集还提供了ΔG和ΔΔG的估计值,用于分析蛋白质折叠的稳定性。
The MegaScale dataset contains 1,841,285 thermodynamic folding stability measurements using cDNA display proteolysis of natural and designed proteins. The main datasets include:
- `dataset1`: Contains 1,841,285 stability measurements of all mutations in G0-G11.
- `dataset2`: Contains 776,298 high-quality folding stability measurements, covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains of 40-72 amino acids in length.
- `dataset3`: Contains 325,132 ΔG measurements, suitable for estimating the ΔΔG of mutations.
- `dataset3_single`: Single point mutations in `dataset3`, using the train/val/test splits defined in ThermoMPNN.
- `dataset3_single_cv`: Single point mutations in `dataset3`, using cross-validation splits.
提供机构:
RosettaCommons



