five

The BenchStab dataset: a dataset for comparing mutational predictors of stability

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10637727
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is a part of BenchStab, a command-line tool for querying and benchmarking web-based protein stability predictors. We created the dataset to independently evaluate 18 structure-enabled and 4 sequence-based predictors of a stability change upon mutation. We suggest that this dataset should be excluded from training and validation of future stability predictors. The dataset consists of single-point mutations and their experimentally determined ΔΔG from FireProtDB, utilizing only records with both a ΔΔG measurement and a PDB accession code available. We eliminated all records similar to the data used in the training set of any of the predictors considered in BenchStab using UniRef50 clusters. This resulted in 289 records for 36 proteins, of which 28 % display a stabilizing effect (negative value of ΔΔG; see DDG distribution.png for the exact distribution). We further confirmed, by employing SCOP fold-based structure clustering, that the folds of 25 of our proteins were not present in the training sets.  The file dataset.csv contains specifications of mutations (including the chain) and the ground truth ΔΔG reported from the literature alongside accession codes from FireProtDB (experiment ID), UniProt and Protein Data Bank, and UniRef50 cluster IDs. The file benchstab_input.csv contains the same data in the input format of the BenchStab tool. For more statistics and details about the dataset, please read the supplement of the paper or get in touch with us.
创建时间:
2024-10-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作