PawanRamaMali/proteingym-fm-benchmark
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/PawanRamaMali/proteingym-fm-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- other
tags:
- protein
- biology
- deep-mutational-scanning
- protein-language-models
- benchmark
pretty_name: ProteinGym FM Benchmark
size_categories:
- 1M<n<10M
---
# ProteinGym Protein Foundation Model Benchmark
Benchmark results for protein foundation models on the ProteinGym substitution benchmark (217 DMS assays, ~2.7M variants).
## Models Benchmarked
| Model | Status | Assays | Mean Spearman ρ | Median ρ |
|-------|--------|--------|-----------------|----------|
| ESM-2 (650M) | ✅ Complete | 217/217 | 0.446 | - |
| ESM-2 (3B) | ✅ Complete | 201/217 | 0.432 | 0.475 |
| SaProt (650M) | ✅ Complete | 201/217 | 0.258 | 0.258 |
| ProtT5-XL | ✅ Complete | 134/217 | ~0.28 | 0.282 |
| ESM-1v (5 seeds) | 🔄 Running | 0/217 | - | - |
| ESM3 (1.4B) | Pending | 0/217 | - | - |
| ESM3 (7B) | Pending | 0/217 | - | - |
## Dataset Structure
```
results/
├── esm2_650M/
│ ├── per_assay_spearman.csv
│ └── scored/
├── esm2_3B/
│ ├── per_assay_spearman.csv
│ └── scored/
├── saprot_650M/
│ └── per_assay_spearman.csv
├── prott5_xl/
│ └── per_assay_spearman.csv
...
```
## Citation
```bibtex
@article{mali2026protein,
title={From Sequence Encoders to Multimodal Systems: A Critical Survey of Protein Foundation Models},
author={Mali, Pawan Rama and Bharti, Vandana},
journal={IEEE Transactions on Computational Biology and Bioinformatics},
year={2026}
}
```
## Related
- [ProteinGym Benchmark](https://proteingym.org/)
- [GitHub Repository](https://github.com/PawanRamaMali/protein-fm-benchmark)
提供机构:
PawanRamaMali



