five

genbio-ai/ProteinGYM-DMS

收藏
Hugging Face2024-12-07 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/genbio-ai/ProteinGYM-DMS
下载链接
链接失效反馈
官方服务:
资源简介:
# ProteinGYM DMS Benchmark DMS Benchmark includes three types of mutations: indels, single substitution and multiple substitutions. Each tsv file contains all data for a single task, representing all possible mutations on a specific protein sequence. The filename corresponds to the task name. Here we use randomly cross-validation scheme, where the data for each task is randomly divided into five folds and the `fold_id` column indicates the fold assignment. The labels are continuous values ranging from 0 to 1, and the tasks are categorized into different functional groups. To check more dataset details, please refer to [ProteinGYM paper](https://www.biorxiv.org/content/10.1101/2023.12.07.570727v1) |Mutation Type | Num Task | Categories| |------------- |:-------------:|:-------------:| |indels|66 | Activity, Expression, Fitness, Stability| | substitutions| 217| Activity, Expression, Fitness, Stability, Binding|
提供机构:
genbio-ai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作