DeepFoldProtein/uniref50_processed
收藏Hugging Face2025-11-09 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/DeepFoldProtein/uniref50_processed
下载链接
链接失效反馈官方服务:
资源简介:
这是一个针对无监督蛋白质表示学习预处理的UniRef50数据集快照。它经过序列标准化、过滤和去重处理,并根据UniRef50簇ID进行拆分。数据集包括训练集、验证集和测试集,验证集使用ESM官方验证头文件,测试集为基于簇ID的随机保留集。
This dataset is a preprocessed UniRef50 snapshot tailored for unsupervised protein representation learning. It undergoes sequence normalization, filtering, and deduplication, and is split by UniRef50 cluster ID. The dataset includes train, validation, and test sets, with the validation set using ESMs official validation headers and the test set being a random holdout based on cluster ID.
提供机构:
DeepFoldProtein



