fredzzp/Uniref50
收藏Hugging Face2024-12-27 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/fredzzp/Uniref50
下载链接
链接失效反馈官方服务:
资源简介:
Uniref50是一个蛋白质序列数据集,按照50%的序列同一性进行聚类,包含大约4000万个蛋白质序列。数据集分为训练集、验证集和测试集三个部分,可用于机器学习模型的训练和评估。
Uniref50 is a protein sequence dataset clustered at 50% sequence identity, containing approximately 40 million protein sequences. The dataset is split into three parts: training set, validation set, and test set, which can be used for training and evaluation of machine learning models.
提供机构:
fredzzp



