lhallee/foldseek_dataset
收藏Hugging Face2025-05-29 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/lhallee/foldseek_dataset
下载链接
链接失效反馈官方服务:
资源简介:
这是一个来自ProstT5项目的数据集,包含标签(labels)和序列(seqs)两个字符串类型的字段。数据集分为训练集、测试集和验证集,分别包含17070828、474和474个示例。数据集已经使用相应的tokenizer解码,以获取原始序列。
This dataset is from the ProstT5 project, containing two string-type fields: labels and seqs. The dataset is split into training, test, and validation sets, with 17,070,828, 474, and 474 examples respectively. The dataset has been decoded using the corresponding tokenizer to obtain the raw sequences.
提供机构:
lhallee



