proteinglm/ssp_q3
收藏Hugging Face2024-11-20 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/proteinglm/ssp_q3
下载链接
链接失效反馈官方服务:
资源简介:
该数据集用于蛋白质二级结构预测,包含蛋白质序列和对应的结构标签。蛋白质的二级结构包括螺旋、链和各种转角,这些结构赋予蛋白质特定的三维构型,对其三级结构的形成至关重要。每个蛋白质序列被分类为三个不同的类别,分别代表不同的结构元素:H - 螺旋(包括α-螺旋、3-10螺旋和π螺旋),E - 链(包括β-链和β-桥),C - 卷曲(包括转角、弯曲和随机卷曲)。数据集分为训练集和测试集,分别包含10,848和667个实例。数据集的来源是NetSurfP-2.0,并且遵循Apache-2.0许可。
The Secondary Structure Prediction (Q3) Dataset is used to study the secondary structure of proteins, classifying them into three distinct structural elements: H (Helix), E (Strand), and C (Coil). The dataset includes a training set and a test set, containing 10,848 and 667 instances respectively. Each instance contains a protein sequence string and a structural label sequence. The dataset features include protein sequences and corresponding structural labels, sourced from NetSurfP-2.0. The dataset is licensed under Apache-2.0.
提供机构:
proteinglm



