genbio-ai/ssp_q3_rag
收藏Hugging Face2025-04-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/genbio-ai/ssp_q3_rag
下载链接
链接失效反馈官方服务:
资源简介:
二级结构预测数据集(Q3)是研究蛋白质二级结构的基础数据集。数据集中的蛋白质序列被分类为三种不同的结构元素:H - 螺旋(包括α-螺旋、3-10螺旋和π螺旋)、E - 折叠(包括β-折叠和β-桥)、C - 卷曲(包括转折、弯曲和随机卷曲)。每个实例包括蛋白质序列、结构标签、多重序列比对(MSA)序列和结构嵌入(str_emb)。数据集分为训练集和测试集,分别包含10848和667个实例。
The Secondary Structure Prediction Dataset (Q3) is a fundamental dataset for studying protein secondary structures. Protein sequences in the dataset are classified into three different structural elements: H - Helix (including alpha-helix, 3-10 helix, and pi helix), E - Strand (including beta-strand and beta-bridge), C - Coil (including turns, bends, and random coils). Each instance includes a protein sequence, a sequence of structural labels, multiple sequence alignment (MSA) sequences, and structure embeddings (str_emb). The dataset is split into a training set and a test set, containing 10,848 and 667 instances respectively.
提供机构:
genbio-ai



