AI4Protein/ssp_q3
收藏Hugging Face2025-11-21 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/AI4Protein/ssp_q3
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- token-classification
tags:
- chemistry
- biology
---
Dataset Summary
The study of a protein’s secondary structure (Sec. Struc. P.) forms a fundamental cornerstone in understanding its biological function. This secondary structure, comprising helices, strands, and various turns, bestows the protein with a specific three-dimensional configuration, which is critical for the formation of its tertiary structure. In the context of this work, a given protein sequence is classified into three distinct categories, each representing a different structural element: H - Helix (includes alpha-helix, 3-10 helix, and pi helix), E - Strand (includes beta-strand and beta-bridge), C - Coil (includes turns, bends, and random coils).
Data Fields
seq: a string containing the protein sequence
label: a sequence containing the structural label of each residue.
Original Dataset Name: biomap-research/ssp_q3
Original Author / Organization: Biomap
Original URL: https://huggingface.co/datasets/biomap-research/ssp_q3
Original License: Apache License 2.0
No changes were made to the data except for the column name. All credit and rights belong to the original authors.
许可证:Apache-2.0
任务类别:
- 词元分类(Token Classification)
标签:
- 化学
- 生物学
数据集概述
对蛋白质二级结构(Sec. Struc. P.)的研究是理解其生物学功能的核心基石。这类二级结构由螺旋、折叠链与各类转角构成,赋予蛋白质特定的三维构型,这对于其三级结构的形成至关重要。在本研究中,给定的蛋白质序列将被划分为三类不同的结构元素:H-螺旋(包括α-螺旋、3-10螺旋和π螺旋)、E-折叠链(包括β-折叠链与β-桥结构)、C-卷曲(包括转角、弯折与无规卷曲)。
数据字段
seq:包含蛋白质序列的字符串
label:包含每个残基结构标签的序列
原始数据集名称:biomap-research/ssp_q3
原始作者/机构:Biomap
原始链接:https://huggingface.co/datasets/biomap-research/ssp_q3
原始许可证:Apache许可证2.0
本数据集仅对列名进行了调整,未对原始数据作出任何修改,所有权益与荣誉均归属于原作者。
提供机构:
AI4Protein



