21f1002947/21f1002947-nppe-2-dataset
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/21f1002947/21f1002947-nppe-2-dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含蛋白质氨基酸序列及其二级结构注释。每个氨基酸残渣在序列中均使用Q8(8类)和Q3(3类)二级结构方案进行标注,使得该数据集适用于计算生物学中的序列标注和标记分类任务。数据集旨在用于训练和评估深度学习模型,如BiLSTM、CRF和基于Transformer的架构,以进行蛋白质结构预测。
This dataset contains protein amino acid sequences paired with their secondary structure annotations. Each amino acid residue in a sequence is labeled using both Q8 (8-class) and Q3 (3-class) secondary structure schemes, making the dataset suitable for sequence labeling and token classification tasks in computational biology. The dataset is designed for training and evaluating deep learning models such as BiLSTM, CRF, and Transformer-based architectures for protein structure prediction.
提供机构:
21f1002947



