End-to-End Deep Learning Model to Predict and Design Secondary Structure Content of Structural Proteins
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/End-to-End_Deep_Learning_Model_to_Predict_and_Design_Secondary_Structure_Content_of_Structural_Proteins/19131835
下载链接
链接失效反馈官方服务:
资源简介:
Structural proteins are the basis
of many biomaterials and key
construction and functional components of all life. Further, it is
well-known that the diversity of proteins’ function relies
on their local structures derived from their primary amino acid sequences.
Here, we report a deep learning model to predict the secondary structure
content of proteins directly from primary sequences, with high computational
efficiency. Understanding the secondary structure content of proteins
is crucial to designing proteins with targeted material functions,
especially mechanical properties. Using convolutional and recurrent
architectures and natural language models, our deep learning model
predicts the content of two essential types of secondary structures,
the α-helix and the β-sheet. The training data are collected
from the Protein Data Bank and contain many existing protein geometries.
We find that our model can learn the hidden features as patterns of
input sequences that can then be directly related to secondary structure
content. The α-helix and β-sheet content predictions show
excellent agreement with training data and newly deposited protein
structures that were recently identified and that were not included
in the original training set. We further demonstrate the features
of the model by a search for de novo protein sequences that optimize
max/min α-helix/β-sheet content and compare the predictions
with folded models of these sequences based on AlphaFold2. Excellent agreement is found, underscoring that our model has predictive
potential for rapidly designing proteins with specific secondary structures
and could be widely applied to biomedical industries, including protein
biomaterial designs and regenerative medicine applications.
创建时间:
2022-02-07



