five

End-to-End Protein Normal Mode Frequency Predictions Using Language and Graph Models and Application to Sonification

收藏
Figshare2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/End-to-End_Protein_Normal_Mode_Frequency_Predictions_Using_Language_and_Graph_Models_and_Application_to_Sonification/21610507
下载链接
链接失效反馈
官方服务:
资源简介:
The prediction of mechanical and dynamical properties of proteins is an important frontier, especially given the greater availability of proteins structures. Here we report a series of models that provide end-to-end predictions of nanodynamical properties of proteins, focused on high-throughput normal mode predictions directly from the amino acid sequence. Using neural network models within the family of Natural Language Processing and graph-based methods, we offer atomistically based mechanistic predictions of key protein mechanical features. The models include an end-to-end long short-term memory (LSTM) model, an end-to-end transformer model, a graph-based transformer model, and an equivariant graph neural network. All four models show exceptional performance, with the graph-based transformer architecture offering the best results but at the cost of requiring a graph structure as input. Conversely, the LSTM and transformer models offer end-to-end sequence-to-property prediction capabilities, providing efficient avenues for protein engineering, analysis, and design. We compare our results against published data based on a Principal Neighborhood Aggregation graph neural network, revealing that the transformer model offers better performance while also being able to predict a large set of the first 64 normal mode frequencies, simultaneously. The use of the end-to-end transformer model may facilitate other downstream applications through the use of transfer learning, and it offers a comprehensive prediction of dynamical properties without any structural knowledge, directly from the amino acid sequence. We demonstrate a potential application in scientific sonification, where the normal mode frequencies are transposed to generate audible signals for a detailed analysis of subtle changes of protein sequences.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作