Extracting Residue Solvent Exposure from Covalent Labeling Data with Machine Learning: A Hybrid Approach for Protein Structure Prediction
收藏Figshare2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Extracting_Residue_Solvent_Exposure_from_Covalent_Labeling_Data_with_Machine_Learning_A_Hybrid_Approach_for_Protein_Structure_Prediction/29115275
下载链接
链接失效反馈官方服务:
资源简介:
Hydroxyl radical protein footprinting (HRPF) coupled with mass spectrometry yields information about residue solvent exposure and protein topology. However, data from these experiments are sparse and require computational interpretation to generate useful structural insight. We previously implemented a Rosetta algorithm that uses experimental HRPF data to improve protein structure prediction. Modern structure prediction methods, such as AlphaFold2 (AF2), use machine learning (ML) to generate their predictions. Implementation of an HRPF-guided version of AF2 is challenging due to the substantial amount of training data required and the inherently abstract nature of ML networks. Thus, here we present a hybrid method that uses a light gradient boosting machine to predict residue solvent accessibility from experimental HRPF data. These predictions were subsequently used to improve Rosetta structure prediction. Our hybrid approach identified models with atomic-level detail for all four proteins in our benchmark set. These results illustrate that it is possible to successfully use ML in combination with HRPF data to accurately predict protein structures.



