five

Determination of Protein Secondary Structure from Infrared Spectra Using Partial Least-Squares Regression

收藏
NIAID Data Ecosystem2026-03-09 收录
下载链接:
https://figshare.com/articles/dataset/Determination_of_Protein_Secondary_Structure_from_Infrared_Spectra_Using_Partial_Least-Squares_Regression/3467180
下载链接
链接失效反馈
官方服务:
资源简介:
Infrared (IR) spectra contain substantial information about protein structure. This has previously most often been exploited by using known band assignments. Here, we convert spectral intensities in bins within Amide I and II regions to vectors and apply machine learning methods to determine protein secondary structure. Partial least squares was performed on spectra of 90 proteins in H2O. After preprocessing and removal of outliers, 84 proteins were used for this work. Standard normal variate and second-derivative preprocessing methods on the combined Amide I and II data generally gave the best performance, with root-mean-square values for prediction of ∼12% for α-helix, ∼7% for β-sheet, 7% for antiparallel β-sheet, and ∼8% for other conformations. Analysis of Fourier transform infrared (FTIR) spectra of 16 proteins in D2O showed that secondary structure determination was slightly poorer than in H2O. Interval partial least squares was used to identify the critical regions within spectra for secondary structure prediction and showed that the sides of bands were most valuable, rather than their peak maxima. In conclusion, we have shown that multivariate analysis of protein FTIR spectra can give α-helix, β-sheet, other, and antiparallel β-sheet contents with good accuracy, comparable to that of circular dichroism, which is widely used for this purpose.
创建时间:
2016-07-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作