iGlu_AdaBoost: Identification of Lysine Glutarylation Using the AdaBoost Classifier
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/iGlu_AdaBoost_Identification_of_Lysine_Glutarylation_Using_the_AdaBoost_Classifier/13133155
下载链接
链接失效反馈官方服务:
资源简介:
Lysine glutarylation is a newly reported
post-translational modification
(PTM) that plays significant roles in regulating metabolic and mitochondrial
processes. Accurate identification of protein glutarylation is the
primary task to better investigate molecular functions and various
applications. Due to the common disadvantages of the time-consuming
and expensive nature of traditional biological sequencing techniques
as well as the explosive growth of protein data, building precise
computational models to rapidly diagnose glutarylation is a popular
and feasible solution. In this work, we proposed a novel AdaBoost-based
predictor called iGlu_AdaBoost to distinguish glutarylation and non-glutarylation
sequences. Here, the top 37 features were chosen from a total of 1768
combined features using Chi2 following incremental feature selection
(IFS) to build the model, including 188D, the composition of k-spaced amino acid pairs (CKSAAP), and enhanced amino acid
composition (EAAC). With the help of the hybrid-sampling method SMOTE-Tomek,
the AdaBoost algorithm was performed with satisfactory recall, specificity,
and AUC values of 87.48%, 72.49%, and 0.89 over 10-fold cross validation
as well as 72.73%, 71.92%, and 0.63 over independent test, respectively.
Further feature analysis inferred that positively charged amino acids
RK play critical roles in glutarylation recognition. Our model presented
the well generalization ability and consistency of the prediction
results of positive and negative samples, which is comparable to four
published tools. The proposed predictor is an efficient tool to find
potential glutarylation sites and provides helpful suggestions for
further research on glutarylation mechanisms and concerned disease
treatments.
创建时间:
2020-10-22



