ToxTeller: Predicting Peptide Toxicity Using Four Different Machine Learning Approaches
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/ToxTeller_Predicting_Peptide_Toxicity_Using_Four_Different_Machine_Learning_Approaches/26255005
下载链接
链接失效反馈官方服务:
资源简介:
Examining the toxicity of peptides is essential for therapeutic
peptide-based drug design. Machine learning approaches are frequently
used to develop highly accurate predictors for peptide toxicity prediction.
In this paper, we present ToxTeller, which provides four predictors
using logistic regression, support vector machines, random forests,
and XGBoost, respectively. For prediction model development, we construct
a data set of toxic and nontoxic peptides from SwissProt and ConoServer
databases with existence evidence levels checked. We also fully utilize
the protein annotation in SwissProt to collect more toxic peptides
than using keyword search alone. From this data set, we construct
an independent test data set that shares at most 40% sequence similarity
within itself and with the training data set. From a quite comprehensive
list of 28 feature combinations, we conduct 10-fold cross-validation
on the training data set to determine the optimized feature combination
for model development. ToxTeller’s performance is evaluated
and compared with existing predictors on the independent test data
set. Since toxic peptides must be avoided for drug design, we analyze
strategies for reducing false-negative predictions of toxic peptides
and suggest selecting models by top sensitivity instead of the widely
used Matthews correlation coefficient, and also suggest using a meta-predictor approach with multiple predictors.
创建时间:
2024-07-11



