Table 1_StackPVP: a stacked ensemble classification framework for predicting phage virion proteins using integrated evolutionary features.docx
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Table_1_StackPVP_a_stacked_ensemble_classification_framework_for_predicting_phage_virion_proteins_using_integrated_evolutionary_features_docx/31798870
下载链接
链接失效反馈官方服务:
资源简介:
Phage therapy has attracted increasing attention as a potential antibacterial strategy. Phage virion proteins (PVPs) are vital for recognizing host cells and binding to their surface receptors. Thus, accurate PVP identification is essential for developing antibacterial agents. In this study, we introduced StackPVP, a computational approach that integrated ensemble learning methods with evolutionary features to improve PVP identification accuracy. Initially, position-specific scoring matrix (PSSM) were derived from protein sequences, from which additional features—including Amino Acid Composition derived from PSSM (AAC-PSSM), Dipeptide Composition derived from PSSM (DPC-PSSM), Pseudo Position-Specific Scoring Matrix (Pse-PSSM), and PSSM Composition (PSSM-COM)—were extracted. Subsequently, three feature selection methods were employed to determine the optimal feature subset, which was then used with 12 base machine learning classifiers. Among three meta-classifier algorithms—logistic regression (LR), random forest (RF), and support vector machine (SVM)—random forest achieved the best overall performance and was selected as the meta-classifier in the final stacking model. On the evaluated test dataset, StackPVP achieved an area under the curve (AUC) of 94.26% and showed improved specificity compared with several representative existing methods. These results suggest that StackPVP provides a complementary computational approach for PVP identification and may assist in phage genome annotation and related research.
创建时间:
2026-03-18



