Unsupervised Machine Learning Organization of the Functional Dark Proteome of Gram-Negative “Superbugs”: Six Protein Clusters Amenable for Distinct Scientific Applications
收藏Figshare2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Unsupervised_Machine_Learning_Organization_of_the_Functional_Dark_Proteome_of_Gram-Negative_Superbugs_Six_Protein_Clusters_Amenable_for_Distinct_Scientific_Applications/21685480
下载链接
链接失效反馈官方服务:
资源简介:
Uncharacterized proteins have been underutilized as targets for the development of novel therapeutics for difficult-to-treat bacterial infections. To facilitate the exploration of these proteins, 2819 predicted, uncharacterized proteins (19.1% of the total) from reference strains of multidrug Acinetobacter baumannii, Klebsiella pneumoniae, and Pseudomonas aeruginosa species were organized using an unsupervised k-means machine learning algorithm. Classification using normalized values for protein length, pI, hydrophobicity, degree of conservation, structural disorder, and %AT of the coding gene rendered six natural clusters. Cluster proteins showed different trends regarding operon membership, expression, presence of unknown function domains, and interactomic relevance. Clusters 2, 4, and 5 were enriched with highly disordered proteins, nonworkable membrane proteins, and likely spurious proteins, respectively. Clusters 1, 3, and 6 showed closer distances to known antigens, antibiotic targets, and virulence factors. Up to 21.8% of proteins in these clusters were structurally covered by modeling, which allowed assessment of druggability and discontinuous B-cell epitopes. Five proteins (4 in Cluster 1) were potential druggable targets for antibiotherapy. Eighteen proteins (11 in Cluster 6) were strong B-cell and T-cell immunogen candidates for vaccine development. Conclusively, we provide a feature-based schema to fractionate the functional dark proteome of critical pathogens for fundamental and biomedical purposes.



