PlasmoFP: leveraging deep learning to predict protein function of uncharacterized proteins across the malaria parasite genus
收藏Figshare2025-09-11 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_b_PlasmoFP_leveraging_deep_learning_to_predict_protein_function_of_uncharacterized_proteins_across_the_malaria_parasite_genus_b_/30100396
下载链接
链接失效反馈官方服务:
资源简介:
Malaria, caused by parasites in the genus Plasmodium, remains a global health burden, with thousands of deaths and millions of infections each year. Sequencing of the Plasmodium falciparum genome in 2002 jump-started functional studies, but a large fraction of all predicted proteins remain poorly characterized. Here, we introduce Plasmodium Function Predictor (PlasmoFP), deep learning models designed specifically for the genus Plasmodium. PlasmoFP models are trained on structure-function relationships of proteins of the phylogenetically relevant SAR (Stramenopiles, Alveolate, and Rhizarians) supergroup to predict Gene Ontology terms for partially annotated and ‘unknown function’ proteins of 19 Plasmodium species. Our structure-focused approach addresses long-standing challenges to annotating Plasmodium proteins due to their low sequence similarity to other well-characterized proteins. PlasmoFP models estimate epistemic uncertainty, control false discovery rates in model predictions, and outperform existing methods. By integrating PlasmoFP false discovery rate-controlled predictions with existing annotations for Plasmodium proteins, we reduced the proportion of unannotated proteins from 15-59% across species to 3-28% and improved the proportion of fully annotated proteins from 7-42% to 36-68%, improving proteome-wide annotation completeness across the genus. Combined, PlasmoFP predictions help advance Plasmodium basic research which can aid progress towards global malaria elimination.
创建时间:
2025-09-11



