P4PP: an universal shotgun proteomics data analysis pipeline for virus identification
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.omicsdi.org/dataset/pride/PXD057159
下载链接
链接失效反馈官方服务:
资源简介:
Humans can be infected by a wide variety of virus species. To detect these different viruses, a data analysis approach for shotgun proteomic data was developed. For this proteome for pandemic preparedness (P4PP) pipeline, a database (P4PPv01) and a web application (P4PP) were constructed. The P4PP pipeline enables the identification of 1896 virus species from the 32 virus families in which at least one human-infectious virus is described. P4PP was evaluated using different datasets of cell-cultivated viruses, generated at different institutes, measured with different instruments and prepared with different sample preparations methods. In total, 174 MS datasets of 160 and 14 protein trypsin digests of virus-infected and non-infected cell lines were analyzed, respectively. Of the 160 samples, 146 were correctly identified at the species level, and an additional 4 samples were identified to the family level. In the remaining 10 samples, no virus was detected. However, all these 10 samples tested positive in follow-up samples measured in that time series, indicating that the number of peptides derived from the virus was initially too low in the samples obtained at the start of the experiment. Furthermore, results show that Influenza A or SARS-CoV-2 can be subtyped if enough peptides of the virus are identified. Shotgun proteomics, in combination with the developed data analysis approach, can identify all types of virus species after cultivation in a cell line. Implementing this virus proteome analysis capability in viral diagnostic laboratories will likely improve their capabilities to cope with unexpected, mutated or re-emerging viruses.
人类可被多种病毒物种感染。为实现此类多种病毒的检测,本研究开发了一种适用于鸟枪蛋白质组学(shotgun proteomics)数据的数据分析方法。针对该大流行防范蛋白质组学(Proteome for Pandemic Preparedness, P4PP)流程,研究人员构建了配套数据库(P4PPv01)与网页应用(P4PP)。P4PP流程可从32个病毒科中鉴定出1896种病毒物种,上述病毒科均至少已报道过一种可感染人类的病毒。本研究采用不同机构产生、不同仪器检测、不同样品制备方法得到的细胞培养病毒数据集,对P4PP流程的性能进行了评估。本次共分析了174组质谱(Mass Spectrometry, MS)数据集,分别对应160份病毒感染细胞系与14份未感染细胞系的胰蛋白酶消化蛋白样品。在160份样品中,146份可在物种水平上被正确鉴定,另有4份可鉴定至科水平;剩余10份样品未检测到病毒。但在该时间序列后续采集的检测样品中,这10份原始样品均呈阳性,表明实验初期获取的样品中,病毒衍生肽的初始丰度过低。此外,研究结果显示,若能鉴定到足够量的病毒肽段,可对甲型流感病毒(Influenza A)或SARS-CoV-2进行亚型分型。鸟枪蛋白质组学与本研究开发的数据分析方法相结合,可在细胞系培养后鉴定所有类型的病毒物种。在病毒诊断实验室中部署该病毒蛋白质组分析能力,有望提升其应对突发、突变或重现病毒的能力。
创建时间:
2025-05-28



