five

Data_Sheet_1_In silico Identification of Serovar-Specific Genes for Salmonella Serotyping.PDF

收藏
frontiersin.figshare.com2023-06-02 更新2025-01-15 收录
下载链接:
https://frontiersin.figshare.com/articles/dataset/Data_Sheet_1_In_silico_Identification_of_Serovar-Specific_Genes_for_Salmonella_Serotyping_PDF/8032640/1
下载链接
链接失效反馈
官方服务:
资源简介:
Salmonella enterica subspecies enterica is a highly diverse subspecies with more than 1500 serovars and the ability to distinguish serovars within this group is vital for surveillance. With the development of whole-genome sequencing technology, serovar prediction by traditional serotyping is being replaced by molecular serotyping. Existing in silico serovar prediction approaches utilize surface antigen encoding genes, core genome MLST and serovar-specific gene markers or DNA fragments for serotyping. However, these serovar-specific gene markers or DNA fragments only distinguished a small number of serovars. In this study, we compared 2258 Salmonella accessory genomes to identify 414 candidate serovar-specific or lineage-specific gene markers for 106 serovars which includes 24 polyphyletic serovars and the paraphyletic serovar Enteritidis. A combination of several lineage-specific gene markers can be used for the clear identification of the polyphyletic serovars and the paraphyletic serovar. We designed and evaluated an in silico serovar prediction approach by screening 1089 genomes representing 106 serovars against a set of 131 serovar-specific gene markers. The presence or absence of one or more serovar-specific gene markers was used to predict the serovar of an isolate from genomic data. We show that serovar-specific gene markers have comparable accuracy to other in silico serotyping methods with 84.8% of isolates assigned to the correct serovar with no false positives (FP) and false negatives (FN) and 10.5% of isolates assigned to a small subset of serovars containing the correct serovar with varied FP. Combined, 95.3% of genomes were correctly assigned to a serovar. This approach would be useful as diagnosis moves to culture-independent and metagenomic methods as well as providing a third alternative to confirm other genome-based analyses. The identification of a set of gene markers may also be useful in the development of more cost-effective molecular assays designed to detect specific gene markers of the all major serovars in a region. These assays would be useful in serotyping isolates where cultures are no longer obtained and traditional serotyping is therefore impossible.

沙门氏菌亚种enterica是一种高度多样化的亚种,拥有超过1500个血清型。在此群体内区分血清型对于监测至关重要。随着全基因组测序技术的进步,传统的血清型分型方法正被分子血清型分型所取代。现有的基于计算机的血清型预测方法主要利用表面抗原编码基因、核心基因组多位点序列分型以及血清型特异性基因标记或DNA片段进行血清型鉴定。然而,这些血清型特异性基因标记或DNA片段仅能区分少数血清型。在本研究中,我们比较了2258个沙门氏菌辅助基因组,以识别出针对106个血清型的414个候选血清型特异性或谱系特异性基因标记,其中包括24个多系血清型和肠炎血清型的旁系型。多种谱系特异性基因标记的组合可用于明确识别多系血清型和旁系型。我们设计并评估了一种基于计算机的血清型预测方法,通过筛选代表106个血清型的1089个基因组与131个血清型特异性基因标记集进行对比。通过检测一个或多个血清型特异性基因标记的有无,利用基因组数据预测分离菌株的血清型。我们发现,血清型特异性基因标记与其他基于计算机的血清型分型方法具有可比的准确性,84.8%的分离菌株被正确分配到相应的血清型,且无假阳性(FP)和假阴性(FN)情况,10.5%的分离菌株被分配到包含正确血清型的小型血清型子集,但存在变化性的FP。综合来看,95.3%的基因组被正确分配到血清型。此方法在诊断转向非培养依赖性和宏基因组方法时将非常有用,同时为基于基因组的分析提供第三种验证方法。识别一组基因标记对于开发更经济的分子检测方法也具有实用价值,这些方法旨在检测该地区所有主要血清型的特异性基因标记。这些检测方法在无法获得培养物且传统血清型分型因此变得不可能的情况下,对血清型鉴定将非常有用。
提供机构:
Frontiers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作