five

Table_2_Comparative Analysis of Tools and Approaches for Source Tracking Listeria monocytogenes in a Food Facility Using Whole-Genome Sequence Data.XLSX

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Table_2_Comparative_Analysis_of_Tools_and_Approaches_for_Source_Tracking_Listeria_monocytogenes_in_a_Food_Facility_Using_Whole-Genome_Sequence_Data_XLSX/8100191
下载链接
链接失效反馈
官方服务:
资源简介:
As WGS is increasingly used by food industry to characterize pathogen isolates, users are challenged by the variety of analysis approaches available, ranging from methods that require extensive bioinformatics expertise to commercial software packages. This study aimed to assess the impact of analysis pipelines (i.e., different hqSNP pipelines, a cg/wgMLST pipeline) and the reference genome selection on analysis results (i.e., hqSNP and allelic differences as well as tree topologies) and conclusion drawn. For these comparisons, whole genome sequences were obtained for 40 Listeria monocytogenes isolates collected over 18 years from a cold-smoked salmon facility and 2 other isolates obtained from different facilities as part of academic research activities; WGS data were analyzed with three hqSNP pipelines and two MLST pipelines. After initial clustering using a k-mer based approach, hqSNP pipelines were run using two types of reference genomes: (i) closely related closed genomes (“closed references”) and (ii) high-quality de novo assemblies of the dataset isolates (“draft references”). All hqSNP pipelines identified similar hqSNP difference ranges among isolates in a given cluster; use of different reference genomes showed minimal impacts on hqSNP differences identified between isolate pairs. Allelic differences obtained by wgMLST showed similar ranges as hqSNP differences among isolates in a given cluster; cgMLST consistently showed fewer differences than wgMLST. However, phylogenetic trees and dendrograms, obtained based on hqSNP and cg/wgMLST data, did show some incongruences, typically linked to clades supported by low bootstrap values in the trees. When a hqSNP cutoff was used to classify isolates as “related” or “unrelated,” use of different pipelines yielded a considerable number of discordances; this finding supports that cut-off values are valuable to provide a starting point for an investigation, but supporting and epidemiological evidence should be used to interpret WGS data. Overall, our data suggest that cgMLST-based data analyses provide for appropriate subtype differentiation and can be used without the need for preliminary data analyses (e.g., k-mer based clustering) or external closed reference genomes, simplifying data analyses needs. hqSNP or wgMLST analyses can be performed on the isolate clusters identified by cgMLST to increase the precision on determining the genomic similarity between isolates.

全基因组测序(Whole Genome Sequencing, WGS)正愈发广泛地被食品工业用于病原菌分离株的分型鉴定,但现有分析方法种类繁杂,从需深厚生物信息学专业背景的分析流程到商业化软件包不一而足,这给使用者带来了挑战。本研究旨在评估分析流程(即不同的高保真单核苷酸多态性(high-quality Single Nucleotide Polymorphism, hqSNP)分析流程、一套核心基因组/全基因组多位点序列分型(core genome/whole genome Multi-Locus Sequence Typing, cg/wgMLST)分析流程)以及参考基因组选择对分析结果(即hqSNP差异、等位基因差异以及系统发育树拓扑结构)与所得结论的影响。为开展上述比较分析,本研究获取了18年间从某冷熏三文鱼加工厂分离得到的40株单核细胞增生李斯特菌(Listeria monocytogenes)分离株,以及另外2株从其他加工厂分离、作为学术研究项目一部分获得的分离株的全基因组序列;采用3套hqSNP分析流程与2套多位点序列分型(Multi-Locus Sequence Typing, MLST)流程对WGS数据进行分析。在基于k聚体(k-mer)的方法完成初步聚类后,分别使用两类参考基因组运行hqSNP分析流程:(i)亲缘关系相近的完成图基因组(即"闭合参考基因组"),以及(ii)本数据集分离株的高质量从头组装序列(即"草图参考基因组")。所有hqSNP分析流程在同一聚类内的分离株间均得到了相近的hqSNP差异区间;使用不同参考基因组对分离株对间的hqSNP差异分析结果影响极小。通过全基因组多位点序列分型(whole genome Multi-Locus Sequence Typing, wgMLST)得到的等位基因差异,在同一聚类内的分离株间与hqSNP差异区间相近;而核心基因组多位点序列分型(core genome Multi-Locus Sequence Typing, cgMLST)得到的差异值始终低于wgMLST。不过,基于hqSNP与cg/wgMLST数据构建的系统发育树与聚类树确实存在部分拓扑结构不一致的情况,这类不一致通常与树中自展支持率较低的进化支相关。当使用hqSNP阈值将分离株划分为"相关"或"无关"时,采用不同分析流程会得到大量不一致的分类结果;该结果表明,阈值可作为溯源调查的起点具有参考价值,但解读WGS数据时还需结合辅助证据与流行病学证据。总体而言,本研究数据表明,基于cgMLST的数据分析可实现恰当的亚型分型,且无需进行初步数据分析(如基于k聚体的聚类)或引入外部闭合参考基因组,从而简化了数据分析需求。可针对cgMLST鉴定得到的分离株聚类开展hqSNP或wgMLST分析,以提升分离株间基因组相似性判定的精确度。
创建时间:
2019-05-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作