Data_Sheet_4_Taxogenomics and Systematics of the Genus Pantoea.pdf
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_4_Taxogenomics_and_Systematics_of_the_Genus_Pantoea_pdf/10075013
下载链接
链接失效反馈官方服务:
资源简介:
Members of the genus Pantoea are Gram-negative bacteria isolated from various environments. Taxonomic affiliation based on multilocus sequence analysis (MLSA) is used routinely for inferring accurate phylogeny and identification of bacterial species and genera. Partial sequences of five housekeeping genes (fusA, gyrB, leuS, rpoB, and pyrG) were extracted from 206 draft or complete genomes of Pantoea strains publicly available in databases and analyzed together with the representative sequences of the 25 validly published Pantoea type strains to verify and assess their phylogenetic assignations. Of a total of 159 strains assigned to species level, 11.3% of the non-type strains were incorrectly assigned within suitable Pantoea species. The highest proportion of misidentified strains was recorded in Pantoea vagans, 8 out of 15 (53.3%) inaccurate assignations at the species level. One probable reason for this incorrect classification could be the method previously used for strain identification. Forty-seven (22.8%) genome sequences were from strains identified at the genus level only (Pantoea sp.). A combination of MLSA, average nucleotide identities [ANI and MuMmer-based ANI (ANIm)], tetranucleotide usage pattern (TETRA), and genome-based DNA-DNA hybridization (gDDH) data was used to accurately assign 25 of the 47 strains to validly published Pantoea species, while 17 strains could be assigned as putative novel species within the genus Pantoea. Four genomes designed as Pantoea sp. were identified as Mixta calida. Positive and significant correlation coefficients were computed between MLSA and all the indices derived from whole-genome sequences being proposed for species delimitation. gDDH exhibited the best correlation with MLSA while TETRA was the worst. Accurate species-level identification is key to a better understanding of bacterial diversity and evolution. The MLSA scheme used here could be instrumental to determine the correct taxonomic status of new whole-genome sequenced Pantoea strains, especially non-type strains, before depositing into public databases.
泛菌属(Pantoea)成员为分离自多种生境的革兰氏阴性菌。常规采用基于多位点序列分析(multilocus sequence analysis, MLSA)的分类学归属分析方法,以推导准确的系统发育关系并完成细菌物种与属的鉴定。从公共数据库中公开的206份泛菌属菌株的草图基因组或完整基因组中,提取了5个持家基因(fusA、gyrB、leuS、rpoB及pyrG)的部分序列,并与25株有效发表的泛菌属模式菌株的代表性序列一同开展分析,以验证并评估这些菌株的系统发育归类结果。在共计159株被鉴定至物种水平的菌株中,11.3%的非模式菌株在泛菌属物种内的归类存在错误。物种水平鉴定错误比例最高的为漫游泛菌(Pantoea vagans),15株菌株中有8株(53.3%)存在不准确的归类。造成此类分类错误的一个可能原因是此前用于菌株鉴定的方法。共计47份(22.8%)基因组序列对应的菌株仅被鉴定至属水平(Pantoea sp.)。结合多位点序列分析、平均核苷酸同源性(average nucleotide identities, ANI)、基于MuMmer的平均核苷酸同源性(ANIm)、四核苷酸使用模式(tetranucleotide usage pattern, TETRA)以及基于基因组的DNA-DNA杂交(genome-based DNA-DNA hybridization, gDDH)数据,本研究将47株菌株中的25株准确归类至已有效发表的泛菌属物种,剩余17株则可被归为泛菌属内的推定新种。另有4份被标注为Pantoea sp.的基因组被鉴定为嗜热混合菌(Mixta calida)。研究计算发现,多位点序列分析与所有基于全基因组序列提出的物种界定指标之间均存在显著正相关关系。其中gDDH与多位点序列分析的相关性最高,而TETRA的相关性最低。准确的物种水平鉴定是深入理解细菌多样性与进化的关键。本研究采用的多位点序列分析方案,可为新完成全基因组测序的泛菌属菌株(尤其是非模式菌株)在提交至公共数据库前,确定其正确的分类学地位提供有力支撑。
创建时间:
2019-10-30



