five

The consolidated and reconciled annotations from all of the WGS strains used in this study

收藏
DataONE2025-03-07 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:91df03e845aa573ea832933872e9975673a8930d77bda75a88827c5953ce18d5
下载链接
链接失效反馈
官方服务:
资源简介:
Pantoea agglomerans is one of four Pantoea species reported in the USA to cause bacterial rot of onion bulbs. However, not all P. agglomerans strains are pathogenic to onion. We characterized onion-associated strains of Pagg to elucidate the genetic and genomic signatures of onion-pathogenic P. agglomerans. We collected >300 P. agglomerans strains associated with symptomatic onion plants and bulbs from public culture collections, research laboratories, and a multi-year survey in 11 states in the USA. Combining the 87 genome assemblies with 100 high-quality, public P. agglomerans genome assemblies we identified two well-supported P. agglomerans phylogroups. Strains causing severe symptoms on onion were only identified in Phylogroup II and encoded the HiVir pantaphos biosynthetic cluster, supporting the role of HiVir as a pathogenicity factor. The P. agglomerans HiVir cluster was encoded in two distinct plasmid contexts: 1) as an accessory gene cluster on a conserved P. agglomerans pla..., Eighty-one genomes assembled as part of this study were combined with 100 high-quality genome assemblies from NCBI Genbank (Table S3). Genomes were re-annotated with Prokka [@Seemann2014] for pangenome analysis with Roary [@Page2015] to identify core and accessory genes. Custom scripts were used to reconcile the CDS records of the Prokka, Genbank, and RefSeq annotations for each genome. For the process, CDS records were considered to be synonymous, if the coordinates for their stop codon were equal. If the coordinates of the start codon of synonymous CDS records were not equal, the records were marked as \"interesting\"., , #### **`DatasetS1.zip`** This dataset contains the consolidated and reconciled annotations from all of the WGS strains of *Pantoea* *agglomerans* used in this study. In this study, Prokka was used, independently, by both the UGA and USDA authors to reannotate genomes for input to Roary ([https://dx.doi.org/10.1093/bioinformatics/btv421](https://dx.doi.org/10.1093/bioinformatics/btv421). For genomes available from NCBI, both GenBank and RefSeq annotations are available. Hence, for each genome, multiple sets of gene annotations are available. The ZIP file `DatasetS1.zip` contains 187 tab-separated values (`.tsv`) files, one for each of the 187 genomes analyzed by Roary in this study. For each WGS strain, we have provided a table (i.e., `.tsv` file) listing which CDS annotations are equivalent in the available annotation sets. That is, each row of each table lists the equivalent annotations for each CDS. Annotations are considered equivalent if they refer to CDS features whose amino ac...,
创建时间:
2025-03-13
二维码
社区交流群
二维码
科研交流群
商业服务