PGG Core Genes - Tables F1000.xlsx
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/PGG_Core_Genes_-_Tables_F1000_xlsx/14132180
下载链接
链接失效反馈官方服务:
资源简介:
This is a collection of Excel spreadsheet tables in support of the article "A pan-genome method to determine core regions of
the Bacillus subtilis and Escherichia coli genomes" DOI (10.12688/f1000research.51873.1). The tables show characteristics of pan-genome graph (PGG) determined core genes for Bacillus subtilis and Escherichia coli.
Table 1. Pan-genome graph statistics for B. subtilis and E. coli.
Table 2. The number of deleted genes
from B. subtilis reduced strains which are noncore versus core.
Table 3. Large noncore regions which
have not been deleted from any of the strains delta 6, IIG-Bs27-47-24, or PG10,
PS38.
Supplementary Table 1. All B.
subtilis genes compared to the PGG based annotation of core regions
for the type strain genome used by Kobayashi et al and Koo et al. (strain 168,
GenBank sequence AL009126.3, BioSample SAMEA3138188, Assembly ASM904v1/GCF_000009045.1).
Columns 1-5 are the start, stop, strand, OGC, and OGC size for the PGG
annotation. Columns 6-11 are the gene type, start, stop, strand, locus tag, and
gene symbol/name for the GenBank annotation. Column 12 is a list of synonyms
for B. subtilis genes associated with
the Koo and Kobayashi genes. Column 13 is the Koo et al. gene symbol/name.
Columns 14-15 are the Kobayashi et al. gene symbol/name and evidence type (from
Supporting Table 4 “RB, reference to study with Bacillus subtilis; RO, reference to study with other bacteria; TW,
this work; TW*, inactivation failed but IPTG mutant could not be made”).
Columns 16-17 are the GenBank protein product accession and name. Column 18 is
the PGG core or non-core region the gene is contained in. Column 19 indicates
if the gene is in MiniBacillus. Columns 20-23 show genes deleted for
strains delta 6, IIG-Bs27-47-24, PG10, and PS38 respectively.
Supplementary Table 2. The
305 B. subtilis genes deemed
essential by either Kobayashi et al. or Koo et al. These genes are compared to
the PGG based annotation of core regions for the type strain genome used by
Kobayashi et al. and Koo et al. (strain 168, GenBank sequence AL009126.3,
BioSample SAMEA3138188, Assembly ASM904v1/GCF_000009045.1). Columns 1-5
are the start, stop, strand, OGC, and OGC size for the PGG annotation. Columns
6-11 are the gene type, start, stop, strand, locus tag, and gene symbol/name
for the GenBank annotation. Column 12 is a list of synonyms for B. subtilis genes associated with the Koo
and Kobayashi genes. Column 13 is the Koo et al. gene symbol/name. Columns
14-15 are the Kobayashi et al. gene symbol/name and evidence type (from
Supporting Table 4 “RB, reference to study with Bacillus subtilis; RO, reference to study with other bacteria; TW,
this work; TW*, inactivation failed but IPTG mutant could not be made”).
Columns 16-17 are the GenBank protein product accession and name. Column 18 is
the PGG core or non-core region the gene is contained in.
Supplementary Table 3. The 258 B.
subtilis core regions identified through the pan-genome graph.
Supplementary Table 4. All B.
subtilis tRNA and rRNA genes
for the type strain genome (strain 168, GenBank sequence AL009126.3, BioSample
SAMEA3138188, Assembly ASM904v1/GCF_000009045.1) compared to the refined
PGG. Columns 1-5 are the start, stop, strand, OGC, and OGC size for the PGG
annotation. Columns 6-11 are the gene type, start, stop, strand, locus tag, and
gene symbol/name for the GenBank annotation.
Supplementary Table 5. The 108 B.
subtilis genomes used in the study. Data is from
GenBank RefSeq: BioSample ID, Assembly ID, GenBank Species, GenBank Strain,
Genome SIaze, and whether the genome is a type strain.
Supplementary Table 6. The 414 E. coli
genes deemed essential by Goodall et al., Baba et al., or Yamazaki et al. These
genes are compared to the PGG based annotation of core regions for the K-12
BW25113 strain used by Goodall (GenBank
sequence CP009273.1, Assembly ASM75055v1/GCA_000750555.1, BioSample
SAMN03013572). Columns 1-5 are the start, stop, strand, OGC, and OGC size for
the PGG annotation. Columns 6-11 are the gene type, start, stop, strand, locus
tag, and gene symbol/name for the GenBank annotation. Column 12 is a list of
gene synonyms for the gene from GenBank. Columns 13-21 are from Goodall et al.:
13-15 from Table S1 (normal essentiality), 16-18 from Table S4 (essentiality
after outgrowth), 19-20 from Table S3 (outlier discrepancies), and 21 from
Table S2 (comparison of data sets). Column 22 is the PGG core or non-core
region the gene is contained in.
Supplementary Table 7. The 521
E. coli core regions identified through the pan-genome graph.
Supplementary Table 8. All E. coli genes
compared to the PGG based annotation of core regions for the K-12 BW25113
strain used by Goodall (GenBank
sequence CP009273.1, Assembly ASM75055v1/GCA_000750555.1, BioSample
SAMN03013572). Columns 1-5 are the start, stop, strand, OGC, and OGC size for
the PGG annotation. Columns 6-11 are the gene type, start, stop, strand, locus
tag, and gene symbol/name for the GenBank annotation. Column 12 is a
list of gene synonyms for the gene from GenBank. Columns 13-21 are from Goodall et al.: 13-15 from Table S1 (normal
essentiality), 16-18 from Table S4 (essentiality after outgrowth), 19-20 from
Table S3 (outlier discrepancies), and 21 from Table S2 (comparison of data
sets). Column 22 is the PGG core or non-core region the gene is contained in.
Supplementary Table 9. The 971 E.
coli genomes used in the study. Data is from GenBank RefSeq: BioSample ID,
Assembly ID, GenBank Species, GenBank Strain, Genome SIaze, and whether the
genome is a type strain.
创建时间:
2021-02-28



