Supplementary data for Abdel-Glil M et al. (Manuscript 2020)
收藏DataCite Commons2021-03-11 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_data_for_Abdel-Glil_M_et_al_Manuscript_2020_/12264497/2
下载链接
链接失效反馈官方服务:
资源简介:
File Descriptions: <br> Data set 1: Detailed reports on the <i>de novo</i> genome assembly for PacBio sequence data of 23 NCTC <i>Clostridium perfringens</i> strains.<br>Data set 2: The folder includes Prokka-annotated "*.gbk" files of the analysed 206 <i>Clostridium perfringens</i> strains (A), as well as the RAST annotation of the chromosome from 34 circularized genomes (B).<br>Data set 3: (A) Core genome alignment constructed for the 206 <i>Clostridium perfringens </i>genomes using Parsnp. (B) A list of the 63,036 core genome SNPs identified in the 206 strains with their distribution across and impact on the reference ATCC 13124 genome, including 61,236 nonsyonymous, 15,489 synonymous, 4,395 in other features and 16,500 SNPs in non-coding regions. Data set 4: (A) Gene sequences comprising the pangenome of the 206 <i>Clostridium perfringens</i> strains as calculated with Roary (options -i 90 -s). (B) Gene presence-absence matrix of the 14,942 genes comprising the pangenome of the 206 <i>Clostridium perfringens</i> strains. The total number of genes predicted per each strain is shown (Right). (C) The pangenome frequency in the investigated 206 strains. (D) A pie chart showing the pangenome compartements. <br>Data set 5 Gene sequences of virulence factors and (putative) iron uptake systems used for the BLAST analysis of the 206<i> Clostridium perfringens</i> genomes.<br>Data set 6: (A) Visualisation of a large deletion of the <i>colA</i> gene in strain CBA7123 (bottom) compared to <i>colA</i> gene in strain 13 (top). (C) FASTA_Alignment of the <i>colA</i> genes of 206 <i>C. perfringens</i> strains. The <i>colA</i> genes with a large deletion mutation were artificially fragmented in the alignment file. <br>Data set 7: (A) Visualisation of a large deletion of the mu toxin gene (<i>nagH</i>) as observed in phylogroup I strains. As examples, the <i>nagH</i> region of strain SM101 (A) and strain NCTC 8081 (B) were compared to that in strain 13 (top). Strain NCTC 10240 was the only strain in phylogroup I that had an intact <i>nagH</i> gene. (B) FASTA_alignment of the <i>nagH</i> genes of 206<i> C. perfringens</i> strains. The <i>nagH</i> genes with a large deletion mutation were artificially fragmented in the alignment file.<br>Data set 8: (A) Phylogenetic tree based on the deduced amino acid sequences of the <i>pfoA</i> gene and the <i>pfoA</i> like sequence (<i>pfoA </i>variant) in the strain JP838. Representative members of the cytolysin family were included. (B) Protein alignment of the <i>pfoA</i> gene (top) and the <i>pfoA</i> like sequence (bottom) in strain JP838. (C) FASTA nucleotide alignment of <i>pfoA</i> gene and the <i>pfoA</i> like sequence in strain JP838.<br>Data set 9: (A) RAST Annotation of the NCTC_8081 plasmid with highlighted <i>cpe</i>, <i>tcp</i> locus and the toxin homolog. (B) Phylogenetic tree based on the deduced amino acid sequences of representative members of Leukocidin/Hemolysin superfamily and the homolog (locus tag 02938) of the Darmbrand NCTC 8081 strain.<br> Data set 10: Amino acid sequences of identified putative virulence genes in the 206 <i>Clostridium perfringens</i> strains
文件说明:
数据集1:针对23株NCTC产气荚膜梭菌(*Clostridium perfringens*)的PacBio测序数据进行从头基因组组装(de novo genome assembly)的详细报告。
数据集2:该文件夹包含所分析的206株产气荚膜梭菌(*Clostridium perfringens*)的Prokka注释GenBank格式(*.gbk*)文件(A),以及34个环状化基因组的染色体的RAST注释结果(B)。
数据集3:(A) 利用Parsnp工具对206株产气荚膜梭菌(*Clostridium perfringens*)基因组构建的核心基因组比对文件;(B) 在206株菌株中鉴定得到的63036个核心基因组单核苷酸多态性(SNP)列表,包含其在参考菌株ATCC 13124基因组上的分布及影响,其中包括61236个非同义突变SNP、15489个同义突变SNP、4395个位于其他功能区域的SNP以及16500个位于非编码区的SNP。
数据集4:(A) 利用Roary工具(参数:-i 90 -s)计算得到的206株产气荚膜梭菌(*Clostridium perfringens*)泛基因组所包含的基因序列;(B) 206株产气荚膜梭菌泛基因组的14942个基因的基因存在-缺失矩阵,右侧标注了每株菌株预测得到的基因总数;(C) 在所研究的206株菌株中泛基因组的频率分布;(D) 展示泛基因组组分的饼图。
数据集5:用于对206株产气荚膜梭菌(*Clostridium perfringens*)基因组进行BLAST比对分析的毒力因子及(潜在)铁摄取系统基因序列。
数据集6:(A) 菌株CBA7123(下方)与菌株13(上方)的*colA*基因对比图,展示了菌株CBA7123中*colA*基因的大片段缺失现象;(C) 206株产气荚膜梭菌(*C. perfringens*)*colA*基因的FASTA序列比对文件,其中携带大片段缺失突变的*colA*基因在比对文件中被人工打断。
数据集7:(A) 系统发育群I菌株中观察到的μ毒素基因*nagH*大片段缺失现象的可视化结果,以菌株SM101(对应A)和菌株NCTC 8081(对应B)的*nagH*区域与菌株13(上方)的*nagH*区域进行对比为例;系统发育群I中仅菌株NCTC 10240携带完整的*nagH*基因。(B) 206株产气荚膜梭菌(*C. perfringens*)*nagH*基因的FASTA序列比对文件,其中携带大片段缺失突变的*nagH*基因在比对文件中被人工打断。
数据集8:(A) 基于*pfoA*基因及其同源序列(*pfoA*变异型)推导得到的氨基酸序列构建的系统发育树,包含溶血素家族的代表性成员;(B) 菌株JP838中*pfoA*基因(上方)与其同源序列(下方)的蛋白质序列比对结果;(C) 菌株JP838中*pfoA*基因及其同源序列的FASTA核苷酸序列比对文件。
数据集9:(A) NCTC_8081质粒的RAST注释结果,其中高亮标注了*cpe*、*tcp*基因座以及毒素同源基因;(B) 基于白细胞毒素/溶血素超家族代表性成员以及Darmbrand菌株NCTC 8081的同源基因(基因座标签:02938)推导得到的氨基酸序列构建的系统发育树。
数据集10:206株产气荚膜梭菌(*Clostridium perfringens*)中鉴定得到的潜在毒力基因的氨基酸序列。
提供机构:
figshare
创建时间:
2020-10-19



