Additional file 2 of Pre-Cambrian roots of novel Antarctic cryptoendolithic bacterial lineages
收藏DataCite Commons2021-03-20 更新2024-08-18 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Additional_file_2_of_Pre-Cambrian_roots_of_novel_Antarctic_cryptoendolithic_bacterial_lineages/14251830/1
下载链接
链接失效反馈官方服务:
资源简介:
Additional file 2: Supplementary Table 1. Results of the CBS detection procedure and the validation using Mash Screen. Each row reports: CBS ID (i.e. the CBS MAG representative), metagenomic sample, estimated depth of coverage (mean, standard deviation, first quartile, median third quartile), number of mapped reads, ANI between the consensus sequences and the CBS representative, coverage breadths at depths from 1 to 5, Mash Screen containment score, number of shared hashes, median multiplicity and containment score p-value. Supplementary Table 2. Assembly statistics and taxonomic classification of the MAGs. Supplementary Table 3. Abundance of CBS at phylum level, expressed as percentage of reads that could be mapped to the representative CBS. Median: median; Q1 and Q3: first and third quartile; IQR: interquartile range; Mean: mean; SD: standard deviation; #CBS: number of candidate bacterial species belonging to the phylum. Supplementary Table 4. Increase in the number of bacterial species for each taxonomic Order provided by the data in the present study, compared to the data available in the GTDB database. Supplementary Table 5. Sample metadata. Geographic coordinates of the sampling sites, accession numbers of the raw sequences, accession numbers and N50 of the assembled metagenomes on the JGI IMG/M portal. Supplementary Table 6. Prevalence and taxonomic classification for each CBS representative. Supplementary Table 7. Summary of Bayesian divergence estimates. For each order we report the mean age of its origin (OO: the split of the order from the closest order) and the 95% CI (OO max and OO min), the origin of the oldest uniquely Antarctic clade (AOO1, the split of the Antarctic clade from a non-Antarctic lineage of the same order), and, where present, the origin of the second oldest antarctic clade (AOO2). See Supplementary Data 1. Supplementary Table 8. Number of predicted proteins (NProts) and of proteins that had a match in the EggNOG database (NHitsOG) and that could be associated to a term in the Gene Ontology (NHitsGO) or had a match in the KEGG and COG databases (NHitsKEGG and NHitsCOG, respectively). Supplementary Table 9. Number of KEGG orthologs characteristic of the Antarctic or reference Jiangellales genomes. The Fisher’s exact test (uncorrected p<0.05) was performed to identify unevenly distributed orthologs between the two groups. Supplementary Table 10. Number of KEGG orthologs characteristic of the Antarctic or reference Thermomicrobiales genomes. The Fisher’s exact test (uncorrected p<0.05) was performed to identify unevenly distributed orthologs between the two groups.
附加文件2:补充表1。CBS检测流程结果及使用Mash Screen(玛什筛查)进行的验证。每一行依次报告:CBS ID(即CBS宏基因组组装基因组(Metagenome-Assembled Genome, MAG)代表序列)、宏基因组样本、预估覆盖深度(平均值、标准差、第一四分位数、中位数、第三四分位数)、比对上的测序读段数、共识序列与CBS代表序列的平均核苷酸一致性(Average Nucleotide Identity, ANI)、深度1至5下的覆盖广度、Mash Screen包含得分、共享哈希值数量、中位多重度及包含得分p值。
补充表2。宏基因组组装基因组的组装统计信息与分类学分类结果。
补充表3。门水平下CBS的丰度,以可比对至CBS代表序列的测序读段占比表示。缩写说明:Median(中位数);Q1、Q3:第一、第三四分位数;IQR:四分位距;Mean(平均值);SD:标准差;#CBS:隶属于该门的候选细菌物种数量。
补充表4。相较于基因组分类数据库(Genome Taxonomy Database, GTDB)中已有的数据,本研究数据新增的各分类学目级单元的细菌物种数量。
补充表5。样本元数据。包括采样点的地理坐标、原始测序序列的登录号、组装完成的宏基因组在联合基因组研究所整合微生物基因组及微生物组(Joint Genome Institute Integrated Microbial Genomes & Microbiomes, JGI IMG/M)平台上的登录号及N50值。
补充表6。各CBS代表序列的检出率与分类学分类结果。
补充表7。贝叶斯分化估计结果汇总。针对每个目级分类单元,报告其起源的平均时间(OO:该目与近缘目发生分化的时间)及95%置信区间(95% CI,含OO上限与OO下限)、最古老的特有南极分支的起源时间(AOO1:该南极分支与同目非南极谱系发生分化的时间),若存在则同时报告第二古老的南极分支的起源时间(AOO2)。详见补充数据1。
补充表8。预测蛋白总数(NProts)、在EggNOG数据库中匹配到同源序列的蛋白数量(NHitsOG)、可注释至基因本体(Gene Ontology, GO)术语的蛋白数量(NHitsGO),以及在京都基因与基因组百科全书(Kyoto Encyclopedia of Genes and Genomes, KEGG)与簇状同源蛋白群(Clusters of Orthologous Groups, COG)数据库中匹配到同源序列的蛋白数量(分别记为NHitsKEGG与NHitsCOG)。
补充表9。南极或参考江氏杆菌目(Jiangellales)基因组特有的KEGG直系同源基因数量。采用费希尔精确检验(未校正p<0.05)以鉴定两组间分布不均的直系同源基因。
补充表10。南极或参考热微菌目(Thermomicrobiales)基因组特有的KEGG直系同源基因数量。采用费希尔精确检验(未校正p<0.05)以鉴定两组间分布不均的直系同源基因。
提供机构:
figshare
创建时间:
2021-03-20



