five

S1 - Genome metadata and metabolic analyses for existing and newly extracted Rickettsia and Megaira symbiotic bacteria

收藏
DataCite Commons2022-03-25 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/S1_-_Genome_metadata_and_metabolic_analyses_for_existing_and_newly_extracted_Rickettsia_and_Megaira_symbiotic_bacteria/14865561
下载链接
链接失效反馈
官方服务:
资源简介:
<b>What is it</b> An excel spread sheet containing all metadata associated with new and existing <i>Rickettsia</i>, Torix <i>Rickettsia</i> (<i>Ca</i>. Tisiphia), <i>Ca</i>. Megaira and <i>Orientia</i> genomes used in pangenomic, metabolic and phylogenomic analyses in the paper "Genomic diversity in the genus Rickettsia and its sister groups indicate the torix group are evolutionarily distinct" The tables are split into genome data (S1a) and metabolic data (S1b). Each sheet contains the information as follows:<br> <b><i>S1a</i></b><i> – yellow tabs, genome and gene sequence metadata</i> - &gt; <b>S1a.1</b> – a list of published genomes. Includes accession numbers, which rickettsia group they belong to, checkM quality check scores, metadata on assembly level, and which analyses they were used in. -&gt; <b>S1a.2</b> – Metadata for new genomes assembles in this study. Includes: rickettsia group, assembly level, checkM completeness scores, genome size and N50 scores, Host names, host accessions and brief details of host ecology. -&gt; <b>S1a.3</b> – Accession and host species for all Rickettsial 16S rRNA, COI, gltA and 17kDa sequences used in MLST to identify the Rhyzobius (Oopac6), Meloidae (Ppec13), Moomin (Moomin) and unusual Belli (Dallo3) genomes extracted from Short Read Archive data.-&gt; <b>S1a.4 </b>– GTDB-Tk taxonomy classification which supports assiggning torix group rickettsia in its own clade.<br><br> <b><i>S1b</i></b><i> – red tabs, pangenome and metabolic data</i> -&gt; <b>S1b.1</b> – The 74 core gene clusters with COG and KEGG/kofam functions across 104 genomes from a pangenome of Rickettsia, Megaira and Orientia constructed with Anvi’o. PHI scores are given for each cluster. These gene clusters were later used in phylogeny construction. -&gt; <b>S1b.2</b> – The 43 ribosomal protein gene clusters with COG and KEGG/kofam functions across 104 genomes from a pangenome of Rickettsia, Megaira and Orientia constructed with Anvi’o 7. These gene clusters were later used in phylogeny construction. -&gt; <b>S1b.3</b> – Raw metabolic pathway completion scores for Rickettsia, Megaira and Orientia genomes used in metabolic heatmaps in the main paper. Contains the KEGG module names, classes, categories, and estimated completion scores produced through `anvi-estimate-metabolism` in Anvi’o 7. -&gt;<b> S1b.4</b> – Functional enrichment scores produced with `anvi-get-enriched-functions-per-pan-group` in Anvi’o 7. Comparisons are made within Torix, within the main Rickettsia (excluding Torix), between Torix and Megaira, and between Torix and Rickettsia.-&gt;<b> S1b.5</b> – Average nucleotide identity (ANI) scores with percentage similarity between each genome produced with pyANI<br>-&gt;<b> S1b.6</b> – Average Amino acid identity (AAI) scores with percentage similarity between each genome produced with kosta lab AAI matrix tools.<br>##&gt; S1b.7 Data from gene cluster content analysis used to construct accumulation curves and cluster similarity plots<br><br> <b>Why is it</b> Background data that is necessary for transparency but is unsuitable for inclusion in a paper as information tables.<br> <b>Use and benefit</b> Useful metadata and raw data that can be examined by readers and allows for reproducibility.
提供机构:
figshare
创建时间:
2021-10-05
二维码
社区交流群
二维码
科研交流群
商业服务