Supplementary Data for Kogay et al (2020)
收藏DataCite Commons2025-06-01 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/Supplementary_Data_for_Kogay_et_al_2020_/12071223/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains supplementary data for the bioinformatic analyses presented in:<br>Roman Kogay, Yuri I. Wolf, Eugene V. Koonin, and Olga Zhaxybayeva: Selection for energy savings during protein production drives the GC content and amino acid composition bias in gene transfer agents (under review.)<br><b>Dataset Contents:</b><br><br><b>212_alphaproteobacterial_genomes_list.xlsx:</b> Taxonomic names and RefSeq assembly accession numbers for 212 alphaproteobacterial genomes with GTA regions.<br><br><b>Detected_GTA_homologs.zip:</b> Amino acid sequences of GTA genes detected in alphaproteobacteria and viruses. The filename prefices (g1, …, g15) refer to the GTA gene name, while the filename suffices (_gta, _virus) designate to the gene origin. The sequences are in the FASTA format.<br><br><b>PAML_Branch_siteA_test_alns_trees.zip:</b> Codon alignments and phylogenetic trees that were used as an input for the branch site A model test in PAML. The alignments are in FASTA format and phylogenetic trees are in Newick format. The same phylogenetic trees were also used in the analyses of carbon utilization change in selected viruses. The filename prefices (g2, …, g15) refer to the GTA gene name.<br><br><b>reference_aln_tree.zip:</b> Concatenated alignment of 31 phylogenetic marker genes in 212 alphaproteobacterial genomes in FASTA format (reference_alignment.fasta), information about partitions and substitution models used in phylogenetic reconstruction (reference_partitions.txt), and the reference phylogenetic tree in Newick format (reference_tree.treefile).<br><br><b>ancestral_reconstructions.zip: </b>Alignments and phylogenetic trees that were used in the ancestral sequence reconstruction via FastML. The alignments are in FASTA format and phylogenetic trees are in Newick format. The filename prefices (g6, …, g15) refer to the GTA gene name.<br><br><b>net_carbon_change_bootstrapping.zip:</b> Alignments of selected viruses and their inferred GTA ancestors. These alignments were used to generate 1000 bootstrap replicates to assess significance of the net change in the carbon utilization in selected viral homologs. The alignments are in FASTA format. The filename prefices (g6, …, g15) refer to the RcGTA gene name. The calculated net changes in the carbon utilization across bootstrap replicates are in net_carbon_change_bootstrapping.xlsx file.<b><br><br><br></b><b><br></b>
提供机构:
figshare
创建时间:
2020-05-06



