five

Supplementary data for Shakya et al. (2017)

收藏
DataCite Commons2020-09-01 更新2024-08-17 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_data_for_Shakya_et_al_2017_/5406733/1
下载链接
链接失效反馈
官方服务:
资源简介:
This data set contains sequences, sequence alignments and phylogenetic trees used in the bioinformatic analyses presented in:<br><br>Shakya M, Soucy SM, and Zhaxybayeva O. "Insights into Origin and Evolution of α-proteobacterial Gene Transfer Agents", submitted. <br><br><b>File Contents:<br><br></b><b></b><b>RefSeq_bacterial_hits.zip:</b> FASTA-formatted files of detected bacterial homologs of RcGTA genes in RefSeq database release 76. The filenames correspond to gene names listed in Supplementary Table S4. <b>RefSeq_viral_hits.zip:</b> FASTA-formatted files of detected viral homologs of RcGTA genes within RefSeq database release 76. The filenames correspond to gene names listed in Supplementary Table S4. <br> <b>GTA_Rhodobacterales_queries.zip</b>: FASTA-formatted files of RcGTA homologs from <i>Rhodobacterales</i> that were used in BLAST searches of <i>RefSeq</i> database and 255 α-proteobacterial genomes. <br> <b>individual_proteins.zip</b>: FASTA-formatted alignments of individual RcGTA structural cluster genes and their large cluster (LC) homologs used to create the LC-locus alignment. The filenames correspond to gene names listed in Supplementary Table S4. <br><b>individual_trees.zip</b>: NEWICK-formatted phylogenetic trees reconstructed from the alignments in individual_protein.zip file. These trees were used in analyses shown in Supplementary Table S3. <b>LC_locus.zip</b>: FASTA-formatted LC-locus alignment and NEWICK-formatted phylogenetic tree of the LC-locus (the right panel of Figure 6). <br> <b>flanking_genes.zip</b>: FASTA-formatted alignments and NEWICK-formatted phylogenetic trees of three genes that were found flanking large clusters detected in non-alpha-proteobacterial genomes. The trees are shown in Supplementary Figure S8. <b>PPD.zip: </b> Pairwise phylogenetic distances (PPDs) of RcGTA homologs found in large clusters (LC), small clusters (SC), and viruses in tab-delimited text files, and FASTA-formatted alignments of RcGTA homologs used to calculate the PPDs. The data are shown in Supplementary Figure S4. <b>reference_tree.zip: </b>PHYLIP-formatted<b> </b>concatenated alignment of 99 alignments of genes conserved across<b> </b>α-proteobacteria (see Supplementary Table S2), and NEWICK-formatted phylogenetic trees reconstructed using this alignment (see Figure 6 and Supplementary Figure S3.) <b>StructuralClusterHomologs.xlsx: </b>An Excel spreadsheet with information about RcGTA homologs found in small clusters (SC) and large clusters (LC) across alpha-proteobacterial genomes. The table contains the GI numbers of each homologs, as well as accession number and taxonomic information of the source genome. <b> </b>
提供机构:
figshare
创建时间:
2017-09-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作