Supplementary Data for Kogay et al. (2019)
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Supplementary_Data_for_Kogay_et_al_2019_/8796419
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains supplementary
figures and tables, sequence alignments, phylogenetic trees and outlier removal
calculations used in the bioinformatic analyses presented in:
Roman Kogay, Taylor B. Neely, Daniel P. Birnbaum, Camille R. Hankel, Migun Shakya, and
Olga Zhaxybayeva. “Machine-learning classification suggests that many
alphaproteobacterial prophages may instead be gene transfer agents”, BioRxiv,
2019. (BIORXIV/2019/697243; available at https://www.biorxiv.org/content/10.1101/697243v1)
File Contents:
Supplementary_Figures.pdf: Supplementary
Figures S1 and S2 in the manuscript.
Supplementary_Tables.zip:
Supplementary Tables S1-S15 in the manuscript.
Alignments_for_weight_assignment_GTAs.zip: Alignments
of 'true GTA' sequences in the training dataset. These alignments were used to
generate pairwise phylogenetic distances for the weighting scheme. The
alignments are in FASTA format. The filename prefix (g2, …, g15) refers to the RcGTA
gene name (see Supplementary Table S1).
Alignments_for_weight_assignment_viruses.zip:
Alignments of 'true virus' sequences in the training dataset. These alignments
were used to generate pairwise phylogenetic distances for the weighting scheme.
The alignments are in FASTA format. The filename prefix (g2, …, g15) refers to the
RcGTA gene name (see Supplementary Table S1).
Alignments_for_removal_of_outliers.zip:
Alignments of ‘true GTA’ and ‘true virus’ sequences in the training datasets. Pairwise
phylogenetic distances calculated from these alignments were used to remove GTA
homologs that are more closely related to viruses than to other GTAs, as well
as to investigate obtained lower accuracies for g6 and g12 (Supplementary
Table S11). The alignments are in FASTA format. The filename prefix (g2, …,
g15) refers to the RcGTA gene name (see Supplementary Table S1).
Outlier_removal.xlsx:
Calculations to identify GTAs that are more closely related to viruses than to
other GTAs. The removed sequences are highlighted.
Reference_phylogenetic_tree_reconstruction.zip:
Concatenated alignment of 83 marker genes in 1,423 taxa in PHYLIP format
(concatenated_83markers.phy); information about partitions and substitution
models used in phylogenetic reconstruction (partitions.txt); and phylogenetic
tree in Newick format (1423_alphaproteobacteria_reference_tree.newick).
创建时间:
2019-07-15



