Ancient origin and constrained evolution of the division and cell wall (dcw) gene cluster across Bacteria
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/4y5mzppzmb
下载链接
链接失效反馈官方服务:
资源简介:
SUPPORTING DATA:
Data from the manuscript of Megrian et al., "Ancient origin and constrained evolution of the division and cell wall (dcw) gene cluster across Bacteria”
- CONCAT/
- concat_12DCW.treefile
Phylogeny presented in Figure 6 and Supplementary Figure 14, in newick format.
- concat_12DCW_collapsed.treefile
Same phylogeny as concat_12DCW.treefile, but with phyla collapsed into a single branch.
- concat_12DCCW.aln
Concatenation alignment used to reconstruct the phylogeny. Corresponds to the concatenation of 12 dcw cluster proteins.
- RENAME_CONCAT.txt
Annotation file to rename the labels of the newick file on iTol (https://itol.embl.de)
- cleaned_trimmed_single_alignments/
Directory containing cleaned and trimmed single alignments used for the concatenation.
- PARALOGS/
- *.treefile
Phylogenies presented in Figure 4 and Supplementary Figures 2, 3 and 4, in newick format.
- *.trim
Trimmed alignments used to reconstruct the phylogenies.
- *.aln
Alignments used to reconstruct the phylogenies (before trimming).
- *.fasta
Sequences aligned to reconstruct the phylogenies.
- PastML/
- pastml_raw_output.tab
Output of PastML inferences. Columns correspond to contiguous pairs of dcw cluster genes. Rows correspond to node names in the reference phylogeny. 0 refers to absence of the pair in the correspoding node, 1 refers to presence.
- pastml_ref_tree.treefile
Reference phylogeny used for the inference. Node names are indicated.
- SGT_before_cleaning
- *_raw.treefile
Single gene phylogenies obtained after the homology searches, before cleaning.
- *_raw.trim
Trimmed alignments used to reconstruct the phylogenies.
- *_raw.aln
Alignments used to reconstruct the phylogenies (before trimming).
- *_raw.fasta
Sequences aligned to reconstruct the phylogenies.
- RENAME_SGT.txt
Annotation file to rename the labels of the newick files on iTol (https://itol.embl.de)
- OTHER_REF_TREES
- CORE
Contains CONCAT and SGT_before_cleaning data, based on 63 core genome markers.
- RNAPOL_IF2
Contains CONCAT and SGT_before_cleaning data, based on RNApol+IF2 markers.
- RPROTS
Contains CONCAT and SGT_before_cleaning data, based on 16rprot markers.
- CPR
- CPR.treefile
Phylogeny presented in Figure 5. Obtained from a supermatrix that contains 302 sequences (27 chloroflexi + 275 CPRs), and 2126 amino acid positons. The concatenated proteins are: MurG, MurF, MurE, MurD, MurC, MraZ, MraY, MraW, FtsZ, FtsW, FtsI, FtsA. IQ-TREE v2.3.1 was used to infer the ML tree. Best-fit model: LG+F+R10 chosen according to BIC. Ultrafast bootstrap 1000.
- TREES_CLEANING
- cleanTrees.R
R script used for cleaning phylogenies.
- SAMPLE_FILES
Files needed to run the script.
创建时间:
2022-08-30



