Genome-wide analysis of the Firmicutes illuminates the diderm/monoderm transition
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/3pcn9779gc
下载链接
链接失效反馈官方服务:
资源简介:
SUPPORTING DATA:
Data from the manuscript of Taib et al., " Genome-wide analysis of the Firmicutes illuminates the diderm/monoderm transition”
1- UBA-FIRMICUTES/UBA_Firmicutes.faa
Annotated proteomes of the 1,639 UBA Firmicutes from Parks et al., (2017)
2- DB_TaxIds.xls
TaxId, names and taxonomy of the taxa used to build the four databanks used in the analyses.
There are 4 sheets in the file:
- Firmicutes DB LARGE: 1,869 Firmicutes taxa, both reference and UBAs.
- Firmicutes DB SMALL: 316 Firmicutes taxa, both reference and UBAs.
- DB BACTERIA: 358 bacterial taxa.
- OUTGRP: 13 bacterial taxa used to root the reference tree of Firmicutes in Figure 1.
3- OM_ProteinIDsConcatenations.xlsx
Accession numbers of the OM proteins used to build the trees in Figure 5 and Supplementary Figure 4
4- spo0A-vs-DBSMALL.xls
Accession numbers of the spo0A domain homologues identified in Firmicutes DB SMALL and mapped in Supplementary Figure 1.
5- 15964FAMILIES/
The folder contains the protein families based on an 80% coverage-35% identity cutoff, and present in at least 5 Firmicutes.
For each family, two files are available: a fasta file of the sequences included in the family and a table with the taxonomy of each protein sequence.
6- 3500HCLUSTERS/
Tables corresponding to each of the 3500 clusters generated by the hierarchical clustering approach.
Each file contains the families belonging to the cluster with their annotation and taxonomic distribution.
7- PFAMDOMAINS/
Three folders corresponding to the three approaches used for pfam domains annotation: ALL; COLLAPSED and SINGLE.
Each folder contains the fasta files and the tables with protein accession numbers, taxids and the taxonomy of the proteins where the domains were identified.
8- TREES/
Data related to the five phylogenies presented. For each tree, there is the corresponding alignment (.fasta), the supermatrix (.phy), the newick file (.treefile), and for three trees the corresponding pdf.
9-OMCluster_AccNum.xls
Accession numbers of the proteins represented in the OM cluster figure (Figure 4)
创建时间:
2020-10-30



