Additional file 1: of Comparative genomics of Bacteria commonly identified in the built environment
收藏Mendeley Data2024-06-25 更新2024-06-28 收录
下载链接:
https://springernature.figshare.com/articles/dataset/Additional_file_1_of_Comparative_genomics_of_Bacteria_commonly_identified_in_the_built_environment/7641272
下载链接
链接失效反馈官方服务:
资源简介:
Table S1. Selection of bacterial genera commonly identified in the built environment. Bacterial genera identified in 54 publications were compiled (see Table S2) and commonly identified genera were selected. All bacterial genera identified in more than about 10% of the publications (n ≥ 6 publications) with at least one complete reference genome on the NCBI RefSeq database were used in this study (n = 28 genera). Table S2. Metadata for each reference. 54 publications were compiled, including metadata for location, sub-locations, bacterial genera identified, sample type, climate (Table S4 and S5), temperature (°C), and humidity (%). If temperature or humidity was not described by the publication, the average over a certain period of time (either the timeframe stated in the publication or the publication year) was obtained from online sources. Table S3. Publication count for each “Common BE Bacterial Genus” by macro-Level BE location. Macro-level BE Locations included indoor, outdoor, underground, and extreme. Further division by type of sample is also depicted, including surface (S), air (A), water (W). Darker orange color indicates more references identified the genera in the macro BE location and sample type while lighter orange color indicates fewer references. The total number of references for each location and genera are also shown. Table S4. Köppen climate classification. Köppen climate classification was used to identify the climate for each publication’s study location. Only the climate assignment between 1981 and 2010 was used for this study. Abbreviation descriptions, latitude, and longitude values are listed. Table S5. Publication count for each “Common BE Bacterial Genus” by climate. The climate was identified for each publication’s study location based on the closest Köppen latitude and longitude values and correlated with the Köppen ID (see Table S4 for Köppen assignment). For publications describing general locations (e.g., only provided a U.S. state name), a central location in the region was chosen for latitude and longitude. Publications without location specifics were not included, and publications in space were separated out to “Space” category. Darker orange color indicates more references identified the genera in the macro BE location and sample type while lighter orange color indicates fewer references. The total number of references for each location and genera are also shown. Table S6. MetaMetaDB environmental category assignment for each “Common BE Bacterial Genus.” MetaMetaDB is a database to search for the possible habitats a microorganism could live in and was made by collecting 16S rRNA sequences. Environmental categories for each “Common BE bacterial genus” were based on the identity threshold of 97%, corresponding to the species taxonomic level. Every species for each “Common BE genus” is listed with the corresponding environmental category, where “Y” indicates that the species has been previously identified in the category and “N” indicates the species has not been identified in the category. “Hits” indicates the number of 16S rRNA sequences used by the database. Table S7. Mean distance (Dmean) between all pairs of bacterial species for each “Common BE Bacterial Genus.” The Dmean was used to describe the genetic diversity among species within a genus. The genetic distance between a pair of bacteria was calculated with the K80 model using the ‘dist.dna’ function of the ‘ape’ package of R ( https://cran.r-project.org/web/packages/ape ). We used a nucleotide sequence alignment of the 16S rRNA genes in ‘The All-Species Living Tree’ Project ( https://www.arb-silva.de/projects/living-tree /). LTP datasets based on SILVA release 128 were downloaded from Archive ( https://www.arb-silva.de/no_cache/download/archive/living_tree/LTP_release_128 /). Bacterial genera for which 3 or more taxa (N > 2) were available at LTP_release_128 were included in the 16S rRNA diversity analysis. Table S8. Genome information. Genome features reported include size (Mb), GC content (%), GCSI (GC skew index), and S value (strength of selected codon usage). A genus was deemed BE if observed in at least 6 publications out of 54. The column “BE” shows the number of references that identified the genera. Table S9. Robustness of the study. The genome data set used in this study was tested over two levels: 1) different subsets of bacteria (e.g., Phyla of Proteobacteria, Firmicutes, and Actinobacteria) and also randomly selecting one representative for species that have multiple strains sequenced, and 2) testing different numbers of publications (n = 1, 2, 3, 4, 5, and 6) to select for BE genera. Table S10. Genomic feature statistical analysis for each MetaMetaDB selected environmental category. Each genomic feature per MetaMetaDB environmental category was analyzed to determine statistical significance between the “Common BE genomes” associated with an environment and the “Common BE genomes” not associated. Significance is indicated by q-value < 0.05 and large effect size by Cliff’s delta |d| > 0.474. (XLSX 3660 kb)
创建时间:
2023-06-28



