Additional file 1 of Comprehensive profiling of genomic invertons in defined gut microbial community reveals associations with intestinal colonization and surface adhesion

Name: Additional file 1 of Comprehensive profiling of genomic invertons in defined gut microbial community reveals associations with intestinal colonization and surface adhesion
Creator: figshare
Published: 2025-03-10 04:10:44
License: 暂无描述

DataCite Commons2025-03-10 更新2025-05-07 收录

下载链接：

https://springernature.figshare.com/articles/dataset/Additional_file_1_of_Comprehensive_profiling_of_genomic_invertons_in_defined_gut_microbial_community_reveals_associations_with_intestinal_colonization_and_surface_adhesion/28561463/1

下载链接

链接失效反馈

官方服务：

资源简介：

Supplementary Material 1: Supplementary Sections S1-S18 are available in the Supplementary Information PDF file. Supplementary Tables S1-S18 are included as CSV files and contain the following data: Table S1: Metadata on hCom2 strains analyzed in this work, including column information on strain name, abbreviated name (abbrev), associated sequence contigs, BioSample/BioProject accession numbers, and GTDB phylogeny (domain,phylum,class,order,family,genus,species). Table S2: Read libraries analyzed in this work with associated metadata, including column information on sample names, sample types, SRA accession number, sample descriptions, number of read pairs in the raw read library, number and fraction of read pairs that passed mouse-host-filtering (for mouse-stool sample reads only), number and fraction of read pairs that were successfully mapped by bowtie2 (denominator being the number of read pairs that passed mouse-host-filtering if sample was mouse-stool sample, or total raw read pair count otherwise). Table S3: Forward/reverse read counts of invertons in isolate culture comparing original PhaseFinder vs PhaseFinderDC, including information on the actual strain culture used to generated the read library, inverton ID, forward read counts based on paired-end orientation (Pe_F), reverse read counts based on paired-end orientation (Pe_R), forward read counts based on directly spanning inversion junction (Span_F), reverse read counts based on directly spanning inversion junction (Span_R), the strain genome onto which reads were mapped and called as inverton (mappedStrain), which workflow was used (original PhaseFinder of PhaseFinderDC), and whether this was a mis-map (i.e., actualStrain was not the same as mappedStrain). Table S4: Forward/reverse read counts of all invertons across all samples, determined using PhaseFinderDC, including column information on inverton ID, sample ID, forward read counts based on paired-end orientation (Pe_F), reverse read counts based on paired-end orientation (Pe_R), inversion ratio Pe_R/(Pe_R+Pe_F) calculated using read counts based on paired-end orientation (Pe_ratio), forward read counts based on directly spanning inversion junction (Span_F), reverse read counts based on directly spanning inversion junction (Span_R), inversion ratio Span_R/(Span_R+Span_F) calculated using read counts based on paired-end orientation (Span_ratio). Table S5: Metadata on all identified invertons including column information on inverton ID, inverton group, whether it intersects a gene coding sequence (intersectGene), the associated IR sequence, and full inverton sequence. Table S6: Enriched motifs detected across inverton IR sequences, including column information on motif ID and sequence. Table S7: Enriched motifs detected across full inverton sequences, including column information on motif ID and sequence. Table S8: Instances of detected motifs from Table S7 across hCom2 genomes, including column information on genomic locations (chrom, start, stop, strand), the associated motif ID, MEME p-value of the detected instance. For motif instances that intersect an inverton, additional column information is included on the inverton ID, location of the inverton (start_inverton, end_inverton), and inverton group. Table S9: Metadata on inverton-proximal genes across hCom2, including column information on genomic loci of each gene (chrom, start, end, strand), the gene ID, the gene annotation, the associated inverton ID, and whether the gene intersected the inverton directly, and if not whether it could be regulated by a promoter in the inverton (5’-end of gene is closer to inverton than 3’-end of gene). Table S10: Enrichment of inverton-proximal gene annotations by inverton group and type of proximity, including column information on the gene annotation, inverton group, the type of proximity (regulatable-vs.-nonregulatable-vs.-intersecting), the number of genes with given annotation that are proximal to inverton of given group with given type of proximity (near inverton gene count), the number of genes with given annotation that are not proximal to inverton of given group with given type of proximity (not near inverton gene count), the number of genes not with given annotation that are proximal to inverton of given group with given type of proximity (near inverton other gene count), the number of genes not with given annotation that are not proximal to inverton of given group with given type of proximity (not near inverton other gene count), the odds ratio, Fisher’s Exact p-value, and significance after multiple hypothesis correction. Table S11: Invertase genes across hCom2, including column information on the locations of the genes (chrom, start, end, strand), gene ID, gene annotation, and invertase group. For invertase gene instances that intersect an inverton, additional column information is included on the inverton ID and inverton group. Table S12: Enrichment of invertase groups proximal to inverton groups, including column information on the inverton group, invertase group, the number of invertons within given inverton group that are proximal to an invertase gene within the given invertase group, the number of invertons not within given inverton group that are proximal to an invertase gene within the given invertase group, the number of invertons within given inverton group that are not proximal to an invertase gene within the given invertase group, the number of invertons not within given inverton group that are not proximal to an invertase gene within the given invertase group, the odds ratio, Fisher’s Exact p-value, and significance after multiple hypothesis correction. Table S13: Directionally biased invertons comparing between (i) isolate culture / mouse stool and (ii) carrier-attached / liquid-phase mixed culture samples, including column information on inverton ID, the number of forward orientation reads in the first condition (Pe_F_cond1), the number of reverse orientation reads in the first condition (Pe_R_cond1), the number of forward orientation reads in the second condition (Pe_F_cond2), the number of reverse orientation reads in the second condition (Pe_R_cond2), the odds ratio, Fisher’s Exact p-value, and label of first condition vs. second condition comparison. Table S14: Gene regulation predictions based on directionally biased invertons with promoter motifs, including column information on the genomic location and attributes of the gene (chrom, start, end, gene ID, gene strand, gene annotation), the inverton ID, the type of gene proximity (regulatable-vs.-nonregulatable-vs.-intersecting), the number of forward inverton orientation reads in the first condition (Pe_F_cond1), the number of reverse inverton orientation reads in the first condition (Pe_R_cond1), the number of forward inverton orientation reads in the second condition (Pe_F_cond2), the number of reverse inverton orientation reads in the second condition (Pe_R_cond2), the odds ratio, Fisher’s Exact p-value, the label of first condition vs. second condition comparison, the direction the inverton is enriched toward in the first condition, the motif ID of the predicted promoter, the strand of the detected motif on the genome, whether the promoter is a reverse complement of the motif, the strand of the promoter determined based on prior 2 columns, and the condition with the predicted upregulation. Table S15: Gene annotations enriched for inverton-based regulation, including column information on the gene annotation, the label of first condition vs. second condition comparison, the condition with the predicted upregulation, the number of genes with given annotation that are predicted to be upregulated in the given condition, the number of genes with a different annotation that are predicted to be upregulated in the given condition, the number of genes with given annotation that are not predicted to be upregulated in the given condition, the number of genes with a different annotation that are not predicted to be upregulated in the given condition, the odds ratio, and Fisher’s Exact p-value. Table S16: Metadata on transcriptomic datasets used for gene regulation prediction validation, including column information on the SRA accession, gene ID, sample type, and cqn-normalized RPKM value. Table S17: Gene expression estimated based on transcriptomic datasets, including column information on the gene ID, median RPKM estimated across mouse and in vitro RNA-seq samples, and the predicted upregulated condition based on inverton orientation data. Table S18: Directionally biased invertons comparing across timepoints (mouse generation and mixed culture passage), including column information on the inverton ID, the number of forward inverton orientation reads in the first condition (Pe_F_cond1), the number of reverse inverton orientation reads in the first condition (Pe_R_cond1), the number of forward inverton orientation reads in the second condition (Pe_F_cond2), the number of reverse inverton orientation reads in the second condition (Pe_R_cond2), the odds ratio, Fisher’s Exact p-value, the label of first condition vs. second condition comparison.

提供机构：

figshare

创建时间：

2025-03-10