five

Additional file 1 of Integrative metagenomic, transcriptomic, and proteomic analysis reveal the microbiota-host interplay in early-stage lung adenocarcinoma among non-smokers

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Additional_file_1_of_Integrative_metagenomic_transcriptomic_and_proteomic_analysis_reveal_the_microbiota-host_interplay_in_early-stage_lung_adenocarcinoma_among_non-smokers/26289146
下载链接
链接失效反馈
官方服务:
资源简介:
Additional file 1: Figure S1. Quality control of metagenomic data. A Comparison of quality control passed reads after filtering in 129 samples. B Specaccum species accumulation curves of three groups. C Rarefaction curves showing observed species richness taken from the 129 samples. D Overall taxa distribution of the microbiome kingdom in three groups. Figure S2. Microbial compositions in the cohort. Microbial compositions of the patients with ESLUAD and HCs at the phylum (A), genus (B), and species (C) levels. The top 10/20 abundant microbial taxa are shown with different gradient colors. The microbial composition is arranged in order of the mostabundant taxonomic ranks. Figure S3. Representative microbes exhibiting significant alterations between patients with ESLUAD and HCs. *** p < 0.001 as determined by Kruskal–Wallis test. Figure S4. Correlation between intrapulmonary microbiota and clinical features. A, D Comparison of the alpha diversity (Chao1/Shannon/Simpson index) and beta diversity (Bray–Curtis distance) at the species level with tumor infiltration in patients with ES-LUAD. B, E Comparison of the alpha diversity (Chao1/Shannon/Simpson index) and beta diversity (Bray–Curtis distance) at the species level with solid component of tumor in patients with ES-LUAD. C, F Comparison of the alpha diversity (Chao1/Shannon/Simpson index) and beta diversity (Bray–Curtis distance) at the species level with multiple-primary nodules in patients with ES-LUAD. Box plots show median ± quartiles, and the whiskers extend from the hinge to the largest or smallest value no further than 1.5-fold of the interquartile range. ns: Not significant, p-value as determined by Wilcoxon rank-sum test. AIS: Adenocarcinoma in situ, MIA: Minimally invasive adenocarcinoma, IA: Invasive adenocarcinoma, pGGN: Pure ground glass nodules, mGGN: Mixed ground glass nodules, SN: Solid nodule. Figure S5. Overview of transcriptome data. A RNA-Seq passed reads sequenced by Illumina NoveSeq 6000 Nanopore platforms (Wilcoxon rank-sum test). B Clustering heatmap of the DEGs between patients with ES-LUAD and HCs (DESeq2, |log2FC| > 1). C PCA analysis reveals differences in the transcriptomes of patients with ES-LUAD and HCs. D Volcano diagram shows the significant DEGs between patients with ES-LUAD and HCs (DESeq2, |log2FC| > 1). Figure S6. Identification of ES-LUAD-related mRNAs in the transcriptome dataset through WGCNA. A–D Network fitting calculations with fitted curves for selected network construction parameters. A Correlation coefficient corresponding to different power. B Average connectivity of the network constructed with different power values. When the power is taken as 8, the correlation coefficient is higher, and the average connectivity of the network is also higher, so the value of power used in the construction of the subsequent module is 8. C The distribution of network connectivity when the power is 8; D The test result of the power law distribution. As can be seen from the figure, k and p(k) are negatively correlated (correlation coefficient: 0.85), indicating that the selected power value enables the establishment of a scale-free network of genes. E The result of weighted co-expression network construction. F Heatmap of correlation analysis between modules and clinical traits. G Gene expression information statistics within modules. Figure S7. Overview of proteomic data. A QC sample correlation represents the process stability. B Clustering heatmap of the DEPs between patients with ES-LUAD and HCs (Wilcoxon rank-sum test, log2 fold change > 1). C PCA reveals differences in the proteome of patients with ES-LUAD and HCs. D Volcano diagram shows the significant DEPs between patients with ES-LUAD and HCs (Wilcoxon rank-sum test, log2 fold change > 1). Figure S8. Validation and prognostic information of DEGs and DEPs in public databases. The top represents the expression of DEGs, and the bottom represents the expression of DEPs. The middle represents the OS and DFS. Solid lines indicate significance at p < 0.05 (Mantel–Cox test). Figure S9. Random forest model based on multi-omics data. A The left panel represents the validation queue ROC curve for the random forest model established based on 3000 proteins (training AUC = 1). The middle panel depicts the selection of optimal feature count based on 10-fold cross-validation. The right panel shows the ROC curve for the top 150 proteins in the validation queue (training AUC = 1). B The left panel represents the validation queue ROC curve for the random forest model established based on 13846 mRNAs (training AUC = 1). The middle panel depicts the selection of optimal feature count based on 10-fold cross-validation. The right panel shows the ROC curve for the top 500 mRNAs in the validation queue (training AUC = 1). C The left panel represents the validation queue ROC curve for the random forest model established based on 196 KO genes (training AUC = 1). The middle panel depicts the selection of optimal feature count based on 10-fold cross-validation. D The left panel represents the validation queue ROC curve for the random forest model established based on 398 microbes and 3000 proteins (training AUC = 1). The middle panel depicts the selection of optimal feature count based on 10-fold cross-validation. The right panel shows the ROC curve for the top 45 microbes and proteins in the validation queue (training AUC = 1).
创建时间:
2024-07-13
二维码
社区交流群
二维码
科研交流群
商业服务