Table_1_Large-Scale Integrative Analysis of Soybean Transcriptome Using an Unsupervised Autoencoder Model.DOCX
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Table_1_Large-Scale_Integrative_Analysis_of_Soybean_Transcriptome_Using_an_Unsupervised_Autoencoder_Model_DOCX/19296662
下载链接
链接失效反馈官方服务:
资源简介:
Plant tissues are distinguished by their gene expression patterns, which can help identify tissue-specific highly expressed genes and their differential functional modules. For this purpose, large-scale soybean transcriptome samples were collected and processed starting from raw sequencing reads in a uniform analysis pipeline. To address the gene expression heterogeneity in different tissues, we utilized an adversarial deconfounding autoencoder (AD-AE) model to map gene expressions into a latent space and adapted a standard unsupervised autoencoder (AE) model to help effectively extract meaningful biological signals from the noisy data. As a result, four groups of 1,743, 914, 2,107, and 1,451 genes were found highly expressed specifically in leaf, root, seed and nodule tissues, respectively. To obtain key transcription factors (TFs), hub genes and their functional modules in each tissue, we constructed tissue-specific gene regulatory networks (GRNs), and differential correlation networks by using corrected and compressed gene expression data. We validated our results from the literature and gene enrichment analysis, which confirmed many identified tissue-specific genes. Our study represents the largest gene expression analysis in soybean tissues to date. It provides valuable targets for tissue-specific research and helps uncover broader biological patterns. Code is publicly available with open source at https://github.com/LingtaoSu/SoyMeta.
植物组织可通过其基因表达模式加以区分,该特征有助于鉴定组织特异性高表达基因及其差异功能模块。为此,本研究收集了大规模大豆转录组(transcriptome)样本,并采用统一分析流程,从原始测序读段(raw sequencing reads)开始完成标准化处理。为解决不同组织间的基因表达异质性问题,本研究采用对抗去混淆自编码器(adversarial deconfounding autoencoder,AD-AE)模型将基因表达映射至隐空间,同时适配了标准无监督自编码器(unsupervised autoencoder,AE)模型,以从噪声数据中有效提取具有生物学意义的信号。最终,本研究分别在叶、根、种子与根瘤组织中鉴定出四组组织特异性高表达基因,每组基因数量依次为1743、914、2107与1451个。为获取各组织中的关键转录因子(transcription factors,TFs)、枢纽基因及其功能模块,本研究基于校正与压缩后的基因表达数据,构建了组织特异性基因调控网络(gene regulatory networks,GRNs)与差异关联网络。本研究通过文献佐证与基因富集分析对结果进行了验证,证实了诸多已鉴定的组织特异性基因。本研究是迄今为止规模最大的大豆组织基因表达分析研究,可为组织特异性研究提供有价值的靶点,并助力揭示更广泛的生物学规律。本研究的开源代码已公开于https://github.com/LingtaoSu/SoyMeta。
创建时间:
2022-03-03



