Data_Sheet_3_Cross-Species Meta-Analysis of Transcriptomic Data in Combination With Supervised Machine Learning Models Identifies the Common Gene Signature of Lactation Process.XLSX
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/Data_Sheet_3_Cross-Species_Meta-Analysis_of_Transcriptomic_Data_in_Combination_With_Supervised_Machine_Learning_Models_Identifies_the_Common_Gene_Signature_of_Lactation_Process_XLSX/6808820
下载链接
链接失效反馈官方服务:
资源简介:
Lactation, a physiologically complex process, takes place in mammary gland after parturition. The expression profile of the effective genes in lactation has not comprehensively been elucidated. Herein, meta-analysis, using publicly available microarray data, was conducted identify the differentially expressed genes (DEGs) between pre- and post-peak milk production. Three microarray datasets of Rat, Bos Taurus, and Tammar wallaby were used. Samples related to pre-peak (n = 85) and post-peak (n = 24) milk production were selected. Meta-analysis revealed 31 DEGs across the studied species. Interestingly, 10 genes, including MRPS18B, SF1, UQCRC1, NUCB1, RNF126, ADSL, TNNC1, FIS1, HES5 and THTPA, were not detected in original studies that highlights meta-analysis power in biosignature discovery. Common target and regulator analysis highlighted the high connectivity of CTNNB1, CDD4 and LPL as gene network hubs. As data originally came from three different species, to check the effects of heterogeneous data sources on DEGs, 10 attribute weighting (machine learning) algorithms were applied. Attribute weighting results showed that the type of organism had no or little effect on the selected gene list. Systems biology analysis suggested that these DEGs affect the milk production by improving the immune system performance and mammary cell growth. This is the first study employing both meta-analysis and machine learning approaches for comparative analysis of gene expression pattern of mammary glands in two important time points of lactation process. The finding may pave the way to use of publically available to elucidate the underlying molecular mechanisms of physiologically complex traits such as lactation in mammals.
泌乳是一类生理机制复杂的过程,于分娩后在乳腺内发生。目前学界尚未全面阐明泌乳过程中功能基因的表达谱特征。为此,本研究利用公开可得的微阵列(microarray)数据开展荟萃分析(meta-analysis),以鉴定泌乳高峰期前后产奶量相关的差异表达基因(differentially expressed genes, DEGs)。本研究选用了大鼠(Rat)、牛(Bos Taurus)及塔马尔沙袋鼠(Tammar wallaby)的三套微阵列数据集,共选取与产奶量峰值前(n=85)及峰值后(n=24)相关的样本。荟萃分析结果显示,跨所研究物种共鉴定出31个DEGs。值得注意的是,其中10个基因(包括MRPS18B、SF1、UQCRC1、NUCB1、RNF126、ADSL、TNNC1、FIS1、HES5及THTPA)在原始研究中未被检出,这凸显了荟萃分析在生物特征发现中的应用价值。共同靶点与调控因子分析结果显示,CTNNB1、CDD4及LPL具有较高的连接度,可作为基因网络的核心枢纽基因。鉴于原始数据来自三个不同物种,为验证异质性数据源对差异表达基因筛选结果的影响,本研究应用了10种属性加权(attribute weighting,机器学习)算法。属性加权分析结果表明,物种类型对筛选得到的基因列表几乎无影响或影响极小。系统生物学(systems biology)分析结果显示,这些差异表达基因可通过增强免疫系统功能与促进乳腺细胞增殖,进而影响产奶量。本研究首次同时采用荟萃分析与机器学习方法,对泌乳过程两个关键时间点的乳腺基因表达模式开展比较分析。本研究结果可为利用公开数据集阐明哺乳动物泌乳等复杂生理性状的潜在分子机制提供新思路。
创建时间:
2018-07-12



