Table3_Machine learning identified MDK score has prognostic value for idiopathic pulmonary fibrosis based on integrated bulk and single cell expression data.XLSX
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Table3_Machine_learning_identified_MDK_score_has_prognostic_value_for_idiopathic_pulmonary_fibrosis_based_on_integrated_bulk_and_single_cell_expression_data_XLSX/24630030
下载链接
链接失效反馈官方服务:
资源简介:
Idiopathic pulmonary fibrosis (IPF) is a progressive and fatal lung disease that poses a significant challenge to medical professionals due to its increasing incidence and prevalence coupled with the limited understanding of its underlying molecular mechanisms. In this study, we employed a novel approach by integrating five expression datasets from bulk tissue with single-cell datasets; they underwent pseudotime trajectory analysis, switch gene selection, and cell communication analysis. Utilizing the prognostic information derived from the GSE47460 dataset, we identified 22 differentially expressed switch genes that were correlated with clinical indicators as important genes. Among these genes, we found that the midkine (MDK) gene has the potential to serve as a marker of Idiopathic pulmonary fibrosis because its cellular communicating genes are differentially expressed in the epithelial cells. We then utilized midkine and its cellular communication-related genes to calculate the midkine score. Machine learning models were further constructed through midkine and related genes to predict Idiopathic pulmonary fibrosis disease through the bulk gene expression datasets. The midkine score demonstrated a correlation with clinical indexes, and the machine learning model achieved an AUC of 0.94 and 0.86 in the Idiopathic pulmonary fibrosis classification task based on lung tissue samples and peripheral blood mononuclear cell samples, respectively. Our findings offer valuable insights into the pathogenesis of Idiopathic pulmonary fibrosis, providing new therapeutic directions and target genes for further investigation.
特发性肺纤维化(Idiopathic Pulmonary Fibrosis, IPF)是一种进行性致死性肺部疾病,其发病率与患病率持续攀升,且潜在分子机制尚未被充分阐明,这给临床医务工作者带来了巨大挑战。本研究采用创新性研究策略,整合了5份本体组织(bulk tissue)表达数据集与单细胞转录组数据集,并对其开展拟时间轨迹分析、开关基因筛选以及细胞通讯分析。借助GSE47460数据集提供的预后信息,我们筛选出22个与临床指标显著相关的差异表达开关基因,将其作为核心研究基因。在上述核心基因中,我们发现中期因子(midkine, MDK)有望成为特发性肺纤维化的标志物,因为其调控的细胞通讯相关基因在上皮细胞中呈现差异表达。随后,我们以MDK及其细胞通讯相关基因为基础,构建了中期因子评分(midkine score)。进一步基于上述基因构建机器学习模型,借助本体组织基因表达数据集实现特发性肺纤维化的病情预测。中期因子评分与多项临床指标显著相关;该机器学习模型在基于肺组织样本和外周血单个核细胞样本的特发性肺纤维化分类任务中,分别取得了0.94与0.86的受试者工作特征曲线下面积(AUC)。本研究结果为特发性肺纤维化的发病机制研究提供了重要理论参考,同时为后续的治疗靶点探索与新型治疗方向研发提供了全新思路。
创建时间:
2023-11-24



