DataSheet2_Genome-wide imputed differential expression enrichment analysis identifies trait-relevant tissues.PDF
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet2_Genome-wide_imputed_differential_expression_enrichment_analysis_identifies_trait-relevant_tissues_PDF/21842724
下载链接
链接失效反馈官方服务:
资源简介:
The identification of pathogenically-relevant genes and tissues for complex traits can be a difficult task. We developed an approach named genome-wide imputed differential expression enrichment (GIDEE), to prioritise trait-relevant tissues by combining genome-wide association study (GWAS) summary statistic data with tissue-specific expression quantitative trait loci (eQTL) data from 49 GTEx tissues. Our GIDEE approach analyses robustly imputed gene expression and tests for enrichment of differentially expressed genes in each tissue. Two tests (mean squared z-score and empirical Brown’s method) utilise the full distribution of differential expression p-values across all genes, while two binomial tests assess the proportion of genes with tissue-wide significant differential expression. GIDEE was applied to nine training datasets with known trait-relevant tissues and ranked 49 GTEx tissues using the individual and combined enrichment tests. The best-performing enrichment test produced an average rank of 1.55 out of 49 for the known trait-relevant tissue across the nine training datasets—ranking the correct tissue first five times, second three times, and third once. Subsequent application of the GIDEE approach to 20 test datasets—whose pathogenic tissues or cell types are uncertain or unknown—provided important prioritisation of tissues relevant to the trait’s regulatory architecture. GIDEE prioritisation may thus help identify both pathogenic tissues and suitable proxy tissue/cell models (e.g., using enriched tissues/cells that are more easily accessible). The application of our GIDEE approach to GWAS datasets will facilitate follow-up in silico and in vitro research to determine the functional consequence(s) of their risk loci.
识别复杂性状的致病相关基因与组织往往是一项极具挑战性的工作。我们开发了一种名为全基因组推断差异表达富集分析(genome-wide imputed differential expression enrichment, GIDEE)的方法,通过将全基因组关联研究(genome-wide association study, GWAS)汇总统计数据与来自49个GTEx组织的组织特异性表达数量性状基因座(expression quantitative trait loci, eQTL)数据相结合,对性状相关组织进行优先级排序。我们的GIDEE方法可对稳健推断的基因表达进行分析,并检验各组织中差异表达基因的富集情况。其中两类检验(均方z得分法与经验布朗方法)利用了所有基因的差异表达p值的完整分布,另外两类二项式检验则评估了具有组织范围显著差异表达的基因比例。
研究将GIDEE应用于9个带有已知性状相关组织的训练数据集,并通过单独富集检验与联合富集检验对49个GTEx组织进行排序。表现最优的富集检验在9个训练数据集上,针对已知性状相关组织的平均排名为1.55/49——其中5次将正确组织排至首位,3次排至第二位,1次排至第三位。
随后我们将GIDEE方法应用于20个致病组织或细胞类型尚不明确的测试数据集,为与性状调控架构相关的组织提供了重要的优先级排序结果。
因此,GIDEE的优先级排序方法有助于同时识别致病组织与合适的替代组织/细胞模型(例如利用更易获取的富集组织/细胞)。将本方法应用于全基因组关联研究数据集,将有助于推动后续的计算机模拟与体外实验研究,以明确其风险位点的功能效应。
创建时间:
2023-01-09



