High dimensional association detection in large-scale genomic data
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE156074
下载链接
链接失效反馈官方服务:
资源简介:
Joint analyses of genomic datasets obtained in multiple different conditions are essential for understanding the biological mechanism that drives tissue-specificity and cell differentiation, but they still remain computationally challenging. To address this we introduce CLIMB (Composite LIkelihood eMpirical Bayes), a statistical methodology that learns patterns of condition-specificity present in genomic data. CLIMB provides a generic framework facilitating a host of analyses, such as clustering genomic features sharing similar condition-specific patterns and identifying which of these features are involved in cell fate commitment. Our approach improves upon existing methods by boosting statistical power to identify meaningful signals while retaining interpretability and computational tractability. We illustrate CLIMB's value on two sets of hematopoietic data: one studying CTCF ChIP-seq measured in 17 different cell populations, and another examining RNA-seq measured across constituent cell populations in three committed lineages. These analyses demonstrate that CLIMB captures biologically relevant clusters in the data and improves upon commonly-used pairwise comparisons and unsupervised clusterings typical of genomic analyses. CTCF ChIP-seq from 17 cell populations were jointly analyzed; obtained clusters of genomic loci were compared against ATAC-seq, H3K4me1 ChIP- and H3K4me3 ChIP-seq at matching loci. RNA-seq reads were compared across differentiated states in the erythroid, myeloid, and megakaryocytic cell lineages. DNase-seq of 38 biosamples across all autosomes were jointly reanalyzed using CLIMB and non-negative matrix factorization (see GSE156074_DNase-seq_README.txt for details).
创建时间:
2022-11-30



