Table_2_Integrating Peak Colocalization and Motif Enrichment Analysis for the Discovery of Genome-Wide Regulatory Modules and Transcription Factor Recruitment Rules.xlsx

NIAID Data Ecosystem2026-03-11 收录

下载链接：

https://figshare.com/articles/dataset/Table_2_Integrating_Peak_Colocalization_and_Motif_Enrichment_Analysis_for_the_Discovery_of_Genome-Wide_Regulatory_Modules_and_Transcription_Factor_Recruitment_Rules_xlsx/11880144

下载链接

链接失效反馈

官方服务：

资源简介：

Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq) has opened new avenues of research in the genome-wide characterization of regulatory DNA-protein interactions at the genetic and epigenetic level. As a consequence, it has become the de facto standard for studies on the regulation of transcription, and literally thousands of data sets for transcription factors and cofactors in different conditions and species are now available to the scientific community. However, while pipelines and best practices have been established for the analysis of a single experiment, there is still no consensus on the best way to perform an integrated analysis of multiple datasets in the same condition, in order to identify the most relevant and widespread regulatory modules composed by different transcription factors and cofactors. We present here a computational pipeline for this task, that integrates peak summit colocalization, a novel statistical framework for the evaluation of its significance, and motif enrichment analysis. We show examples of its application to ENCODE data, that led to the identification of relevant regulatory modules composed of different factors, as well as the organization on DNA of the binding motifs responsible for their recruitment.

染色质免疫共沉淀结合下一代测序（ChIP-Seq）为在遗传与表观遗传层面开展全基因组水平的调控性DNA-蛋白质相互作用（regulatory DNA-protein interactions）表征研究开辟了全新路径。正因如此，该技术已成为转录调控（transcription regulation）研究的事实上的标准，目前科学界已可获取不同实验条件、不同物种中针对转录因子（transcription factors）与辅因子（cofactors）的数以千计套数据集。然而，尽管针对单组实验的分析流程与最佳实践已逐步建立，但针对同一条件下的多套数据集开展整合分析，以鉴定由不同转录因子与辅因子构成的最具相关性且广泛分布的调控模块（regulatory modules）的最优方案，目前仍未达成共识。本文针对这一问题提出一款计算分析流程，其整合了峰峰顶共定位（peak summit colocalization）分析、用于评估其显著性的全新统计框架（statistical framework），以及基序富集分析（motif enrichment analysis）。我们展示了该流程应用于ENCODE数据的实例，其成功鉴定出由不同因子构成的功能性调控模块，以及负责介导这些因子募集的DNA结合基序（binding motifs）在DNA上的排布规律。

创建时间：

2020-02-21