Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets via protocol-specific bias modeling
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE108513
下载链接
链接失效反馈官方服务:
资源简介:
DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBS) in regulatory regions via footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq. Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impacts the discrimination of footprint from background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints. Open chromatin profiling via ATAC-seq and DNase-seq in HEK293 and K562 cells Please note that the only difference between high, medium and low depth samples is how they were sequenced. High depth samples were sequenced by themselves on a HiSeq lane, whereas for medium depth samples 2 libraries were multiplexed on one lane (and for low depth samples 4 libraries were multiplexed on one lane).
DNase-seq与ATAC-seq是当前广泛用于全基因组开放染色质区域检测的实验技术。DNase-seq具备单核苷酸分辨率,该特性可进一步用于通过染色质足迹分析推断调控区域内的转录因子结合位点(TFBS)。已有研究证实,DNase I存在序列偏好性,且该偏好性会对染色质足迹分析的效率产生负面影响。但目前针对ATAC-seq的染色质足迹分析及序列偏好性影响的系统性研究仍较为匮乏。本研究对这两种技术开展了系统性比较,结果显示,对ATAC-seq实验流程进行优化可提升其测序产出,并增强其与同一细胞系DNase-seq数据的一致性。研究发现,两种技术具有各自独特的序列偏好性,且在开展染色质足迹分析时可针对这些实验流程特异性的偏好性进行校正。尽管二者的染色质足迹图谱形态存在差异,但ATAC-seq与DNase-seq所推断的染色质足迹位点在位置上整体一致。然而,实验流程特异性的序列偏好性与转录因子结合位点的序列特征共同作用,会影响染色质足迹位点与背景区域的区分效果,进而导致在部分转录因子的分析中,一种方法的表现优于另一种。最后,本研究探讨了可重复识别开放染色质区域及染色质足迹位点所需的测序深度。本数据集采用ATAC-seq与DNase-seq对HEK293及K562细胞系进行开放染色质谱分析。请注意,高、中、低深度样本的唯一差异在于其测序方案:高深度样本单独在一条HiSeq测序泳道上完成测序;中深度样本将2个文库在一条泳道上进行多重测序(低深度样本则为4个文库在一条泳道上进行多重测序)。
创建时间:
2019-03-27



