five

CTCF BINDING PATTERNS DEFINE TADS AND BOUNDARY ELEMENTS IN HUMAN AND MOUSE GENOMES

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8305566
下载链接
链接失效反馈
官方服务:
资源简介:
Msci thesis from Industrial Placement at Politecnico di Milano   Abstract In eukaryotes, the genome is highly packaged inside the nucleus. This hierarchical structure requires complex compartmentation and organization. Topologically associating domains (TADs) serve as architectural units in the 3D genome, functioning to regulate and constrain gene interaction. TAD boundaries are enriched in the CCCTC-binding factor (CTCF). Due to the asymmetric nature of the CTCF motif, it exhibits differing orientations based on the strand it is localized. Motif orientation is crucial for loop formation, which is facilitated in a convergent orientation. While the relationship between TADs and CTCF motif patterns has been described in humans, the model remained to be corroborated in other mammals.  We conducted a comparative analysis by applying the research conducted in humans to the mouse genome. By integrating ChIP-seq data, we examined the CTCF sites based on ChIP-seq peak consolidation, signal value, and motif p-value. This analysis aimed to comprehend the binding profile of CTCF and elucidate its complexity in terms of binding strength and cross-tissue conservation. Mouse genome CTCF motifs were categorized into cluster patterns based on their relative orientation. Following this classification, the distribution of each pattern was studied using ChIP-seq-related descriptors. We then built a TAD boundary collection based on the insulation score. Boundary consolidation and length, insulation score, and CTCF number were used to reconcile CTCF to TAD boundary function. The cluster patterns were then mapped to TAD and TAD boundaries to elucidate their structure based on the spatial arrangement of CTCF. The distribution of ChIP-seq-validated CTCF sites within TAD boundaries revealed a well-defined structure, characterized by an abundance of the divergent pattern. Boundaries with the highest insulation scores exhibited both a high consolidation and CTCF enrichment. This set of boundaries was used to study TAD structure. An increment of the convergent pattern was observed inside TADs. The individual CTCF sites on the interior of the TAD were further organized into left and right sections, corresponding to an opposing directional arrangement to TAD boundaries. Significantly, the ChIP-seq data contained many unannotated peaks and non-signal motifs. These were associated with a high motif p-value, low ChIP-seq signal, low consolidation, and a less structured arrangement of CTCF clustering. Moreover, non-signal motifs were underrepresented within TAD boundaries and displayed a disorderly distribution. The human model of CTCF-dependent TAD structure was confirmed in mice. The model suggested the inherent alternation between convergent and divergent patterns, with these CTCF sites conferring characteristic properties upon TADs and their boundaries. These traits encompass the change of preferential genomic interaction and boundary insulation capability. Thus, the orientation-based clustering of CTCF sites was further validated as a good method to define TADs in a precise organized manner.
创建时间:
2023-10-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作