Extensive binding of uncharacterized human transcription factors to genomic dark matter [GSE76494 Reanalysis]
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE280246
下载链接
链接失效反馈官方服务:
资源简介:
Most of the human genome is thought to be non-functional, and includes large segments often referred to as “dark matter” DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells. Nearly half of them associated strongly with known regulatory regions such as promoters and enhancers, often at conserved motif matches and co-localizing with each other. Surprisingly, the other half often associated with genomic dark matter, at largely unique sites, via intrinsic sequence recognition. Dozens of these, which we term “Dark TFs” mainly bind within regions of closed chromatin. Dark TF binding sites are rarely under purifying selection, and are enriched for transposable elements. Many Dark TFs are KZNFs, which contain the repressive KRAB domain, but many are not, and may represent potential pioneer TFs: based on compiled literature information, the Dark TFs exert diverse functions ranging from early development to tumor suppression. Thus, a large fraction of previously uncharacterized human TFs may have unappreciated activities within the dark matter genome. This entry describes re-analysis of previously existing datasets in located in SRA entry SRP068022. These datasets have been analyzed previously (in GSE76494), but they were re-analyzed here to be consistent with the newly generated ChIP-seq datasets. Previous datasets were mapped into hg19 assembly and analyzed with different computational approaches *************************************************************** The table below lists GEO accessions reused/reanalyzed for this study. ***************************************************************
目前学界普遍认为,人类基因组的大部分序列不具备功能,其中包含大量常被称为"暗物质DNA(dark matter DNA)"的片段。人类基因组还编码数百个推定且表征不足的转录因子(transcription factors, TFs)。我们在活细胞中鉴定了166个未被表征的人类转录因子的基因组结合位点。其中近半数可与已知调控区域(如启动子、增强子)牢固结合,且常富集于保守基序匹配位点,并彼此共定位。令人意外的是,另一半转录因子则多通过内在序列识别机制,结合于基因组暗物质区域内的大量独特位点。其中数十个被我们命名为"暗物质结合型转录因子(Dark TFs)"的蛋白,主要结合于封闭染色质区域内。"暗物质结合型转录因子(Dark TFs)"的结合位点极少受到纯化选择,且富集转座元件。多数暗物质结合型转录因子属于KRAB型锌指蛋白(KZNFs),携带转录抑制型KRAB结构域,但也有不少并非此类蛋白,可能属于潜在的先驱转录因子(pioneer TFs)。结合已汇编的文献信息可知,这类转录因子发挥的功能多样,涵盖早期胚胎发育至肿瘤抑制等多个过程。因此,大量此前未被表征的人类转录因子,可能在基因组暗物质区域中具备未被认知的活性。
本条目描述的是对序列读取档案(Sequence Read Archive, SRA)条目SRP068022内的已有数据集进行的重新分析。这些数据集此前已在GSE76494号条目中完成分析,但本次研究对其进行重新分析,以与新生成的染色质免疫共沉淀测序(Chromatin Immunoprecipitation sequencing, ChIP-seq)数据集保持分析标准的一致性。此前的数据集均被比对至hg19基因组组装版本,并采用不同的计算方法完成分析。
***************************************************************
下表列出了本研究中复用/重新分析的基因表达汇编(Gene Expression Omnibus, GEO)登录号。
***************************************************************
创建时间:
2024-12-12



