five

DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers [Drosophila genome-wide UMI-STARR-seq]

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP336576
下载链接
链接失效反馈
官方服务:
资源简介:
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood and enhancer de novo design is considered impossible. Here we built a deep learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally non-equivalent instances of the same TF motif that are determined by motif-flanking sequence and inter-motif distances. We validated these rules experimentally and demonstrated their conservation in human by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo. Overall design: Genome-wide UMI-STARR-seq was performed in S2 cells using two core promoters each representing housekeeping and developmental transcription programs. All experiments were performed in 2 biological replicates.

增强子序列可调控基因表达,其包含不同转录因子(Transcription Factors, TFs)的结合位点,即基序(motifs)。尽管已有大量遗传学与计算生物学研究,但DNA序列与调控活性之间的关联仍未被充分阐明,且增强子的从头设计曾被认为是不可能完成的任务。本研究构建了深度学习模型DeepSTARR,可直接基于黑腹果蝇(Drosophila melanogaster)S2细胞的DNA序列,定量预测数千个发育相关及管家增强子的活性。该模型学习到了相关的转录因子基序与高阶语法规则,包括由基序侧翼序列及基序间距离决定的、同一转录因子基序的功能非等价实例。我们通过实验验证了这些规则,并通过测试超过40000个野生型与突变型的黑腹果蝇及人类增强子,证明了这些规则在人类基因组中的保守性。最后,我们从头设计并通过功能验证获得了具有预期活性的人工合成增强子。实验整体设计:本研究在S2细胞中开展全基因组UMI-STARR-seq实验,使用两种核心启动子分别对应管家基因与发育相关的转录程序;所有实验均设置2次生物学重复。
创建时间:
2022-05-27
二维码
社区交流群
二维码
科研交流群
商业服务