Homo sapiens Raw sequence reads
收藏NIAID Data Ecosystem2026-04-25 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP200845
下载链接
链接失效反馈官方服务:
资源简介:
The cleavage and polyadenylation reaction is a crucial step in transcription termination and pre-mRNA maturation in human cells. Despite extensive research, the encoding of polyadenylation mediated regulation of gene expression within the DNA sequence is not well understood. Here, we utilized a massively parallel reporter assay to inspect the effect of over 12,000 rationally designed polyadenylation sequences (PASs) on reporter gene expression and cleavage efficiency. We find that the PAS sequence can span over five orders of magnitude in expression. By using a uniquely designed scanning mutagenesis data set, we gain mechanistic insight into various modes of action by which the cleavage efficiency affects the sensitivity or robustness of the PAS to mutation. Furthermore, we employ motif discovery to identify both known and novel sequence motifs associated with PAS mediated regulation. By leveraging the large-scale of our data, we train a deep learning model for the highly accurate prediction of RNA levels from DNA sequence alone (R=0.83). Moreover, we predict the distribution of cleavage efficiencies along the PAS using two complementing models. Taken together, our results expand our understanding of PAS mediated regulation and provide an unprecedented resource for designing and analyzing PAS in a predictable manner for regulatory genomics and synthetic biology applications.
创建时间:
2020-05-20



