five

Learning the sequence code for mRNA and protein abundance in human immune cells

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE240919
下载链接
链接失效反馈
官方服务:
资源简介:
mRNA and protein abundance are defined by transcriptional and post-transcriptional regulatory mechanisms. Here, we develop a machine learning pipeline, termed SONAR, to decipher the endogenous sequence code that determines mRNA and protein abundance in human cells. SONAR models predict up to 62% of mRNA and 63% of protein abundance independent of promoter or enhancer information, and reveal a strong—yet dynamic—cell-type specific sequence code. We also find that the effect of sequence features is dependent on their location within the mRNA transcript. Using SONAR, we design synthetic 3’UTRs, with which protein expression levels can be manipulated and tailored to a specific cell-type. Beyond its fundamental findings, our work provides novel means to improve immunotherapies and biotechnology applications. A parallel reporter assay was performed to test the effect of synthetic 3'UTR sequences on GFP protein expression. HeLa cells, HEK cells, CD8+ T cell and CD4+ T cells were transduced with a GFP-3'UTR retroviral library containing ±500 distinct synthetic 3'UTR sequences. Transduced cells were subsequently sorted for GFPhi or GFPlo cells. gDNA was isolated from these populations, 3'UTR sequences were amplified and sequenced to asses abundances
创建时间:
2023-09-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作