five

DNA-Diffusion STARR-Seq Validation

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP577018
下载链接
链接失效反馈
官方服务:
资源简介:
To test how DNA-Diffusion sequences can induce transcription, we select 2150 sequences, including DNA-Diffusion synthetic and natural occurring DHS sites for each cell type (K562, HepG2, and GM12878) and insert them into STARR-Seq plasmids (N= 6450 sequences). all synthetic and naturally occurring sequences were combined into a single library, and this same library was experimentally tested using STARR-Seq in different cell lines (K562, HepG2, GM12878). Overall design: A STARR-Seq plasmid library containing 6,000 enhancer candidates was synthesized and constructed following methods described previously (Neumayr et al., Gordon et al.). Each candidate was uniquely barcoded with five random nucleotides during synthesis, PCR amplified using specific primers (forward: TAGAGCATGCACCGG, reverse: TCGACGAATTCGGCC), and cloned into the STARR-Seq plasmid vector (hSTARR-seq_ORI, Addgene #99296) via NEBuilder HiFi assembly. Recombinant plasmids were electroporated into NEB® 10-beta electrocompetent E. coli and amplified plasmid libraries were purified using PureLink HiPure plasmid midiprep kits. Library complexity was assessed by sequencing PCR-amplified inserts prepared with NEBNext Ultra II DNA kits. Subsequently, plasmids of sufficient complexity were electroporated into three cell lines (K562, HepG2, GM12878), and post-transfection RNA and DNA were isolated after 6 hours. Following DNase treatment and reverse transcription of isolated RNA, both RNA-derived cDNA and genomic DNA underwent targeted amplification with specific primer pairs and were sequenced using Illumina-compatible NEBNext library preparations. Counts of enhancer sequences were quantified using Kallisto, aligning reads to a constructed index with k-mer size 23, filtering for the presence of specific homology arms. The final enhancer activity (mRNA/DNA ratio) was computed with the R "mpra" package (version 1.14.0), employing voom normalization and limma analysis to generate log2 fold changes (log2FC) and associated error ranges across all cell line replicates, considering only sequences with non-zero DNA counts.
创建时间:
2025-10-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作