Deciphering Regulatory SNPs from ATAC-seq. Deciphering Regulatory SNPs from ATAC-seq
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA606502
下载链接
链接失效反馈官方服务:
资源简介:
Background: Expression quantitative trait loci (eQTL) studies are a valuable approach for identifying genetic variants correlated with gene expression. However, identifying the causal variants is challenging due to linkage disequilibrium amongst variants in the same haplotype block. In this study, we aim to identify functional SNPs in key regulatory regions that alter transcriptional regulation and thus, potentially impact cellular function. The majority of disease-associated single-nucleotide polymorphisms (SNPs) are located in regulatory regions, which can result in allele-specific binding (ASB) of transcription factors and differential expression of the target gene alleles. Here, we present regSNPs-ASB, a generalized linear model-based approach to accurately identify regulatory SNPs that are located in transcription factor binding sites from ATAC-seq data. Results: Using regSNPs-ASB, we identified 53 regulatory SNPs in human MCF-7 breast cancer cells and 125 regulatory SNPs in human mesenchymal stem cells (MSC). By integrating the regSNPs-ASB output with RNA-seq experimental data and publicly available chromatin interaction data from MCF-7 cells, we found that these 53 regulatory SNPs were associated with 74 potential target genes and that 32 (43%) of these genes showed significant allele-specific expression (ASE). By comparing all of the MCF-7 and MSC regulatory SNPs to the eQTLs in the Genome-Tissue Expression (GTEx) Project database, we found that 30% (16/53) of the regulatory SNPs in MCF-7 and 43% (52/122) of the regulatory SNPs in MSC were also eQTLs. The enrichment of regulatory SNPs in eQTLs indicated that many of them are likely responsible for allelic differences in gene expression (chi-square test, p-value < 0.01). In sum, we conclude that regSNPs-ASB is a useful tool for identifying causal variants from ATAC-seq data. This new computational tool will enable efficient prioritization of genetic variants identified as eQTL for further studies to validate their causal regulatory function. Ultimately, identifying causal genetic variants will further our understanding of the underlying molecular mechanisms of disease and the eventual development of potential therapeutic targets. Overall design: ATAC-seq assays were performed on two human cell lines, MCF-7 breast cancer cells and MSC (Mesenchymal Stem Cells). We generated 3 ATAC-seq libraries in MCF-7 cells and each individual libraries were derived from three technical replicates. We also generated 3 ATAC-seq libraries in MSC and each individual libraries were derived from two technical replicates. RNA-seq assays were performed on MCF-7 breast cancer cells. We generated 2 RNA-seq libraries which were each sequenced in three lanes to create three technical replicates per library.
**背景**:表达数量性状位点(eQTL)研究是识别与基因表达相关遗传变异的重要手段。然而,由于同一单体型块内的变异存在连锁不平衡,甄别因果变异颇具挑战。本研究旨在鉴定关键调控区域中可改变转录调控、进而潜在影响细胞功能的功能单核苷酸多态性(SNPs)。绝大多数疾病相关单核苷酸多态性(SNPs)位于调控区域,可导致转录因子的等位基因特异性结合(ASB)以及靶基因等位基因的差异表达。本文介绍了regSNPs-ASB——一种基于广义线性模型的方法,可从ATAC-seq数据中精准鉴定转录因子结合位点内的调控性单核苷酸多态性。
**结果**:借助regSNPs-ASB,我们在人类MCF-7乳腺癌细胞中鉴定出53个调控性SNPs,在人类间充质干细胞(MSC)中鉴定出125个调控性SNPs。将regSNPs-ASB的输出结果与RNA-seq实验数据以及公开的MCF-7细胞染色质相互作用数据进行整合后,我们发现这53个调控性SNPs与74个潜在靶基因相关,其中32个(占比43%)基因呈现显著的等位基因特异性表达(ASE)。将所有MCF-7和MSC的调控性SNPs与基因型-组织表达(GTEx)项目数据库中的eQTL进行比对,我们发现MCF-7细胞中30%(16/53)的调控性SNPs、MSC中43%(52/122)的调控性SNPs同时属于eQTL。调控性SNPs在eQTL中的富集现象表明,其中多数变异可能与基因表达的等位基因差异有关(卡方检验,P<0.01)。综上,我们认为regSNPs-ASB是一种可从ATAC-seq数据中甄别因果变异的实用工具。这款新型计算工具可高效对eQTL鉴定出的遗传变异进行优先级排序,以便后续研究验证其因果调控功能。最终,甄别因果遗传变异将加深我们对疾病潜在分子机制的理解,并推动潜在治疗靶点的开发。
**整体实验设计**:对两种人类细胞系——MCF-7乳腺癌细胞和间充质干细胞(MSC)开展ATAC-seq检测。我们在MCF-7细胞中构建了3个ATAC-seq文库,每个文库均来自3次技术重复;在MSC中同样构建了3个ATAC-seq文库,每个文库来自2次技术重复。同时对MCF-7乳腺癌细胞开展RNA-seq检测,构建了2个RNA-seq文库,每个文库在3个测序通道上完成测序,即每个文库对应3次技术重复。
创建时间:
2020-02-13



