Single cell RNA-seq analysis reveals that prenatal arsenic exposure results in long-term, adverse effects on immune gene expression in response to Influenza A infection

NIAID Data Ecosystem2026-03-11 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.vt4b8gtp6

下载链接

链接失效反馈

官方服务：

资源简介：

Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage. Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq). Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047. Preprocessing of single cell RNA sequencing (scRNA-seq) data Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples. Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”. Differential gene expression by immune cells Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1. Analysis of arsenic effect on immune cell gene expression by scRNA-seq. Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.

经饮用水摄入砷是一项严峻的环境健康问题。流行病学研究表明，产前砷暴露与儿童期后续呼吸道感染，以及砷经全身清除多年后成年期的呼吸道疾病发病风险存在显著关联。本研究探究了单纯产前砷暴露对成年后甲型流感病毒（influenza A, IAV）肺部感染后的炎症免疫应答与呼吸道健康的影响。研究对象为经宫内暴露于100 ppb亚砷酸钠的C57BL/6J小鼠，待其发育至成年后，以甲型流感病毒（H1N1）进行肺部感染。在甲型流感病毒感染后的多个时间点对肺组织与支气管肺泡灌洗液（bronchoalveolar lavage fluid, BALF）进行评估，结果显示，砷暴露组小鼠的肺部损伤与炎症程度均显著高于对照组小鼠。对从甲型流感病毒感染肺部中分离的免疫细胞进行单细胞RNA测序（single-cell RNA sequencing, scRNA-seq）分析后发现，增强的炎症应答由单核细胞源性巨噬细胞、中性粒细胞、自然杀伤细胞（natural killer cell, NK细胞）与肺泡巨噬细胞的先天免疫功能失调所介导。本研究结果表明，产前砷暴露可在砷暴露结束多年后，对成年宿主针对甲型流感病毒感染的先天免疫应答产生持久影响，进而引发更严重的免疫病理损伤。本研究首次提供直接证据，证明经饮用水单纯产前砷暴露可使成年小鼠对甲型流感病毒感染产生过度炎症应答的易感性，该易感性与严重肺部损伤密切相关。方法 ### 用于单细胞RNA测序（single-cell RNA sequencing, scRNA-seq）的全肺匀浆制备流程通过右心室向肺部灌注磷酸盐缓冲液（phosphate buffered saline, PBS），摘取肺组织后进行机械解离，随后依次通过70 μm与30 μm滤膜过滤，以获得单细胞悬液。使用膜联蛋白V EasySep试剂盒（StemCell Technologies公司，加拿大温哥华）去除死细胞，并通过抗CD45偶联磁珠（Miltenyi公司，美国加州奥本）进行磁分选，以富集造血来源细胞。将6个样本的单细胞悬液上样至Chromium单细胞系统（10X Genomics公司），以生成带有条形码的单细胞凝胶微滴乳液，并采用单细胞3’端第2版建库试剂盒完成单细胞RNA测序文库制备。将文库进行多重测序后，在Nextseq 500测序仪（Illumina公司）的4个泳道上完成3轮测序。原始测序数据的双端拆分与条形码处理采用Cell Ranger v3.0.1软件（10X Genomics公司；达特茅斯基因组共享资源中心）完成。将测序读段比对至小鼠基因组（GRCm38）与甲型流感病毒基因组（A/PR8/34，基因组版本GCF_000865725.1），以生成唯一分子标识符（unique molecular index, UMI）计数矩阵。基因表达数据已提交至NCBI基因表达汇编（Gene Expression Omnibus, GEO）数据库，登录号为GSE142047。 ### 单细胞RNA测序（scRNA-seq）数据预处理采用Cell Ranger生成的计数矩阵将在R统计分析环境（版本3.6.1）中进行分析。采用scran软件（v1.14.3，Lun等，2016）与Scater软件（v1.14.1，McCarthy等，2017）进行初步可视化与质量分析，以确定细胞质量与特征过滤的阈值。将样本计数矩阵导入Seurat软件（v3.1.1，Stuart等，2019），并计算每个细胞中线粒体、血红蛋白与甲型流感病毒转录本的占比。将以下细胞过滤以排除后续分析：唯一分子标识符（UMI）数量低于1000或高于20000（对应低质量细胞与双细胞聚合体）、特征基因数量少于300（低质量细胞）、比对至线粒体基因的读段占比高于10%（濒死细胞），或比对至血红蛋白基因的读段占比高于1%（红细胞）。过滤后每个样本的细胞总数介于1895至2482之间，砷暴露组与对照组的细胞数量无显著差异。随后采用SCTransform方法（Hafemeister等，2019）对数据进行标准化，并为每个样本识别可变特征基因。采用Seurat V3软件（Stuart等，2019）中实现的典型相关分析（canonical correlation analysis, CCA）与互近邻（mutual nearest neighbors, MNNs）算法识别样本间的整合锚点，并以此将样本整合至共享空间以用于后续比较。该流程可在存在技术或生物学差异的情况下，识别样本间共有的细胞群，同时也可保留单个样本特有的非重叠细胞群。 ### 基于单细胞RNA测序的甲型流感病毒感染肺部免疫细胞聚类与参考注释细胞类型鉴定从整合后的数据集识别主成分，并将其用于将数据投影至二维空间的均匀流形近似与投影（Uniform Manifold Approximation and Projection, UMAP）可视化。采用默认参数构建共享近邻（shared-nearest-neighbor, SNN）图，并在Seurat中使用SLM算法，以0.2至2的分辨率范围识别细胞簇。选取前30个主成分，共识别出22个细胞簇，其细胞数量介于25至2310之间。采用scran软件中的findMarkers函数识别各细胞簇的特征基因。为给单个细胞标注细胞类型，本研究采用singleR软件包（v3.1.1），将单个细胞的基因表达谱与Immgen数据库中经荧光激活细胞分选（fluorescence activated cell sorting, FACS）富集的白细胞样本的表达数据进行比对（Aran D.等，2019；Heng等，2008）。本研究基于分选Immgen样本所用的特征基因（Guilliams等，2014），手动更新了Immgen参考注释，新增了263个用于精细分析的样本组标签与25个CD45+细胞类型标识。参考注释信息详见附表S2；经标签修剪后仍无法可靠标注的细胞将被归类为“未知”。 ### 免疫细胞的差异基因表达分析针对单个细胞类型的差异基因表达分析，通过按样本分别合并各细胞类型的原始计数数据，为每个细胞类型生成伪批量计数表。仅对每个样本中细胞数量充足（≥10个）的细胞类型进行差异表达分析。在基于微滴的单细胞RNA测序中，裂解细胞释放的游离RNA会混入微滴，可能导致在本不表达这些基因的细胞类型中错误检出相关基因。因此本研究采用Young与Behjati开发的方法（Young等，2018）估算每个基因的游离RNA占比，并将每个细胞类型中游离RNA占比超过25%的基因予以排除。该基因排除操作将以细胞类型特异性的方式进行。在不足5%的细胞中表达的基因也将被排除在分析之外。随后按照批量RNA测序的标准流程（Law等，2014），采用Limma软件（带质量权重的limma-voom算法）进行差异表达分析。采用MA/QC标准（P<0.05，log2倍数变化>1）筛选显著差异表达基因。 ### 基于单细胞RNA测序分析砷对免疫细胞基因表达的影响通过合并每个样本中所有细胞的原始计数数据，生成用于伪批量基因表达分析的计数表，以识别砷对样本整体基因表达的影响。将在任意样本中计数低于20，或总计数低于60的基因排除在分析之外。差异表达分析采用前文所述的limma-voom算法完成。

创建时间：

2020-06-01