SARS-CoV-2 Omicron Boosting Induces De Novo B Cell Response in Humans

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/7719029

下载链接

链接失效反馈

官方服务：

资源简介：

These are the processed BCR repertoire and transcriptomics data described in Alsoussi & Malladi et al., Nature, 2023. The raw sequencing data new to this study are available on SRA under BioProject PRJNA800176. This study also used BCR repertoire data from Turner & O'Halloran et al., Nature, 2021 (PRJNA731610), Schmitz, Turner & Liu et al., Immunity, 2021 (PRJNA741267), and Kim & Zhou et al., Nature, 2022 (PRJNA777934). Code Code along with Docker containers for reproducing the NGS data-based figures and analyses in the published paper can be found on GitHub. Metadata File: WU382_alsoussi_et_al_nature_2023_meta.tsv.gz Notes: 181 samples in total, including: 78 new 90 from Kim & Zhou et al., Nature, 2022 8 from Schmitz, Turner & Liu et al., Immunity, 2021 5 from Turner & O'Halloran et al., Nature, 2021 Participant IDs: 6 participants who were in previous studies and who continued in the new study were referenced by new participant IDs. Correspondence with previous participant IDs is as follows: 382-01 = 368-22 382-02 = 368-20 382-07 = 368-02a 382-08 = 368-04 382-13 = 368-01a 382-15 = 368-10 Sample collection time was originally recorded in days in the `timepoint` column. Values in parentheses indicate variations in which the BCR data was coded. Timepoints were mainly referenced in weeks in the manuscript, as shown in the `timepoint_ms` column. Pre-3rd dose ("pre-boost") samples were coded `b0` in the `booster_num` column; post-3rd dose ("post-boost") samples were coded `b1`. The `booster_type` column records the 3rd dose ("booster") variant. `regular` = mRNA-1273 (WA1/2020) `beta_delta` = mRNA-1273.213 (Beta & Delta) `v1.1.529` = mRNA-1273.529 (Omicron) 382-02/07/08 received mRNA-1273; 382-01/13/15 received mRNA-1273.213; 382-53/54/55 received mRNA-1273.529. The `seq_type` column indicates the platform from which sequences originated. `bulk` = bulk BCR sequencing `tgx` = 10x Genomics single-cell VDJ + 5' gene expression `mab`, `mab_1`, `mab_2`: single-cell sorted mAb synthesis. The suffixes were purely for the convenience of distinguishing originating studies. Abbreviations: LN = lymph node BM = bone marrow PB = plasmablast GC = germinal centre LLPC = long-lived plasma cell NS = no sorting mAb = monoclonal antibody [Beta & Delta booster] Processed BCR data - heavy chains File: WU382_alsoussi_et_al_nature_2023_betaDelta_bcr_heavy.tsv.gz Analysis was based on heavy chain-based clonal inference. Notes on columns: The columns largely follow the AIRR-C Rearrangement format. The main deviation is that CDR3s were used, as opposed to IMGT-defined "junctions". Nonetheless, junction-related columns are included here as some repositories such as iReceptor use these. Non-standard columns are noted below. `cell_id`: Only sequences from single-cell samples and synthesized mAbs have cell IDs. 10x sequences follow the format `[donor]_[sample]@[id]`. `NA` for bulk sequences. `sequence_id`: Sequence IDs follow the format `[donor]_[sample]@[id]`. `v_call_genotyped`: V gene annotation reassigned after individualized genotyping by TIgGER. `germline_[vdj]_call`: Clonal consensus germline calls after corresponding clonal consensus sequences were reconstructed via `CreateGermlines.py --cloned` from Change-O. `collapse_count`: Number of duplicate IMGT-aligned V(D)J sequences that were collapsed by `alakazam::collapseDuplicates`. `timepoint`: Timepoints follow the format `b[01]_d*`, where `b0` and `b1` correspond to pre-3rd dose ("pre-boost") and post-3rd dose ("post-boost") respectively, and `d*` indicates the timepoint in days. There's one exception: `b0_m6or9` for pre-3rd dose d201 or d280 (m6or9 = 6 or 9 months). `gex_anno`: Cell type identity annotation based on transcriptomic profiles. Mapped from `anno_leiden_0.35` from WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cells.h5ad. `compartment`: B cell compartment `clone_id`: B cell clonal lineage IDs follow the format `[donor]@[id]`. `s_pos_clone`: `TRUE` if a sequence belonged to a B cell clone that was designated as S-binding by virtue of containing one of the recombinant mAbs that tested positive via ELISA. `expressed_id`: mAb IDs of mAbs from Turner & O'Halloran et al., Nature, 2021 and the current study; and of recombinant mAbs generated based on 10x BCRs from Kim & Zhou et al., Nature, 2022. `NA` for everything else. `elisa`: ELISA results for binding of recombinant mAbs to SARS-CoV-2 S. `TRUE` if positive (WA1+); `FALSE` if negaive; `NA` if not tested or test failed. `nuc_RS_19_312`: number of replacement and silent mutations between IMGT-numbered nucleotide positions 19-312 along IGHV sequences, calculated by `shazam::calcObservedMutations`. `nuc_denom_19_312`: number of informative nucleotide positions for counting mutations, excluding non-A/T/G/C positions (such as "N", "-", "."). `nuc_RS_freq_19_312`: nucleotide-level mutation frequency (= nuc_RS_19_312 / nuc_denom_19_312). [Beta & Delta booster] Processed BCR data - light chains File: WU382_alsoussi_et_al_nature_2023_betaDelta_bcr_light.tsv.gz Light chains were not used for heavy chain-based clonal inference or analysis. [Beta & Delta booster] Processed transcriptomics data Files: WU382_alsoussi_et_al_nature_2023_betaDelta_gex_all_cells.h5ad WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cells.h5ad WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cell_umap.tsv.gz Notes on the `h5ad` files: These files can be imported into Scanpy as an AnnData object. Each `AnnData` object has 3 `.layers`, each representing a version of the count matrix. `raw_counts`: Imported from `cellranger aggr` output by `scanpy.read_10x_mtx`. `log_norm`: Log-noramlized expression values outputted by `scanpy.pp.normalize_total` followed by `scanpy.pp.log1p`. `scaled`: The `log_norm` layer scaled to unit variance and zero mean by `scanpy.pp.scale`. The `gene_name` and `biotype` columns in `.var` were extracted from GENCODE v32 GTF. Columns in `.obs` (each row corresponds to a cell) `n_feature`: The `n_genes_by_counts` column produced by `scanpy.pp.calculate_qc_metrics`, renamed. The number of genes expressed. This is before subsetting the genes. `n_umi`: The `total_counts` column produced by `scanpy.pp.calculate_qc_metrics`, renamed. The total UMI counts in a cell. `pct_mt`: The `pct_counts_mt` column produced by `scanpy.pp.calculate_qc_metrics`, renamed. The percentage of counts in mitochondrial genes. `n_hkg`: The number of housekeeping genes for which expression was detected. `n_gene_expressed`: The total number of genes for which expression was detected. This is after subsetting the genes. `pre_qc_bcr`: `TRUE` if a cell also had paired BCR data available. Produced by cross-referencing the cellular barcodes in `cell_barcodes.json` outputted by `cellranger vdj`. At this point the BCR data had not gone through the QC process in the BCR processing pipeline (hence `pre_qc`). `leiden_[resolution]`: Cluster assignment by `scanpy.tl.leiden`. `anno_leiden_[resolution]`: Cell type identity annotations based on transcriptomic profiles. This was mapped onto the `gex_anno` column in the processed heavy chain BCR data. UMAP coordinates can be found in `.obsm["X_umap"]`. `.X` has been set to `None` in order to reduce file size. Note on the `tsv.gz` file: This file was derived from WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cells.h5ad. It contains UMAP coordinates and select attributes of the cells, including their log-normalized expression values of XBP1 (`ln_XBP1`). For analysis and visualization in conjunction with BCR data. In addition, the preprocessed count matrix outputted by `cellranger aggr` is available from GEO under BioProject PRJNA800176. [Omicron booster] Processed BCR data - heavy chains File: WU382_alsoussi_et_al_nature_2023_omicron_bcr_heavy.tsv.gz Notes on columns: `elisa`: ELISA results for mAbs, with values being one of `WA1+`, `BA1+WA1-`, or `negative`. `NA` for bulk sequences. `clone_type`: If a sequence was in an S-binding B cell clone (`TRUE` for `s_pos_clone`), its `clone_type` was based on the `elisa` value of the S-binding mAb in that clone -- either `WA1+` or `BA1+WA1-`; otherwise `NA`.

创建时间：

2023-04-05

5,000+

优质数据集

54 个

任务类型

进入经典数据集