SARS-CoV-2 Omicron Boosting Induces De Novo B Cell Response in Humans
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7719029
下载链接
链接失效反馈官方服务:
资源简介:
These are the processed BCR repertoire and transcriptomics data described in Alsoussi & Malladi et al., Nature, 2023. The raw sequencing data new to this study are available on SRA under BioProject PRJNA800176. This study also used BCR repertoire data from Turner & O'Halloran et al., Nature, 2021 (PRJNA731610), Schmitz, Turner & Liu et al., Immunity, 2021 (PRJNA741267), and Kim & Zhou et al., Nature, 2022 (PRJNA777934).
Code
Code along with Docker containers for reproducing the NGS data-based figures and analyses in the published paper can be found on GitHub.
Metadata
File: WU382_alsoussi_et_al_nature_2023_meta.tsv.gz
Notes:
181 samples in total, including:
78 new
90 from Kim & Zhou et al., Nature, 2022
8 from Schmitz, Turner & Liu et al., Immunity, 2021
5 from Turner & O'Halloran et al., Nature, 2021
Participant IDs: 6 participants who were in previous studies and who continued in the new study were referenced by new participant IDs. Correspondence with previous participant IDs is as follows:
382-01 = 368-22
382-02 = 368-20
382-07 = 368-02a
382-08 = 368-04
382-13 = 368-01a
382-15 = 368-10
Sample collection time was originally recorded in days in the `timepoint` column. Values in parentheses indicate variations in which the BCR data was coded. Timepoints were mainly referenced in weeks in the manuscript, as shown in the `timepoint_ms` column.
Pre-3rd dose ("pre-boost") samples were coded `b0` in the `booster_num` column; post-3rd dose ("post-boost") samples were coded `b1`.
The `booster_type` column records the 3rd dose ("booster") variant.
`regular` = mRNA-1273 (WA1/2020)
`beta_delta` = mRNA-1273.213 (Beta & Delta)
`v1.1.529` = mRNA-1273.529 (Omicron)
382-02/07/08 received mRNA-1273; 382-01/13/15 received mRNA-1273.213; 382-53/54/55 received mRNA-1273.529.
The `seq_type` column indicates the platform from which sequences originated.
`bulk` = bulk BCR sequencing
`tgx` = 10x Genomics single-cell VDJ + 5' gene expression
`mab`, `mab_1`, `mab_2`: single-cell sorted mAb synthesis. The suffixes were purely for the convenience of distinguishing originating studies.
Abbreviations:
LN = lymph node
BM = bone marrow
PB = plasmablast
GC = germinal centre
LLPC = long-lived plasma cell
NS = no sorting
mAb = monoclonal antibody
[Beta & Delta booster] Processed BCR data - heavy chains
File: WU382_alsoussi_et_al_nature_2023_betaDelta_bcr_heavy.tsv.gz
Analysis was based on heavy chain-based clonal inference.
Notes on columns:
The columns largely follow the AIRR-C Rearrangement format. The main deviation is that CDR3s were used, as opposed to IMGT-defined "junctions". Nonetheless, junction-related columns are included here as some repositories such as iReceptor use these. Non-standard columns are noted below.
`cell_id`: Only sequences from single-cell samples and synthesized mAbs have cell IDs. 10x sequences follow the format `[donor]_[sample]@[id]`. `NA` for bulk sequences.
`sequence_id`: Sequence IDs follow the format `[donor]_[sample]@[id]`.
`v_call_genotyped`: V gene annotation reassigned after individualized genotyping by TIgGER.
`germline_[vdj]_call`: Clonal consensus germline calls after corresponding clonal consensus sequences were reconstructed via `CreateGermlines.py --cloned` from Change-O.
`collapse_count`: Number of duplicate IMGT-aligned V(D)J sequences that were collapsed by `alakazam::collapseDuplicates`.
`timepoint`: Timepoints follow the format `b[01]_d*`, where `b0` and `b1` correspond to pre-3rd dose ("pre-boost") and post-3rd dose ("post-boost") respectively, and `d*` indicates the timepoint in days. There's one exception: `b0_m6or9` for pre-3rd dose d201 or d280 (m6or9 = 6 or 9 months).
`gex_anno`: Cell type identity annotation based on transcriptomic profiles. Mapped from `anno_leiden_0.35` from WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cells.h5ad.
`compartment`: B cell compartment
`clone_id`: B cell clonal lineage IDs follow the format `[donor]@[id]`.
`s_pos_clone`: `TRUE` if a sequence belonged to a B cell clone that was designated as S-binding by virtue of containing one of the recombinant mAbs that tested positive via ELISA.
`expressed_id`: mAb IDs of mAbs from Turner & O'Halloran et al., Nature, 2021 and the current study; and of recombinant mAbs generated based on 10x BCRs from Kim & Zhou et al., Nature, 2022. `NA` for everything else.
`elisa`: ELISA results for binding of recombinant mAbs to SARS-CoV-2 S. `TRUE` if positive (WA1+); `FALSE` if negaive; `NA` if not tested or test failed.
`nuc_RS_19_312`: number of replacement and silent mutations between IMGT-numbered nucleotide positions 19-312 along IGHV sequences, calculated by `shazam::calcObservedMutations`.
`nuc_denom_19_312`: number of informative nucleotide positions for counting mutations, excluding non-A/T/G/C positions (such as "N", "-", ".").
`nuc_RS_freq_19_312`: nucleotide-level mutation frequency (= nuc_RS_19_312 / nuc_denom_19_312).
[Beta & Delta booster] Processed BCR data - light chains
File: WU382_alsoussi_et_al_nature_2023_betaDelta_bcr_light.tsv.gz
Light chains were not used for heavy chain-based clonal inference or analysis.
[Beta & Delta booster] Processed transcriptomics data
Files:
WU382_alsoussi_et_al_nature_2023_betaDelta_gex_all_cells.h5ad
WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cells.h5ad
WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cell_umap.tsv.gz
Notes on the `h5ad` files:
These files can be imported into Scanpy as an AnnData object.
Each `AnnData` object has 3 `.layers`, each representing a version of the count matrix.
`raw_counts`: Imported from `cellranger aggr` output by `scanpy.read_10x_mtx`.
`log_norm`: Log-noramlized expression values outputted by `scanpy.pp.normalize_total` followed by `scanpy.pp.log1p`.
`scaled`: The `log_norm` layer scaled to unit variance and zero mean by `scanpy.pp.scale`.
The `gene_name` and `biotype` columns in `.var` were extracted from GENCODE v32 GTF.
Columns in `.obs` (each row corresponds to a cell)
`n_feature`: The `n_genes_by_counts` column produced by `scanpy.pp.calculate_qc_metrics`, renamed. The number of genes expressed. This is before subsetting the genes.
`n_umi`: The `total_counts` column produced by `scanpy.pp.calculate_qc_metrics`, renamed. The total UMI counts in a cell.
`pct_mt`: The `pct_counts_mt` column produced by `scanpy.pp.calculate_qc_metrics`, renamed. The percentage of counts in mitochondrial genes.
`n_hkg`: The number of housekeeping genes for which expression was detected.
`n_gene_expressed`: The total number of genes for which expression was detected. This is after subsetting the genes.
`pre_qc_bcr`: `TRUE` if a cell also had paired BCR data available. Produced by cross-referencing the cellular barcodes in `cell_barcodes.json` outputted by `cellranger vdj`. At this point the BCR data had not gone through the QC process in the BCR processing pipeline (hence `pre_qc`).
`leiden_[resolution]`: Cluster assignment by `scanpy.tl.leiden`.
`anno_leiden_[resolution]`: Cell type identity annotations based on transcriptomic profiles. This was mapped onto the `gex_anno` column in the processed heavy chain BCR data.
UMAP coordinates can be found in `.obsm["X_umap"]`.
`.X` has been set to `None` in order to reduce file size.
Note on the `tsv.gz` file: This file was derived from WU382_alsoussi_et_al_nature_2023_betaDelta_gex_b_cells.h5ad. It contains UMAP coordinates and select attributes of the cells, including their log-normalized expression values of XBP1 (`ln_XBP1`). For analysis and visualization in conjunction with BCR data.
In addition, the preprocessed count matrix outputted by `cellranger aggr` is available from GEO under BioProject PRJNA800176.
[Omicron booster] Processed BCR data - heavy chains
File: WU382_alsoussi_et_al_nature_2023_omicron_bcr_heavy.tsv.gz
Notes on columns:
`elisa`: ELISA results for mAbs, with values being one of `WA1+`, `BA1+WA1-`, or `negative`. `NA` for bulk sequences.
`clone_type`: If a sequence was in an S-binding B cell clone (`TRUE` for `s_pos_clone`), its `clone_type` was based on the `elisa` value of the S-binding mAb in that clone -- either `WA1+` or `BA1+WA1-`; otherwise `NA`.
创建时间:
2023-04-05



