Exazyme/BRCA2_HUMAN_Erwood_2022_HEK293T_substitutions_singles_organismalfitness_PE_REGR
收藏Hugging Face2024-09-10 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Exazyme/BRCA2_HUMAN_Erwood_2022_HEK293T_substitutions_singles_organismalfitness_PE_REGR
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: mutation
dtype: string
- name: aa_seq
dtype: string
- name: target continuous
dtype: float64
- name: fold_random_5
dtype: int64
- name: fold_modulo_5
dtype: int64
- name: fold_contiguous_5
dtype: int64
- name: aa_unirep_1900
sequence: float64
- name: aa_1hot
sequence:
sequence: int64
- name: aa_1hot_pos
sequence:
sequence: float32
splits:
- name: train
num_bytes: 773042066
num_examples: 265
download_size: 21312768
dataset_size: 773042066
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Dataset Card for "BRCA2_HUMAN_Erwood_2022_HEK293T"
## Description
This dataset is adapted from the ProteinGym benchmark. It contains single DMS substitution sequences and respective DMS scores.
## Tasks
Regression - target continuous
## Datapoints
265
## Sources
Downloaded from: https://proteingym.org/ \
Paper: https://www.biorxiv.org/content/biorxiv/early/2021/05/14/2021.05.11.443710.full.pdf
## Original Publication
**High-throughput mutagenesis reveals functional determinants for DNA targeting by activation-induced deaminase**
**Abstract:**
Over the last decade, next generation sequencing has become widely implemented in clinical practice. However, as genetic variants of uncertain significance (VUS) are
frequently identified, the need for scaled functional interpretation of such variants has
become increasingly apparent. One method to address this is saturation genome editing
(SGE), which allows for scaled multiplexed functional assessment of single nucleotide
variants. The current applications of SGE, however, rely on homology-directed repair
(HDR) to introduce variants of interest, which is limited by low editing efficiencies and
low product purity. Here, we have adapted CRISPR prime editing for SGE and
demonstrated its utility in understanding the functional significance of variants in the
NPC1 gene underlying the lysosomal storage disorder Niemann-Pick disease type C1
(NPC). Additionally, we have designed a genome editing strategy that allows for the
haploidization of gene loci, which permits isolated variant interpretation in virtually any
cell type. By combining saturation prime editing (SPE) with a clinically relevant assay,
we have functionally scored and interpreted 256 variants in NPC1 haploidized HEK293T
cells. To further demonstrate the applicability of this strategy, we used SPE and cell
model haploidization to functionally score 465 variants in the BRCA2 gene. We
anticipate that our work will be translatable to any gene with an appropriate cellular
assay, allowing for more rapid and accurate diagnosis and improved genetic counselling
and ultimately precise patient care.
提供机构:
Exazyme



