Redefined indel taxonomy reveals insights into mutational signatures
收藏doi.org2024-02-14 更新2025-03-26 收录
下载链接:
http://doi.org/10.17632/3k2tpx9ssr.2
下载链接
链接失效反馈官方服务:
资源简介:
Despite their deleterious effects, small insertions and deletions (indels) have received far less attention than substitutions1,2. Recent computational advances have surmounted previous technical challenges, enabling the study of mutational processes driving indel formation. Here we generated isogenic CRISPR-edited human cellular models of post-replicative repair dysfunction (PRRd), including individual and combined gene-edits of mismatch repair (MMR) and replicative polymerases (POLE and POLD1). Unique, diverse mutational footprints of MMR deficiency and polymerase dysfunction were revealed. However, the prevailing indel classification framework1 falls short in discriminating these indel signatures from background mutagenesis and among each other, as it condenses important biological signals into predominantly two indel subclasses, limiting signature analyses. To address this, we propose a novel classification system that considers the 5’ and 3’ nucleobases flanking an indel, as with substitution signatures, resulting in 89 indel subclasses. Our new indel classification enhances the disambiguation of experimental signatures, and through analysis of 4,775 cancer whole genomes from the 100,000 Genomes Project3, we uncover 37 indel signatures (InDs); 27 are new. Further, we develop a classifier – PRRDetect, which outperforms approved biomarkers, such as Tumor Mutational Burden (TMB), for immunotherapy. This re-defined indel taxonomy advances our understanding of mutagenesis and heralds potential clinical applications.
Deposited here are the de novo mutation lists from the experimental samples included in the study. Whole-genome sequencing short reads were aligned to GRCh38/hg38 using BWA-MEM. Post-processing filters were applied to improve the specificity of mutation-calling. Specifically, for single nucleotide variant calls by CaVEMan, we used CLPM == 0 and ASMD >= 140. To reduce false positive calls by Pindel, we used QUAL >= 250 and REP < 10. Rearrangements were not assessed as they were too few to be informative. De novo substitutions and indels in subclones were obtained by subtracting from respective parental clone whenever available, or by removing mutations shared among subclones.
尽管小插入和小缺失(indels)具有有害的影响,但相较于替换,它们并未获得足够的关注1,2。近年来,计算技术的进步克服了以往的技术难题,使得对驱动indels形成的突变过程的研究成为可能。在本研究中,我们生成了同源CRISPR编辑的人细胞模型,模拟了后复制修复功能障碍(PRRd),包括单个和组合的错配修复(MMR)和复制聚合酶(POLE和POLD1)基因编辑。揭示了MMR缺陷和聚合酶功能障碍的独特、多样的突变足迹。然而,目前通行的indels分类框架1在区分这些indels特征与背景突变以及彼此之间时存在不足,因为它将重要的生物信号主要浓缩为两个indels亚类,限制了特征分析。为此,我们提出了一种新的分类体系,该体系考虑了indels两侧的5'和3'核苷酸,与替换特征类似,从而产生了89个indels亚类。我们的新indels分类提高了实验特征的明确区分,通过对来自100,000基因组项目3的4,775个癌症全基因组进行分析,我们发现了37个indels特征(InDs);其中27个为新的。此外,我们开发了一种分类器——PRRDetect,其性能优于已批准的生物标志物,如肿瘤突变负荷(TMB),用于免疫治疗。这一重新定义的indels分类学推进了我们对突变过程的理解,并为潜在的临床应用预示了前景。在此处存储了实验样本中的新发突变列表。全基因组测序短读段使用BWA-MEM对GRCh38/hg38进行对齐。后处理过滤器被应用以提高突变检测的特异性。具体而言,对于CaVEMan的单核苷酸变异调用,我们使用了CLPM == 0和ASMD >= 140。为了减少Pindel的假阳性调用,我们使用了QUAL >= 250和REP < 10。由于重组信息有限,未对其进行评估。通过从相应的亲本克隆中减去,或在子克隆之间共享的突变被去除,获得了亚克隆中的新发替换和indels。
提供机构:
doi.org



