five

Whole genome sequencing of KrasG12D-driven mouse primary pancreatic cancer cell culture for bioinformatic inference of chromothripsis. Evolutionary routes and KRAS dosage define pancreatic cancer phenotypes

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB23110
下载链接
链接失效反馈
官方服务:
资源简介:
We frequently observed complex copy number patterns on chromosome 4, involving the Cdkn2a locus, in primary pancreatic cancer cell culture from a KrasG12D-driven mouse model. The regularity of oscillating copy number states in most cancers suggested chromothripsis as the predominant process underlying these complex alterations. To confirm this, we performed whole-genome sequencing of the primary pancreatic cancer cell culture S821. For estimation of copy number states, the Bioconductor HMMcopy package 1.16.0 was used followed by segmentation with the Bioconductor DNAcopy package 1.48.0. For LOH analysis variant positions in control and tumor were computed with samtools mpileup v1.3.1. Only positions in regions with mapping quality of 60 and an average phredscore of 20 were considered for further analysis. Furthermore, positions harboring strand bias and variant allele frequencies less than 20% and above 85% in the control were excluded as they are likely homozygous in the germline. The minimal cutoff coverage for a given polymorphic position in the control was set to eight reads. Segmental duplications (UCSC Genome Browser) and regions with mouse line specific variation (Mouse Genomes Project, REL-1505) were excluded. For this set of SNPs the difference of frequencies between tumor and control samples were calculated. DELLY v0.7.6 was used for calling structural variations (SVs). SV-classes were defined according to DELLY callings: Deletion-type (3to5), duplication-type (5to3) and inversion-type (5to5 and 3to3). The predicted rearrangements were merged and filtered based on variant frequency, mapping quality and the distance between two connected breakpoints. The existence of chromothripsis was tested by applying the six hallmark criteria proposed by Korbel et al. (Cell, 2013). Clustering of SV breakpoints was tested using a χ²-goodness-of-fit test. Regularity of oscillating copy number states in the chromothriptic model was compared to a virtual chromosome generated by a Monte Carlo simulation, as described in Stephens et al. (Cell, 2011). For each distinct number of breakpoints, 100 simulation runs were completed and mean values as well as 95% CI were calculated. Interspersed loss and retention of heterozygosity was analyzed by calculating the Jaccard index between heterozygously deleted segments and regions comprising LOH and SNP information. Randomness of observed DNA segment order was tested using a Monte Carlo simulation as described in Korbel et al. (Cell, 2013). The uniform distribution of SV-types was tested using a χ²-goodness-of-fit test. The Wald-Wolfowitz runs test as implemented in R package randtests 1.0 was performed for testing right-sided against the null hypothesis of randomly distributed 5’-to-3’ breakpoint joints sequence.

本研究在KrasG12D驱动的小鼠模型来源的原代胰腺癌细胞培养物中,频繁观察到4号染色体上涉及Cdkn2a基因座(Cdkn2a locus)的复杂拷贝数(copy number)模式。多数癌症中振荡式拷贝数状态的规律性,提示染色体碎裂(chromothripsis)是这些复杂变异背后的主要驱动过程。为验证这一假设,我们对原代胰腺癌细胞培养物S821进行了全基因组测序(whole-genome sequencing)。拷贝数状态的估算采用Bioconductor的HMMcopy软件包1.16.0版本,随后使用Bioconductor的DNAcopy软件包1.48.0版本进行分段分析。针对杂合性缺失(Loss of Heterozygosity, LOH)分析,我们使用samtools mpileup v1.3.1计算对照组与肿瘤样本中的变异位点。仅保留比对质量值为60、平均Phred质量值(phredscore)为20的区域内的位点用于后续分析。此外,我们排除了对照组中存在链偏好性、变异等位基因频率低于20%且高于85%的位点,因为这类位点在生殖系中大概率为纯合型。对照组中每个多态性位点的最低覆盖深度阈值设为8条读段。我们还排除了片段重复区域(UCSC基因组浏览器(UCSC Genome Browser)注释)以及小鼠品系特异性变异区域(小鼠基因组计划(Mouse Genomes Project),REL-1505)。针对该组单核苷酸多态性(Single Nucleotide Polymorphism, SNP)位点,我们计算了肿瘤样本与对照样本之间的频率差异。我们使用DELLY v0.7.6进行结构变异(Structural Variations, SV)的识别。结构变异类型根据DELLY的识别结果定义:缺失型(3to5)、重复型(5to3)以及倒位型(5to5和3to3)。我们基于变异频率、比对质量以及两个相连断裂点(breakpoints)之间的距离,对预测的染色体重排结果进行合并与过滤。我们采用Korbel等人于2013年发表于《Cell》的六项标志性标准,对染色体碎裂现象的存在性进行验证。我们使用χ²拟合优度检验(χ²-goodness-of-fit test)对结构变异断裂点的聚集性进行验证。我们将染色体碎裂模型中振荡式拷贝数状态的规律性,与Stephens等人于2011年发表于《Cell》的研究中描述的蒙特卡洛模拟(Monte Carlo simulation)生成的虚拟染色体进行对比。针对不同数量的断裂点,我们完成了100次模拟运行,并计算了均值与95%置信区间(Confidence Interval, CI)。我们通过计算杂合缺失片段与包含杂合性缺失及单核苷酸多态性信息的区域之间的杰卡德指数(Jaccard index),分析了杂合性缺失与保留的交错分布模式。我们采用Korbel等人2013年发表于《Cell》的研究中描述的蒙特卡洛模拟方法,对观测到的DNA片段排列顺序的随机性进行验证。我们使用χ²拟合优度检验对结构变异类型的均匀分布性进行验证。我们使用R包randtests 1.0中实现的Wald-Wolfowitz游程检验(Wald-Wolfowitz runs test),对5’到3’方向的断裂点连接序列的随机分布零假设进行单侧检验。
创建时间:
2017-12-03
二维码
社区交流群
二维码
科研交流群
商业服务