Data from: Mutation screening of 1,237 cancer genes across six model cell lines of basal-like breast cancer
收藏DataONE2016-01-06 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Basal-like breast cancer is an aggressive subtype generally characterized as poor prognosis and lacking the expression of the three most important clinical biomarkers, estrogen receptor, progesterone receptor, and HER2. Cell lines serve as useful model systems to study cancer biology in vitro and in vivo. We performed mutational profiling of six basal-like breast cancer cell lines (HCC38, HCC1143, HCC1187, HCC1395, HCC1954, and HCC1937) and their matched normal lymphocyte DNA using targeted capture and next-generation sequencing of 1,237 cancer-associated genes, including all exons, UTRs and upstream flanking regions. In total, 658 somatic variants were identified, of which 378 were non-silent (average 63 per cell line, range 37–146) and 315 were novel (not present in the Catalogue of Somatic Mutations in Cancer database; COSMIC). 125 novel mutations were confirmed by Sanger sequencing (59 exonic, 48 3’UTR and 10 5’UTR, 1 splicing), with a validation rate of 94% of high confidence variants. Of 36 mutations previously reported for these cell lines but not detected in our exome data, 36% could not be detected by Sanger sequencing. The base replacements C/G>A/T, C/G>G/C, C/G>T/A and A/T>G/C were significantly more frequent in the coding regions compared to the non-coding regions (OR 3.2, 95% CI 2.0–5.3, P<0.0001; OR 4.3, 95% CI 2.9–6.6, P<0.0001; OR 2.4, 95% CI 1.8–3.1, P<0.0001; OR 1.8, 95% CI 1.2–2.7, P = 0.024, respectively). The single nucleotide variants within the context of T[C]T/A[G]A and T[C]A/T[G]A were more frequent in the coding than in the non-coding regions (OR 3.7, 95% CI 2.2–6.1, P<0.0001; OR 3.8, 95% CI 2.0–7.2, P = 0.001, respectively). Copy number estimations were derived from the targeted regions and correlated well to Affymetrix SNP array copy number data (Pearson correlation 0.82 to 0.96 for all compared cell lines; P<0.0001). These mutation calls across 1,237 cancer-associated genes and identification of novel variants will aid in the design and interpretation of biological experiments using these six basal-like breast cancer cell lines.
基底样乳腺癌(Basal-like breast cancer)是一类侵袭性亚型,通常以预后不良且缺乏三种关键临床生物标志物——雌激素受体(estrogen receptor)、孕激素受体(progesterone receptor)及HER2的表达为特征。细胞系是开展体外(in vitro)与体内(in vivo)癌症生物学研究的实用模型系统。本研究针对6株基底样乳腺癌细胞系(HCC38、HCC1143、HCC1187、HCC1395、HCC1954及HCC1937)及其匹配的正常淋巴细胞DNA,采用靶向捕获结合新一代测序(next-generation sequencing)技术,对1237个癌症相关基因(涵盖全部外显子、UTR区及上游侧翼区域)进行了突变谱分析。研究共鉴定出658个体细胞变异,其中378个为非同义变异(每株细胞系平均63个,范围37~146),另有315个为新发变异(未收录于癌症体细胞突变目录(Catalogue of Somatic Mutations in Cancer, COSMIC)数据库)。通过Sanger测序(Sanger sequencing)验证了125个新发突变,其中59个位于外显子区、48个位于3’UTR、10个位于5’UTR、1个位于剪接区域,高置信度变异的验证率达94%。针对此前已报道但未在本次外显子组数据中检出的36个突变,其中36%无法通过Sanger测序验证。相较于非编码区,编码区中C/G>A/T、C/G>G/C、C/G>T/A及A/T>G/C的碱基替换频率显著更高(比值比[OR]分别为3.2、95%置信区间[CI] 2.0~5.3,P<0.0001;4.3、95%CI 2.9~6.6,P<0.0001;2.4、95%CI 1.8~3.1,P<0.0001;1.8、95%CI 1.2~2.7,P=0.024)。处于T[C]T/A[G]A及T[C]A/T[G]A序列背景下的单核苷酸变异,在编码区的出现频率亦显著高于非编码区(OR分别为3.7、95%CI 2.2~6.1,P<0.0001;3.8、95%CI 2.0~7.2,P=0.001)。从靶向捕获区域推导得到的拷贝数估计值,与Affymetrix SNP阵列(Affymetrix SNP array)的拷贝数数据相关性良好(所有比对细胞系的皮尔逊相关系数为0.82~0.96,P<0.0001)。本研究针对1237个癌症相关基因的变异检出及新发变异鉴定结果,将有助于使用上述6株基底样乳腺癌细胞系开展的生物学实验的设计与结果解读。
创建时间:
2016-01-06



