TMC-SNPdb: An Indian germline variant dataset derived from whole exome sequence
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP015935
下载链接
链接失效反馈官方服务:
资源简介:
Cancer is predominantly a somatic disease. A mutant allele found in cancer cell genome is considered somatic when it is absent in paired normal genome and dbSNP, the most comprehensive public SNP database. However, dbSNP inadequately represents several non-Caucasian populations including that from the Indian subcontinent, posing a limitation in cancer genomic analyses of data from these populations. We present âTMC-SNPdbâ, as the first open source freely accessible (through ANNOVAR), flexible and upgradable SNP database from whole exome data of 62 normal samples derived from cancer patients of Indian origin, representing 114,309 unique germline variants. TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or an easy-to-use graphical user interface (GUI) with the ability to deplete additional Indian population specific SNPs over and above that possible with dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that âTMC-SNPdbâ reduced 42%, 33% and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of âTMC-SNPdbâ in several Mendelian germline diseases.
创建时间:
2023-10-13



