Whole Genome Sequence of a Turkish Individual

Figshare2016-01-18 更新2026-04-29 收录

下载链接：

https://figshare.com/articles/dataset/_Whole_Genome_Sequence_of_a_Turkish_Individual_/898396

下载链接

链接失效反馈

官方服务：

资源简介：

Although whole human genome sequencing can be done with readily available technical and financial resources, the need for detailed analyses of genomes of certain populations still exists. Here we present, for the first time, sequencing and analysis of a Turkish human genome. We have performed 35x coverage using paired-end sequencing, where over 95% of sequencing reads are mapped to the reference genome covering more than 99% of the bases. The assembly of unmapped reads rendered 11,654 contigs, 2,168 of which did not reveal any homology to known sequences, resulting in ∼1 Mbp of unmapped sequence. Single nucleotide polymorphism (SNP) discovery resulted in 3,537,794 SNP calls with 29,184 SNPs identified in coding regions, where 106 were nonsense and 259 were categorized as having a high-impact effect. The homo/hetero zygosity (1,415,123∶2,122,671 or 1∶1.5) and transition/transversion ratios (2,383,204∶1,154,590 or 2.06∶1) were within expected limits. Of the identified SNPs, 480,396 were potentially novel with 2,925 in coding regions, including 48 nonsense and 95 high-impact SNPs. Functional analysis of novel high-impact SNPs revealed various interaction networks, notably involving hereditary and neurological disorders or diseases. Assembly results indicated 713,640 indels (1∶1.09 insertion/deletion ratio), ranging from −52 bp to 34 bp in length and causing about 180 codon insertion/deletions and 246 frame shifts. Using paired-end- and read-depth-based methods, we discovered 9,109 structural variants and compared our variant findings with other populations. Our results suggest that whole genome sequencing is a valuable tool for understanding variations in the human genome across different populations. Detailed analyses of genomes of diverse origins greatly benefits research in genetics and medicine and should be conducted on a larger scale.

尽管全人类基因组测序现已可通过成熟的技术与经济资源实现，但针对特定人群基因组开展精细化分析的需求依然存在。本研究首次报道了土耳其人群人类基因组的测序与分析工作：我们采用双端测序技术实现了35倍测序覆盖度，超过95%的测序读段（sequencing reads）可比对至参考基因组，覆盖了超过99%的基因组碱基位点。对未比对读段进行组装后，共得到11654个重叠群（contigs），其中2168个未与已知序列展现出同源性，最终获得约1兆碱基对（Mbp）的未比对序列。单核苷酸多态性（SNP）检测共得到3537794个SNP位点，其中29184个位于编码区，包含106个无义突变及259个被归类为高影响效应的变异。纯合/杂合位点比例为1415123∶2122671（约1∶1.5），转换/颠换比例为2383204∶1154590（约2.06∶1），均处于预期范围内。在已检测到的SNP中，480396个为潜在新变异，其中2925个位于编码区，包含48个无义突变及95个高影响SNP。对新型高影响SNP的功能分析显示其参与多种相互作用网络，尤其与遗传性疾病及神经系统疾病密切相关。组装结果显示共存在713640个插入缺失变异（indels），其长度范围为-52碱基对（bp）至34 bp，共导致约180个密码子插入/缺失及246个移码突变。本研究采用基于双端测序及读段深度的方法，共发现9109个结构变异，并将本研究的变异检测结果与其他人群进行了对比。本研究结果表明，全基因组测序是解析不同人群人类基因组变异的有效工具。针对不同起源人群基因组的精细化分析可极大助力遗传学与医学研究，未来应进一步扩大此类研究的规模。

创建时间：

2016-01-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集