Whole genome re-sequencing of a single Bos taurus animal for SNP discovery. Sequencing of Fleckvieh bull Vanstein
收藏NIAID Data Ecosystem2026-03-06 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJEB1985
下载链接
链接失效反馈官方服务:
资源简介:
Background: The approximately 2 million bovine SNPs available in dbSNP have mainly been identified in the Hereford breed by the bovine genome project and by sequencing reduced representation libraries of various breeds. To increase this resource, we used whole-genome re-sequencing of a single Fleckvieh bull. Results: We generated 24 Gbases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4 fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown. 9360 of these SNPs cause non-synonymous substitutions within coding regions. We further identified 115.000 small indels. Comparison with the genotypes of the same animal generated on a 50k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with oligonucleotide array data and genotypes determined for 196 randomly selected SNPs, was approximately 1%. Accounting for these detection rates, we estimated a nucleotide diversity of approximately 9.4 x 10-4 or 1 SNP per 1093 bp, which is in accordance with previous estimates. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minimal allele frequency (MAF) of 24.5%. The distribution of the minor allele frequency of tested SNPs was nearly uniform with 83% of the SNPs having a MAF larger than 5% Conclusions: This work provides the first single cattle genome by next-generation sequencing. The chosen approach - low to medium coverage re-sequencing - identified more than 2 million novel SNPs providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies.
背景:目前收录于dbSNP(单核苷酸多态性数据库)的约200万个牛单核苷酸多态性(Single Nucleotide Polymorphisms, SNPs),主要由牛基因组计划通过对海福特牛群体以及多种肉牛的简化基因组测序文库进行测序所鉴定得到。为扩充该数据库资源,本研究采用单头弗莱维赫公牛的全基因组重测序策略。
结果:本研究共获得24吉碱基的测序数据,主要采用36bp双端读长测序,平均测序深度达7.4倍。该测序覆盖度足以识别出244万个单核苷酸多态性(SNPs),其中82%为此前未被报道的新位点;其中9360个位点可导致编码区发生非同义替换。本研究还进一步识别出11.5万个小型插入缺失变异(Insertions-Deletions, indels)。将本研究得到的该个体基因型与50k寡核苷酸芯片检测结果进行比对,结果显示纯合型SNPs的检出率为74%,杂合型SNPs的检出率为30%。通过与寡核苷酸芯片数据以及196个随机选取SNPs的测序分型结果进行比对,本研究估算假阳性率约为1%。结合上述检出率参数,本研究估算得到牛的核苷酸多样性约为9.4×10^-4,即每1093个碱基对中存在1个SNPs,该结果与此前的研究估算值相符。本研究还对48头弗莱维赫公牛与48头布朗维赫公牛群体中196个SNPs的等位基因频率进行了检测;其中95%的SNPs具有多态性,平均最小等位基因频率(Minor Allele Frequency, MAF)为24.5%。受试SNPs的最小等位基因频率分布近似均匀,其中83%的SNPs其最小等位基因频率大于5%。
结论:本研究通过下一代测序技术首次获得单头肉牛的全基因组序列。本研究所采用的中低覆盖度重测序策略,成功识别出超过200万个全新SNPs,为全基因组关联研究背景下高密度寡核苷酸芯片的构建提供了极具价值的资源库。
创建时间:
2010-02-26



