Data from: Development of highly reliable in silico SNP resource and genotyping assay from exome capture and sequencing: an example from black spruce (Picea mariana)
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.tr87v
下载链接
链接失效反馈官方服务:
资源简介:
Picea mariana is a widely distributed boreal conifer across Canada and the
subject of advanced breeding programs for which population genomics and
genomic selection approaches are being developed. Targeted sequencing was
achieved after capturing P. mariana exome with probes designed from the
sequenced transcriptome of Picea glauca, a distant relative. A high
capture efficiency of 75.9% was reached although spruce has a complex and
large genome including gene sequences interspersed by some long introns.
The results confirmed the relevance of using probes from congeneric
species to perform successfully interspecific exome capture in the genus
Picea. A bioinformatics pipeline was developed including stringent
criteria that helped detect a set of 97 075 highly reliable in silico
SNPs. These SNPs were distributed across 14 909 genes. Part of an Infinium
iSelect array was used to estimate the rate of true positives by
validating 4267 of the predicted in silico SNPs by genotyping trees from
P. mariana populations. The true positive rate was 96.2%, for in silico
SNPs compared to a genotyping success rate of 96.7% for a set 1115 P.
mariana control SNPs recycled from previous genotyping arrays. These
results indicate the high success rate of the genotyping array and the
relevance of the selection criteria used to delineate the new P. mariana
in silico SNP resource. Furthermore, in silico SNPs were generally of
medium to high frequency in natural populations, thus providing high
informative value for future population genomics applications.
黑云杉(Picea mariana)是加拿大境内广泛分布的北方针叶树,也是目前正在开展群体基因组学(population genomics)和基因组选择(genomic selection)研究的高级育种项目的研究对象。通过使用从其远缘亲属白云杉(Picea glauca)的测序转录组(sequenced transcriptome)设计的探针捕获黑云杉外显子组(exome),实现了靶向测序。尽管云杉基因组复杂且庞大,包含被长内含子(introns)间隔的基因序列,但仍达到了75.9%的高捕获效率。该结果证实,在云杉属(Picea)中使用来自同属物种(congeneric species)的探针成功进行种间外显子组捕获(interspecific exome capture)是可行的。开发了一套生物信息学流程(bioinformatics pipeline),其中包含严格的筛选标准,助力检测出97075个高度可靠的虚拟单核苷酸多态性(in silico SNPs)。这些单核苷酸多态性分布在14909个基因中。使用Infinium iSelect芯片的部分位点,通过对黑云杉群体植株进行基因分型以验证4267个预测的虚拟单核苷酸多态性,从而估算真阳性率(true positive rate)。虚拟单核苷酸多态性的真阳性率为96.2%,而从先前基因分型芯片中复用的1115个黑云杉对照单核苷酸多态性(control SNPs)的基因分型成功率为96.7%。这些结果表明基因分型芯片的高成功率,以及用于构建黑云杉新型虚拟单核苷酸多态性资源的筛选标准的有效性。此外,虚拟单核苷酸多态性在自然群体中普遍具有中高频特征,因此为未来群体基因组学应用提供了高信息价值。
提供机构:
Dryad
创建时间:
2015-09-17



