Data from: Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing

Name: Data from: Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing
Creator: The University of British Columbia
Published: 2025-04-24 19:38:38
License: 暂无描述

DataCite Commons2025-04-24 更新2025-04-16 收录

下载链接：

https://doi.library.ubc.ca/10.14288/1.0397828

下载链接

链接失效反馈

官方服务：

资源简介：

Abstract Background: Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost. Results: Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3. Conclusions: The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.

摘要 背景：林业中的基因组选择（Genomic selection, GS）可通过早期选择和更高的选择强度显著缩短育种周期，提高单位时间增益，尤其适用于低遗传力且表达较晚的性状。经济实惠的下一代测序技术使以合理成本对大量树木进行基因分型成为可能。 结果：采用基于测序的基因分型（Genotyping-by-sequencing, GBS）技术对1126株内陆云杉（Interior spruce）树木进行基因分型，这些树木代表25个开放授粉家系（open-pollinated families），种植于加拿大不列颠哥伦比亚省的三个试验点。比较了四种插补算法：均值法（mean value, MI）、奇异值分解（singular value decomposition, SVD）、期望最大化（expectation maximization, EM）以及新推导的基于家系的k近邻算法（family-based k-nearest neighbor, kNN-Fam）。对树木的多个产量和木材属性进行表型测定。使用岭回归最佳线性无偏预测（Ridge Regression Best Linear Unbiased Predictor, RR-BLUP）和广义岭回归（Generalized Ridge Regression, GRR）开发单地点和多地点GS预测模型，以检验关于性状结构的不同假设。最后，利用主成分分析（Principal Component Analysis, PCA）开发多性状GS预测模型。EM和kNN-Fam插补方法分别在缺失数据为30%和60%时表现更优。RR-BLUP GS预测模型的准确性优于GRR，表明这些性状的遗传结构复杂。多地点GS预测准确性较高且优于单地点模型，而多地点可预测性的准确性最低，反映出b型遗传相关（type-b genetic correlations）且不可靠。将基因组信息纳入数量遗传学分析产生了更真实的遗传力估计值，因为半同胞系谱（half-sib pedigree）往往会夸大加性遗传方差，进而夸大遗传力和增益估计值。主成分得分作为多性状GS预测模型的代表产生了令人惊讶的结果：通过PCA2和PCA3可同时选择负相关性状。 结论：GS在开放授粉家系测定（树木改良评估方法中最简单的形式）中的应用被证明是有效的。所有性状获得的预测准确性极大支持GS在树木育种中的整合。虽然单地点GS预测准确性较高，但结果明确表明单地点GS模型预测其他地点的能力不可靠，支持采用多地点方法。主成分得分提供了同时选择具有不同表型最优值性状的机会。

提供机构：

The University of British Columbia

创建时间：

2021-05-21