Table_1_Application of Genomic Selection at the Early Stage of Breeding Pipeline in Tropical Maize.DOCX
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Table_1_Application_of_Genomic_Selection_at_the_Early_Stage_of_Breeding_Pipeline_in_Tropical_Maize_DOCX/14866422
下载链接
链接失效反馈官方服务:
资源简介:
In maize, doubled haploid (DH) line production capacity of large-sized maize breeding programs often exceeds the capacity to phenotypically evaluate the complete set of testcross candidates in multi-location trials. The ability to partially select DH lines based on genotypic data while maintaining or improving genetic gains for key traits using phenotypic selection can result in significant resource savings. The present study aimed to evaluate genomic selection (GS) prediction scenarios for grain yield and agronomic traits of one of the tropical maize breeding pipelines of CIMMYT in eastern Africa, based on multi-year empirical data for designing a GS-based strategy at the early stages of the pipeline. We used field data from 3,068 tropical maize DH lines genotyped using rAmpSeq markers and evaluated as test crosses in well-watered (WW) and water-stress (WS) environments in Kenya from 2017 to 2019. Three prediction schemes were compared: (1) 1 year of performance data to predict a second year; (2) 2 years of pooled data to predict performance in the third year, and (3) using individual or pooled data plus converting a certain proportion of individuals from the testing set (TST) to the training set (TRN) to predict the next year's data. Employing five-fold cross-validation, the mean prediction accuracies for grain yield (GY) varied from 0.19 to 0.29 under WW and 0.22 to 0.31 under WS, when the 1-year datasets were used training set to predict a second year's data as a testing set. The mean prediction accuracies increased to 0.32 under WW and 0.31 under WS when the 2-year datasets were used as a training set to predict the third-year data set. In a forward prediction scenario, good predictive abilities (0.53 to 0.71) were found when the training set consisted of the previous year's breeding data and converting 30% of the next year's data from the testing set to the training set. The prediction accuracy for anthesis date and plant height across WW and WS environments obtained using 1-year data and integrating 10, 30, 50, 70, and 90% of the TST set to TRN set was much higher than those trained in individual years. We demonstrate that by increasing the TRN set to include genotypic and phenotypic data from the previous year and combining only 10–30% of the lines from the year of testing, the predicting accuracy can be increased, which in turn could be used to replace the first stage of field-based screening partially, thus saving significant costs associated with the testcross formation and multi-location testcross evaluation.
在玉米育种中,大型玉米育种项目的双单倍体(doubled haploid, DH)系生产能力,往往超出了多环境试验中对全部测交候选材料进行表型鉴定的能力范围。若能基于基因型数据初步筛选DH系,同时借助表型选择维持或提升关键性状的遗传增益,可实现显著的资源节约。本研究基于多年实证数据,旨在评估国际玉米小麦改良中心(CIMMYT)东非热带玉米育种流水线之一的籽粒产量与农艺性状基因组选择(genomic selection, GS)预测方案,以在该育种流程的早期阶段设计基于基因组选择的策略。
本研究使用了3068份热带玉米DH系的田间数据,这些材料通过rAmpSeq分子标记进行基因型鉴定,并于2017至2019年在肯尼亚的正常灌溉(well-watered, WW)与水分胁迫(water-stress, WS)环境中作为测交材料开展鉴定。本研究对比了三种预测方案:(1)利用1年的性状表现数据预测次年的表现;(2)利用2年的合并数据预测第三年的性状表现;(3)利用单年或合并数据,同时将测试集(testing set, TST)中一定比例的材料纳入训练集(training set, TRN),以预测下一年的数据。
采用五折交叉验证时,当以1年数据集作为训练集预测次年测试集的数据时,籽粒产量(grain yield, GY)的平均预测精度在WW环境下为0.19~0.29,在WS环境下为0.22~0.31。当以2年数据集作为训练集预测第三年数据集时,平均预测精度在WW环境下提升至0.32,在WS环境下提升至0.31。在正向预测场景中,当训练集包含前一年的育种数据,并将次年30%的测试集数据纳入训练集时,可获得良好的预测能力(0.53~0.71)。利用1年数据,并将10%、30%、50%、70%及90%的TST材料整合至TRN中时,WW与WS环境下抽穗期和株高的预测精度,均显著高于仅以单年数据作为训练集的情况。
本研究表明,通过扩大TRN集以纳入前一年的基因型与表型数据,仅整合测试年份中10%~30%的材料,即可提升预测精度,进而可部分替代基于田间筛选的第一阶段工作,从而节约测交组配与多环境测交鉴定相关的大量成本。
创建时间:
2021-06-28



