Phenotypic and Genotypic Data with R Scripts for Genomic Prediction Models in Soybean Across Multiple Environments

Name: Phenotypic and Genotypic Data with R Scripts for Genomic Prediction Models in Soybean Across Multiple Environments
Creator: Iowa State University
Published: 2026-05-05 20:30:20
License: 暂无描述

DataCite Commons2026-05-05 更新2026-05-10 收录

下载链接：

https://iastate.figshare.com/articles/dataset/Phenotypic_and_Genotypic_Data_with_R_Scripts_for_Genomic_Prediction_Models_in_Soybean_Across_Multiple_Environments/31929267

下载链接

链接失效反馈

官方服务：

资源简介：

Improving selection accuracy in soybean breeding programs is crucial for reducing costs and shortening the time required to develop new varieties. Genomic selection (GS) is a promising tool for improving selection accuracy. This study aimed to identify the most effective GS models for predicting key traits, including seed yield, protein content, oil content, and maturity in soybean breeding programs. Additionally, we sought to determine the optimal method for optimizing the training population (structure vs. size) and to establish the minimum number of genotypes required to ensure high model performance across locations. Finally, we explored multitrait selection based on genomic prediction to improve breeding decisions. Data were obtained from the soybean variety development program and included experiments planted in a randomized block design with two replications at eight locations during the 2023 and 2024 growing seasons. Six GS models were tested: rrBLUP, Bayes A, Bayes B, RKHS, random forest, and support vector machine. Four methods for optimizing the training population were evaluated: random selection (RS), maturity group random selection (MGRS), experimental random selection (ERS), and genetic algorithm (GA). We also assessed ten different training population sizes (ranging from 10% to 90%) and five selection index strategies: direct selection on seed yield, direct selection on oil content, direct selection on protein content, Rank-index, and Smith-Hazel index. Our results suggest that the most effective strategy for implementing GS in a public soybean breeding program is to use the rrBLUP model for genomic prediction. To optimize its performance, it is recommended to train the model using 80% of the total population. This approach provides robust, reliable predictions. Furthermore, structuring the training population through experimental random selection enhances genetic diversity, which is crucial for improving selection accuracy and robustness across breeding cycles. Finally, the Rank-index proved highly effective for selecting soybean genotypes, particularly for improving seed yield and oil content, while results for seed protein content were less promising. By considering multiple traits simultaneously, this method offers a more balanced approach to genetic improvement compared to single-trait selection methods, making it an excellent tool for breeding programs that aim to enhance both productivity and quality.

提供机构：

Iowa State University

创建时间：

2026-04-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集