five

DataSheet_1_Feature engineering of environmental covariates improves plant genomic-enabled prediction.docx

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet_1_Feature_engineering_of_environmental_covariates_improves_plant_genomic-enabled_prediction_docx/25828906
下载链接
链接失效反馈
官方服务:
资源简介:
IntroductionBecause Genomic selection (GS) is a predictive methodology, it needs to guarantee high-prediction accuracies for practical implementations. However, since many factors affect the prediction performance of this methodology, its practical implementation still needs to be improved in many breeding programs. For this reason, many strategies have been explored to improve the prediction performance of this methodology. MethodsWhen environmental covariates are incorporated as inputs in the genomic prediction models, this information only sometimes helps increase prediction performance. For this reason, this investigation explores the use of feature engineering on the environmental covariates to enhance the prediction performance of genomic prediction models. Results and discussionWe found that across data sets, feature engineering helps reduce prediction error regarding only the inclusion of the environmental covariates without feature engineering by 761.625% across predictors. These results are very promising regarding the potential of feature engineering to enhance prediction accuracy. However, since a significant gain in prediction accuracy was observed in only some data sets, further research is required to guarantee a robust feature engineering strategy to incorporate the environmental covariates.

引言 基因组选择(Genomic selection, GS)作为一种预测方法论,需为实际应用保障较高的预测精度。然而,诸多因素会影响该方法的预测性能,因此当前许多育种项目中的基因组选择实际应用仍有待完善。为此,学界已探索诸多策略以提升该方法的预测性能。 **方法** 当将环境协变量作为输入引入基因组预测模型时,此类信息往往未必能有效提升预测性能。鉴于此,本研究探索对环境协变量开展特征工程(feature engineering),以优化基因组预测模型的预测性能。 **结果与讨论** 本研究发现,在所有数据集上,相较于仅纳入未经过特征工程的环境协变量的基准方案,特征工程可使各类预测器的预测误差降低761.625%。这一结果凸显了特征工程在提升预测精度方面的巨大潜力,令人备受鼓舞。但需注意的是,仅在部分数据集上观测到了预测精度的显著提升,因此仍需开展进一步研究,以构建可稳定应用的环境协变量融合特征工程策略。
创建时间:
2024-05-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作