five

Supporting data for: Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics

收藏
DataCite Commons2025-06-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.kprr4xh5p
下载链接
链接失效反馈
官方服务:
资源简介:
Identifying and growing new crop varieties with the highest yield is of utmost importance to ensure robust and sustainable food supplies for the global population. Plant breeding programs benefit from increasing technological support but still rely on full growth cycle and manual yield measurement, hindering speed of development. While methods to predict yield have been proposed, satisfying levels of performance are still to be reached. In this study, we propose a new machine learning model that simultaneously leverages both genotype and phenotype measurement by fusing multiple sources of input data collected by unmanned aerial systems: longitudinal multispectral and thermal images, digital elevation models, along with single nucleotide polymorphisms (SNPs) measurements. To tackle the varying number of observations for each sample, we leverage a deep multiple instance learning framework with an attention mechanism that also allows us to shed light on the importance the trained model gives to each data input during prediction, enhancing interpretability. Our model reaches 0.754~±~0.024 Pearson correlation coefficient when predicting yield in similar environmental conditions, which represents a 34.8% improvement over the genotype-only linear baseline (0.559~±~0.050). Moreover, we achieve transfer to a new, unseen environment where we obtain 0.386~±~0.010~(0.407 for ensemble performance) Pearson correlation coefficient when predicting yield on new lines when using genotypes alone, a 13.5% improvement over the linear baseline. We show that our multi-modal deep learning architecture efficiently accounts for plant health and environment, thereby distilling out the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training are therefore expected to improve plant breeding programs with more accurate selections of lines, speeding up delivery of improved varieties.
提供机构:
Dryad
创建时间:
2023-05-18
二维码
社区交流群
二维码
科研交流群
商业服务