Supporting data for: Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics
收藏DataCite Commons2025-06-01 更新2025-04-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.kprr4xh5p
下载链接
链接失效反馈官方服务:
资源简介:
Identifying and growing new crop varieties with the highest yield is of
utmost importance to ensure robust and sustainable food supplies for the
global population. Plant breeding programs benefit from increasing
technological support but still rely on full growth cycle and manual yield
measurement, hindering speed of development. While methods to predict
yield have been proposed, satisfying levels of performance are still to be
reached. In this study, we propose a new machine learning model that
simultaneously leverages both genotype and phenotype measurement by fusing
multiple sources of input data collected by unmanned aerial systems:
longitudinal multispectral and thermal images, digital elevation models,
along with single nucleotide polymorphisms (SNPs) measurements. To tackle
the varying number of observations for each sample, we leverage a deep
multiple instance learning framework with an attention mechanism that also
allows us to shed light on the importance the trained model gives to each
data input during prediction, enhancing interpretability. Our model
reaches 0.754~±~0.024 Pearson correlation coefficient when predicting
yield in similar environmental conditions, which represents a 34.8%
improvement over the genotype-only linear baseline (0.559~±~0.050).
Moreover, we achieve transfer to a new, unseen environment where we obtain
0.386~±~0.010~(0.407 for ensemble performance) Pearson correlation
coefficient when predicting yield on new lines when using genotypes alone,
a 13.5% improvement over the linear baseline. We show that our multi-modal
deep learning architecture efficiently accounts for plant health and
environment, thereby distilling out the genetic contribution and providing
excellent predictions. Yield prediction algorithms leveraging phenotypic
observations during training are therefore expected to improve plant
breeding programs with more accurate selections of lines, speeding up
delivery of improved varieties.
提供机构:
Dryad
创建时间:
2023-05-18



