five

Supporting data for: Multi-modal deep learning improves grain yield prediction in wheat breeding by fusing genomics and phenomics

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.kprr4xh5p
下载链接
链接失效反馈
官方服务:
资源简介:
Identifying and growing new crop varieties with the highest yield is of utmost importance to ensure robust and sustainable food supplies for the global population. Plant breeding programs benefit from increasing technological support but still rely on full growth cycle and manual yield measurement, hindering speed of development. While methods to predict yield have been proposed, satisfying levels of performance are still to be reached. In this study, we propose a new machine learning model that simultaneously leverages both genotype and phenotype measurement by fusing multiple sources of input data collected by unmanned aerial systems: longitudinal multispectral and thermal images, digital elevation models, along with single nucleotide polymorphisms (SNPs) measurements. To tackle the varying number of observations for each sample, we leverage a deep multiple instance learning framework with an attention mechanism that also allows us to shed light on the importance the trained model gives to each data input during prediction, enhancing interpretability. Our model reaches 0.754~±~0.024 Pearson correlation coefficient when predicting yield in similar environmental conditions, which represents a 34.8% improvement over the genotype-only linear baseline (0.559~±~0.050). Moreover, we achieve transfer to a new, unseen environment where we obtain 0.386~±~0.010~(0.407 for ensemble performance) Pearson correlation coefficient when predicting yield on new lines when using genotypes alone, a 13.5% improvement over the linear baseline. We show that our multi-modal deep learning architecture efficiently accounts for plant health and environment, thereby distilling out the genetic contribution and providing excellent predictions. Yield prediction algorithms leveraging phenotypic observations during training are therefore expected to improve plant breeding programs with more accurate selections of lines, speeding up delivery of improved varieties. Methods Plant Material and Field Layout Spring wheat (Triticum aestivum L.) breeding lines of two different experiments, named as YT (Yield Trials, 27°22'57.6'' N, 109°55'34.7'' W) and EYT (Elite Yield Trials, 27°23'0.1'' N, 109°55'7.9'' W), were selected from the International Maize and Wheat Improvement Center (CIMMYT) wheat breeding program.  All the trials were planted in November 2017, at Norman E Borlaug Experiment Station in Ciudad Obregon, Sonora, Mexico during the 2017–18 season. The YT experiment consisted of 1800 unique spring wheat entries, while the EYT consisted of 1710 unique entries. Both experiments were arranged as the alpha lattice design and distributed within two blocks in YT and three blocks in EYT. The YT plots served as experimental units and were 1.7m × 3.4m in size, planted on two raised beds spaced 0.8m apart with paired rows on each bed at 0.15m spacing for each plot. The EYT plots were sown in flat and were 1.3m × 4m in size with six rows per plot. UAS, Sensors, and Image Acquisition The UAS used for image acquisition was a DJI Matrice 100 (DJI, Shenzhen, China). The flight plans were created using Litchi Android App (VC Technology Ltd., UK) and CSIRO mission planner application (https://uavmissionplanner.netlify.app/) for DJI Matrice100. Accordingly, the flight speed, the flight elevation above the ground, and the width between two parallel flight paths were adjusted based on the overlap rate and the camera field of view.  Both cameras were automatically triggered with the onboard GNSS unit following a constant interval of distance traveled.  A summary of flight settings is listed in the Supplement (Table 1). To collect the thermal image from the spring wheat nurseries, a FLIR VUE Pro R thermal camera (FLIR Systems, USA) was carried by the DJI Matrice 100. All data collections were conducted between 11AM and 1PM during the grain filling stage. The aerial image overlap rate between two geospatially adjacent images was set to 80% both sequentially and laterally to ensure optimal orthomosaic photo stitching quality. To preserve the image pixel information, the FLIR camera was set to capture Radiometric JPEG (R-JPEG) images. A MicaSense RedEdge-M multispectral camera (MicaSense Inc., USA) was used to collect spring wheat canopy images in both the YT and EYT experiments. All UAS flights were conducted between 11AM to 2PM. The aerial image overlap rate between two geospatially adjacent images was set to 80% both sequentially and laterally to ensure optimal orthomosaic photo stitching quality. To preserve the image pixel intensity, the MicaSense RedEdge-M camera was set to capture uncompressed TIFF images. To improve the geospatial accuracy of orthomosaic and orthorectified images, ground control points (GCPs) consisting of bright white/reflective square markers were uniformly distributed in the field experiment before image acquisition and surveyed to cm-level resolution. All the GCPs were surveyed using a Trimble R4 RTK (Trible Inc., Sunnyvale, California, US) Global Positioning System (GPS). Plot-level Traits Extraction Plot-level phenotypic trait values used for learning include multiple vegetation indices (VIs), the canopy height from the digital elevation models (DEMs), and the canopy temperature. Extraction of plot-level phenotypic values from orthomosaic and orthorectified images followed the methodology of Wang et al. (2020). Wang, Xu, Paula Silva, Nora Bello, Daljit Singh, Byron Evers, Suchismita Mondal, Francisco Pinto, Ravi Prakash Singh, and Jesse Poland. "Improved accuracy of high-throughput phenotyping from Unmanned Aerial Systems by extracting traits directly from orthorectified images." Frontiers in plant science 11 (2020): 1616. Imaging Sensor MicaSense RedEdge-M FLIR VUE Pro R Flight Speed 14 km/h 18 km/h Experiment YT EYT YT EYT Flight Date 01/18/2018, 02/26/2018, 03/07/2018, 03/15/2018 01/19/2018, 02/23/2018, 03/02/2018, 03/07/2018, 03/21/2018 03/08/2018, 03/18/2018 02/23/2018, 03/02/2018 Flight altitude 35 m AGL 60 m AGL Ground Sample Distance of Orthomosaic 2.05 cm/pixel 8.20 cm/pixel
创建时间:
2023-05-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作