Data from Hackathon "GenHack 3 Generative Modeling Challenge": Predicting maize crop yield distribution under stochastic weather
收藏DataCite Commons2025-05-16 更新2025-04-16 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/C3FNBY
下载链接
链接失效反馈官方服务:
资源简介:
Hackathon Overview The GenHack 3 is a data challenge organized by École Polytechnique in 2024. The task was to construct generative models for predicting maize crop yield distributions conditional on temperatures and rainfall across multiple locations simultaneously. This hackathon was divided into two rounds, each with different levels of weather conditioning, to accurately reproduce the effects of weather on final crop yield. The detailed task description can be provided upon request. Dataset Creation The dataset was generated using a Stochastic Weather Generator (SWG) and a crop model. The SWG was trained on data from four French weather stations. This weather station data was downloaded through the INRAE CLIMATIK platform, managed by the AgroClim laboratory of Avignon, France (site available in French). The stations and their identification numbers are as follows: Montreuil-Bellay (49215002), Mons-en-Chaussée (80557001), Saint-Martin-de-Hinx (40272002), and Saint-Gènes-Champanelle (63345002). We trained a daily multi-site and multivariate SWG using the following weather variables: daily minimum and maximum temperatures, precipitation, solar irradiance, and Penman evapotranspiration. The SWG is an extension of the model described in the paper "Interpretable Seasonal Hidden Markov Model for Spatio-temporal Stochastic Rain Generation". The full training details are available in the tutorial of the Julia package StochasticWeatherGenerators.jl. The SWG generated N years of weather data, which was input into the STICS crop model for maize (see the STICS website) to produce N annual crop yield values. The parameters used in the STICS model are also described in the tutorial. The most important modification to the default parameters is that no irrigation was provided, to highlight the hydric stress on the plant. Weather Data Aggregation Daily maximum temperatures and average rainfall were aggregated into nine periods spanning April 27 to October 27 (the maize growth period): Period 1: April 27 - May 16 Period 2: May 17 - June 5 Period 3: June 6 - June 25 Period 4: June 26 - July 15 Period 5: July 16 - August 4 Period 6: August 5 - August 24 Period 7: August 25 - September 13 Period 8: September 14 - October 3 Period 9: October 4 - October 27 Details on reproducing these aggregated variables are explained in the tutorial section "Sensitivity of maize on rainfall during key growth periods". Participants were provided with these aggregated weather variables and the resulting yield data. The objective was to build a generative model capable of generating yield values conditionally on specific weather conditions (e.g., high or low rainfall). Dataset Structure The dataset provided to participants included 104 realizations, with a separate validation dataset of 105 realizations used for evaluation. Column 1 (YEAR): Year number Columns 2-10 (W_1-W_9): Mean daily temperature (°C) over each of the nine periods Columns 11-19 (W_10-W_18): Mean daily rainfall (mm/mm²) over each of the nine periods Column 20 (YIELD): Annual maize yield (t/ha)
提供机构:
Recherche Data Gouv
创建时间:
2025-01-16



