Input data for winter wheat yield forecasting
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/rbs52ww8v2
下载链接
链接失效反馈官方服务:
资源简介:
Data Description for County-Scale Winter Wheat Yield Prediction in Eastern China (2005-2022)
This dataset, provided as a single CSV file (winter_wheat_yield_prediction_data.csv), compiles comprehensive input features for modeling winter wheat yield across 77 counties in eastern China from 2005 to 2022. Each row represents a unique county-year observation. Data was meticulously gathered:
• County-level winter wheat yield (ton/ha): From official statistical yearbooks.
• Climate variables: Growing season (Oct-May) averages/totals for temperature, precipitation, and solar radiation from TerraClimate (1/24-degree), aggregated to county level.
• Remote sensing variables:
o Vegetation Indices (VIs): NDVI, EVI, and NIRv from MODIS (MOD13A2/MYD13A2), pre-processed and aggregated to county-level maximums/averages.
o Solar-Induced Chlorophyll Fluorescence (SIF): High-resolution GOSIF and CSIF, aggregated to county level, providing a direct proxy for photosynthetic activity.
• County-level planting area: From the National Earth System Science Data Center.
All data layers were matched by county and year. Missing values were handled via imputation; quality control removed outliers.
Notable Findings & Interpretation:
Our modeling (using LASSO, RIDGE, SVR, RF, XGBoost, TabPFN) yielded key insights:
1. Synergistic Power: Integrating climate and remote sensing data delivered the most robust predictions ($R^2=0.72-0.81$), outperforming climate-only ($R^2=0.60-0.78$) or remote sensing-only ($R^2=0.43-0.65$) models. This highlights capturing both environmental drivers and biological manifestations.
2. SIF's Advantage: SIF generally outperformed VIs ($R^2_{max}=0.65$ vs. $R^2_{max}=0.62$) due to its direct link to photosynthesis. NIRv performed comparably to CSIF in remote sensing-only scenarios.
3. Dynamic Data Contributions: Data roles evolve seasonally. Climate data was crucial early on; remote sensing became more informative as the season progressed, integrating cumulative weather effects.
4. Non-linear Model Superiority: Non-linear ML methods consistently outperformed linear models, with TabPFN achieving the best performance, underscoring the inherently complex crop-yield relationships.
5. Robustness in Anomalous Years: SIF-based models (especially GOSIF) showed superior robustness in challenging years (e.g., 2016), maintaining better performance when VIs struggled.
How to Interpret & Use This Data:
This dataset is a valuable resource for agricultural science, remote sensing, and environmental modeling. It can be used to:
• Validate/benchmark new yield prediction models against comprehensive real-world data.
• Investigate spatio-temporal yield patterns and underlying environmental drivers.
• Explore relationships between climate, remote sensing, and yield.
• Develop/refine agricultural management strategies by understanding yield influencing factors.
• Study extreme weather event impacts on winter wheat productivity.
创建时间:
2025-06-26



