five

Early-season biomass and weather enable robust cereal rye cover crop biomass predictions

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.ngf1vhj1r
下载链接
链接失效反馈
官方服务:
资源简介:
Farmers need accurate estimates of winter cover crop biomass to make informed decisions on termination timing or to estimate potential release of nitrogen from cover crop residues to subsequent cash crops. Utilizing data from an extensive experiment across 11 states from 2016 to 2020, this study explores the most reliable predictors for determining cereal rye cover crop biomass at the time of termination. Our findings demonstrate a strong relationship between early-season and late-season cover crop biomass. Employing a random forest model, we predicted late-season cereal rye biomass with a margin of error of approximately 1,000 kg ha-1 based on early-season biomass, growing degree days, cereal rye planting and termination dates, photosynthetically active radiation, precipitation, and site coordinates as predictors. Our results suggest that similar modeling approaches could be combined with remotely sensed early-season biomass estimations to improve the accuracy of predicting winter cover crop biomass at termination for decision support tools. Methods 2.1 Field sites and operations   Cereal rye cover crop biomass data used in the modeling approach were obtained from a field experiment conducted on research farms in 11 states between 2016 and 2020 (as outlined in Supplementary Table 1). Cereal rye was planted in 9.1 by 12.2 m plots in the late fall of each year, with four or five replicates per site-year. Management practices (i.e., cereal rye variety, seeding rates and methods) specific to each site were based on local norms. Biomass samples were collected from two 0.5-m2 quadrats in each plot at six weeks (hereafter referred to as “early-season biomass”) and two weeks (“late-season biomass”) prior to target dates for soybean planting. Cereal rye variety, as well as cereal rye planting and early and late-season biomass dates are summarized across sites and years in Supplementary Table 2.   2.2 Data assembly and preparation   In addition to early-season biomass, which was hypothesized to predict winter cover crop growth, weather variables related to temperature and radiation were used to model late-season biomass.  Minimum and maximum air temperatures (℃) and shortwave incoming solar radiation (W m-2) were extracted for each site-year on a daily basis  at a spatial resolution of 0.125° by 0.125° from the North American Land Data Assimilation System Phase 2 dataset (Xia et al., 2012).  Cumulative growing degree days (CGDD) (-4.5° C base) were calculated over two time periods and negative values were omitted (Pessotto et al., 2023). “Early CGDD” and “early precipitation” were summed between cereal rye planting date and early termination date (six weeks prior to soybean planting), and “late CGDD” and “late precipitation” were summed between early termination and late termination date. Precipitation data were extracted from the multi-radar/multi-sensor system (NOAA Multi-Radar/Multi-Sensor System (MRMS), 2023). Daily photosynthetically active radiation (PAR) was calculated from shortwave radiation using the ‘sw.to.par’ function in the LakeMetabolizer v.1.5.0 R package (Winslow et al., 2016). The mean of daily PAR was calculated for the period between early and late cover crop termination dates.   2.3 Statistical Analyses 2.3.1 Model selection to evaluate support for each covariate   All predictor variables, early-season cereal rye biomass, cereal rye planting date (Julian days), late termination date (Julian days), mean late PAR, and both early and late CGDD, were standardized by subtracting the mean and dividing by the standard deviation of each variable (Gelman, 2008).We examined all candidate predictor variables for collinearity using the vif function in the car package (v3.0-10) (Fox et al., 2018) and removed the precipitation variables because of their variance inflation factor scores > 3 (Zuur et al., 2010). Site location (which varied occasionally from year to year within states) was input as a unique categorical variable for each set of field location coordinates.   We fit a generalized linear mixed effects model (GLMM) using the glmer function in the lme4 package (Bates et al., 2015) with a Gaussian error distribution and log link function due to overdispersion in the response variable (late-season cereal rye cover crop biomass in kg ha-1). We specified a hierarchical model with random intercepts for each location and for blocks (nested under each location) to address the non-independence of repeated measurements within the same locations and blocks through time (Pinheiro & Bates, 2000). We fit a “global” model with all covariates that we hypothesized to be important including early-season cereal rye biomass, cereal rye planting date (in Julian days), late termination date, mean late PAR, and both early and late CGDD. We visually assessed model assumptions of homogeneity of variance across groups and normality of fitted residuals.   2.3.2 Random forest model and validation   To improve the accuracy of predictions, we also fit a random forest machine learning model on the dataset using the randomForest package v. 4.7–11 in R (Breiman, 2001; Liaw & Wiener, 2002). We specified a random forest model with the training parameters ntree set to 1,000 and mtry set to 2 and included the same covariates as the GLMM, except we included site latitude and longitude coordinates separately rather than as categorical locations. Variable importance was calculated with the randomForest package; variables were ranked using %IncMSE, the mean decrease in prediction accuracy on the out of bag samples as each variable is randomly permuted.   The dataset was randomly partitioned so that the random forest model was trained on 70% of the total data, and 30% was withheld and used for model validation. We also used the same data partition to validate a version of the “global model” GLMM that was refitted to include only the training data. To assess how model performance varied across “low” and “high” cover crop biomass values, we evaluated it separately for “low” biomass observations of 4,000 kg ha-1 or less and “high” cereal rye biomass values greater than 4,000 kg ha-1.
创建时间:
2024-01-21
二维码
社区交流群
二维码
科研交流群
商业服务