Large, climate-sensitive soil carbon stocks mapped with pedology-informed machine learning in the North Pacific coastal temperate rainforest

NIAID Data Ecosystem2026-05-02 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.5jf6j1r

下载链接

链接失效反馈

官方服务：

资源简介：

Accurate soil organic carbon (SOC) maps are needed to predict the terrestrial SOC feedback to climate change, one of the largest remaining uncertainties in Earth system modeling. Over the last decade, global scale models have produced varied predictions of the size and distribution of SOC stocks, ranging from 1,000 to > 3,000 Pg of C within the top 1 m. Regional assessments may help validate or improve global maps because they can examine landscape controls on SOC stocks and offer a tractable means to retain regionally-specific information, such as soil taxonomy, during database creation and modeling. We compile a new transboundary SOC stock database for coastal watersheds of the North Pacific coastal temperate rainforest, using soil classification data to guide gap-filling and machine learning approaches used to explore spatial controls on SOC and predict regional stocks. Precipitation and topographic attributes controlling soil wetness were found to be the dominant controls of SOC, underscoring the dependence of C accumulation on high soil moisture. The random forest model predicted stocks of 4.5 Pg C (to 1 m) for the study region, 22% of which was stored in organic soil layers. Calculated stocks of 228 ± 111 Mg C ha-1 fell within ranges of several past regional studies and indicate 11-33 Pg C may be stored across temperate rainforest soils globally. Predictions were compared very favorably to regionalized estimates from two spatially explicit global products (Pearson's correlation: ρ = 0.73 vs. 0.34). Notably, SoilGrids250m was an outlier for estimates of total SOC, predicting 4-fold higher stocks (18 Pg C) and indicating bias in this global product for the soils of the temperate rainforest. In sum, our study demonstrates that CTR ecosystems represent a moisture-dependent hotspot for SOC storage at mid-latitudes. Methods Transboundary SOC Database We compiled a transboundary database of > 1300 soil profile descriptions (pedons) across SEAK and BC from published and archive data sources. For each pedon, we calculated SOC stocks for the top 1 m of mineral soil plus surface organic horizons using data harmonization and gap-filling procedures that are detailed in the supplementary information (supplementary tables 1–5). In brief, US soil classification was converted to Canadian where necessary, and gaps were filled with published values or modeled estimates grouped by soil class, horizon, and lithology. In contrast to some other regional and global C assessments, this approach avoided the use of generalized empirical relationships between soil properties and missing variables, such as between soil C and soil bulk density, or soil C and depth. Environmental covariates Environmental covariates were selected (supplementary table 6) to predict SOC stock due to their relationship with soil-forming factors (climate, organisms, relief, parent material, and time; Jenny 1994). Covariate data were extracted from the rasters at the pedon coordinates and appended to the final SOC stocks (in supplementary material) to use in all further analyses. Further details of the 12 selected environmental covariates along with justification for inclusion and pre-processing steps are listed in supplementary table 6. Briefly, only high-quality and spatially continuous data products were used. Curating covariates based on knowledge of regional soil development facilitates clearer interpretation and reduces the risk of autocorrelation between variables. Random forest model A random forest model was trained to predict stocks of SOC across the NPCTR in R (v.3.4; R Core Team 2018 (www.R-project.org)) using the R-package randomForest (4.6; Liaw and Wiener 2002). Random forests grow a large number of regression trees (Breiman et al 1984) from different random subsets of training data and predictor variables, thereby reducing variance relative to single trees, and greatly reducing the risk of over-fitting model predictions and non-optimal solutions—though at the cost of interpretability (Breiman 2001). The transboundary database SOC stocks and associated covariates were first split into training (80%) and testing (20%) data and the model was parameterized to grow 5000 trees. For each tree, a subsample equivalent to ¼ of the total sample size was utilized (with replacement). Node size was set at 4 to minimize the out-of-bag error based on preliminary testing. Model performance was measured from goodness-of-fit, distributions of residuals, and predictions of test SOC stocks. Confidence intervals were computed using an infinitesimal jack-knife procedure (Wager et al 2013). Predictions were made across the NPCTR study extent using an R-package raster (v2.6; Hijmans 2017) which produced a SOC map at 90.5 m resolution. All lakes >10 ha were clipped from the final map (HydroLakes, Messager et al 2016), and the glacier area was clipped using the Randolph Glacier Inventory 5.0 (GLIMS, Raup et al 2007) database. Final SOC stocks were adjusted for topography by scaling the SOC map with actual land surface area calculated from cell slope values. The random forest model was re-run for the three gap-filling sensitivity analyses. Soil organic carbon maps were exported as .tif files.

创建时间：

2024-10-10