Prediction of Topsoil Texture Through Regression Trees and Multiple Linear Regressions

Name: Prediction of Topsoil Texture Through Regression Trees and Multiple Linear Regressions
Creator: SciELO journals
Published: 2020-08-29 23:51:05
License: 暂无描述

DataCite Commons2020-08-29 更新2024-07-27 收录

下载链接：

https://scielo.figshare.com/articles/Prediction_of_Topsoil_Texture_Through_Regression_Trees_and_Multiple_Linear_Regressions/6151397

下载链接

链接失效反馈

官方服务：

资源简介：

ABSTRACT: Users of soil survey products are mostly interested in understanding how soil properties vary in space and time. The aim of digital soil mapping (DSM) is to represent the spatial variability of soil properties quantitatively to support decision-making. The goal of this study is to evaluate DSM techniques (Regression Trees - RT and Multiple Linear Regressions - MLR) and the ability of these tools to predict mineral fraction content under a wide variability of landscapes. The study site was the entire Guapi-Macacu watershed (1,250.78 km2) in the state of Rio de Janeiro in the Southeast region of Brazil. Terrain attributes and remote sensing data (with 30 m of spatial resolution) were used to represent landscape co-variables selected as an input in predictive models in order to develop the explanatory variables. The selection of sampling sites was based on the Latin Hypercube algorithm. A representative set of one hundred points with feasible field access was chosen. Different input databases were tested for prediction of mineral fraction content (harmonized and original data). The Spline algorithm was used to harmonize data according to the GlobalSoil. Net consortium standards. The results showed better performance from the RT models, using input from an average of six covariates; the simplest MLR model used twice as many input variables, creating more complex models without gaining precision. Furthermore, better R2 values were obtained using RT models, irrespective of harmonization of soil data. The harmonized dataset from the 0.00-0.05 and 0.05-0.15 m layers, in general, presented better results for the clay and silt, with R2 values of 0.52 (0.00-0.05 m) and 0.69 (0.05-0.15 m), respectively. Prediction of sand content showed better results when the original depth data was used as an input, although all regression tree models had R2 values greater than 0.52. The RT models provided a better statistical index than MLR for all predicted properties; however, the variance between models suggests similarity of performance. Regarding harmonization of soil data, both input databases (harmonized or not) can be used to predict soil properties, since the variance of model performance was low and generalization of the soil maps showed similar trends. The products obtained from the digital soil mapping approach make it possible to integrate the factor of uncertainties, providing easier interpretation for soil management and land use decisions.

摘要：土壤调查产品的使用者大多关注土壤属性在时空维度上的变化规律。数字土壤制图（Digital Soil Mapping, DSM）旨在定量表征土壤属性的空间变异特征，以辅助决策制定。本研究的目标是评估DSM技术（回归树Regression Trees, RT与多元线性回归Multiple Linear Regressions, MLR）的表现，以及这两类工具在景观异质性较强的区域预测土壤矿物组分含量的能力。本研究的研究区为巴西东南部里约热内卢州境内的瓜皮-马卡苏（Guapi-Macacu）全流域，流域面积1250.78 km²。本研究选取地形属性与空间分辨率为30米的遥感数据作为景观协变量，将其作为预测模型的输入以构建解释变量。采样点位的选取基于拉丁超立方算法，最终筛选出100个便于野外作业的代表性采样点。本研究针对土壤矿物组分含量预测任务，测试了两类输入数据集：经标准化处理的数据集与原始数据集。本研究采用样条插值算法，依据GlobalSoil.Net联盟的标准对数据进行标准化处理。结果表明，平均仅需6个协变量作为输入的回归树模型表现更优；而最简多元线性回归模型所需的输入变量数量是回归树模型的两倍，虽构建了更复杂的模型，却未提升预测精度。此外，无论土壤数据是否经过标准化处理，回归树模型均能获得更高的决定系数（R²）值。总体而言，针对0.00-0.05 m与0.05-0.15 m土层的标准化数据集，在黏土与粉粒含量预测上表现更佳，对应的决定系数分别为0.52（0.00-0.05 m土层）与0.69（0.05-0.15 m土层）。尽管所有回归树模型的决定系数均高于0.52，但砂粒含量预测采用原始深度数据集时可获得更优结果。针对所有预测的土壤属性，回归树模型的统计指标均优于多元线性回归模型；不过不同模型间的方差差异表明二者的预测性能较为相近。关于土壤数据标准化处理的问题，两类输入数据集（标准化与原始数据）均可用于土壤属性预测，这是因为模型性能的方差较低，且生成的土壤图的空间分布趋势一致。本研究通过数字土壤制图方法生成的成果可整合不确定性因素，可为土壤管理与土地利用决策提供更清晰的解读依据。

提供机构：

SciELO journals

创建时间：

2018-04-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集