Data from: Modeling spatiotemporal abundance of mobile wildlife in highly variable environments using boosted GAMLSS hurdle models
收藏Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.1vm20t6
下载链接
链接失效反馈官方服务:
资源简介:
1. Modeling organism distributions from survey data involves numerous statistical challenges, including zero-inflation, overdispersion, and selection and incorporation of environmental covariates. In environments with high spatial and temporal variability, addressing these challenges often requires numerous assumptions regarding organism distributions and their relationships to biophysical features. These assumptions may limit the resolution or accuracy of predictions resulting from survey-based distribution models. 2. We propose an iterative modeling approach that incorporates a negative binomial hurdle, followed by modeling the relationship of organism distribution and abundance to environmental covariates using generalized additive models (GAM) and generalized additive models for location, scale, and shape (GAMLSS). Our approach accounts for key features of survey data by separating binary (presence-absence) from count (abundance) data, separately modeling the mean and dispersion of count data, and incorporating selection of appropriate covariates and response functions from a suite of potential covariates while avoiding overfitting. 3. We apply our modeling approach to surveys of sea duck abundance and distribution in Nantucket Sound (Massachusetts, USA), which has been proposed as a location for offshore wind energy development. Our model results highlight the importance of spatiotemporal variation in this system, as well as identifying key habitat features including distance to shore, sediment grain size, and seafloor topographic variation. 4. Our work provides a powerful, flexible, and highly repeatable modeling framework with minimal assumptions that can be broadly applied to the modeling of survey data with high spatiotemporal variability. Applying GAMLSS models to the count portion of survey data allows us to incorporate potential overdispersion, which can dramatically affect model results in highly dynamic systems. Our approach is particularly relevant to systems in which little a priori knowledge is available regarding relationships between organism distributions and biophysical features, since it incorporates simultaneous selection of covariates and their functional relationships with organism responses.
1. 基于调查数据构建生物分布模型时,需应对诸多统计挑战,包括零膨胀(zero-inflation)、过度离散(overdispersion),以及环境协变量(environmental covariates)的筛选与纳入。在时空变异性较高的环境中,解决此类挑战往往需要针对生物分布及其与生物物理特征的关系提出大量假设。此类假设可能会限制基于调查的分布模型所生成预测的分辨率与精度。
2. 本研究提出一种迭代建模方法,首先纳入负二项 hurdle(negative binomial hurdle)模型,随后利用广义加性模型(generalized additive models, GAM)以及位置、尺度和形状广义加性模型(generalized additive models for location, scale, and shape, GAMLSS),构建生物分布与丰度同环境协变量之间的关联关系。该方法通过将二分类(出现-不出现)数据与计数(丰度)数据分离,分别对计数数据的均值与离散程度进行建模,并从一系列潜在协变量中筛选合适的协变量与响应函数,同时避免过拟合(overfitting),从而契合调查数据的关键特征。
3. 我们将所提建模方法应用于美国马萨诸塞州楠塔基特湾(Nantucket Sound)的海鸭丰度与分布调查数据,该海域曾被提议作为海上风电开发场址。模型结果凸显了该系统时空变异的重要性,同时识别出包括离岸距离、沉积物粒径与海底地形变化在内的关键栖息地特征。
4. 本研究构建了一个假设条件极少、功能强大且灵活易用的高度可复现建模框架,可广泛应用于时空变异性较高的调查数据建模。将GAMLSS模型应用于调查数据的计数部分,能够纳入潜在的过度离散问题——该问题在高度动态的系统中会对模型结果产生显著影响。由于本方法可同时筛选协变量及其与生物响应的函数关系,因此尤其适用于生物分布与生物物理特征之间的关联缺乏先验知识的研究系统。
创建时间:
2023-06-28



