Data from: The challenge of modeling niches and distributions for data-poor species: a comprehensive approach to model complexity
收藏DataONE2017-07-11 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
Models of species ecological niches and geographic distributions now represent a widely used tool in ecology, evolution, and biogeography. However, the very common situation of species with few available occurrence localities presents major challenges for such modeling techniques, in particular regarding model complexity and evaluation. Here, we summarize the state of the field regarding these issues and provide a worked example using the technique Maxent for a small mammal endemic to Madagascar (the nesomyine rodent Eliurus majori). Two relevant model-selection approaches exist in the literature (information criteria, specifically AICc; and performance predicting withheld data, via a jackknife), but AICc is not strictly applicable to machine-learning algorithms like Maxent. We compare models chosen under each selection approach with those corresponding to Maxent default settings, both with and without spatial filtering of occurrence records to reduce the effects of sampling bias. Both selection approaches chose simpler models than those made using default settings. Furthermore, the approaches converged on a similar answer when sampling bias was taken into account, but differed markedly with the unfiltered occurrence data. Specifically, for that dataset, the models selected by AICc had substantially fewer parameters than those identified by performance on withheld data. Based on our knowledge of the study species, models chosen under both AICc and withheld-data-selection showed higher ecological plausibility when combined with spatial filtering. The results for this species intimate that AICc may consistently select models with fewer parameters and be more robust to sampling bias. To test these hypotheses and reach general conclusions, comprehensive research should be undertaken with a wide variety of real and simulated species. Meanwhile, we recommend that researchers assess the critical yet underappreciated issue of model complexity both via information criteria and performance on withheld data, comparing the results between the two approaches and taking into account ecological plausibility.
物种生态位与地理分布模型现已成为生态学、进化生物学及生物地理学领域广泛应用的研究工具。然而,多数物种仅存少量可获取的出现位点这一普遍情况,给此类建模技术带来了重大挑战,尤其在模型复杂度与模型评估层面。为此,本文系统梳理了该领域针对此类问题的研究现状,并以马达加斯加特有小型哺乳动物——马岛鼠亚科啮齿类(nesomyine rodent)Eliurus majori为例,演示了最大熵模型(Maxent)的实操流程。当前学界存在两类主流模型选择方法:一类为信息准则类方法,具体即修正赤池信息准则(AICc);另一类为基于刀切法(jackknife)的预留数据预测性能评估法,但AICc并不严格适用于Maxent这类机器学习算法。我们分别对比了基于两类选择方法得到的模型,与采用Maxent默认参数构建的模型;其中部分实验对出现记录进行了空间过滤(spatial filtering)以降低采样偏差(sampling bias)的影响,另一部分则未做此类处理。两类模型选择方法均筛选出了比默认参数构建的模型更为简洁的模型。此外,当考虑采样偏差时,两类方法得到的模型结果趋于一致;但在未过滤出现记录的数据集下,二者结果差异显著。具体而言,在未过滤的数据集上,基于AICc筛选的模型参数数量远少于基于预留数据性能筛选的模型。结合该研究物种的生物学认知,结合空间过滤的AICc筛选模型与预留数据性能筛选模型均展现出更高的生态学合理性。针对该物种的研究结果表明,AICc或许能够持续筛选出参数更少、对采样偏差更具鲁棒性的模型。为验证上述假说并得出普适性结论,未来需针对大量真实物种与模拟物种开展系统性研究。同时,本文建议研究者通过信息准则与预留数据性能评估两种方式,对模型复杂度这一关键却未得到足够重视的问题进行评估,对比两种方法的结果,并结合生态学合理性进行综合考量。
创建时间:
2017-07-11



