five

Influential Observations in Bayesian Regression Tree Models

收藏
Taylor & Francis Group2023-06-21 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Influential_Observations_in_Bayesian_Regression_Tree_Models/22963610/1
下载链接
链接失效反馈
官方服务:
资源简介:
Bayesian Classification and Regression Trees (BCART) and Bayesian Additive Regression Trees (BART) are popular Bayesian regression models widely applicable in modern regression problems. Their popularity is intimately tied to the ability to flexibly model complex responses depending on high-dimensional inputs while simultaneously being able to quantify uncertainties. This ability to quantify uncertainties is key, as it allows researchers to perform appropriate inferential analyses in settings that have generally been too difficult to handle using the Bayesian approach. However, surprisingly little work has been done to evaluate the sensitivity of these modern regression models to violations of modeling assumptions. In particular, we will consider influential observations, which one reasonably would imagine to be common—or at least a concern—in the big-data setting. In this article, we consider both the problem of detecting influential observations and adjusting predictions to not be unduly affected by such potentially problematic data. We consider three detection diagnostics for Bayesian tree models, one an analogue of Cook’s distance and the others taking the form of a divergence measure and a conditional predictive density metric, and then propose an importance sampling algorithm to re-weight previously sampled posterior draws so as to remove the effects of influential data in a computationally efficient manner. Finally, our methods are demonstrated on real-world data where blind application of the models can lead to poor predictions and inference. Supplementary materials for this article are available online.
提供机构:
McCulloch, R. E.; George, E. I.; Pratola, M. T.
创建时间:
2023-05-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作