five

Bayesian Dynamic Feature Partitioning in High-Dimensional Regression with Big Data

收藏
Taylor & Francis Group2022-04-21 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Bayesian_Dynamic_Feature_Partitioning_in_High-Dimensional_Regression_with_Big_Data/14939291/1
下载链接
链接失效反馈
官方服务:
资源简介:
Bayesian computation of high dimensional linear regression models using Markov Chain Monte Carlo (MCMC) or its variants can be extremely slow or completely prohibitive since these methods perform costly computations at each iteration of the sampling chain. Furthermore, this computational cost cannot usually be efficiently divided across a parallel architecture. These problems are aggravated if the data size is large or data arrive sequentially over time (streaming or online settings). This article proposes a novel dynamic feature partitioned regression (DFP) for efficient online inference for high dimensional linear regressions with large or streaming data. DFP constructs a <i>pseudo posterior density</i> of the parameters at every time point, and quickly updates the pseudo posterior when a new block of data (data shard) arrives. DFP updates the pseudo posterior at every time point suitably and partitions the set of parameters to exploit parallelization for efficient posterior computation. The proposed approach is applied to high dimensional linear regression models with Gaussian scale mixture priors and spike and slab priors on large parameter spaces, along with large data, and is found to yield state-of-the-art inferential performance. The algorithm enjoys theoretical support with pseudo posterior densities over time being arbitrarily close to the full posterior as the data size grows, as shown in the supplementary material. Supplementary material also contains details of the DFP algorithm applied to different priors. Package to implement DFP is available in https://github.com/Rene-Gutierrez/DynParRegReg. The dataset is available in https://github.com/Rene-Gutierrez/DynParRegReg_Implementation.
提供机构:
Gutierrez, Rene; Guhaniyogi, Rajarshi
创建时间:
2021-07-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作