five

Generalized Data Thinning Using Sufficient Statistics

收藏
DataCite Commons2024-06-13 更新2024-08-19 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Generalized_data_thinning_using_sufficient_statistics/25809729/2
下载链接
链接失效反馈
官方服务:
资源简介:
Our goal is to develop a general strategy to decompose a random variable <i>X</i> into multiple independent random variables, without sacrificing any information about unknown parameters. A recent paper showed that for some well-known natural exponential families, <i>X</i> can be <i>thinned</i> into independent random variables X(1),…,X(K), such that X=∑k=1KX(k). These independent random variables can then be used for various model validation and inference tasks, including in contexts where traditional sample splitting fails. In this article, we generalize their procedure by relaxing this summation requirement and simply asking that some known function of the independent random variables exactly reconstruct <i>X</i>. This generalization of the procedure serves two purposes. First, it greatly expands the families of distributions for which thinning can be performed. Second, it unifies sample splitting and data thinning, which on the surface seem to be very different, as applications of the same principle. This shared principle is sufficiency. We use this insight to perform generalized thinning operations for a diverse set of families. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
提供机构:
Taylor & Francis
创建时间:
2024-06-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作