five

Score Matching for Compositional Distributions

收藏
Taylor & Francis Group2022-03-03 更新2026-04-16 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Score_matching_for_compositional_distributions/17161180/2
下载链接
链接失效反馈
官方服务:
资源简介:
Compositional data are challenging to analyse due to the non-negativity and sum-to-one constraints on the sample space. With real data, it is often the case that many of the compositional components are highly right-skewed, with large numbers of zeros. Major limitations of currently available models for compositional data include one or more of the following: insufficient flexibility in terms of distributional shape; difficulty in accommodating zeros in the data in estimation; and lack of computational viability in moderate to high dimensions. In this article, we propose a new model, the polynomially tilted pairwise interaction (PPI) model, for analysing compositional data. Maximum likelihood estimation is difficult for the PPI model. Instead, we propose novel score matching estimators, which entails extending the score matching approach to Riemannian manifolds with boundary. These new estimators are available in closed form and simulation studies show that they perform well in practice. As our main application, we analyse real microbiome count data with fixed totals using a multinomial latent variable model with a PPI model for the latent variable distribution. We prove that, under certain conditions, the new score matching estimators are consistent for the parameters in the new multinomial latent variable model.

成分数据(compositional data)因样本空间需满足非负性与和为1的双重约束,分析难度颇高。在实际应用场景中,多数成分数据的组分往往呈现显著右偏分布,且伴随大量零值。当前主流的成分数据分析模型存在以下一项或多项核心局限:分布形状适配灵活性不足;估计过程中难以适配数据中的零值;在中高维场景下缺乏计算可行性。本文针对成分数据分析任务,提出一种新型模型:多项式倾斜成对交互(polynomially tilted pairwise interaction, PPI)模型。PPI模型难以通过极大似然估计完成求解,为此我们提出了全新的得分匹配(score matching)估计量,该方法需将得分匹配框架拓展至带边界的黎曼流形(Riemannian manifolds)场景。这类新型估计量可表示为闭式形式,仿真实验结果表明其在实际应用中表现优异。作为核心应用案例,我们借助以PPI模型作为潜变量分布的多项潜变量模型,对固定总和的实际微生物组计数数据展开分析。我们证明,在特定条件下,新型得分匹配估计量针对该多项潜变量模型的参数具备相合性。
提供机构:
Wood, Andrew T. A.; Scealy, Janice L.
创建时间:
2022-03-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作