Data from: A hierarchical Bayesian approach for handling missing classification data

Name: Data from: A hierarchical Bayesian approach for handling missing classification data
Creator: Dryad
Published: 2025-04-01 05:23:27
License: 暂无描述

DataCite Commons2025-04-01 更新2025-04-09 收录

下载链接：

https://datadryad.org/dataset/doi:10.5061/dryad.8h36t01

下载链接

链接失效反馈

官方服务：

资源简介：

Ecologists use classifications of individuals in categories to understand composition of populations and communities. These categories might be defined by demographics, functional traits, or species. Assignment of categories is often imperfect, but frequently treated as observations without error. When individuals are observed but not classified, these “partial” observations must be modified to include the missing data mechanism to avoid spurious inference. We developed two hierarchical Bayesian models to overcome the assumption of perfect assignment to mutually exclusive categories in the multinomial distribution of categorical counts, when classifications are missing. These models incorporate auxiliary information to adjust the posterior distributions of the proportions of membership in categories. In one model, we use an empirical Bayes approach, where a subset of data from one year serves as a prior for the missing data the next. In the other approach, we use a small random sample of data within a year to inform the distribution of the missing data. We performed a simulation to show the bias that occurs when partial observations were ignored and demonstrated the altered inference for the estimation of demographic ratios. We applied our models to demographic classifications of elk (Cervus elaphus nelsoni) to demonstrate improved inference for the proportions of sex and stage classes. We developed multiple modeling approaches using a generalizable nested multinomial structure to account for partially observed data that were missing not at random for classification counts. Accounting for classification uncertainty is important to accurately understand the composition of populations and communities in ecological studies.

生态学家通过对个体进行类别划分，以探究种群与群落的组成结构。此类别可基于种群统计特征、功能性状或物种身份进行界定。类别分配过程往往存在瑕疵，但研究中常将其视作无误差的观测结果。当个体被观测到却未完成分类时，这类“部分”观测数据需纳入缺失数据机制进行修正，以避免产生伪推断（spurious inference）。当分类信息缺失时，针对分类计数的多项分布（multinomial distribution）中“个体被完美分配至互斥类别（mutually exclusive categories）”的固有假设，我们开发了分层贝叶斯模型（hierarchical Bayesian models）以破除该局限。此类模型纳入辅助信息，用于调整各类别归属比例的后验分布（posterior distributions）。其中一款模型采用经验贝叶斯（empirical Bayes）方法：以某一年度的部分数据作为下一年度缺失分类数据的先验信息。另一款模型则利用单年度内的小型随机样本数据，为缺失分类数据的分布提供先验依据。我们通过模拟实验，展示了忽略部分观测数据时产生的估计偏倚，并验证了该偏倚对种群统计比率推断结果的影响。我们将所提模型应用于北美马鹿（Cervus elaphus nelsoni）的种群统计分类数据，以验证其在性别与发育阶段类别占比推断中的优化效果。我们基于可推广的嵌套多项分布结构，开发了多种建模方法，用以处理分类计数中存在的非随机缺失（missing not at random）部分观测数据。在生态学研究中，考虑分类过程的不确定性，对于准确解析种群与群落的组成结构至关重要。

提供机构：

Dryad

创建时间：

2019-01-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集