Multiple Imputation by Ordered Monotone Blocks With Application to the Anthrax Vaccine Research Program
收藏DataCite Commons2024-03-24 更新2024-07-27 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Multiple_Imputation_by_Ordered_Monotone_Blocks_With_Application_to_the_Anthrax_Vaccine_Research_Program/1067056/1
下载链接
链接失效反馈官方服务:
资源简介:
Multiple imputation (MI) has become a standard statistical technique for dealing with missing values. The CDC Anthrax Vaccine Research Program (AVRP) dataset created new challenges for MI due to the large number of variables of different types and the limited sample size. A common method for imputing missing data in such complex studies is to specify, for each of <i>J</i> variables with missing values, a univariate conditional distribution given all other variables, and then to draw imputations by iterating over the <i>J</i> conditional distributions. Such fully conditional imputation strategies have the theoretical drawback that the conditional distributions may be incompatible. When the missingness pattern is monotone, a theoretically valid approach is to specify, for each variable with missing values, a conditional distribution given the variables with fewer or the same number of missing values and sequentially draw from these distributions. In this article, we propose the “multiple imputation by ordered monotone blocks” approach, which combines these two basic approaches by decomposing any missingness pattern into a collection of smaller “constructed” monotone missingness patterns, and iterating. We apply this strategy to impute the missing data in the AVRP interim data. Supplemental materials, including all source code and a synthetic example dataset, are available online.
多重插补(Multiple Imputation, MI)已成为处理缺失值的标准统计技术。美国疾病控制与预防中心(CDC)炭疽疫苗研究计划(Anthrax Vaccine Research Program, AVRP)数据集因包含大量不同类型的变量且样本量有限,给多重插补方法带来了新的挑战。针对此类复杂研究中的缺失数据插补,常见方法为:为每一个存在缺失值的变量(共<i>J</i>个)指定一个以其余所有变量为条件的单变量条件分布,随后通过遍历这<i>J</i>个条件分布来抽取插补值。此类全条件插补策略存在理论缺陷:其条件分布可能互不兼容。当缺失模式为单调型时,一种理论上严谨的方法为:为每一个存在缺失值的变量指定一个以缺失值数目更少或相等的变量为条件的分布,并依次从这些分布中抽取插补值。本文提出了「有序单调块多重插补」方法,该方法通过将任意缺失模式拆解为若干小型「构造性」单调缺失模式并迭代执行,结合了上述两种基础方法。我们将该策略应用于AVRP中期数据的缺失值插补工作。包括全部源代码与合成示例数据集在内的补充材料均可在线获取。
提供机构:
Taylor & Francis
创建时间:
2016-01-19



