To Adjust or not to Adjust? Estimating the Average Treatment Effect in Randomized Experiments with Missing Covariates

Name: To Adjust or not to Adjust? Estimating the Average Treatment Effect in Randomized Experiments with Missing Covariates
Creator: Ding, Peng; Zhao, Anqi
Published: 2022-10-17 00:00:00
License: 暂无描述

Taylor & Francis Group2022-10-17 更新2026-04-16 收录

下载链接：

https://tandf.figshare.com/articles/dataset/To_adjust_or_not_to_adjust_Estimating_the_average_treatment_effect_in_randomized_experiments_with_missing_covariates/21082244

下载链接

链接失效反馈

官方服务：

资源简介：

Randomized experiments allow for consistent estimation of the average treatment effect based on the difference in mean outcomes without strong modeling assumptions. Appropriate use of pretreatment covariates can further improve the estimation efficiency. Missingness in covariates is nevertheless common in practice, and raises an important question: should we adjust for covariates subject to missingness, and if so, how? The unadjusted difference in means is always unbiased. The complete-covariate analysis adjusts for all completely observed covariates, and is asymptotically more efficient than the difference in means if at least one completely observed covariate is predictive of the outcome. Then what is the additional gain of adjusting for covariates subject to missingness? To reconcile the conflicting recommendations in the literature, we analyze and compare five strategies for handling missing covariates in randomized experiments under the design-based framework, and recommend the missingness-indicator method, as a known but not so popular strategy in the literature, due to its multiple advantages. First, it removes the dependence of the regression-adjusted estimators on the imputed values for the missing covariates. Second, it does not require modeling the missingness mechanism, and yields consistent estimators even when the missingness mechanism is related to the missing covariates and unobservable potential outcomes. Third, it ensures large-sample efficiency over the complete-covariate analysis and the analysis based on only the imputed covariates. Lastly, it is easy to implement via least squares. We also propose modifications to it based on asymptotic and finite sample considerations. Importantly, our theory views randomization as the basis for inference, and does not impose any modeling assumptions on the data-generating process or missingness mechanism. Supplementary materials for this article are available online.

随机对照试验可在无需较强建模假设的前提下，基于结局指标的均值差实现平均处理效应（average treatment effect）的一致估计。合理利用预处理协变量（pretreatment covariates）可进一步提升估计效率。然而，协变量缺失在实际研究中颇为常见，由此引出一项核心议题：我们是否应当针对存在缺失的协变量进行校正？若需校正，具体应如何操作？未校正的均值差估计量始终具备无偏性。完全协变量分析（complete-covariate analysis）会针对所有已完全观测的协变量进行校正，若至少存在一个可预测结局的完全观测协变量，则其渐近效率优于均值差估计。那么，针对存在缺失的协变量进行校正，其额外收益究竟为何？为调和现有文献中的分歧建议，本文基于试验设计的推断框架（design-based framework），分析并比较了随机对照试验中处理协变量缺失的五种策略，并鉴于其多重优势，推荐了一种虽已被提出但在学界尚未广泛应用的缺失指示变量法（missingness-indicator method）。该方法具备如下多重优势：其一，可消除回归校正估计量对缺失协变量插补值的依赖；其二，无需对缺失机制进行建模，即便缺失机制与缺失协变量及不可观测的潜在结局相关，仍可得到一致估计量；其三，相较于完全协变量分析以及仅基于插补协变量的分析方法，可保证大样本下的估计效率；其四，可通过最小二乘法便捷实现。本文同时基于渐近性质与有限样本特性，对该方法提出了改进方案。尤为关键的是，本文的理论框架以随机化为推断基础，未对数据生成过程或缺失机制施加任何建模假设。本文的补充材料可在线获取。

提供机构：

Ding, Peng; Zhao, Anqi

创建时间：

2022-10-17