Replication data for: Reliable Inference in Highly Stratified Contingency Tables: Using Latent Class Models as Density Estimators
收藏DataONE2015-04-11 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/sha256:92beebe0408569414f88c2ea33a60154b88a20e39e4ac9942ca700dedde240c3
下载链接
链接失效反馈官方服务:
资源简介:
Contingency tables are among the most basic and useful techniques available for analyzing categorical data, but they produce highly imprecise estimates in small samples or for population subgroups that arise following repeated stratification. I demonstrate that preprocessing an observed set of categorical variables using a latent class model can greatly improve the quality of table-based inferences. As a density estimator, the latent class model closely approximates the underlying joint distribution of the variables of interest, which enables reliable estimation of conditional probabilities and marginal effects, even among subgroups containing fewer than 40 observations. Though here focused on applications to public opinion, the procedure has a wide range of potential uses. I illustrate the benefits of the latent class model-based approach for greatly improved accuracy in estimating and forecasting vote preferences within small demographic subgroups using survey data from the 2004 and 2008 U.S. presidential election campaigns.
列联表(Contingency Tables)是当前分析分类数据最基础且实用的技术手段之一,但在小样本场景下,或是针对经多次分层得到的总体子群体时,其估计结果往往精度极低。本文证明,使用潜在类别模型(Latent Class Model)对观测到的一组分类变量进行预处理,可显著提升基于列联表的统计推断质量。作为一种密度估计方法,潜在类别模型能够精准拟合目标变量的潜在联合分布,即便针对样本量不足40个的子群体,也可实现条件概率与边际效应的可靠估计。尽管本文的方法聚焦于民意调查场景,但该流程具备广泛的潜在应用价值。本文借助2004年与2008年美国总统大选的调研数据,展示了基于潜在类别模型的方法在小型人口统计学子群体的投票倾向估计与预测中,可大幅提升预测精度的优势。
创建时间:
2023-11-20



