five

Replication Data for: Polling India via Regression and Post-Stratification of Non-Probability Online Samples

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://doi.org/10.7910/DVN/0RRVKJ
下载链接
链接失效反馈
官方服务:
资源简介:
Recent technological advances have facilitated the collection of large-scale administrative data and the online surveying of the Indian population. Building on these we propose a strategy for more robust, frequent and transparent projections of the Indian vote during the campaign. We execute a modified MrP model of Indian vote preferences that proposes innovations to each of its three core components: stratification frame, training data, and a learner. For the post-stratification frame we propose a novel Data Integration approach that allows the simultaneous estimation of counts from multiple complementary sources, such as census tables and auxiliary surveys. For the training data we assemble panels of respondents from two unorthodox online populations: Amazon Mechanical Turks workers and Facebook users. And as a modeling tool, we replace the Bayesian multilevel regression learner with Random Forests. Our 2019 pre-election forecasts for the two largest Lok Sahba coalitions were very close to actual outcomes: we predicted 41.6% for the NDA, against an observed value of 45.0\% and 30.6\% for the UPA against an observed vote share of just under 31.3\%. Our uniform-swing seat projection outperforms other pollsters -- we had the lowest absolute error of 87 seats.
创建时间:
2021-06-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作