Replication Data for: Comparative investigation of time series missing data imputation in political science: Different methods, different results

DataONE2020-07-24 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/sha256:0bcd89883b0fef0c7ce87d5e8d38f8c06139ed1c7185eda71d1b87384fd53120

下载链接

链接失效反馈

官方服务：

资源简介：

Missing data is a growing concern in social science research. This paper introduces novel machine-learning methods to explore imputation efficiency and its effect on missing data. The authors used Internet and public service data as the test examples. The empirical results show that the method not only verified the robustness of the positive impact of Internet penetration on the public service, but also further ensured that the machine-learning imputation method was better than random and multiple imputation, greatly improving the model’s explanatory power. The panel data after machine-learning imputation with better continuity in the time trend is feasibly analyzed, which can also be analyzed using the dynamic panel model. The long-term effects of the Internet on public services were found to be significantly stronger than the short-term effects. Finally, some mechanisms in the empirical analysis are discussed.

缺失数据（missing data）问题是社会科学研究中日益受到关注的重要议题。本文提出新型机器学习（machine learning）方法，以探究缺失数据的插补（imputation）效率及其影响。作者以互联网与公共服务数据作为测试样本开展实证研究。实证结果表明，该方法不仅验证了互联网普及率对公共服务的正向影响的稳健性（robustness），还进一步证实机器学习插补方法优于随机插补与多重插补，大幅提升了模型的解释力（explanatory power）。经机器学习插补后的面板数据（panel data）在时间趋势上具备更优连续性，可进行可靠分析，亦可采用动态面板模型（dynamic panel model）开展研究。研究发现，互联网对公共服务的长期影响显著强于短期影响。最后，本文对实证分析中的若干作用机制进行了探讨。

创建时间：

2023-11-22