COVID-19 High Frequency Phone Survey of Households 2021, Round 4 - Viet Nam
收藏microdata.worldbank.org2023-10-26 更新2025-03-26 收录
下载链接:
https://microdata.worldbank.org/index.php/catalog/4063
下载链接
链接失效反馈官方服务:
资源简介:
Geographic coverage
---------------------------
National, regional
Analysis unit
---------------------------
Households
Kind of data
---------------------------
Sample survey data [ssd]
Sampling procedure
---------------------------
The 2020/21 Vietnam COVID-19 High Frequency Phone Survey of Households (VHFPS) uses a nationally representative household survey from 2018 as the sampling frame. The 2018 baseline survey includes 46,980 households from 3132 communes (about 25% of total communes in Vietnam). In each commune, one EA is randomly selected and then 15 households are randomly selected in each EA for interview. We use the large module of to select the households for official interview of the VHFPS survey and the small module households as reserve for replacement.
After data processing, the final sample size for Round 4 is 3,945 households.
Mode of data collection
---------------------------
Computer Assisted Telephone Interview [cati]
Research instrument
---------------------------
The questionnaire for this round consisted of the following sections
Section 2. Behavior
Section 5. Employment (main respondent)
Section 6. Coping
Section 7. Safety Nets
Section 8. FIES
Section 10. Opinion
Section 11. Vaccine
Note: Some categorical responses have been merged in the anonymized data set for confidentiality.
Cleaning operations
---------------------------
Data cleaning began during the data collection process. Inputs for the cleaning process include available interviewers’ note following each question item, interviewers’ note at the end of the tablet form as well as supervisors’ note during monitoring. The data cleaning process was conducted in following steps:
• Append households interviewed in ethnic minority languages with the main dataset interviewed in Vietnamese.
• Remove unnecessary variables which were automatically calculated by SurveyCTO
• Remove household duplicates in the dataset where the same form is submitted more than once.
• Remove observations of households which were not supposed to be interviewed following the identified replacement procedure.
• Format variables as their object type (string, integer, decimal, etc.)
• Read through interviewers’ note and make adjustment accordingly. During interviews, whenever interviewers find it difficult to choose a correct code, they are recommended to choose the most appropriate one and write down respondents’ answer in detail so that the survey management team will justify and make a decision which code is best suitable for such answer.
• Correct data based on supervisors’ note where enumerators entered wrong code.
• Recode answer option “Other, please specify”. This option is usually followed by a blank line allowing enumerators to type or write texts to specify the answer. The data cleaning team checked thoroughly this type of answers to decide whether each answer needed recoding into one of the available categories or just keep the answer originally recorded. In some cases, that answer could be assigned a completely new code if it appeared many times in the survey dataset.
• Examine data accuracy of outlier values, defined as values that lie outside both 5th and 95th percentiles, by listening to interview recordings.
• Final check on matching main dataset with different sections, where information is asked on individual level, are kept in separate data files and in long form.
• Label variables using the full question text.
• Label variable values where necessary.
地理覆盖范围
---------------------------
国家级、区域级
分析单元
---------------------------
家庭
数据类型
---------------------------
样本调查数据 [ssd]
抽样程序
---------------------------
2020/21 越南 COVID-19 高频电话家庭调查(VHFPS)以 2018 年的国家代表性家庭调查作为抽样框架。2018 年的基础调查涵盖了来自 3132 个乡镇(约占越南总乡镇的 25%)的 46,980 个家庭。在每个乡镇中,随机选择一个 EA,然后在每个 EA 中随机选择 15 个家庭进行访谈。我们使用大模块选择 VHFPS 调查的官方访谈家庭,并将小模块家庭作为备用。在数据整理后,第 4 轮的最终样本量为 3,945 个家庭。
数据收集方式
---------------------------
计算机辅助电话访谈 [cati]
研究工具
---------------------------
本轮问卷包括以下部分
章节 2. 行为
章节 5. 就业(主要受访者)
章节 6. 应对策略
章节 7. 安全网
章节 8. FIES
章节 10. 意见
章节 11. 疫苗
注:为保护隐私,匿名数据集中已合并部分分类响应。
数据清理操作
---------------------------
数据清理工作在数据收集过程中开始。清理过程的输入包括每个问题项后的可用访谈员笔记、平板表格表尾的访谈员笔记以及监控过程中的监督员笔记。数据清理过程按照以下步骤进行:
• 将使用少数民族语言访谈的家庭与主要使用越南语访谈的dataset家庭合并。
• 删除由 SurveyCTO 自动计算的无关变量。
• 删除数据集中提交超过一次的重复家庭表格。
• 删除根据确定的替换程序不应访谈的家庭的观察数据。
• 将变量格式化为它们的对象类型(字符串、整数、小数等)。
• 仔细阅读访谈员笔记并根据需要进行调整。在访谈过程中,如果访谈员发现难以选择正确的代码,建议选择最合适的代码,并详细记录受访者的答案,以便调查管理团队进行核实并作出最佳决策。
• 根据监督员的笔记更正数据,因为统计员输入了错误的代码。
• 重新编码“其他,请具体说明”的答案选项。此选项通常后跟一个空白行,允许统计员输入或书写以具体说明答案的文本。数据清理团队仔细检查此类答案,以决定每个答案是否需要重新编码到可用的类别之一,或者仅保留原始记录的答案。在某些情况下,如果该答案在调查数据集中出现多次,则可以为其分配一个全新的代码。
• 通过收听访谈录音来检查异常值的准确性,异常值定义为位于第 5 百分位和第 95 百分位之外的值。
• 对主要dataset与不同章节的匹配进行最终检查,其中对个人层面的信息进行了询问,这些信息保留在单独的数据文件和长格式中。
• 使用完整的问题文本对变量进行标注。
• 在必要时对变量值进行标注。
提供机构:
microdata.worldbank.org



