five

Socio-economic impact of COVID-19 on refugees - Panel Study - Kenya

收藏
microdata.unhcr.org2021-02-26 更新2025-01-22 收录
下载链接:
https://microdata.unhcr.org/index.php/catalog/296
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract --------------------------- The World Bank and UNHCR in collaboration with the Kenya National Bureau of Statistics and the University of California, Berkeley are conducting the Kenya COVID-19 Rapid Response Phone Survey to track the socioeconomic impacts of the COVID-19 pandemic, the recovery from it as well as other shocks to provide timely data to inform a targeted response. This dataset contains information from eight waves of the COVID-19 RRPS, which is part of a panel survey that targets refugee household and started in May 2020. The same households were interviewed every two months for five survey rounds, in the first year of data collection, and every four months thereafter, with interviews conducted using Computer Assisted Telephone Interviewing (CATI) techniques. The sample aims to be representative of the refugee and stateless population in Kenya. It comprises five strata: Kakuma refugee camp, Kalobeyei settlement, Dadaab refugee camp, urban refugees, and Shona stateless. Waves 1-7 of this survey include information on household background, service access, employment, food security, income loss, transfers, health, and COVID-19 knowledge. Wave 8 focused on how households were exposed to shocks, in particular adverse weather shocks and the increase in the price of food and fuel, but also included parts of the previous modules on household background, service access, employment, food security, income loss, and subjective wellbeing. The data is uploaded in three files. The first is the hh file, which contains household level information. The 'hhid', uniquely identifies all household. The second is the adult level file, which contains data at the level of adult household members. Each adult in a household is uniquely identified by the 'adult_id'. The third file is the child level file, available only for waves 3-7, which contains information for every child in the household. Each child in a household is uniquely identified by the 'child_id'. The duration of data collection and sample size for each completed wave was: Wave 1: May 14 to July 7, 2020; 1,328 refugee households Wave 2: July 16 to September 18, 2020; 1,699 refugee households Wave 3: September 28 to December 2, 2020; 1,487 refugee households Wave 4: January 15 to March 25, 2021; 1,376 refugee households Wave 5: March 29 to June 13, 2021; 1,562 refugee households Wave 6: July 14 to November 3, 2021; 1,407 refugee households Wave 7: November 15, 2021, to March 31, 2022; 1,281 refugee households Wave 8: May 31 to July 8, 2022: 1,355 refugee households The same questionnaire is also administered to nationals in Kenya, with the data available in the WB microdata library: <https://microdata.worldbank.org/index.php/catalog/3774> Geographic coverage --------------------------- National coverage covering rural and urban areas Analysis unit --------------------------- Individual and Household Universe --------------------------- All persons of concern for UNHCR Kind of data --------------------------- Sample survey data [ssd] Sampling procedure --------------------------- The sample aims to be representative of the refugee and stateless population in Kenya. It comprises five strata: Kakuma refugee camp, Kalobeyei settlement, Dadaab refugee camp, urban refugees, and Shona stateless, where sampling approaches differ across strata. For refugees in Kakuma and Kalobeyei, as well as for stateless people, recently conducted Socioeconomic Surveys (SES), were used as sampling frames. For the refugee population living in urban areas and the Dadaab camp, no such household survey data existed, and sampling frames were based on UNHCR's registration records (proGres), which include phone numbers. For Kakuma, Kalobeyei, Dadaab and urban refugees, a two-step sampling process was used. First, 1,000 individuals from each stratum were selected from the corresponding sampling frames. Each of these individuals received a text message to confirm that the registered phone was still active. In the second stage, implicitly stratifying by sex and age, the verified phone number lists were used to select the sample. Until wave 7 sampled households that were not reached in earlier waves were also contacted along with households that were interviewed before. In wave 8 only households that had previously participated in the survey were contacted for interview. The “wave” variable represents in which wave the households were interviewed in. For the stateless population, all the participants of the Shona socioeconomic survey (n=400) were included in the RRPS, because of limited sample size. The sampling frames for the refugee and Shona stateless communities are thus representative of households with active phone numbers registered with UNHCR. Mode of data collection --------------------------- Computer Assisted Telephone Interview [cati] Research instrument --------------------------- The questionnaire included 12 sections Section 1: Introduction Section 2: Household background Section 3: Travel patterns and interactions Section 4: Employment Section 5: Food security Section 6: Income Loss Section 7: Transfers Section 8: Subjective welfare (50% of sample) Section 9: Health Section 10: COVID Knowledge Section 11: Household and Social Relations (50% of sample) Section 12: Conclusion Cleaning operations --------------------------- Variable names were kept constant across survey waves. For questions that remained exactly the same across survey waves, data points for all waves can be found under one variable name. For questions where the phrasing changed (even in a minimal way) across waves, variable names were also changed to reflect the change in phrasing. Extended missing values are used to indicate why a value is missing for all variables. The following extended missing values are used in the dataset: · .a for 'Don't know' · .b for 'Refused to respond' · .c for 'Outliers set to missing' · .d for 'Inconsistency set to missing' (used for employment data as explained below) · .e for 'Field Skipped' (where an error in the survey tool caused the question to be missed) · .z for 'Not administered' (as the variable was not relevant to the observation) More detailed data on children was collected between waves 3 and 7, compared to waves 1, 2 and 8. In waves 1 and 2, data on children, e.g. on their learning activities, was collected for all children in a household with one question. Therefore, variables related to children are part of the 'hh' data for waves 1 and 2. Between waves 3 and 7, questions on children in the household were asked for specific children. Some questions covered all children, while others were only administered to one randomly selected child in the household. This approach allows to disaggregate data at the level of the child household members, and the data can be found in the 'child' data set. The household level weights can be used for analysis of the children's data. In wave 8, detailed information on children was dropped, as the questionnaire focused on other topics. The education status of household members, except for the respondent, was imputed for rounds 1 and 2. For rounds 1 and 2, only the education status of the respondent was elicited, while for later rounds the education status for each household member was asked. In order to evaluate outcomes by the household member's education status, information on education was imputed for waves 1 and 2, using the information provided for all household members in waves 3, 4, and 5. This resulted in additional information on the education status for household members in round 1 and 2, which was not yet available for earlier versions of this data. Some questions are not asked repeatedly across waves such that their values were imputed. For some questions, answers are not possible or unlikely to change within two months between survey waves such that households were not asked about them in all waves. The questions on assets owned before March 2020 were only asked to households when they are interviewed for the first time. The questions on the dwelling's wall and floor material as well as the household's connection to the power grid was not asked for all households in wave 2 and 3, where only new households and those who moved were covered by these questions. Questions on the main source of electricity in the households and types of assets owned were not asked in wave 8. The missing values those variables have when they were not asked, are imputed from the answers given in earlier waves. Improved quality insurance algorithms lead to minor revisions to wave 1 to 5 data. Based on additional data checks, the team has made minor refinements to wave 1 to 5 data. The identification of the household members that were the respondent or the household head was refined in the rare cases where it was not possible to interview the same respondent as in previous waves for a given household such that another adult was interviewed. For this reason, for about 2 percent of observations the household head status was assigned to an incorrect household member, which was corrected. For <1 percent of households the respondent did not appear in adult level dataset. For about 1 percent of observations in wave 5 the respondent appeared twice in the adult level dataset. Data from questions on COVID-19 vaccinations from wave 7 was dropped from the dataset. Due to significantly higher self-reported vaccination rates compared to official administrative records, data on vaccinations was deemed unreliable, most likely due to social desirability bias. Consequently, questions on vaccination status and questions using the vaccination data as a validation criterion were dropped from the datasets.

摘要 --------------------------- 世界银行与联合国难民署携手肯尼亚国家统计局和加州大学伯克利分校共同开展肯尼亚COVID-19快速响应电话调查,旨在追踪COVID-19大流行对社会经济的影响、恢复情况以及其他冲击,以提供及时数据,为有针对性的应对措施提供信息。本数据集包含来自COVID-19 RRPS八轮调查的信息,RRPS是针对难民家庭进行的横截面调查,始于2020年5月。在数据收集的第一年中,每两个月对同一家庭进行一次访谈,共进行了五轮调查,之后每四个月进行一次访谈,访谈采用计算机辅助电话访谈(CATI)技术进行。样本旨在代表肯尼亚的难民和无国籍人口。它包括五个层次:卡库马难民营地、卡洛贝伊定居点、达达布难民营地、城市难民和肖纳无国籍者。调查的第1至7轮包含有关家庭背景、服务获取、就业、粮食安全、收入损失、转移支付、健康和COVID-19知识的详细信息。第8轮重点调查家庭如何受到冲击,特别是不利的天气冲击以及食品和燃料价格的上涨,但也包括先前模块中有关家庭背景、服务获取、就业、粮食安全、收入损失和主观福祉的部分。 数据以三个文件上传。第一个是hh文件,包含家庭层面的信息。'hhid'唯一标识所有家庭。第二个是成人级别文件,包含家庭成年成员层面的数据。家庭中的每个成年人都通过'adult_id'唯一标识。第三个文件是儿童级别文件,仅在3至7轮调查中提供,包含家庭中每个儿童的信息。家庭中的每个儿童都通过'child_id'唯一标识。 数据收集的持续时间和每个完成的波次样本量如下: 波次1:2020年5月14日至7月7日;1,328个难民家庭 波次2:2020年7月16日至9月18日;1,699个难民家庭 波次3:2020年9月28日至12月2日;1,487个难民家庭 波次4:2021年1月15日至3月25日;1,376个难民家庭 波次5:2021年3月29日至6月13日;1,562个难民家庭 波次6:2021年7月14日至11月3日;1,407个难民家庭 波次7:2021年11月15日至2022年3月31日;1,281个难民家庭 波次8:2022年5月31日至7月8日:1,355个难民家庭 同一问卷也针对肯尼亚国民进行施测,数据可在世界银行微观数据库中找到:<https://microdata.worldbank.org/index.php/catalog/3774>。 地理覆盖范围 --------------------------- 全国覆盖,涵盖农村和城市地区。 分析单元 --------------------------- 个人和家庭。 总体 --------------------------- 联合国难民署关注的所有人。 数据类型 --------------------------- 样本调查数据 [ssd]。 抽样程序 --------------------------- 样本旨在代表肯尼亚的难民和无国籍人口。它包括五个层次:卡库马难民营地、卡洛贝伊定居点、达达布难民营地、城市难民和肖纳无国籍者,各层次抽样方法不同。对于卡库马和卡洛贝伊的难民以及无国籍人士,使用了最近进行的社会经济调查(SES)作为抽样框架。对于居住在城市地区和达达布营地的难民人口,没有此类家庭调查数据,抽样框架基于联合国难民署的登记记录(proGres),其中包含电话号码。对于卡库马、卡洛贝伊、达达布和城市难民,采用两阶段抽样过程。首先,从每个层次对应的抽样框架中选择了1,000人。每个此类个人都会收到一条短信,以确认注册的电话仍然活跃。在第二阶段,通过性别和年龄进行隐式分层,使用经过验证的电话号码列表进行抽样。在波次7中,联系了在先前波次未接触到的家庭,以及之前接受过访谈的家庭。在波次8中,仅联系了之前参与过调查的家庭进行访谈。'wave'变量表示家庭接受访谈的波次。对于无国籍人口,所有参加肖纳社会经济调查(n=400)的参与者都被纳入RRPS,因为样本量有限。因此,难民和肖纳无国籍社区的抽样框架代表了在联合国难民署注册的具有活跃电话号码的家庭。 数据收集方式 --------------------------- 计算机辅助电话访谈 [cati]。 研究工具 --------------------------- 问卷包括12个部分 部分1:介绍 部分2:家庭背景 部分3:旅行模式和互动 部分4:就业 部分5:粮食安全 部分6:收入损失 部分7:转移支付 部分8:主观福利(样本的50%) 部分9:健康 部分10:COVID知识 部分11:家庭和社会关系(样本的50%) 部分12:结论 数据清理操作 --------------------------- 在调查各波次中保持变量名称一致。对于在调查各波次中完全相同的提问,所有波次的数据点都可以在单个变量名称下找到。对于在波次间措辞发生变化的提问(即使变化很小),变量名称也进行了更改,以反映措辞的变化。 对于所有变量,使用扩展缺失值来指示缺失值的原因。数据集中使用了以下扩展缺失值: ·.a 表示“不知道” ·.b 表示“拒绝回答” ·.c 表示“异常值设置为缺失” ·.d 表示“不一致设置为缺失”(用于就业数据,如下文所述) ·.e 表示“字段跳过”(由于调查工具中的错误导致问题被错过) ·.z 表示“未实施”(因为变量与观察无关) 在波次3至7之间,与波次1、2和8相比,收集了更多关于儿童的数据。在波次1和2中,对于家庭中的所有儿童,例如他们的学习活动,只收集了一个问题。因此,与儿童相关的变量是波次1和2的'hh'数据的一部分。在波次3至7之间,对家庭中的特定儿童询问有关儿童的问题。一些问题涉及所有儿童,而另一些问题只针对家庭中随机选择的一个儿童进行。这种方法允许在儿童家庭成员层面进行数据细分,数据可以在'child'数据集中找到。可以使用家庭级别权重对儿童数据进行分析。在波次8中,由于问卷专注于其他主题,因此删除了关于儿童的详细信息。 在第一轮和第二轮中,除了受访者外,对家庭成员的教育状况进行了插补。对于第一轮和第二轮,只收集了受访者的教育状况,而后来各轮则询问每个家庭成员的教育状况。为了按家庭成员的教育状况评估结果,使用波次3、4和5中提供的所有家庭成员的信息对波次1和2的教育状况进行了插补。这导致了关于家庭成员教育状况的额外信息,这些信息在早期版本的数据中尚未可用。 一些问题在波次间没有反复提问,因此其值被插补。对于一些问题,由于在两个调查波次之间两个月内答案不太可能发生变化,因此家庭没有被询问这些问题。在第一次访谈时,只向家庭询问了2020年3月之前拥有的资产问题。在波次2和3中,没有询问所有家庭的住宅墙壁和地板材料以及家庭对电网的连接问题,仅针对新家庭和搬家者覆盖这些问题。关于家庭主要电力来源和拥有的资产类型的问题在波次8中没有询问。当这些变量未被询问时,其缺失值从早期波次的答案中进行插补。 改进的质量保证算法导致对波次1至5数据进行了轻微的修订。基于额外的数据检查,团队对波次1至5数据进行了轻微的改进。在难以对特定家庭进行与先前波次相同的受访者访谈的情况下,对受访者或家庭户主的家庭成员进行了精炼。因此,在约2%的观测值中,将户主状态分配给了错误的家庭成员,并进行了纠正。在不到1%的家庭中,受访者没有出现在成人级别数据集中。在波次5的大约1%的观测值中,受访者出现在成人级别数据集中两次。 从波次7的COVID-19疫苗接种问题数据中删除了数据集。由于自报疫苗接种率与官方行政记录相比显著较高,因此认为疫苗接种数据不可靠,最可能是由于社会期望偏差。因此,删除了有关疫苗接种状态的问题以及使用疫苗接种数据作为验证标准的问题。
提供机构:
microdata.unhcr.org
二维码
社区交流群
二维码
科研交流群
商业服务