National Sustainable Development Plan Baseline Survey 2019, Household Income and Expenditure Survey 2019 - Vanuatu
收藏microdata.pacificdata.org2020-10-09 更新2025-01-21 收录
下载链接:
https://microdata.pacificdata.org/index.php/catalog/742
下载链接
链接失效反馈官方服务:
资源简介:
Abstract
---------------------------
The National Sustainable Development Plan (NSDP) Baseline Survey 2019 is an expanded Household Income and Expenditure Survey (HIES) and is inclusive of health educational, cultural, and productive dimensions previously uncollected or in need of updating. The results of this survey will inform directly more than 30 key indicators listed in the NSDP M&E (Monitoring and Evaluation) Framework, as well as more than 40 of the listed indicators for the United Nations Sustainable Development Goals (SDGs). The NSDP Baseline Survey presents an opportunity as well for Vanuatu to establish a comprehensive Melanesian Wellbeing baseline as well as an updated baseline for the calculation of the Consumer Price Index (CPI) and revising National Accounts.
Geographic coverage
---------------------------
National coverage. Below are the details of this national coverage:
1. National (Vanuatu);
2. Provinces (Torba, Sanma, Penama, Malampa, Shefa, Tafea);
4. Area Councils (Torres Area council right to Futuna & Aneityum Area Council);
5. Villages / Towns;
6. Urban/Rural.
Analysis unit
---------------------------
Household and Individual.
Universe
---------------------------
All de jure residents.
Kind of data
---------------------------
Sample survey data [ssd]
Sampling procedure
---------------------------
The sample size for this survey was determined using the previous 2010 Household Income and Expenditure Survey (HIES) outputs, and especially the per capita monthly total expenditure. From the 2010 HIES the mean, standard deviation and standard error were computed (per capita expenditure) and from the 2016 Census the distribution of the population across the 6 provinces of Vanuatu was used as a base. According to the accuracy of this variable of interest within each province the sample size per province were adjusted in order to get an expected sampling error around 5% within each province.
The sampling frame used is the last 2016 Vanuatu census for the computation of the probability of selection of the Enumeration Areas (EAs) and the random selection method started with the random selection of EAs using the probability proportional to size. Then within each selected EAs 10 households were randomly selected using the sampling uniformed method.
Within each selected EA the household listing were updated by the team before random selection and interview.
i) The only variable considered is per capita total household expenditure (variable of interest), as in addition to being one of the main indicators derived from the Household Income and Expenditure Survey (HIES), it is likely highly correlated with many other variables of interest (e.g. poverty). From the 2010 HIES dataset, using this variable of interest, a list of relevant indicators were calculated, those indicators provide information on:
- (a)the status of the household expenditure distribution within each province,
- (b) The efficiency provided by the 2010 HIES sample design
- (c) The accuracy of the estimates calculated from the 2010 HIES dataset (especially the per capita household expenditure, our variable or interest)
ii) The original dataset has been trimmed using the variable of interest, the lowest and the highest percentiles (the 1% households with the lowest and highest per capita total household expenditure) were removed from the analysis (outliers). The dataset ends up with 4,289 households (given 4,377 households were completed).
iii) The 2010 Vanuatu HIES sample was based on a stratified multi stages selection
- Stratification: geographical provinces (by urban / rural locations)
- First stage of selection: Enumerations Areas (EAs) with probability of selection proportional to size
- Second stage: households, with uniform probability of selection within the EAs
iv) The mean and standard deviation indicate the status of the variable of interest within each strata. The intracluster correlation (p), and the design effect (DEFF) highlight the efficiency of the sampling strategy, and the standard error/relative standard error (SE/RSE) of the variable of interest show its accuracy.
v) The purpose of this analysis is to get some insights from the 2010 HIES sample design in order to improve the 2019 survey. There is no point to improve the sample size in strata where the sample is not efficient (the gain in accuracy will be minor compared to the related cost).
vi) The challenge in the 2019 Vanuatu baseline survey:
- Meet precision targets in each strata (provincial level) including Penama where Ambae island has been evacuated at the time of the sample design.
- Acceptable sample size (due to budget constraints)
- Following international recommendations (12 months of field operation)
- Enhance the monitoring and supervision of the field staff and simplify management of the logistics in the field
==> Optimize the variance/cost ratio of the survey design
vii) Table 1 from the Document Sample Design (provided as External Resources) presents the Vanuatu 2010 HIES survey specifications, efficiency and accuracy in each strata (for the variable of interest). It shows that some improvements can be done in Torba, and Shefa rural (where the RSE is higher than 5%), and it shows a high intraclass correlation in Malampa, Shefa rural and Tafea (that lead to a high design effect in those strata). In Torba, the high design effect comes from the high number of households interviewed in each selected EA (on average 33 households per selected EA in this strata were interviewed).
- Torba: the sample size is good, there is just a need to reduce the number of households to interview within each strata (and in order to keep a similar sample size the number of EAs to select in the province will be increased)
- Malampa: given the high intracluster correlation in this province, a higher number of EAs to select is required (with the same number of households per EA to interview).
- Shefa rural: keep the same number of households to interview within each EA, and increase the number of EA to select (this will lead to a higher sample size)
- Tafea: similar to Malampa province, the high intraclass correlation indicates that the number of EAs to select has to be increased (therefore the sample size as well).
The sample size has to be increased in Malampa, Shefa rural and Tafea, for the rest, the 2019 design will have to be similar as 2010 (in order to provide at least the same level of accuracy).
viii) The 2019 Vanuatu base line survey follows the international recommendations in terms of data collection schedule (12-month coverage) and considers a better management and supervision of the field staff. In this context, the field staff will work by team, given that:
- A team is made of 1 supervisor (team leader) and 2 or 3 interviewers
- Each interviewer will be responsible for 5 interview per round
- A round of survey is a 1 week period
- 1 EA is covered during 1 round, after the round completion, the team moves to the next EA for the next round.
- A team complete 32 rounds during the 12 month field operation period (roughly every 2 rounds/2 weeks) of work is followed by 1 round/1 week of rest).
ix) Table 3 from the Document Sample Design (provided as External Resources) presents a survey schedule starting February 2019 and ending February 2020. During this period of 32 working weeks (corresponding to 32 different selected EAs) the teams will be on the field (a 3 weeks period of rest during Christmas period).
x) The number of interviewer by team and number of team by province will determine the total sample size within each province. A team made of 3 interviewers can achieve 480 households over the period, while a team of 2 interviewers can achieve only 320 cases.
xi) The intraclass correlation is used to calculate the precision loss due to clustering. Like the standard deviation, the intracluster correlation is considered to be a true population parameter, and therefore transferable between designs. We have to accept the hypothesis that this correlation factor has not changed during the period 2010-2019, and therefore can be used to predict DEFF and RSE for the next survey given an adjusted design (based on the conclusions provided by the 2010 design). Table 2 from the Document Sample Design (provided as External Resources) predicts the design effect and sampling error of the variable of interest given the new sample design that is based on:
- the sample size within each strata
- the number of teams within each strata
- the number of interviewers per team
In order to allow more flexibility in the sample size, it is preferable to set up some teams of 3 interviewers, that can achieve 480 households, which represent a good sample size for Torba and Sanma urban and some teams of 2 interviewers that will achieve 320 households each (2 teams will be required in other provinces).
xii) The proposed design in Table 2 from the Document Sample Design (provided as External Resources) shows a total sample size of 4,640 households and a higher level of accuracy of the estimate of the variable of interest in all the stratas. Only Shefa rural shows a RSE higher than 5%, which will be still acceptable. The high intraclass correlation in Shefa rural impacts the variance of the estimates and lead to an increase the sample size or a decrease of the number of households to interview per EA which is logistically and financially not recommended.
Mode of data collection
---------------------------
Computer Assisted Personal Interview [capi]
Research instrument
---------------------------
The questionnaire was developed in English using the World Bank software Survey Solutions. This questionnaire is divided into 18 modules that are detailed below.
-Introduction (geographic areas, list of household members)
-Module 1: Demographic characteristics: ethnicity, marital status;
-Module 2: Wellbeing: culture and wellbeing, sports;
-Module 3: Education: language, traditional knowledge and skills, school attendance and attainment;
-Module 4: Health: illness, meals, functioning difficulties;
-Module 5: Individual expenses: communication, narcotics, other;
-Module 6: Labour force and individual income: activities, income, allowance, cash transfer amount;
-Module 7: Civic responsibility and satisfaction;
-Module 8: Household details: Dwelling characteristics, energy, water, transport;
-Module 9: Household assets: land, robbery, furniture, asset details;
-Module 10: Other household items and services: home maintenance and construction, vehicles, vehicle details, international private travel, domestic private travel, household services and taxes, financial support, other household expenditures;
-Module 11: Ceremonies;
-Module 12: Remittances;
-Module 13: Shocks;
-Module 14: Productive sector activity: livestock and aquaculture, fishing seafood collection and hunting, agriculture farming activities, handicraft;
-Module 15: Food recall;
-Module 16: Non-food recall;
-Module 17: Food away from home;
-Module 18: Food security.
Cleaning operations
---------------------------
Data editing was done using the software Stata.
Response rate
---------------------------
The final response rate was 98%. Torba had the lowest response rate (92%) with 40 households not responding.
For each enumeration area, field staff were provided with a list of primarily selected households (Set A) and a list of replacement households (Set B). Replacements households were to be interviewed in the case of non-response from the primary selected households (for reasons such as refusals, unable to contact the household). In the circumstance whereby the list of replacement households (Set B) was exhausted, enumerators were instructed to randomly select a household within the EA (these ate Set C).
91% of responding households were from Set A, 9% from Set B and 1% from Set C.
Below is the final response rate for each province:
-Torba: 92%
-Sanma - urban: 100%
-Sanma - rural: 98%
-Penama: 100%
-Malampa: 99%
-Shefa - urban: 100%
-Shefa - rural: 99%
-Tafea: 96%
-NATIONAL: 98%
摘要
---------------------------
2019年国家可持续发展计划(NSDP)基线调查是一项扩大的家庭收入与支出调查(HIES),并涵盖了先前未收集或需要更新的健康、教育、文化和生产等维度。本调查结果将直接为NSDP监测与评估(M&E)框架中列出的30多个关键指标提供信息,以及联合国可持续发展目标(SDGs)中列出的40多个指标。NSDP基线调查还为瓦努阿图建立全面的美拉尼西亚福祉基线,以及更新消费者价格指数(CPI)和修订国家账户提供了一个机会。
地理覆盖范围
---------------------------
全国覆盖。以下为国家覆盖范围的详细信息:
1. 国家(瓦努阿图);
2. 省份(托尔巴、桑马、彭纳马、马拉马帕、谢法、塔法);
4. 地区委员会(托雷斯地区委员会至富图纳与阿内蒂乌姆地区委员会);
5. 村庄/城镇;
6. 城市农村。
分析单位
---------------------------
家庭和个人。
总体
---------------------------
所有法定居民。
数据类型
---------------------------
样本调查数据 [ssd]
抽样程序
---------------------------
本调查的样本量是根据先前2010年家庭收入与支出调查(HIES)的输出确定的,特别是人均月总支出。从2010年HIES中计算了均值、标准差和标准误(人均支出),并从2016年人口普查中使用了瓦努阿图6个省份的人口分布作为基础。根据每个省份该变量的准确性,按省份调整了每个省份的样本量,以在每个省份内获得大约5%的预期抽样误差。
所使用的抽样框架是2016年瓦努阿图人口普查的最后数据,用于计算人口统计区域的抽样概率,随机选择方法从随机选择人口统计区域开始,使用规模成比例的概率。然后,在每个选定的统计区域内部,使用统一的抽样方法随机选择了10户家庭。
在随机选择的每个统计区域内,调查队在随机选择和访谈之前更新了家庭名单。
i) 考虑的唯一变量是人均总家庭支出(关注变量),因为它不仅是家庭收入与支出调查(HIES)中衍生出的主要指标之一,而且很可能与其他许多关注变量高度相关(例如贫困)。从2010年HIES数据集中,使用此关注变量,计算了一系列相关指标,这些指标提供了以下信息:
- (a)每个省份家庭支出分布的状况,
- (b)2010年HIES样本设计的效率
- (c)从2010年HIES数据集中计算出的估计值的准确性(特别是人均家庭支出,我们的关注变量)
ii) 使用关注变量对原始数据集进行了修剪,分析中排除了最低和最高的百分位数(1%的人均总家庭支出最低和最高的家庭),最终数据集包含4,289户家庭(因为完成了4,377户家庭)。
iii) 2010年瓦努阿图HIES样本基于分层多阶段选择
- 分层:地理省份(按城市/农村位置)
- 第一阶段选择:使用规模成比例的概率选择人口统计区域
- 第二阶段:家庭,在所选统计区域内部使用统一的概率选择家庭
- 在每个选定的统计区域内,调查队在随机选择和访谈之前更新了家庭名单。
iv) 均值和标准差表明关注变量在每个分层中的状况。簇内相关系数(p)和设计效应(DEFF)突出了抽样策略的效率,关注变量的标准误/相对标准误(SE/RSE)显示了其准确性。
v) 本分析的目的是从2010年HIES样本设计中获得一些见解,以提高2019年的调查。没有必要在样本效率不高的分层中提高样本量(与相关成本相比,准确性的提高将微不足道)。
vi) 2019年瓦努阿图基线调查的挑战:
- 在每个分层(省级)包括彭纳马达到精度目标,其中安巴岛在样本设计时已被疏散。
- 可接受的样本量(由于预算限制)
- 遵循国际建议(12个月的现场操作)
- 加强现场工作人员的监督,并简化现场后勤管理
==> 优化调查设计的方差/成本比率
vii) 文档样本设计(作为外部资源提供)中的表1显示了瓦努阿图2010年HIES调查的规格、每个分层(针对关注变量)的效率和准确性。它表明在托尔巴和谢法农村(RSE高于5%)可以进行一些改进,并显示在马拉马帕、谢法农村和塔法有高簇内相关系数(导致这些分层的设计效应高)。在托尔巴,高设计效应源于在每个选定的统计区域内接受访谈的家庭数量很多(在该分层中,平均每个选定的统计区域接受访谈的家庭数量为33户)。
- 托尔巴:样本量良好,只需要在每个分层中减少接受访谈的家庭数量(为了保持类似的样本量,该省份中需要选择的统计区域数量将增加)
- 马拉马帕:鉴于该省份簇内相关系数高,需要选择更多的统计区域(每个统计区域接受访谈的家庭数量相同)。
- 谢法农村:保持每个统计区域接受访谈的家庭数量相同,并增加选择的统计区域数量(这将导致更高的样本量)
- 塔法:与马拉马帕省份类似,高簇内相关系数表明需要增加选择的统计区域数量(因此样本量也要增加)。
在马拉马帕、谢法农村和塔法需要增加样本量,对于其余地区,2019年的设计将类似于2010年(为了提供至少相同水平的准确性)。
viii) 2019年瓦努阿图基线调查在数据收集时间表方面遵循国际建议(12个月覆盖),并考虑了更好的现场工作人员管理和监督。在这种情况下,现场工作人员将按团队工作,因为:
- 一个团队由1名监督员(团队负责人)和2或3名访谈员组成
- 每个访谈员将负责每轮5次访谈
- 一轮调查是一个星期期
- 一个统计区域在一个轮次内完成,轮次完成后,团队将移动到下一个统计区域进行下一轮。
- 一个团队在12个月的现场运营期内完成32轮工作(大约每2轮/2周后跟随1轮/1周的休息)。
ix) 文档样本设计(作为外部资源提供)中的表3显示了从2019年2月开始至2020年2月结束的调查时间表。在此32个工作周期间(对应32个不同的选定的统计区域),团队将在现场(圣诞节期间有3周的休息期)。
x) 每个团队访谈员的人数和每个省份的团队数量将决定每个省份的总样本量。由3名访谈员组成的团队可以在该期间实现480户家庭的访谈,而由2名访谈员组成的团队则只能实现320个案例。
xi) 簇内相关系数用于计算由于聚类导致的精度损失。与标准差一样,簇内相关系数被认为是一个真正的总体参数,因此可以在设计之间转移。我们必须接受这一相关因素在2010-2019年期间没有变化的假设,因此可以预测给定调整后的设计(基于2010年设计的结论)的下一调查的设计效应(DEFF)和抽样误差。文档样本设计(作为外部资源提供)中的表2预测了基于以下内容的关注变量的设计效应和抽样误差:
- 每个分层内的样本量
- 每个分层内的团队数量
- 每个团队内的访谈员数量
为了允许更多的样本量灵活性,最好设立一些由3名访谈员组成的团队,可以实现480户家庭的访谈,这对于托尔巴和桑马城市以及一些由2名访谈员组成的团队来说是一个很好的样本量(在其他省份需要2个团队)。
xii) 文档样本设计(作为外部资源提供)中的表2中提出的设计表明,总样本量为4,640户家庭,并在所有分层中对关注变量的估计精度有更高的水平。只有谢法农村的RSE高于5%,这仍然是可以接受的。谢法农村的高簇内相关系数影响了估计的方差,导致样本量增加或每个统计区域接受访谈的家庭数量减少,这在物流和财务上是不推荐的。
数据收集方式
---------------------------
计算机辅助个人访谈 [capi]
研究工具
---------------------------
问卷使用英语开发,采用了世界银行软件调查解决方案。本问卷分为18个模块,如下详细说明。
- 简介(地理区域,家庭成员名单)
- 模块1:人口特征:民族,婚姻状况;
- 模块2:福祉:文化和福祉,体育;
- 模块3:教育:语言,传统知识和技术,学校出席率和成就;
- 模块4:健康:疾病,饮食,功能困难;
- 模块5:个人费用:通信,毒品,其他;
- 模块6:劳动力和个人收入:活动,收入,津贴,现金转移金额;
- 模块7:公民责任和满意度;
- 模块8:家庭详情:住宅特征,能源,水,交通;
- 模块9:家庭资产:土地,抢劫,家具,资产详情;
- 模块10:其他家庭物品和服务:家庭维护和建筑,车辆,车辆详情,国际私人旅行,国内私人旅行,家庭服务和税收,财务支持,其他家庭支出;
- 模块11:仪式;
- 模块12:汇款;
- 模块13:冲击;
- 模块14:生产部门活动:畜牧业和养殖业,渔业海鲜收集和狩猎,农业耕作活动,手工艺;
- 模块15:食品回忆;
- 模块16:非食品回忆;
- 模块17:家庭外食品;
- 模块18:粮食安全。
数据清洗操作
---------------------------
数据编辑使用Stata软件进行。
响应率
---------------------------
最终响应率为98%。托尔巴的响应率最低(92%),有40户家庭未响应。
对于每个人口统计区域,现场工作人员提供了主要选定家庭(集合A)和替代家庭(集合B)的名单。在主要选定家庭(例如由于拒绝、无法联系家庭等原因)未响应的情况下,将对替代家庭进行访谈。在替代家庭(集合B)的名单耗尽的情况下,调查员被指示在统计区域内随机选择一个家庭(这些是集合C)。
91%的响应家庭来自集合A,9%来自集合B,1%来自集合C。
以下是每个省份的最终响应率:
- 托尔巴:92%
- 桑马-城市:100%
- 桑马-农村:98%
- 彭纳马:100%
- 马拉马帕:99%
- 谢法-城市:100%
- 谢法-农村:99%
- 塔法:96%
- 全国:98%
提供机构:
microdata.pacificdata.org



