five

OHS-LFS Consistent Series Weights 1994-2007 - South Africa

收藏
datafirst.uct.ac.za2020-04-28 更新2025-03-24 收录
下载链接:
https://datafirst.uct.ac.za/dataportal/index.php/catalog/402
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract --------------------------- One focus of post apartheid research in South Africa is change. Questions include the progress of South Africa in the economic, social and political arena. National datasets such as the October Household Surveys (OHS) and Labour Force Surveys (LFS) provide a rich source of information on both economic and social variables in a cross sectional framework. These datasets are repeated annually or biannually and therefore have the potential to highlight changes over time. Yet to treat the cross sectional national data as a time series requires that, when stacked side by side, the data produce realistic trends. Since these data were not designed to be used as a time series, there are changes in sample design, the interview process and shifts in the sampling frame which can cause unrealistic changes in aggregates over a short period of time. This raises concerns about the validity of using these datasets as a time series to examine change. The aggregate trends calculated from the OHS and LFS show the data to be both temporally and internally inconsistent. Examining the weights given in the datasets, in addition to the public documentation, it is clear that the Statistics South Africa (StatsSA) household and person weights are not simple design weights i.e. inverse inclusion probability weights. StatsSA poststratifies the person design weight to external population totals. Since the data are cross sectional the intention of the post-stratification adjustment is to produce best estimates of the population given the information available at the time and temporal consistency is not considered. This creates problems when the data is used as a time series. A project was thus undertaken by Nicola Branson at the University of Cape Town, with a scholarship from DataFirst as part of DataFirst's Data Quality Project, funded by the Mellon Foundation. to design a new set of person and household weights for the OHS 1994-1999 and the LFS 2000-2007. These weights are generated using an entropy estimation technique. The new weights result in consistent demographic and geographic trends and greater consistency between person and household level analysis. This dataset consists of the cross-entrophy weights and the research resources used to construct them, including the syntax files, as well as background documentation on the project, and other research output. These should be used with the OHS and LFS data available from the data portal Geographic coverage --------------------------- The OHS and LFS had national coverage Analysis unit --------------------------- Households and individuals Kind of data --------------------------- Sample survey data Mode of data collection --------------------------- Face-to-face [f2f] Data appraisal --------------------------- The purpose of survey weights is to inflate the sample to represent the entire population. These weights therefore play an important role in creating consistent aggregates over time. Statistics South Africa's (StatsSA) household and person weights are not simple design weights i.e. inverse inclusion probability weights. The weights presented in the StatsSA National Household surveys are the design weight post-stratified to external population totals. Since the data are cross sectional the intention of the post-stratification adjustment is to produce best estimates of the population given the information available at the time and temporal consistency is not considered. These cross entropy weights have been provided to render the OHS and LFS series consistent over time. The original cross entropy weights created by Nicola Branson did not include weights for OHS 1996. These have now been created by DataFirst, using a later version of the OHS 1996 data provided by Statistics South Africa.

摘要 --------------------------- 南非后种族隔离时期的学术研究,其焦点之一在于变革。研究问题涉及南非在经济、社会和政治领域的进步。诸如十月家庭调查(OHS)和劳动力调查(LFS)等国家级数据集,在横截面框架内提供了关于经济和社会变量的丰富信息。这些数据集每年或每两年重复一次,因此具有凸显时间变化趋势的潜力。然而,将横截面国家级数据作为时间序列处理,要求当数据并排堆叠时,产生的趋势应具有现实性。鉴于这些数据并非设计用于作为时间序列,样本设计、访谈过程和抽样框架的变动可能导致短期内汇总数据出现不现实的变化,从而引发关于将这些数据集作为时间序列以考察变化的可行性的担忧。 从OHS和LFS计算得出的汇总趋势表明,数据在时间和内部一致性方面均存在问题。通过审视数据集中给出的权重,以及公共文件资料,可以明确看出,南非统计局(StatsSA)的家庭和个人权重并非简单的设计权重,即逆包含概率权重。StatsSA对个人设计权重进行事后分层,以外部人口总数为目标。由于数据为横截面数据,事后分层调整的目的是根据当时可获得的信息,提供关于人口的最佳估计,且未考虑时间一致性。这在使用数据作为时间序列时造成了问题。 因此,尼科拉·布兰森(Nicola Branson)在开普敦大学启动了一项研究项目,该项目由DataFirst奖学金资助,作为DataFirst数据质量项目的一部分,该项目由梅隆基金会资助。该项目的目的是为1994-1999年的OHS和2000-2007年的LFS设计一套新的个人和家庭权重。这些权重是通过熵估计技术生成的。新的权重导致了一致的人口和地理趋势,以及个人和家庭层级分析之间的一致性更高。 本数据集包括横截面熵权重及其构建所使用的资源,包括语法文件,以及关于项目的背景文件和其他研究成果。这些数据应与数据门户提供的OHS和LFS数据一起使用。 地理覆盖范围 --------------------------- OHS和LFS具有全国覆盖范围 分析单元 --------------------------- 家庭和个人 数据类型 --------------------------- 样本调查数据 数据收集方式 --------------------------- 面对面调查(f2f) 数据评估 --------------------------- 调查权重的作用在于将样本膨胀以代表整个总体。因此,这些权重在创建时间上的一致汇总数据中起着至关重要的作用。南非统计局(StatsSA)的家庭和个人权重并非简单的逆包含概率权重。StatsSA国家家庭调查中呈现的权重是经过对外部人口总数进行事后分层的权重。由于数据为横截面数据,事后分层调整的目的是根据当时可获得的信息,提供关于人口的最佳估计,且未考虑时间一致性。这些交叉熵权重已被提供,以使OHS和LFS序列在时间上保持一致。原由尼科拉·布兰森(Nicola Branson)创建的交叉熵权重未包括OHS 1996年的权重。DataFirst已使用由南非统计局提供的OHS 1996年数据的较新版本创建了这些权重。
提供机构:
datafirst.uct.ac.za
二维码
社区交流群
二维码
科研交流群
商业服务