five

October Household Survey 1996 - South Africa

收藏
datafirst.uct.ac.za2020-06-18 更新2025-01-15 收录
下载链接:
https://datafirst.uct.ac.za/dataportal/index.php/catalog/411
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract --------------------------- During October 1996 Statistics South Africa recorded the details of people living in more than nine million households in South Africa, as well as those in hostels, hotels and prisons. Census 1996 was the first nation wide census since the splitting up of the country under apartheid after 1970 and sought to apply the same methodology to everyone: visiting the household, and obtaining details about all its members from a representative who was either interviewed, or else filled in the questionnaire in their language of choice. Geographic coverage --------------------------- The survey had national coverage Analysis unit --------------------------- Households and individuals Universe --------------------------- The survey covered households and household members in households in the nine provinces of South Africa. Kind of data --------------------------- Sample survey data Sampling procedure --------------------------- A sample of 1600 Enumerator Areas (EA's) was produced in conjunction with the sample for the 1996 Population Census post-enumeration survey. A two stage sampling procedure was applied in the following manner. The first stratification was done by province, as well as by type of EA (formal or informal urban areas, commercial farms, traditional authority areas or other non-urban areas). Originally eight hundred EA's were allocated to each strata by province proportionately. Later some adjustments were made to ensure adequate representation of smaller provinces such as the Northern Cape. Independent systematic samples of EA's were drawn for each stratum within each province. The sampling frame that was used was constructed from the preliminary database of EA's which was established during the demarcation and listing phase of the 1996 population census. In the second phase 10 households were drawn from each EA on the western and eastern side of the EA drawn for the post enumeration survey. This meant 10 households per EA in 1600 different EA's, that is 16 000 households in total. Mode of data collection --------------------------- Face-to-face [f2f] Research instrument --------------------------- The data files in the October Household Survey 1996 (OHS 1996) correspond to the following sections in the questionnaire: House: Data from FLAP, Section 1 and Section 7 Person: Data from Section 2 Worker: Data from Section 3 Migrant: Data from Section 4 Death: Data from Section 5 Births: Data from Section 6 - This data had a considerable number of problems and will not be published. Income: Data from Section 7 (included in House) Domestic: Data from Section 8 Data appraisal --------------------------- Questionnaire: The October Household Survey 1996 questionnaire had incorrect FLAP data. No Population Group question was indicated on the FLAP. DataFirst notified Statistics SA who supplied a corrected questionnaire which is the one now available with the dataset. Household IDs: In the previous version of the 1996 October Household Survey dataset archived by DataFirst the HHID were not unique. This was corrected in the first version disseminated by DataFirst, version 1. Version 1.1 keeps this correction, but data users should check versions not obtained from DataFirst and replace these with the latest version available from DataFirst. Linking Files: The Metadata for the OHS 1996 provides an explanation for merging the files in the files in the OHS 1996 dataset: "The data from different files can be linked on the basis of the record identifiers. The record identifiers are composed of the first few fields in each file. Each record contains the three fields Magisterial District, Enumeration area, and Visiting point number. These eleven digits together constitute a unique household identifier. All records with a given household identifier, no matter which file they are in, belong to the same household. For individuals, a further two digits constituting the Person number, when added to the household identifier, creates a unique individual identifier. Again, these can be used to link records from the PERSON and WORK files. The syntax needed to merge information from different files will differ according to the statistical package used (October Household Survey 1996: Metadata: General Notes: 2).” According to the above, to generate household IDs it is necessary to use a combination of magisterial district number (mdnumber), enumeration area number (eanumber) and visiting point number (vpnumber). To generate person IDs it is necessary to use the above with the person number (personnu). These variables are named as such in the OHS 1996 House, OHS 1996 Births, OHS 1996 Migrant, OHS 1996 Deaths, OHS 1996 Household Income Other, OHS 1996 Other, OHS 1996 Domestic and OHS 1996 Flap data files. However, in the OHS 1996 Worker and OHS 1996 Person data files the variable for magisterial district number is “distr”, the variable for Enumeration Area is “ea” and the variable for visiting point number is called "visp”. The variable for person number in these files is called “respno”. The metadata provided to DataFirst with this dataset does not discuss these changes. October Household Survey 1996 Births file: Births data was collected by Section 6 of the OHS 1996 questionnaire, completed for all women younger than 55 years who had ever given birth. The metadata for this survey from Statistics SA states that “This data had a considerable number of problems and will not be published” The dataset provided by DataFirst therefore does not include the original “births” file. Those in possession of this file from unofficial versions of the dataset should note the following problems with the data in the OHS 1996 births file: Variable name: eegender Question 6.2: Is/was (the child) a boy or a girl? Valid range: 1 (boy) - 2 (girl) Data quality issue: There is a third response value of 0 with no description Variable name: livinghh Question 6.4: If alive: Is (the child) currently living with this household? Valid range: 1 (yes) - 2 (no) Data quality issue: This variable has an additional response value (0), which has no description Variable name: agealive Question 6.5: If alive: How old is he/she? This question was asked of all women younger than 55 years who have ever given birth to provide the age of their living children. Data quality issue: responses range from 0-77 for age of child (assuming age 99 is for missing responses) which is outside the plausible range. Variable name: agenaliv Question 6.6: If dead: How old was (the child) when he/she died? Data quality issue: The format of the age at death variable is not clear Variable name: datebirt Question 6.7: [All children]: In what year and month was (the child) born? Data quality issue: There are problems with the format of the date of birth variable Variable name: wherebor Question 6.8: [All children]: Where was (the child) born? Data quality issue: There are only three options for the place of birth in the questionnaire (in a hospital, in a clinic and elsewhere), but the data has 10 response values (0-9) with no explanation for this in the metadata. Variable name: regstere Question 6.9 [All children] Was the birth registered? Valid range: 1(yes) - 2 (no) Data quality issue: There are 4 response values (0-3) for this variable

摘要 --------------------------- 在1996年10月,南非统计局记录了超过九百万户家庭中居住者的详细信息,以及宿舍、酒店和监狱中的居住者。1996年人口普查是自1970年种族隔离政策实施后,国家首次全国性的人口普查,旨在对所有人采用相同的方法:访问家庭,并从代表人物那里获取关于家庭所有成员的详细信息,该代表人物或接受采访,或以他们选择的语言填写问卷。 地理覆盖范围 --------------------------- 该调查具有全国覆盖范围。 分析单元 --------------------------- 家庭和个人 总体 --------------------------- 该调查涵盖了南非九个省份中的家庭及其成员。 数据类型 --------------------------- 样本调查数据 抽样程序 --------------------------- 在1996年人口普查后的抽样调查中,与1600个调查区域(EA)的样本一起,产生了样本。采用了以下两阶段抽样程序。 第一阶段按省份以及调查区域(EA)的类型(正式或非正式城市地区、商业农场、传统权力区域或其他非城市地区)进行分层。最初,每个省份按比例分配了八百个EA。后来,为了确保对较小的省份如北开普省有足够的代表性,进行了一些调整。在每个省份内,对每个层内的EA进行了独立系统的抽样。所使用的抽样框架是从1996年人口普查的划界和编制清单阶段建立的EA初步数据库中构建的。在第二阶段,从为后普查调查抽取的EA的西部和东部各抽取10户家庭。这意味着在1600个不同的EA中,每个EA有10户家庭,总计16000户。 数据收集方式 --------------------------- 面对面 [f2f] 研究工具 --------------------------- 1996年10月家庭调查(OHS 1996)的数据文件对应于问卷中的以下部分: 家庭:来自FLAP部分1和部分7的数据 个人:来自部分2的数据 工人:来自部分3的数据 移民:来自部分4的数据 死亡:来自部分5的数据 出生:来自部分6 - 这部分数据存在许多问题,将不会公开发布。 收入:来自部分7(包含在家庭中)的数据 家庭内务:来自部分8的数据 数据评估 --------------------------- 问卷:1996年10月家庭调查问卷存在FLAP数据的错误。FLAP上没有标注人口群体问题。DataFirst通知了南非统计局,统计局提供了更正后的问卷,现在与数据集一起提供。 家庭ID:在DataFirst存档的1996年10月家庭调查数据集的先前版本中,HHID不是唯一的。这在DataFirst首次发布的版本中得到了纠正,版本1。版本1.1保留了这一纠正,但数据用户应检查非DataFirst获取的版本,并用从DataFirst获取的最新版本替换它们。 链接文件:OHS 1996的元数据提供了关于合并OHS 1996数据集中文件的解释:'基于记录标识符,可以从不同的文件中链接数据。记录标识符由每个文件中的前几个字段组成。每个记录包含三个字段:行政区域、调查区域和访问点编号。这十一个数字共同构成了一个唯一的家庭标识符。所有具有给定家庭标识符的记录,无论它们位于哪个文件中,都属于同一个家庭。对于个人,将上述内容与个人编号(personnu)相加,可以创建一个唯一的个人标识符。同样,这些可以用来链接来自PERSON和WORK文件的记录。合并来自不同文件的信息所需的语法将根据所使用的统计软件包而有所不同(1996年10月家庭调查:元数据:一般注意事项:2)。 根据上述内容,生成家庭ID需要使用行政区域编号(mdnumber)、调查区域编号(eanumber)和访问点编号(vpnumber)的组合。生成个人ID需要使用上述内容与个人编号(personnu)。 这些变量在OHS 1996家庭、OHS 1996出生、OHS 1996移民、OHS 1996死亡、OHS 1996家庭收入其他、OHS 1996其他、OHS 1996家庭内务和OHS 1996 FLAP数据文件中命名为如此。然而,在OHS 1996工人和OHS 1996个人数据文件中,行政区域编号的变量名为“distr”,调查区域的变量名为“ea”,访问点编号的变量名为“visp”。这些文件中个人编号的变量名为“respno”。 与此数据集一起提供的元数据未讨论这些变化。 1996年10月家庭调查出生文件:出生数据是通过OHS 1996问卷的第6部分收集的,为所有55岁以下曾生育过的女性填写。根据南非统计局提供的该调查的元数据,“这部分数据存在许多问题,将不会公开发布”。因此,DataFirst提供的数据库中不包括原始的“出生”文件。拥有来自数据集非官方版本的“出生”文件的人应注意OHS 1996出生文件中的以下数据问题: 变量名:eegender 问题6.2:(孩子)是男孩还是女孩? 有效范围:1(男孩)- 2(女孩) 数据质量问题:存在第三个响应值0,没有描述。 变量名:livinghh 问题6.4:如果还活着:孩子是否目前与这个家庭住在一起? 有效范围:1(是)- 2(否) 数据质量问题:这个变量有一个额外的响应值(0),没有描述。 变量名:agealive 问题6.5:如果还活着:他/她多大了? 这个问题是询问所有55岁以下曾生育过的女性,以提供其存活孩子的年龄。 数据质量问题:对于孩子的年龄的响应范围从0-77(假设99岁是缺失响应),这超出了合理的范围。 变量名:agenaliv 问题6.6:如果已故:孩子去世时多大了? 数据质量问题:死亡年龄变量的格式不清楚。 变量名:datebirt 问题6.7:[所有孩子]孩子在哪一年和哪个月出生? 数据质量问题:出生日期变量的格式存在问题。 变量名:wherebor 问题6.8:[所有孩子]孩子在哪里出生? 数据质量问题:问卷中只有三个出生地点的选项(在医院、在诊所和其他地方),但数据有10个响应值(0-9),在元数据中没有解释。 变量名:regstere 问题6.9 [所有孩子]出生是否已登记? 有效范围:1(是)- 2(否) 数据质量问题:这个变量有4个响应值(0-3)。
提供机构:
datafirst.uct.ac.za
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作