five

Russia Longitudinal Monitoring Survey - Higher School of Economics 1998 - Russian Federation

收藏
catalog.ihsn.org2019-03-29 更新2025-03-22 收录
下载链接:
https://catalog.ihsn.org/catalog/6195
下载链接
链接失效反馈
官方服务:
资源简介:
Abstract --------------------------- The Russia Longitudinal Monitoring Survey (RLMS) is a household-based survey designed to measure the effects of Russian reforms on the economic well-being of households and individuals. In particular, determining the impact of reforms on household consumption and individual health is essential, as most of the subsidies provided to protect food production and health care have been or will be reduced, eliminated, or at least dramatically changed. These effects are measured by a variety of means: detailed monitoring of individuals' health status and dietary intake, precise measurement of household-level expenditures and service utilization, and collection of relevant community-level data, including region-specific prices and community infrastructure data. Data have been collected since 1992. The repeated cross-section design is far and away the simplest alternative for the RLMS. The sampling is cost efficient, easy to maintain, and easy to update when needed. The design supports both efficient cross-sectional and aggregate longitudinal analyses of change in the Russian household population. Updates to the sample, including a full replenishment of the probability sample of dwelling units, will not seriously disrupt the longitudinal data series. Geographic coverage --------------------------- National Analysis unit --------------------------- Households and individuals. Kind of data --------------------------- Sample survey data [ssd] Sampling procedure --------------------------- In Phase II (Rounds V - XX) of the RLMS, a multi-stage probability sample was employed. Please refer to the March 1997 review of the Phase II sample. First, a list of 2,029 consolidated regions was created to serve as PSUs. These were allocated into 38 strata based largely on geographical factors and level of urbanization but also based on ethnicity where there was salient variability. As in many national surveys involving face-to-face interviews, some remote areas were eliminated to contain costs; also, Chechnya was eliminated because of armed conflict. From among the remaining 1,850 regions (containing 95.6 percent of the population), three very large population units were selected with certainty: Moscow city, Moscow Oblast, and St. Petersburg city constituted self-representing (SR) strata. The remaining non-self-representing regions (NSR) were allocated to 35 equal-sized strata. One region was then selected from each NSR stratum using the method "probability proportional to size" (PPS). That is, the probability that a region in a given NSR stratum was selected was directly proportional to its measure of population size. The NSR strata were designed to have approximately equal sizes to improve the efficiency of estimates. The target population (omitting the deliberate exclusions described above) totaled over 140 million inhabitants. Ideally, one would use the population of eligible households, not the population of individuals. As is often the case, we were obliged to use figures on the population of individuals as a surrogate because of the unavailability of household figures in various regions. Since there was no consolidated list of households or dwellings in any of the 38 selected PSUs, an intermediate stage of selection was then introduced, as usual. Professional samplers will recognize that this is actually the first stage of selection in the three SR strata, since those units were selected with certainty. That is, technically, in Moscow, St. Petersburg, and Moscow oblast, the census enumeration districts were the PSUs. However, it was cumbersome to keep making this distinction throughout the description, and researchers followed the normal practice of using the terms "PSU" and "SSU" loosely. Needless to say, in the calculation of design effects, where the distinction is critical, the proper distinction was maintained. The selection of second-stage units (SSUs) differed depending on whether the population was urban (located in cities and "villages of the city type," known as "PGTs") or rural (located in villages). That is, within each selected PSU the population was stratified into urban and rural substrata, and the target sample size was allocated proportionately to the two substrata. For example, if 40 percent of the population in a given region was rural, 40 of the 100 households allotted to the stratum were drawn from villages. In rural areas of the selected PSUs, a list of all villages was compiled to serve as SSUs. The list was ordered by size and (where salient) by ethnic composition. PPS was employed to select one village for each 10 households allocated to the rural substratum. Again, under the standard principles of PPS, once the required number of villages was selected, an equal number of households in the sample (10) were allocated to each village. Since villages maintain very reliable lists of households, in each selected village the 10 households were selected systematically from the household list. In a few cases, villages were judged to be too small to sustain independent interviews with 10 households; in such cases, three or four tiny villages were treated as a single SSU for sampling purposes. In urban areas, SSUs were defined by the boundaries of 1989 census enumeration districts, if possible. If the necessary information was not available, 1994 microcensus enumeration districts, voting districts, or residential postal zones were employed--in decreasing order of preference. Since census enumeration districts were originally designed to be roughly equal in population size, one district was selected systematically without using PPS for each 10 households required in the sample. In the few cases where postal zones were used, one zone was likewise selected systematically for each 10 households. However, where voting districts were used, to compensate for the marked variation in population size, PPS was employed to select one voting district for each 10 households required in the urban sub-stratum. Given the lack of reliable official lists of households within the urban SSUs, we were obliged to develop the list of households from which 10 households were selected. First, a list of dwellings was made. Where more than one household was known to exist within a single dwelling (that is, in the communal apartments and enterprise dormitories that are relatively commonplace in the Russian Federation), the list was amended so that each household (or space within the dwelling) was enumerated in advance of selection. Then, the required number of households was drawn systematically, starting with a random selection in the first interval. In both urban and rural substrata, interviewers were required to visit each selected dwelling up to three times to secure the interviews. They were not allowed to make substitutions of any sort. The interviewers' first task was to identify households at the designated dwellings. "Household" was defined as a group of people who live together in a given domicile, and who share common income and expenditures. Households were also defined to include unmarried children, 18 years of age or younger, who were temporarily residing outside the domicile at the time of the survey. If perchance the interviewer identified more than one household in the dwelling, he or she was obliged to select one using a procedure outlined in the technical report. The interviewer then administered a household questionnaire to the most knowledgeable and willing member of the household. The interviewer then conducted interviews with as many adults as possible, acquiring data about their individual activities and health. Data for the children's questionnaires were obtained from adults in the household. By virtue of the fact that an attempt was made to obtain individual questionnaires for all members of households, the sample constitutes a proper probability sample of individuals as well as of households, without any special weighting. Actually, the fact that we did not interview unmarried minors living temporarily outside the domicile slightly diminished the representativeness of the sample of individuals in that age group. The multivariate distribution of the sample by sex, age, and urban-rural location compared quite well with the corresponding multivariate distribution of the 1989 census. Of course, because of random sampling error and changes in the distribution since the 1989 census, we did not expect perfect correspondence. Nevertheless, there was usually a difference of only one percentage point or less between the two distributions. Another way to evaluate the adequacy (or efficiency) of the sample was to examine design effects. An important factor in determining the precision of estimates in multi-stage samples was the mean ultimate cluster (PSU) size. All else being equal, the larger the size the less precise the measure is. In Rounds I through IV of the RLMS, the average cluster size approached 360--a large number dictated by constraints imposed by our collaborators. Thus, although the sample size covered around 6,000 households, precision was less than we would have liked for a sample of that size. In Rounds I and III of the RLMS, the 95 percent confidence interval for household income was about ?±13 percent. In the Phase II (Rounds V - XX) sample, the situation was considerably better. Although there were only 4,000 households, the mean size of clusters was much smaller than in Phase I. There were 35 PSUs with about 100 households each; even this result was an improvement over the average of 360 in the design of the RLMS Rounds I through IV. However, in the three self-representing areas, the respondents were drawn from 61 PSUs. Recall that Moscow city and oblast, as well as St. Petersburg city, were not sampled but were chosen with certainty. Therefore, the first stage of selection in them was the selection of census enumeration districts. Thus the mean cluster size in the entire sample was about 42, i.e., 4,000/(35+61). Given these much smaller cluster sizes, researchers had reason to expect that precision in this survey would be as good as it was in Rounds I through IV despite the smaller sample size, and this expectation, in fact, turned out to be the case in Rounds V through XIII. Mode of data collection --------------------------- Face-to-face [f2f] Research instrument --------------------------- The questionnaire are English-language translations of the original Russian questionnaires. The English versions have been translated as literally as possible. The order of the questions and the layout of the pages have been preserved in the English versions. The questionnaires are also designed to function as codebooks. The variable names, as they appear in the data sets, are usually listed below or to the left of the questions. If the abbreviation (char) appears with a variable name, then the responses to that question are stored in a character variable. If there is no variable name associated with a particular question, then the responses to that question do not appear in the data set. Some questions in the questionnaires are color coded. Pink means that the question was added. Green indicates changes from the previous round (e.g., year). Gray means that the questions were asked, but the data are not available for public use - the questions were added at the request of the Pension Office and are for their use only.

摘要 --------------------------- 俄罗斯纵向监测调查(RLMS)是一项基于家庭的调查,旨在衡量俄罗斯改革对家庭和个人经济福祉的影响。特别是,确定改革对家庭消费和个人健康的影响至关重要,因为大多数用于保护食品生产和医疗保健的补贴已被削减、取消或至少发生了显著变化。这些影响通过多种手段进行衡量:详细监测个人的健康状况和饮食摄入,精确测量家庭层面的支出和服务利用,以及收集相关社区层面的数据,包括地区特定价格和社区基础设施数据。自1992年以来,已收集了这些数据。 重复的横断面设计是RLMS的最简单替代方案。抽样成本效益高,易于维护,并在需要时易于更新。该设计支持对俄罗斯家庭人口变化的横断面和综合纵向分析。样本更新的包括对住宅单元的概率样本的全面补充,不会严重干扰纵向数据系列。 地理覆盖范围 --------------------------- 全国 分析单元 --------------------------- 家庭和个人。 数据类型 --------------------------- 样本调查数据 [ssd] 抽样程序 --------------------------- 在RLMS的第二阶段(第五轮至第二十轮)中,采用了多阶段概率抽样。请参阅1997年3月第二阶段样本的审查。首先,创建了一个包含2,029个综合地区的列表,作为PSU。这些地区被分配到38个层次,主要基于地理因素和城市化水平,但在存在显著差异的地方也基于民族。与许多涉及面对面访谈的全国性调查一样,一些偏远地区被排除以控制成本;此外,车臣因武装冲突而被排除。在剩余的1,850个地区(占人口的95.6%)中,选定了三个非常大的人口单位:莫斯科市、莫斯科州和圣彼得堡市构成了自我代表(SR)层次。剩余的非自我代表地区(NSR)被分配到35个大小相等的层次。然后,使用“大小成比例的概率”(PPS)方法从每个NSR层次中选定一个地区。也就是说,在给定NSR层次中选定的地区的概率与其人口规模成直接比例。 NSR层次被设计成大致相等的大小,以提高估计的效率。目标人口(排除上述故意排除的情况)总计超过1.4亿人。理想情况下,人们会使用合格家庭的数量,而不是个人的数量。由于在某些地区家庭数据不可用,我们被迫使用个人人口数据作为替代品,这种情况很常见。 由于在38个选定的PSU中没有任何综合的家庭或住宅列表,因此引入了中间选择阶段,这是通常的做法。专业的抽样员会认识到,这实际上是三个自我代表层次的第一阶段选择,因为那些单位是确定的。也就是说,在技术上,在莫斯科、圣彼得堡和莫斯科州,人口普查区是PSU。然而,在整个描述中保持这种区分是繁琐的,研究人员遵循了正常做法,粗略地使用“PSU”和“SSU”等术语。不言而喻,在计算设计效应时,区分至关重要,必须保持适当的区分。第二阶段单位(SSU)的选择取决于人口是否为城市(位于城市和“城市类型村庄”,称为“PGT”)或农村(位于村庄)。也就是说,在每个选定的PSU中,人口被分为城市和农村亚层,目标样本量按比例分配到两个亚层。例如,如果一个地区的40%是农村,则分配给该层次的100户家庭中有40户来自村庄。 在选定的PSU的农村地区,编制了一个包含所有村庄的列表,作为SSU。该列表按大小排序,并在(如果显著)按民族构成排序。PPS被用于为每个农村亚层选择一个村庄。同样,在PPS的标准原则下,一旦选定了所需数量的村庄,则在样本中的等量家庭(10户)被分配给每个村庄。由于村庄保留了非常可靠的住户名单,因此在每个选定的村庄中,从住户名单中系统地选择了10户家庭。在少数情况下,村庄被认为太小,无法维持10户家庭的独立访谈;在这种情况下,三个或四个小村庄被视为一个SSU进行抽样。 在城市地区,如果可能的话,SSU由1989年人口普查区的边界定义。如果必要的信息不可用,则使用1994年人口普查区、投票区或住宅邮政区——按偏好顺序递减。由于人口普查区最初设计时人口规模大致相等,因此系统性地选择一个地区,而不使用PPS,以满足样本中所需的10户家庭。在少数使用邮政区的情况下,同样系统性地选择一个区,以满足每个10户家庭的样本需求。然而,在投票区的情况下,为了补偿人口规模的显著差异,使用PPS为每个城市亚层的10户家庭选择一个投票区。 由于在城市SSU内缺乏可靠的官方家庭名单,我们被迫从其中选择10户家庭的名单。首先,制作了一份住宅列表。如果已知单个住宅内存在多个家庭(即在俄罗斯联邦相对常见的集体公寓和企业宿舍中),则修改该列表,以便在选择之前预先对每个家庭(或住宅内的空间)进行编号。然后,系统地抽取所需数量的家庭,从第一个区间开始进行随机选择。 在城市和农村亚层中,访谈员需要访问每个选定的住宅多达三次以确保访谈。他们不允许进行任何形式的替换。访谈员的首要任务是识别指定住宅中的家庭。‘家庭’被定义为居住在特定住宅内、共享共同收入和支出的群体。家庭还包括当时暂时居住在住宅外、年龄在18岁或以下的未婚子女。如果访谈员在住宅中发现了多个家庭,他或她有义务使用技术报告中概述的程序进行选择。然后,访谈员向家庭中最有知识和愿意的成员发放家庭问卷。 然后,访谈员对尽可能多的成年人进行访谈,获取他们的个人活动和健康状况数据。儿童的问卷数据来自家庭中的成年人。由于试图为家庭的所有成员获取个人问卷,因此样本构成了个人和家庭的有效概率样本,无需任何特殊加权。实际上,我们没有对暂时居住在住宅外的未婚未成年人进行访谈的事实略微降低了该年龄组个人样本的代表性。 样本按性别、年龄和城乡位置的多变量分布与1989年人口普查的相应多变量分布相当吻合。当然,由于随机抽样误差和自1989年人口普查以来的分布变化,我们并不期望完全一致。尽管如此,两个分布之间的差异通常只有一个百分点或更少。 评估样本的充分性(或效率)的另一种方法是通过检查设计效应。在确定多阶段样本估计的精确度时,最终聚类(PSU)的平均大小是一个重要因素。在其他条件相同的情况下,规模越大,测量就越不精确。在RLMS的第一轮至第四轮中,平均聚类大小接近360——这是一个由我们的合作伙伴强加的限制所决定的较大数字。因此,尽管样本量覆盖了约6,000户家庭,但精确度低于我们希望的一个那么大的样本的精确度。在RLMS的第一轮和第三轮中,家庭收入的95%置信区间约为±13%。 在第二阶段(第五轮至第二十轮)样本中,情况大有改善。尽管只有4,000户家庭,但聚类的平均大小比第一阶段小得多。有35个PSU,每个PSU约有100户家庭;即使这个结果也比RLMS第一轮至第四轮的平均值360有所改善。然而,在三个自我代表地区,受访者是从61个PSU中抽取的。回想一下,莫斯科市和州以及圣彼得堡市没有抽样,而是被确定选中。因此,这些地区的第一阶段选择是人口普查区的选择。因此,整个样本的平均聚类大小约为42,即4,000除以(35+61)。鉴于这些更小的聚类大小,研究人员有理由期望这次调查的精确度将与第一轮至第四轮的精确度相当,而且事实上,这种情况在第五轮至第十三轮中确实如此。 数据收集方式 --------------------------- 面对面 [f2f] 研究工具 --------------------------- 问卷是原始俄语问卷的英语翻译。英语版本尽可能地进行了翻译。问题的顺序和页面的布局在英语版本中得到了保留。 问卷还设计成作为代码簿。变量名称(如数据集中所示)通常列在问题下方或左侧。如果变量名称(char)与变量名称一起出现,则该问题的回答存储在字符变量中。如果没有与特定问题关联的变量名称,则该问题的回答不包含在数据集中。问卷中的一些问题是彩色编码的。粉色表示该问题被添加。绿色表示与前一次轮次(例如,年份)的变化。灰色表示已提出问题,但数据不向公众开放——这些问题是应养老金办公室的要求添加的,仅供他们使用。]
提供机构:
catalog.ihsn.org
二维码
社区交流群
二维码
科研交流群
商业服务