Survey of Consumer Finances (SCF)

Mendeley Data2024-01-31 更新2024-06-27 收录

下载链接：

https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/FRMKMF

下载链接

链接失效反馈

官方服务：

资源简介：

analyze the survey of consumer finances (scf) with r the survey of consumer finances (scf) tracks the wealth of american families. every three years, more than five thousand households answer a battery of questions about income, net worth, credit card debt, pensions, mortgages, even the lease on their cars. plenty of surveys collect annual income, only the survey of consumer finances captures such detailed asset data. responses are at the primary economic unit-level (peu) - the economically dominant, financially interdependent family members within a sampled household. norc at the university of chicago administers the data collection, but the board of governors of the federal reserve pay the bills and therefore call the shots. if you were so brazen as to open up the microdata and run a simple weighted median, you'd get the wrong answer. the five to six thousand respondents actually gobble up twenty-five to thirty thousand records in the final pub lic use files. why oh why? well, those tables contain not one, not two, but five records for each peu. wherever missing, these data are multiply-imputed, meaning answers to the same question for the same household might vary across implicates. each analysis must account for all that, lest your confidence intervals be too tight. to calculate the correct statistics, you'll need to break the single file into five, necessarily complicating your life. this can be accomplished with the `meanit` sas macro buried in the 2004 scf codebook (search for `meanit` - you'll need the sas iml add-on). or you might blow the dust off this website referred to in the 2010 codebook as the home of an alternative multiple imputation technique, but all i found were broken links. perhaps it's time for plan c, and by c, i mean free. read the imputation section of the latest codebook (search for `imputation`), then give these scripts a whirl. they've got that new r smell. the lion's share of the respondents in the survey of consumer finances get drawn from a pretty standard sample of american dwellings - no nursing homes, no active-duty military. then there's this secondary sample of richer households to even out the statistical noise at the higher end of the i ncome and assets spectrum. you can read more if you like, but at the end of the day the weights just generalize to civilian, non-institutional american households. one last thing before you start your engine: read everything you always wanted to know about the scf. my favorite part of that title is the word always. this new github repository contains t hree scripts: 1989-2010 download all microdata.R initiate a function to download and import any survey of consumer finances zipped stata file (.dta) loop through each year specified by the user (starting at the 1989 re-vamp) to download the main, extract, and replicate weight files, then import each into r break the main file into five implicates (each containing one record per peu) and merge the appropriate extract data onto each implicate save the five implicates and replicate weights to an r data file (.rda) for rapid future loading 2010 analysis examples.R prepare two survey of consumer finances-flavored multiply-imputed survey analysis functions load the r data files (.rda) necessary to create a multiply-imputed, replicate-weighted survey design demonstrate how to access the properties of a multiply-imput ed survey design object cook up some descriptive statistics and export examples, calculated with scf-centric variance quirks run a quick t-test and regression, but only because you asked nicely replicate FRB SAS output.R reproduce each and every statistic pr ovided by the friendly folks at the federal reserve create a multiply-imputed, replicate-weighted survey design object re-reproduce (and yes, i said/meant what i meant/said) each of those statistics, now using the multiply-imputed survey design object to highlight the statistically-theoretically-irrelevant differences click here to view these three scripts for more detail about the survey of consumer finances (scf), visit: the federal reserve board of governors' survey of consumer finances homepage the latest scf chartbook, to browse what's possible. (spoiler alert: everything.) the survey of consumer finances wikipedia entry the official frequently asked questions notes: nationally-representative statistics on the financial health, wealth, and assets of american hous eholds might not be monopolized by the survey of consumer finances, but there isn't much competition aside from the assets topical module of the survey of income and program participation (sipp). on one hand, the scf interview questions contain more detail than sipp. on the other hand, scf's smaller sample precludes analyses of acute subpopulations. and for any three-handed martians in the audience, ther e's also a few biases between these two data sources that you ought to consider. the survey methodologists at the federal reserve take their job seriously, as evidenced by this working paper trail. write a thank-you in their guestbook. one can never receive enough of those. confidential to sas, spss, stata, and sudaan users: the eighties called. they want their statistical languages back. time to transition to r. :D

本研究基于R语言分析消费者金融调查（Survey of Consumer Finances, SCF）。消费者金融调查追踪美国家庭的财富状况，每三年开展一次，招募超五千户家庭回答一系列关于收入、净资产、信用卡债务、养老金、抵押贷款甚至汽车租赁的问题。多数调查仅收集年度收入数据，唯有消费者金融调查能够获取如此详尽的资产类数据。调查数据以主要经济单位（Primary Economic Unit, PEU）为统计单元——即抽样家庭中经济主导、财务相互依存的家庭成员群体。该调查的数据收集工作由芝加哥大学全国民意研究中心（National Opinion Research Center, NORC）执行，但其经费由美国联邦储备委员会理事会承担，因此拥有最终决策权。若直接读取微观数据并计算简单加权中位数，将得到错误结果。最终公开使用文件中，5000至6000名受访者最终对应25000至30000条记录，原因何在？因为每个主要经济单位对应五条记录。当存在数据缺失时，这些数据会采用多重插补（multiple imputation）处理，即同一家庭同一问题的答案在不同插补集可能存在差异。所有分析都必须考虑这一特性，否则置信区间会过窄。若要计算正确的统计量，需将单个文件拆分为五个，这无疑会增加分析复杂度。这一操作可借助2004年SCF代码手册中内嵌的`meanit` SAS宏实现（搜索`meanit`，需搭配SAS IML附加组件使用）。或者可以尝试2010年代码手册中提及的、采用替代多重插补技术的网站，但笔者仅发现失效链接。或许应当转向方案C——即免费方案：查阅最新代码手册的插补章节（搜索`imputation`），随后尝试以下脚本。这些脚本适配R语言环境。消费者金融调查的多数受访者来自标准美国家庭抽样框架，不含养老院住户与现役军人。此外还增设了高收入家庭的次级抽样样本，以平衡收入与资产分布高端端的统计噪声。最终权重将样本推广至美国平民非机构化家庭群体。在开始分析前还有一点提醒：务必阅读《你想了解的SCF一切》，笔者最青睐该标题中的“always”一词。本全新GitHub仓库包含三份脚本： 1. **1989-2010 批量下载微观数据.R**：实现下载并导入任意版本消费者金融调查的压缩Stata文件（.dta）的功能。循环遍历用户指定的年份（始于1989年改版后），下载主文件、提取文件与重复权重文件，随后将其导入R环境；将主文件拆分为五个插补集（每个插补集对应每个主要经济单位一条记录），并将对应提取数据合并至每个插补集；将五个插补集与重复权重保存为R数据文件（.rda），便于后续快速加载。 2. **2010年分析示例.R**：编写两份适配消费者金融调查多重插补设计的问卷分析函数；加载必要的R数据文件（.rda）以创建多重插补、重复加权的问卷设计对象；演示如何访问多重插补问卷设计对象的属性；生成描述性统计量并导出示例结果，该结果考虑了SCF特有的方差计算规则；运行简易t检验与回归分析（仅因用户有此需求）。 3. **复刻FRB SAS输出.R**：复刻美国联邦储备委员会工作人员提供的全部统计结果；创建多重插补、重复加权的问卷设计对象；复刻全部上述统计量，借助多重插补问卷设计对象展现统计理论上可忽略的差异。欲了解更多消费者金融调查的细节，可访问： - 美国联邦储备委员会理事会消费者金融调查官方主页 - 最新SCF图表手册，一览可实现的分析方向（提示：几乎所有方向） - 消费者金融调查维基百科条目 - 官方常见问题解答注：虽有其他数据源可提供美国家庭财务健康、财富与资产的全国代表性统计，但除收入与项目参与调查（Survey of Income and Program Participation, SIPP）的资产专题模块外，鲜有竞争者。一方面，SCF的访谈问题涵盖的细节远超SIPP；另一方面，SCF样本量较小，无法支持小众亚群体的精细分析。此外，对于关注该领域的研究者而言，两种数据源间存在若干偏倚需加以考量。美国联邦储备委员会的调查方法学家对工作一丝不苟，相关工作论文即可佐证。不妨前往其留言簿留言致谢，多多益善。温馨提示：SAS、SPSS、Stata与SUDAAN用户请注意：上个世纪八十年代的统计工具该更新换代了，是时候转向R语言了。:D

创建时间：

2024-01-31

搜集汇总

数据集介绍

构建方式

Survey of Consumer Finances (SCF) 数据集由美国联邦储备系统定期进行，旨在全面捕捉美国家庭的财务状况。该数据集通过多阶段分层随机抽样方法，从全国范围内选取代表性家庭样本。调查内容涵盖家庭资产、负债、收入、支出等多个维度，确保数据的广泛性和代表性。数据收集过程严格遵循统计学原则，确保样本的无偏性和可靠性。

特点

SCF 数据集以其详尽性和权威性著称，提供了关于美国家庭财务状况的全面视角。其特点包括高频率的更新（每三年一次），确保数据的时代性；广泛的覆盖范围，涵盖不同收入水平、种族和地理位置的家庭；以及丰富的变量集，包括但不限于资产配置、债务结构和消费行为。这些特点使得SCF成为研究家庭金融行为和经济政策制定的宝贵资源。

使用方法

SCF 数据集适用于多种研究目的，包括但不限于家庭金融行为分析、收入不平等研究、以及宏观经济政策的评估。研究者可以通过访问SCF的官方网站获取数据，并利用统计软件进行数据清洗和分析。使用时，建议结合其他经济指标和时间序列数据，以增强研究的深度和广度。此外，SCF的数据开放性和详细的使用指南，使得即使是非专业人士也能有效利用该数据集进行研究。

背景与挑战

背景概述

Survey of Consumer Finances (SCF) 是由美国联邦储备系统自1983年起定期进行的全国性家庭财务状况调查。该数据集旨在提供关于美国家庭资产、负债、收入和消费行为的详细信息，为政策制定者、经济学家和研究人员提供关键数据支持。SCF的独特之处在于其对高收入家庭的深入调查，填补了其他调查数据在这一领域的空白。通过多年的积累，SCF已成为研究家庭金融行为和财富分配的重要资源，对理解经济不平等和制定相关政策具有深远影响。

当前挑战

SCF在构建过程中面临多项挑战。首先，数据收集的复杂性在于需要确保样本的代表性，以准确反映全国不同收入阶层和地区的家庭财务状况。其次，隐私保护和数据安全是重大关切，特别是在涉及敏感财务信息时。此外，随着经济环境的变化，如何持续更新调查方法以捕捉新兴金融产品和行为，也是一项持续的挑战。最后，数据分析的复杂性要求研究人员具备高度的专业技能，以从海量数据中提取有意义的见解。

发展历史

创建时间与更新

Survey of Consumer Finances (SCF) 数据集由美国联邦储备系统于1983年首次创建，旨在提供关于美国家庭财务状况的全面数据。该数据集每三年更新一次，最近一次更新是在2019年，反映了2019年的家庭财务状况。

重要里程碑

SCF 数据集的重要里程碑包括1992年的重大修订，引入了更详细的资产和负债分类，以及2007年增加了对金融衍生品和复杂金融工具的调查。此外，2010年的更新特别关注了金融危机后的家庭财务恢复情况，为政策制定者提供了宝贵的参考。

当前发展情况

当前，SCF 数据集已成为研究家庭财务行为、财富分配和金融稳定性的重要工具。其数据被广泛应用于经济学、金融学和社会学的研究中，为学术界和政策制定者提供了丰富的实证数据。SCF 的持续更新确保了其数据的相关性和时效性，对理解现代经济中的家庭财务动态具有重要意义。

发展历程

首次发表，由美国联邦储备系统（Federal Reserve System）启动，旨在收集和分析美国家庭的财务状况数据。
1946年
首次应用，数据集被用于研究美国家庭的储蓄行为和财务健康状况。
1950年
重要里程碑，SCF进行了重大改革，引入了更广泛的财务指标和更详细的调查问卷，以提高数据的质量和覆盖面。
1983年
进一步扩展，SCF开始每三年进行一次调查，以提供更频繁的财务状况更新。
1992年
技术革新，引入了计算机辅助个人访谈（CAPI）技术，提高了数据收集的效率和准确性。
2004年
数据公开，SCF数据集开始通过互联网向公众开放，促进了学术研究和政策分析的使用。
2010年
最新进展，SCF继续更新和扩展其调查内容，以反映现代经济和金融环境的变化。
2019年

常用场景

经典使用场景

在金融经济学领域，Survey of Consumer Finances (SCF) 数据集被广泛用于研究家庭财务状况和消费行为。该数据集通过详细的问卷调查，收集了美国家庭的资产、负债、收入、支出等关键财务信息。研究者利用这些数据分析家庭财富分布、消费模式、借贷行为以及金融决策的影响因素，为政策制定和学术研究提供了宝贵的实证依据。

衍生相关工作

SCF 数据集的广泛应用催生了大量相关的经典研究工作。例如，学者们基于SCF数据开发了多种家庭财务模型，用于预测家庭消费和储蓄行为。此外，SCF数据还被用于构建和验证金融脆弱性指数，帮助识别和评估家庭面临的金融风险。在政策研究领域，SCF数据为评估税收政策、社会保障政策和住房政策的影响提供了重要依据。这些衍生工作不仅丰富了金融经济学的理论体系，也为实际政策制定提供了科学支持。

数据集最近研究