five

US Baby Names

收藏
www.kaggle.com2017-11-21 更新2025-03-24 收录
下载链接:
https://www.kaggle.com/kaggle/us-baby-names
下载链接
链接失效反馈
官方服务:
资源简介:
US Social Security applications are a great way to track trends in how babies born in the US are named. Data.gov releases two datasets that are helplful for this: one at the [national level](https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-level-data) and another [at the state level](https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-data-by-state-and-district-of-). Note that only names with at least 5 babies born in the same year (/ state) are included in this dataset for privacy. [![benjamin](https://www.kaggle.io/svf/153725/b6c3f30368aeb2b277016b0582d1eab6/nameOverTime.png)](https://www.kaggle.com/benhamner/d/kaggle/us-baby-names/babies-named-benjamin-over-time) I've taken the raw files here and combined/normalized them into two CSV files (one for each dataset) as well as a SQLite database with two equivalently-defined tables. The code that did these transformations is [available here](https://github.com/benhamner/us-baby-names/). *New to data exploration in R? Take the free, interactive DataCamp course, "[Data Exploration With Kaggle Scripts](https://www.datacamp.com/courses/data-exploration-with-kaggle-scripts)," to learn the basics of visualizing data with ggplot. You'll also create your first Kaggle Scripts along the way.*

美国社会保障申请数据集为我们追踪美国出生婴儿命名趋势提供了极佳的途径。数据.gov发布了两个有助于此目的的数据集:一个为国家层面([国家层面数据集](https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-level-data))的数据集,另一个为州及地区层面([州及地区层面数据集](https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-data-by-state-and-district-of-))的数据集。请注意,为了保护隐私,仅包括同一年(/州)出生至少5名婴儿的名字。[![benjamin](https://www.kaggle.io/svf/153725/b6c3f30368aeb2b277016b0582d1eab6/nameOverTime.png)](https://www.kaggle.com/benhamner/d/kaggle/us-baby-names/babies-named-benjamin-over-time) 我已将原始文件在此处合并/规范化为两个CSV文件(每个数据集一个)以及一个包含两个等效定义表的SQLite数据库。进行这些转换的代码[在此处可用](https://github.com/benhamner/us-baby-names/)。 如果您是R语言数据探索的新手?不妨免费参加DataCamp的互动课程“[使用Kaggle脚本的Data Exploration](https://www.datacamp.com/courses/data-exploration-with-kaggle-scripts)”,学习ggplot可视化数据的基础知识。您还将在此过程中创建您的第一个Kaggle脚本。
提供机构:
www.kaggle.com
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作