Adult Data Set
收藏BigML2026-05-08 更新2025-01-04 收录
下载链接:
https://bigml.com/user/czuriaga/gallery/dataset/5a61dce12a83477e0d000112
下载链接
链接失效反馈官方服务:
资源简介:
**Abstract**: Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.
**Donor**:
Ronny Kohavi and Barry Becker
Data Mining and Visualization
Silicon Graphics
e-mail: ronnyk@live.com for questions.
**Data Set Information**:
Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))
Prediction task is to determine whether a person makes over 50K a year.
**Listing of attributes**:
**age**: continuous.
**workclass**: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.
**fnlwgt**: continuous.
**education**: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.
**education-num**: continuous.
**marital-status**: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.
**occupation**: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.
**relationship**: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.
**race**: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.
**sex**: Female, Male.
**capital-gain**: continuous.
**capital-loss**: continuous.
**hours-per-week**: continuous.
**native-country**: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.
[Adult dataset from Delve datasets](http://www.cs.toronto.edu/~delve/data/datasets.html) and [UCI](https://archive.ics.uci.edu/ml/datasets/adult)
创建时间:
2018-01-19
原始信息汇总
Adult Data Set 数据集概述
基本信息
- 数据集名称: Adult Data Set
- 数据集大小: 5.0 MB
- 字段数量: 15
- 实例数量: 48,842
- 创建时间: Fri, 19 Jan 2018 11:56:17 +0000
- 发布时间: Fri, 19 Jan 2018 16:36:34 +0000
- 数据集URL: https://bigml.com/user/czuriaga/gallery/dataset/5a61dce12a83477e0d000112
描述
- 摘要: 基于人口普查数据预测收入是否超过每年5万美元。也称为“Census Income”数据集。
- 捐赠者: Ronny Kohavi 和 Barry Becker(数据挖掘与可视化,Silicon Graphics)
- 数据来源: 从1994年人口普查数据库中提取,使用条件: ((AAGE>16) && (AGI>100) && (AFNLWGT>1)&& (HRSWK>0))
- 预测任务: 确定一个人年收入是否超过5万美元。
属性列表
- age: 连续型。
- workclass: 类别型(Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked)。
- fnlwgt: 连续型。
- education: 类别型(Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool)。
- education-num: 连续型。
- marital-status: 类别型(Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse)。
- occupation: 类别型(Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces)。
- relationship: 类别型(Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried)。
- race: 类别型(White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black)。
- sex: 类别型(Female, Male)。
- capital-gain: 连续型。
- capital-loss: 连续型。
- hours-per-week: 连续型。
- native-country: 类别型(United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands)。
数据质量
- 缺失值:
- workclass: 2,799
- occupation: 2,809
- 错误值: 无
标签
- Census
- Demographic
- Incomes
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集基于1994年美国人口普查数据,包含48,842条记录和15个特征字段,主要用于预测个人年收入是否超过5万美元。数据经过清洗筛选,涵盖人口统计、教育、职业等多维度信息。
以上内容由遇见数据集搜集并总结生成



