Medical Cost Personal Datasets
收藏www.kaggle.com2018-02-21 更新2025-01-21 收录
下载链接:
https://www.kaggle.com/mirichoi0218/insurance
下载链接
链接失效反馈官方服务:
资源简介:
## Context
Machine Learning with R by Brett Lantz is a book that provides an introduction to machine learning using R. As far as I can tell, Packt Publishing does not make its datasets available online unless you buy the book and create a user account which can be a problem if you are checking the book out from the library or borrowing the book from a friend. All of these datasets are in the public domain but simply needed some cleaning up and recoding to match the format in the book.
## Content
**Columns**
- age: age of primary beneficiary
- sex: insurance contractor gender, female, male
- bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height,
objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9
- children: Number of children covered by health insurance / Number of dependents
- smoker: Smoking
- region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.
- charges: Individual medical costs billed by health insurance
## Acknowledgements
The dataset is available on GitHub [here](https://github.com/stedy/Machine-Learning-with-R-datasets).
## Inspiration
Can you accurately predict insurance costs?
《利用 R 进行机器学习》一书,由 Brett Lantz 撰写,为读者提供了基于 R 语言进行机器学习的入门指导。据我所知,Packt 出版公司并不提供在线数据集,除非购买书籍并创建用户账户,这在您从图书馆借阅或向朋友借阅书籍的情况下可能成为问题。所有这些数据集均属于公共领域,但仅需进行一些清理和重新编码,以符合书中的格式。
**列信息**
- 年龄:主要受益人的年龄
- 性别:保险承包商性别,女性,男性
- 体质指数(BMI):提供对相对身高而言体重较高或较低的身体状况的理解,体重指数是衡量体重的客观指标(千克/平方米),使用身高与体重的比率计算,理想值为 18.5 至 24.9
- 儿童:受健康保险覆盖的儿童数量/受抚养人数量
- 吸烟者:吸烟状况
- 地区:受益人在美国的居住区域,东北部、东南部、西南部、西北部
- 费用:健康保险开具的个人医疗费用
**致谢**
该数据集可在 GitHub 上找到 [此处](https://github.com/stedy/Machine-Learning-with-R-datasets)。
**灵感**
您能否准确预测保险费用?
提供机构:
Kaggle



