five

Medical Cost Personal Datasets

收藏
www.kaggle.com2018-02-21 更新2025-01-21 收录
下载链接:
https://www.kaggle.com/mirichoi0218/insurance
下载链接
链接失效反馈
官方服务:
资源简介:
## Context Machine Learning with R by Brett Lantz is a book that provides an introduction to machine learning using R. As far as I can tell, Packt Publishing does not make its datasets available online unless you buy the book and create a user account which can be a problem if you are checking the book out from the library or borrowing the book from a friend. All of these datasets are in the public domain but simply needed some cleaning up and recoding to match the format in the book. ## Content **Columns** - age: age of primary beneficiary - sex: insurance contractor gender, female, male - bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9 - children: Number of children covered by health insurance / Number of dependents - smoker: Smoking - region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest. - charges: Individual medical costs billed by health insurance ## Acknowledgements The dataset is available on GitHub [here](https://github.com/stedy/Machine-Learning-with-R-datasets). ## Inspiration Can you accurately predict insurance costs?

《利用 R 进行机器学习》一书,由 Brett Lantz 撰写,为读者提供了基于 R 语言进行机器学习的入门指导。据我所知,Packt 出版公司并不提供在线数据集,除非购买书籍并创建用户账户,这在您从图书馆借阅或向朋友借阅书籍的情况下可能成为问题。所有这些数据集均属于公共领域,但仅需进行一些清理和重新编码,以符合书中的格式。 **列信息** - 年龄:主要受益人的年龄 - 性别:保险承包商性别,女性,男性 - 体质指数(BMI):提供对相对身高而言体重较高或较低的身体状况的理解,体重指数是衡量体重的客观指标(千克/平方米),使用身高与体重的比率计算,理想值为 18.5 至 24.9 - 儿童:受健康保险覆盖的儿童数量/受抚养人数量 - 吸烟者:吸烟状况 - 地区:受益人在美国的居住区域,东北部、东南部、西南部、西北部 - 费用:健康保险开具的个人医疗费用 **致谢** 该数据集可在 GitHub 上找到 [此处](https://github.com/stedy/Machine-Learning-with-R-datasets)。 **灵感** 您能否准确预测保险费用?
提供机构:
Kaggle
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作