imodels/diabetes-readmission
收藏Hugging Face2022-08-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/imodels/diabetes-readmission
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators: []
language: []
language_creators: []
license: []
multilinguality: []
pretty_name: diabetes-readmission
size_categories:
- 100K<n<1M
source_datasets: []
tags:
- interpretability
- fairness
- medicine
task_categories:
- tabular-classification
task_ids: []
---
Port of the diabetes-readmission dataset from UCI (link [here](https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008)). See details there and use carefully.
Basic preprocessing done by the [imodels team](https://github.com/csinva/imodels) in [this notebook](https://github.com/csinva/imodels-data/blob/master/notebooks_fetch_data/00_get_datasets_custom.ipynb).
The target is the binary outcome `readmitted`.
### Sample usage
Load the data:
```
from datasets import load_dataset
dataset = load_dataset("imodels/diabetes-readmission")
df = pd.DataFrame(dataset['train'])
X = df.drop(columns=['readmitted'])
y = df['readmitted'].values
```
Fit a model:
```
import imodels
import numpy as np
m = imodels.FIGSClassifier(max_rules=5)
m.fit(X, y)
print(m)
```
Evaluate:
```
df_test = pd.DataFrame(dataset['test'])
X_test = df.drop(columns=['readmitted'])
y_test = df['readmitted'].values
print('accuracy', np.mean(m.predict(X_test) == y_test))
```
---
annotations_creators: []
language: []
language_creators: []
license: []
multilinguality: []
pretty_name: 糖尿病再入院(diabetes-readmission)
size_categories:
- 10万 < 样本量 < 100万
source_datasets: []
tags:
- 可解释性(interpretability)
- 公平性(fairness)
- 医学(medicine)
task_categories:
- 表格分类(tabular-classification)
task_ids: []
---
本数据集为从UCI(University of California, Irvine)移植的糖尿病再入院(diabetes-readmission)数据集,原始链接见[此处](https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008)。请参考原始文档并谨慎使用。
基础预处理工作由[imodels团队](https://github.com/csinva/imodels)在[该代码笔记本](https://github.com/csinva/imodels-data/blob/master/notebooks_fetch_data/00_get_datasets_custom.ipynb)中完成。
任务目标为二分类结果`再入院(readmitted)`。
### 示例用法
#### 加载数据
from datasets import load_dataset
dataset = load_dataset("imodels/diabetes-readmission")
df = pd.DataFrame(dataset['train'])
X = df.drop(columns=['readmitted'])
y = df['readmitted'].values
#### 拟合模型
import imodels
import numpy as np
m = imodels.FIGSClassifier(max_rules=5)
m.fit(X, y)
print(m)
#### 评估
df_test = pd.DataFrame(dataset['test'])
X_test = df.drop(columns=['readmitted'])
y_test = df['readmitted'].values
print('accuracy', np.mean(m.predict(X_test) == y_test))
提供机构:
imodels
原始信息汇总
数据集概述
数据集名称
- 名称:diabetes-readmission
数据集属性
- 大小:100K<n<1M
- 标签:
- interpretability
- fairness
- medicine
- 任务类别:tabular-classification
数据集目标
- 目标变量:
readmitted(二元分类)
数据集使用示例
-
加载数据: python from datasets import load_dataset dataset = load_dataset("imodels/diabetes-readmission") df = pd.DataFrame(dataset[train]) X = df.drop(columns=[readmitted]) y = df[readmitted].values
-
模型训练: python import imodels import numpy as np m = imodels.FIGSClassifier(max_rules=5) m.fit(X, y) print(m)
-
模型评估: python df_test = pd.DataFrame(dataset[test]) X_test = df.drop(columns=[readmitted]) y_test = df[readmitted].values print(accuracy, np.mean(m.predict(X_test) == y_test))
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



