aai540-group3/diabetes-readmission
收藏Hugging Face2024-09-22 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/aai540-group3/diabetes-readmission
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators: []
language: []
language_creators: []
license: []
multilinguality: []
pretty_name: diabetes-readmission
size_categories:
- 100K<n<1M
source_datasets: []
tags:
- interpretability
- fairness
- medicine
task_categories:
- tabular-classification
task_ids: []
---
Port of the diabetes-readmission dataset from UCI (link [here](https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008)). See details there and use carefully.
Basic preprocessing done by the [imodels team](https://github.com/csinva/imodels) in [this notebook](https://github.com/csinva/imodels-data/blob/master/notebooks_fetch_data/00_get_datasets_custom.ipynb).
The target is the binary outcome `readmitted`.
### Sample usage
Load the data:
```
from datasets import load_dataset
dataset = load_dataset("imodels/diabetes-readmission")
df = pd.DataFrame(dataset['train'])
X = df.drop(columns=['readmitted'])
y = df['readmitted'].values
```
Fit a model:
```
import imodels
import numpy as np
m = imodels.FIGSClassifier(max_rules=5)
m.fit(X, y)
print(m)
```
Evaluate:
```
df_test = pd.DataFrame(dataset['test'])
X_test = df.drop(columns=['readmitted'])
y_test = df['readmitted'].values
print('accuracy', np.mean(m.predict(X_test) == y_test))
```
annotations_creators: 无
language: 无
language_creators: 无
license: 无
multilinguality: 无
pretty_name: 糖尿病再入院(diabetes-readmission)
size_categories:
- 样本量范围:10万 < 样本量 < 100万
source_datasets: 无
tags:
- 可解释性(interpretability)
- 公平性(fairness)
- 医学(medicine)
task_categories:
- 表格分类(tabular-classification)
task_ids: 无
本数据集为UCI机器学习库中糖尿病再入院(diabetes-readmission)数据集的移植版本,源链接见[此处](https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008),请参阅源站点获取详细信息并谨慎使用。
基础预处理工作由[imodels团队](https://github.com/csinva/imodels)在[该Notebook](https://github.com/csinva/imodels-data/blob/master/notebooks_fetch_data/00_get_datasets_custom.ipynb)中完成。
本次任务的预测目标为二分类结果「再入院(readmitted)」。
### 示例用法
加载数据:
python
from datasets import load_dataset
dataset = load_dataset("imodels/diabetes-readmission")
df = pd.DataFrame(dataset['train'])
X = df.drop(columns=['readmitted'])
y = df['readmitted'].values
拟合模型:
python
import imodels
import numpy as np
m = imodels.FIGSClassifier(max_rules=5)
m.fit(X, y)
print(m)
模型评估:
python
df_test = pd.DataFrame(dataset['test'])
X_test = df.drop(columns=['readmitted'])
y_test = df['readmitted'].values
print('accuracy', np.mean(m.predict(X_test) == y_test))
提供机构:
aai540-group3



