five

imodels/diabetes-readmission

收藏
Hugging Face2022-08-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/imodels/diabetes-readmission
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: [] language: [] language_creators: [] license: [] multilinguality: [] pretty_name: diabetes-readmission size_categories: - 100K<n<1M source_datasets: [] tags: - interpretability - fairness - medicine task_categories: - tabular-classification task_ids: [] --- Port of the diabetes-readmission dataset from UCI (link [here](https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008)). See details there and use carefully. Basic preprocessing done by the [imodels team](https://github.com/csinva/imodels) in [this notebook](https://github.com/csinva/imodels-data/blob/master/notebooks_fetch_data/00_get_datasets_custom.ipynb). The target is the binary outcome `readmitted`. ### Sample usage Load the data: ``` from datasets import load_dataset dataset = load_dataset("imodels/diabetes-readmission") df = pd.DataFrame(dataset['train']) X = df.drop(columns=['readmitted']) y = df['readmitted'].values ``` Fit a model: ``` import imodels import numpy as np m = imodels.FIGSClassifier(max_rules=5) m.fit(X, y) print(m) ``` Evaluate: ``` df_test = pd.DataFrame(dataset['test']) X_test = df.drop(columns=['readmitted']) y_test = df['readmitted'].values print('accuracy', np.mean(m.predict(X_test) == y_test)) ```

--- annotations_creators: [] language: [] language_creators: [] license: [] multilinguality: [] pretty_name: 糖尿病再入院(diabetes-readmission) size_categories: - 10万 < 样本量 < 100万 source_datasets: [] tags: - 可解释性(interpretability) - 公平性(fairness) - 医学(medicine) task_categories: - 表格分类(tabular-classification) task_ids: [] --- 本数据集为从UCI(University of California, Irvine)移植的糖尿病再入院(diabetes-readmission)数据集,原始链接见[此处](https://archive.ics.uci.edu/ml/datasets/diabetes+130-us+hospitals+for+years+1999-2008)。请参考原始文档并谨慎使用。 基础预处理工作由[imodels团队](https://github.com/csinva/imodels)在[该代码笔记本](https://github.com/csinva/imodels-data/blob/master/notebooks_fetch_data/00_get_datasets_custom.ipynb)中完成。 任务目标为二分类结果`再入院(readmitted)`。 ### 示例用法 #### 加载数据 from datasets import load_dataset dataset = load_dataset("imodels/diabetes-readmission") df = pd.DataFrame(dataset['train']) X = df.drop(columns=['readmitted']) y = df['readmitted'].values #### 拟合模型 import imodels import numpy as np m = imodels.FIGSClassifier(max_rules=5) m.fit(X, y) print(m) #### 评估 df_test = pd.DataFrame(dataset['test']) X_test = df.drop(columns=['readmitted']) y_test = df['readmitted'].values print('accuracy', np.mean(m.predict(X_test) == y_test))
提供机构:
imodels
原始信息汇总

数据集概述

数据集名称

  • 名称:diabetes-readmission

数据集属性

  • 大小:100K<n<1M
  • 标签:
    • interpretability
    • fairness
    • medicine
  • 任务类别:tabular-classification

数据集目标

  • 目标变量:readmitted(二元分类)

数据集使用示例

  • 加载数据: python from datasets import load_dataset dataset = load_dataset("imodels/diabetes-readmission") df = pd.DataFrame(dataset[train]) X = df.drop(columns=[readmitted]) y = df[readmitted].values

  • 模型训练: python import imodels import numpy as np m = imodels.FIGSClassifier(max_rules=5) m.fit(X, y) print(m)

  • 模型评估: python df_test = pd.DataFrame(dataset[test]) X_test = df.drop(columns=[readmitted]) y_test = df[readmitted].values print(accuracy, np.mean(m.predict(X_test) == y_test))

搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作