five

Diabetes Diagnosis Dataset for Predictive Modeling and Machine Learning

收藏
Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/t8hnnyr5ph
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains clinical and demographic information of patients collected for the purpose of diabetes prediction and analysis using machine learning techniques. The dataset is structured in CSV format and includes several key medical attributes that are widely used in diabetes diagnosis and research studies. The primary objective of this dataset is to facilitate binary classification of diabetes status and support research in medical data analysis, predictive modeling, feature selection, and explainable artificial intelligence (XAI). It can be effectively utilized for developing, training, validating, and benchmarking machine learning and deep learning models for early-stage diabetes detection. Dataset Structure Each row in the dataset represents a single patient record, while each column corresponds to a clinical or physiological feature. The dataset includes the following attributes: 1. Pregnancies – Number of times pregnant 2. Glucose – Plasma glucose concentration 3. BloodPressure – Diastolic blood pressure (mm Hg) 4. SkinThickness – Triceps skin fold thickness (mm) 5. Insulin – 2-Hour serum insulin (mu U/ml) 6. BMI – Body Mass Index (weight in kg / height in m²) 7. DiabetesPedigreeFunction – A function indicating hereditary influence 8. Age – Age of the patient (years) 9. Outcome – Diabetes status (0 = Non-diabetic, 1 = Diabetic) Key Features 1. Clean tabular structure suitable for supervised learning tasks 2. Balanced numerical attributes for classification modeling 3. Supports feature selection, optimization, and explainable AI analysis 4. Applicable for academic research, teaching, and benchmarking 5. Potential Applications 6. Diabetes risk prediction 7. Medical decision support systems 8. Machine learning classification experiments 9. Feature importance analysis 10. Explainable AI (XAI) studies using SHAP, LIME, Grad-CAM (for hybrid models) File Format CSV (.csv) Usage Rights This dataset is shared for research and educational purposes only.
创建时间:
2026-02-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作