five

Long Term Horizon Predictions and Feature Explainability of Time Series Continuous Glucose Monitor Data. In Data Science & Engineering Master of Advanced Study (DSE MAS) Capstone Projects

收藏
DataCite Commons2026-04-17 更新2025-04-16 收录
下载链接:
https://library.ucsd.edu/dc/object/bb9329453v
下载链接
链接失效反馈
官方服务:
资源简介:
Completed as a Capstone project for the Data Science & Engineering MAS program (DSE260 Capstone), this project utilizes Dexcom's data from their Continuous Glucose Monitors (CGM) to predict and describe Type 2 diabetic patients' future time above or below healthy glucose levels. The time series data, recorded in 5 minute intervals over 365 days for 8000 patients, is used to extract features such as entropy (predictability), variance (from Poincaré plots), among others. These features—alongside demographic data such as treatment type, patient age, and patient sex—are used in XGBoost classification models to predict if the amount of time a patient will spend out of range the next day is more, less, or the same as the amount of time a patient spent out of range the day before. The XGBoost classification models also provide feature importance, which informs the patient of which features affect their outcome the most. This project used pre-existing data from Dexcom. However, this data is not available for sharing due to Dexcom's licensing terms. Also, we did not receive permission to share the Dexcom model.

本项目为数据科学与工程理学硕士(MAS)项目DSE260顶石课程的综合作业,依托德康(Dexcom)公司的连续血糖监测(Continuous Glucose Monitor, CGM)系统采集的数据,对2型糖尿病患者未来血糖超出或低于健康水平的时长进行预测与描述。 该数据集包含8000名患者连续365天、每5分钟采集一次的时序数据,可用于提取熵(可预测性)、庞加莱图方差等多项特征。将上述特征与治疗方案类型、患者年龄、患者性别等人口统计学数据相结合,用于训练XGBoost分类模型,以预测患者次日血糖超出正常范围的时长相较于前一日是更多、更少还是持平。此外,该XGBoost分类模型还可输出特征重要性结果,帮助患者明确对自身血糖结局影响最大的特征。 本项目使用了德康公司的现有数据,但受德康公司的授权条款限制,该数据无法共享;同时我方未获得共享德康相关模型的许可。
提供机构:
UC San Diego Library Digital Collections
创建时间:
2024-01-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作