糖尿病患者记录数据集

Name: 糖尿病患者记录数据集
Creator: 帕依提提
License: 暂无描述

帕依提提2024-03-04 收录

下载链接：

https://www.payititi.com/opendatasets/show-26034.html

下载链接

链接失效反馈

官方服务：

资源简介：

Data Set Information: 糖尿病患者记录来自两个来源：自动电子记录设备和纸质记录。自动装置有一个内部时钟来标记事件，而纸质记录只提供“逻辑时间”时段（早餐、午餐、晚餐、就寝时间）。对于纸质记录，早餐（08:00）、午餐（12:00）、晚餐（18:00）和就寝时间（22:00）被指定为固定时间。因此，纸质记录具有虚拟的统一记录时间，而电子记录具有更真实的时间戳。糖尿病文件由每个记录的四个字段组成。每个字段由选项卡分隔，每个记录由换行符分隔。文件名和格式：（1）年月日格式的日期（2） XX:YY格式的时间（3）代码（4）价值观代码字段的破译如下： 33=常规胰岛素剂量 34=NPH胰岛素剂量 35=UltraLente胰岛素剂量 48=未指定的血糖测量值 57=未指定的血糖测量值 58=早餐前血糖测量 59=早餐后血糖测量 60=午餐前血糖测量 61=午餐后血糖测量 62=晚餐前血糖测量 63=晚餐后血糖测量 64=零食前血糖测量 65=低血糖症状 66=典型的膳食摄入 67=超过正常膳食摄入量 68=低于正常膳食摄入量 69=典型的锻炼活动 70=比通常的锻炼活动多 71=少于通常的锻炼活动 72=未指定的特殊事件 Attribute Information: Diabetes files consist of four fields per record. Each field is separated by a tab and each record is separated by a newline. File Names and format: (1) Date in MM-DD-YYYY format (2) Time in XX:YY format (3) Code (4) Value Relevant Papers: N/A Papers That Cite This Data Set1: Jeroen Eggermont and Joost N. Kok and Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004. [View Context]. Zhi-Hua Zhou and Yuan Jiang. NeC4.5: Neural Ensemble based C4.5. IEEE Trans. Knowl. Data Eng, 16. 2004. [View Context]. Prem Melville and Raymond J. Mooney. Diverse ensembles for active learning. ICML. 2004. [View Context]. Zhihua Zhang and James T. Kwok and Dit-Yan Yeung. Parametric Distance Metric Learning with Label Information. IJCAI. 2003. [View Context]. Michael L. Raymer and Travis E. Doom and Leslie A. Kuhn and William F. Punch. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 33. 2003. [View Context]. Eibe Frank and Mark Hall. Visualizing Class Probability Estimators. PKDD. 2003. [View Context]. Krzysztof Krawiec. Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Institute of Computing Science, Poznan University of Technology. 2002. [View Context]. Ilya Blayvas and Ron Kimmel. Multiresolution Approximation for Classification. CS Dept. Technion. 2002. [View Context]. Peter Sykacek and Stephen J. Roberts. Adaptive Classification by Variational Kalman Filtering. NIPS. 2002. [View Context]. Kristin P. Bennett and Ayhan Demiriz and Richard Maclin. Exploiting unlabeled data in ensemble methods. KDD. 2002. [View Context]. Marina Skurichina and Ludmila Kuncheva and Robert P W Duin. Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy. Multiple Classifier Systems. 2002. [View Context]. Robert Burbidge and Matthew Trotter and Bernard F. Buxton and Sean B. Holden. STAR - Sparsity through Automated Rejection. IWANN (1). 2001. [View Context]. Jochen Garcke and Michael Griebel and Michael Thess. Data Mining with Sparse Grids. Computing, 67. 2001. [View Context]. Peter L. Hammer and Alexander Kogan and Bruno Simeone and Sandor Szedm'ak. R u t c o r Research R e p o r t. Rutgers Center for Operations Research Rutgers University. 2001. [View Context]. Marina Skurichina and Robert P W Duin. Boosting in Linear Discriminant Analysis. Multiple Classifier Systems. 2000. [View Context]. Chris Drummond and Robert C. Holte. Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria. ICML. 2000. [View Context]. Mark A. Hall. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. ICML. 2000. [View Context]. Endre Boros and Peter Hammer and Toshihide Ibaraki and Alexander Kogan and Eddy Mayoraz and Ilya B. Muchnik. An Implementation of Logical Analysis of Data. IEEE Trans. Knowl. Data Eng, 12. 2000. [View Context]. Simon Tong and Daphne Koller. Restricted Bayes Optimal Classifiers. AAAI/IAAI. 2000.

数据集信息：本糖尿病患者数据集源自两类记录来源：自动化电子记录设备与纸质档案。自动化设备内置内部时钟以标记事件发生时刻，而纸质记录仅提供“逻辑时段”标签（早餐、午餐、晚餐、就寝时间）。针对纸质记录，我们将早餐对应至固定时刻08:00、午餐12:00、晚餐18:00、就寝时间22:00，因此纸质记录具备虚拟的标准化记录时间；相较而言，电子记录则拥有更为真实的时间戳。本糖尿病数据集的每条记录包含四个字段，各字段以制表符分隔，不同记录间以换行符分隔。文件名与格式说明：（1）日期，采用MM-DD-YYYY格式（2）时间，采用XX:YY格式（3）代码（Code）（4）数值（Value）代码字段的含义解析如下： 33=常规胰岛素剂量 34=NPH胰岛素剂量 35=UltraLente胰岛素剂量 48=未指定类型血糖测量值 57=未指定类型血糖测量值 58=早餐前血糖测量值 59=早餐后血糖测量值 60=午餐前血糖测量值 61=午餐后血糖测量值 62=晚餐前血糖测量值 63=晚餐后血糖测量值 64=零食前血糖测量值 65=低血糖症状记录 66=常规膳食摄入 67=高于常规膳食摄入量 68=低于常规膳食摄入量 69=常规锻炼活动 70=高于常规锻炼强度 71=低于常规锻炼强度 72=未指定特殊事件属性信息：本糖尿病数据集的每条记录包含四个字段，各字段以制表符分隔，不同记录间以换行符分隔。文件名与格式说明：（1）日期，采用MM-DD-YYYY格式（2）时间，采用XX:YY格式（3）代码（4）数值相关文献：无引用本数据集的文献： 1. Jeroen Eggermont, Joost N. Kok, Walter A. Kosters. Genetic Programming for data classification: partitioning the search space. SAC. 2004. [View Context]. 2. Zhi-Hua Zhou, Yuan Jiang. NeC4.5: Neural Ensemble based C4.5. IEEE Trans. Knowl. Data Eng, 16. 2004. [View Context]. 3. Prem Melville, Raymond J. Mooney. Diverse ensembles for active learning. ICML. 2004. [View Context]. 4. Zhihua Zhang, James T. Kwok, Dit-Yan Yeung. Parametric Distance Metric Learning with Label Information. IJCAI. 2003. [View Context]. 5. Michael L. Raymer, Travis E. Doom, Leslie A. Kuhn, William F. Punch. Knowledge discovery in medical and biological datasets using a hybrid Bayes classifier/evolutionary algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 33. 2003. [View Context]. 6. Eibe Frank, Mark Hall. Visualizing Class Probability Estimators. PKDD. 2003. [View Context]. 7. Krzysztof Krawiec. Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Institute of Computing Science, Poznan University of Technology. 2002. [View Context]. 8. Ilya Blayvas, Ron Kimmel. Multiresolution Approximation for Classification. CS Dept. Technion. 2002. [View Context]. 9. Peter Sykacek, Stephen J. Roberts. Adaptive Classification by Variational Kalman Filtering. NIPS. 2002. [View Context]. 10. Kristin P. Bennett, Ayhan Demiriz, Richard Maclin. Exploiting unlabeled data in ensemble methods. KDD. 2002. [View Context]. 11. Marina Skurichina, Ludmila Kuncheva, Robert P W Duin. Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy. Multiple Classifier Systems. 2002. [View Context]. 12. Robert Burbidge, Matthew Trotter, Bernard F. Buxton, Sean B. Holden. STAR - Sparsity through Automated Rejection. IWANN (1). 2001. [View Context]. 13. Jochen Garcke, Michael Griebel, Michael Thess. Data Mining with Sparse Grids. Computing, 67. 2001. [View Context]. 14. Peter L. Hammer, Alexander Kogan, Bruno Simeone, Sandor Szedm'ak. R u t c o r Research R e p o r t. Rutgers Center for Operations Research Rutgers University. 2001. [View Context]. 15. Marina Skurichina, Robert P W Duin. Boosting in Linear Discriminant Analysis. Multiple Classifier Systems. 2000. [View Context]. 16. Chris Drummond, Robert C. Holte. Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria. ICML. 2000. [View Context]. 17. Mark A. Hall. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. ICML. 2000. [View Context]. 18. Endre Boros, Peter Hammer, Toshihide Ibaraki, Alexander Kogan, Eddy Mayoraz, Ilya B. Muchnik. An Implementation of Logical Analysis of Data. IEEE Trans. Knowl. Data Eng, 12. 2000. [View Context]. 19. Simon Tong, Daphne Koller. Restricted Bayes Optimal Classifiers. AAAI/IAAI. 2000. [View Context].

提供机构：

帕依提提

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集包含糖尿病患者的记录，数据来源于自动电子记录设备和纸质记录，涵盖胰岛素剂量、血糖测量和日常活动等代码信息。数据集以结构化格式存储，包括日期、时间、代码和值四个字段，适用于糖尿病管理和相关研究分析。

以上内容由遇见数据集搜集并总结生成