"PredictGL-NC: A Nutritional Composition Dataset for Glycemic Load Prediction"
收藏DataCite Commons2026-04-17 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/predictgl-nc-dataset
下载链接
链接失效反馈官方服务:
资源简介:
"Glycemic load (GL) is a practical indicator of the glycemic impact of foods because it reflects both carbohydrate quality and carbohydrate quantity. Despite its relevance to nutrition research, dietary planning, and glycemic management, no single publicly available dataset was identified that is specifically structured for machine-learning-based GL prediction using nutritional composition variables. Existing glycemic index (GI) and GL resources are informative, but they are typically dispersed across published studies and reference databases, reported in heterogeneous formats, and not readily organized for computational modeling. This limits their direct use in regression analysis, artificial intelligence applications, and reproducible food informatics research. To address this gap, a curated food-level dataset was developed by integrating glycemic indicators with key nutritional composition variables.The dataset was compiled through a structured literature-based data collection process combined with database extraction. Relevant records were identified from Google Scholar, PubMed, Scopus, Web of Science, and the Sydney University Glycemic Index database. The search strategy targeted studies reporting glycemic index, glycemic load, and nutritional composition data, using terms related to glycemic response, available carbohydrate, macronutrient composition, mixed meals, and common food groups. Reference lists of relevant studies were also screened to identify additional eligible records. Both full-text papers and abstracts were considered during the screening process.Eligibility criteria were defined to support consistency and suitability for predictive modeling. Only English-language human in vivo studies conducted using standardized glycemic assessment methods were considered. In vitro studies, studies involving non-healthy populations, and records lacking sufficient glycemic or nutritional composition data were excluded. A total of 184 records were initially identified. After removal of 23 duplicate records, 161 studies remained for screening. Of these, 12 in vitro studies and 18 studies involving non-healthy individuals were excluded. Following assessment of data completeness, 106 studies were retained for dataset development.From these eligible studies, 612 food items were extracted. These records were supplemented with 163 additional food items obtained from the Sydney University Glycemic Index database, resulting in a final compiled dataset of 775 food items. For each food item, the dataset includes food name, glycemic index, available carbohydrate per 100 g, protein per 100 g, total fat per 100 g, and total dietary fibre per 100 g. An additional independent validation dataset comprising 74 food items was compiled separately using the same procedure and kept distinct from the development dataset.This dataset was created to support machine-learning-based glycemic load prediction and related nutrition informatics applications. It is expected to be useful for predictive modeling, regression benchmarking, dietary decision-support systems, and digital health research involving food-level glycemic assessment."
提供机构:
IEEE DataPort
创建时间:
2026-04-17



