Benchmarking ChatGPT and Other Large Language Models for Personalized Stage-Specific Dietary Recommendations in Chronic Kidney Disease

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://doi.org/10.7910/DVN/SY0TSM

下载链接

链接失效反馈

官方服务：

资源简介：

In case of using our dataset, please cite our work: Kairat, M., Adilmetova, G., Ibraimova, I., Gaipov, A., Varol, H. A., & Chan, M. Y. (2025). Benchmarking ChatGPT and Other Large Language Models for Personalized Stage-Specific Dietary Recommendations in Chronic Kidney Disease. Journal of clinical medicine, 14(22), 8033. https://doi.org/10.3390/jcm14228033 Description: This repository presents a structured evaluation of large language model (LLM)-generated nutritional recommendations for patients with chronic kidney disease (CKD). It includes 5 detailed mock patient profiles, standardized prompts, responses from three state-of-the-art LLMs, and expert evaluations. The goal is to assess how effectively LLMs can provide stage-appropriate, culturally relevant dietary guidance for CKD management. Patient profiles vary in complexity and represent different stages of CKD and common comorbidities, simulating real-world clinical use cases. LLMs Evaluated Each patient case was submitted to the following LLMs: ChatGPT-4 (OpenAI) Gemini (Google) Copilot (Microsoft) Each model generated responses to the same prompt, allowing for comparison across LLMs. Prompt Used All LLMs received the same standardized prompt for each patient case: Provide a culturally appropriate, stage-specific dietary plan (breakfast, lunch, and dinner) for this patient with chronic kidney disease (CKD). Consider dietary restrictions (e.g., sodium, potassium, phosphorus, and protein intake) and incorporate foods commonly consumed in Central Asia. Evaluation Criteria Three domain experts (nutritionists and/or nephrologists) evaluated each LLM-generated response based on the following criteria: Personalization – How well the response reflects the patient’s specific clinical condition and context. Consistency with Guidelines – Adherence to evidence-based dietary recommendations for CKD. Practicality and Availability – Feasibility and cultural relevance of the proposed meal plan using foods commonly consumed in Central Asia. The repository contains the following files and folders: CKD 5 cases.docx – Five detailed mock patient profiles with varying CKD stages and clinical complexity. Evaluation_Criteria.docx – Description of the evaluation rubric used by experts, including definitions for personalization, guideline consistency, and practicality/availability. Prompts-20250703T071939Z-1-001.zip – Contains standardized prompts and LLM-generated responses for each CKD case. Responses are organized by case and model (ChatGPT-4, Gemini, Copilot). Evaluations-20250703T071933Z-1-001.zip – Contains expert evaluation data, including scores and comments for each LLM response across all criteria.

创建时间：

2026-01-29