Supplementary Material for: Evaluation of AI-Based Chatbots in Liver Cancer Information Dissemination: A Comparative Analysis of GPT, DeepSeek, Copilot, and Gemini

Figshare2025-06-06 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/Supplementary_Material_for_Evaluation_of_AI-Based_Chatbots_in_Liver_Cancer_Information_Dissemination_A_Comparative_Analysis_of_GPT_DeepSeek_Copilot_and_Gemini/29254325

下载链接

链接失效反馈

官方服务：

资源简介：

Background/Objectives: This study aimed to evaluate AI-based chatbots (GPT, DeepSeek, Copilot, Gemini) in disseminating information on liver cancer, emphasizing content quality, adherence to established guidelines, and ease of comprehension. Methods: Between January and February 2025, four chatbots were examined us-ing publicly accessible free versions lacking independent reasoning capabilities. Three frequently searched Google Trends questions (“What is liver cancer awareness?”, “What are the symptoms of liver cancer?”, and “Is liver cancer treatable?”) were posed. Their responses were assessed via the DISCERN instrument, Cole-man-Liau Index, Patient Education Materials Assessment Tool for Print, and alignment with American Asso-ciation for the Study of Liver Diseases, National Comprehensive Cancer Network, and European Society for Medical Oncology recommendations. Statistical analysis was performed using SPSS 22. Results: All chatbots largely provided relevant and impartial information. GPT and DeepSeek scored lower on specifying infor-mation sources and update timelines, whereas Copilot omitted local therapies (e.g., Radiofrequency Ablation, Transarterial Chemoembolization, Transarterial Radioembolization), resulting in reduced scientific accuracy. Gemini and Copilot performed better in “Understandability,” while GPT and DeepSeek excelled in “Actiona-bility.” Although GPT demonstrated consistency across multiple treatment options, it did not explicitly refer-ence international guidelines. Study limitations included language constraints, variations in chatbot updates, and reliance on a single inquiry round. Conclusions: AI chatbots show potential as initial informational tools for liver cancer but cannot replace professional medical consultation. In complex diseases requiring multidis-ciplinary management, frequent guideline-based updates, expert validation, and diverse data sources are critical to enhancing clinical relevance and patient outcomes. Keywords: Liver Cancer, Artificial Intelligence, Clinical Decision Support, Chatbots, Oncology.

背景与研究目的：本研究旨在评估基于人工智能的聊天机器人（GPT、DeepSeek、Copilot、Gemini）在传播肝癌相关信息方面的表现，重点考察其内容质量、对既定指南的依从性以及内容的易懂性。方法：2025年1月至2月期间，本研究对四款采用公开可获取的免费版本（不具备独立推理能力）的聊天机器人进行了测评。研究选取了三个谷歌趋势（Google Trends）高频搜索问题："何为肝癌认知？""肝癌的症状有哪些？"以及"肝癌是否可治疗？"，并向四款机器人提问。随后采用DISCERN量表（DISCERN）、科尔曼-刘奥指数（Coleman-Liau Index）、印刷版患者教育材料评估工具（Patient Education Materials Assessment Tool for Print），以及与美国肝病研究学会（American Association for the Study of Liver Diseases）、美国国家综合癌症网络（National Comprehensive Cancer Network）、欧洲肿瘤内科学会（European Society for Medical Oncology）指南推荐的一致性，对机器人的回复进行评估。统计分析采用SPSS 22软件完成。结果：四款聊天机器人均提供了相关性较强且客观公正的信息。GPT与DeepSeek在明确信息来源及更新时间线方面得分较低；而Copilot遗漏了局部治疗手段（如射频消融（Radiofrequency Ablation）、经动脉化疗栓塞（Transarterial Chemoembolization）、经动脉放射栓塞（Transarterial Radioembolization）），导致其科学准确性下降。Gemini与Copilot在"易懂性（Understandability）"维度表现更优，而GPT与DeepSeek则在"可操作性（Actionability）"方面表现出色。尽管GPT对多种治疗方案的阐述具备一致性，但未明确提及国际指南。本研究存在以下局限性：语言限制、聊天机器人版本更新存在差异，且仅进行了单轮提问测试。结论：人工智能聊天机器人具备作为肝癌初始信息工具的潜力，但无法替代专业医疗咨询。对于需要多学科诊疗的复杂疾病而言，定期依据指南进行更新、获取专家验证以及采用多样化数据源，对于提升临床相关性与改善患者结局至关重要。关键词：肝癌、人工智能、临床决策支持、聊天机器人、肿瘤学。

创建时间：

2025-06-06