The performance of ChatGPT-4.0o in medical imaging evaluation: a preliminary investigation

Name: The performance of ChatGPT-4.0o in medical imaging evaluation: a preliminary investigation
Creator: Charles Sturt University
License: 暂无描述

Research Data Australia2024-12-14 收录

下载链接：

https://researchdata.edu.au/the-performance-chatgpt-preliminary-investigation/3399501

下载链接

链接失效反馈

官方服务：

资源简介：

This study investigated the performance of ChatGPT-4.0o in evaluating the quality of positioning in radiographic images. Thirty radiographs depicting a variety of knee, elbow, ankle, hand, pelvis, and shoulder projections were produced using anthropomorphic phantoms and uploaded to ChatGPT-4.0o. The model was prompted to provide a solution to identify any positioning errors with justification and offer improvements. A panel of radiographers assessed the solutions for radiographic quality based on established positioning criteria, with a grading scale of 1–5. In only 20% of projections, ChatGPT-4.0o correctly recognized all errors with justifications and offered correct suggestions for improvement. The most commonly occurring score was 3 (9 cases, 30%), wherein the model recognized at least 1 specific error and provided a correct improvement. The mean score was 2.9. Overall, low accuracy was demonstrated, with most projections receiving only partially correct solutions. The findings reinforce the importance of robust radiography education and clinical experience.

本研究探究了ChatGPT-4.0o在评估放射影像定位质量方面的性能。研究团队使用拟人模体（anthropomorphic phantoms）生成了30张涵盖膝关节、肘关节、踝关节、手部、骨盆及肩关节各类投照位的放射影像，并将其上传至ChatGPT-4.0o。向该模型下达提示指令，要求其识别所有定位错误并给出合理解据，同时提供优化改进方案。由一组放射科医师基于既定的投照定位标准，采用1至5分的等级量表对模型输出的影像质量评估结果进行评分。仅在20%的投照位中，ChatGPT-4.0o能够准确识别全部错误并提供合理解释，同时给出正确的优化建议。最常见的评分为3分（共9例，占比30%），此时模型至少识别出1项具体错误并提供了正确的改进方案。本次研究的平均得分为2.9分。整体而言，模型的准确率较低，绝大多数投照位仅能得到部分正确的评估结果。本研究结果凸显了扎实的放射学教育与临床实践经验的重要性。

提供机构：

Charles Sturt University