SPSS data file for statistical analyses.

Figshare2025-09-12 更新2026-04-28 收录

下载链接：

https://figshare.com/articles/dataset/SPSS_data_file_for_statistical_analyses_/30115407

下载链接

链接失效反馈

官方服务：

资源简介：

BackgroundThis study evaluates the diagnostic performance of three multimodal large language models (LLMs)—ChatGPT-4o, Gemini 2.0, and Claude 3.5—in identifying pneumothorax from chest radiographs.MethodsIn this retrospective analysis, 172 pneumothorax cases (148 patients aged >12 years, 24 patients aged ≤12 years) with both chest radiographs and confirmatory thoracic CT were included from a tertiary emergency department. Patients were categorized by age and pneumothorax size (small/large). Each radiograph was presented to all three LLMs accompanied by basic symptoms (dyspnea or chest pain), with each model analyzing each image three times. Diagnostic accuracy was evaluated using overall accuracy (all three responses correct), strict accuracy (≥2 responses correct), and ideal accuracy (≥1 response correct), alongside response consistency assessment using Fleiss’ Kappa.ResultsIn patients older than 12 years, ChatGPT-4o demonstrated the highest overall accuracy (69.6%), followed by Claude 3.5 (64.9%) and Gemini 2.0 (57.4%). Performance was significantly poorer in pediatric patients across all models (20.8%, 12.5%, and 20.8%, respectively). For large pneumothorax in adults, ChatGPT-4o showed significantly higher accuracy compared to small pneumothorax (81.6% vs. 42.2%; p ConclusionThis study, the first to evaluate these three current multimodal LLMs in pneumothorax identification across different age groups, demonstrates promising results for potential clinical applications, particularly for adult patients with large pneumothorax. However, performance limitations in pediatric cases and with small pneumothoraces highlight the need for further validation before clinical implementation.

创建时间：

2025-09-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集