Diagnostic Accuracy of ChatGPT in Dermatology: A Meta-Analysis of Textual versus Visual Prompts
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/s2rc2d222f
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides the full set of supplemental materials for the meta-analysis titled "Diagnostic Accuracy of ChatGPT in Dermatology: A Meta-Analysis of Textual versus Visual Prompts." The supplemental content includes detailed search strategies, statistical methodologies, study selection process, and comparative performance analyses of ChatGPT models across multiple variables.
Supplemental Figure 1:
PRISMA diagram illustrating the study selection process. Records identified from PubMed and SCOPUS were screened, with exclusions for duplicates, irrelevant topics, non-dermatologic evaluations, non-ChatGPT models, and missing accuracy data. A total of 17 studies were included in the final meta-analysis.
Supplemental Table 1:
Search criteria and statistical analysis methodology. Details search terms, article inclusion and exclusion criteria, and the full statistical approach, including descriptive statistics, Welch’s t-tests, multivariable logistic regression, and Egger’s test for publication bias.
Supplemental Table 2:
Frequency of tested dermatologic conditions among included studies. Lists all dermatologic conditions evaluated more than twice, with counts and frequency percentages.
Supplemental Table 3:
Average diagnostic accuracy of ChatGPT-4, ChatGPT-4o, and overall model performance. Summarizes top diagnosis and differential diagnosis accuracy rates, including 95% confidence intervals and results of t-tests comparing models.
Supplemental Table 4:
Comparison of ChatGPT diagnostic performance between textual and visual prompts. Reports differential accuracy percentages and confidence intervals for both input modalities.
Supplemental Table 5:
Comparison of diagnostic performance for malignant and benign lesion classifications stratified by textual versus visual prompts. Provides accuracy rates for each subgroup.
Supplemental Table 6:
Fitzpatrick skin phototype analysis of diagnostic performance. Evaluates differential accuracy across lighter (Fitzpatrick I-II) versus darker (Fitzpatrick III+) skin tones, with significance testing results.
Supplemental Table 7:
Comparison of diagnostic performance between studies utilizing public versus private data sources. Reports accuracy rates and statistical comparisons between the two groups.
Supplemental Table 8:
Analysis of ChatGPT performance by year of model accession (2023–2025). Highlights trends in diagnostic accuracy over time across studies included in the meta-analysis.
创建时间:
2025-04-28



