Development and validation of a GPT-based rater for assessing communication skills using the Gap-Kalamazoo Communication Skills Assessment Form
收藏DataCite Commons2026-01-07 更新2025-09-08 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Development_and_validation_of_a_GPT-based_rater_for_assessing_communication_skills_using_the_Gap-Kalamazoo_Communication_Skills_Assessment_Form/29649817/1
下载链接
链接失效反馈官方服务:
资源简介:
This study developed a generative pre-trained transformer (GPT)-based rater to assess communication skills using the Gap-Kalamazoo Communication Skills Assessment Form (GKCSAF), and examined its inter-rater reliability and concurrent validity. The GPT rater assessed 80 therapist-patient interaction transcripts previously assessed by human raters. For inter-rater reliability, at the total-score level, the GPT rater’s assessments showed acceptable differences (mean absolute error % [MAE%] = 12.2%–21.0%). However, we found low intraclass correlation coefficients (ICC) with human ratings (0.00–0.35), which might be due to limited score variability. At the domain level, only four domains showed acceptable differences (MAE% ≤ 30.3%) but all nine domains showed poor agreements (weighted κ ≤ 0.38). For concurrent validity, the GPT rater’s assessments also showed acceptable differences, but low ICC values compared to average human scores at both the total-score level (MAE% = 10.8%–11.5%; ICC = 0.12–0.36) and domain level (MAE% = 14.0%–30.3%; ICC = 0.00–0.37). Overall, the GPT rater may serve as a supplementary tool for providing total scores in low-stakes assessments of communication skills. Its performance at the domain level appears limited, highlighting the need for caution in domain interpretation and the importance of further refinement for high-stakes or detailed assessment contexts.
提供机构:
Taylor & Francis
创建时间:
2025-07-26



