five

Performance of large language models on Thailand’s national medical licensing examination: a cross-sectional study

收藏
DataONE2025-06-18 更新2025-11-01 收录
下载链接:
https://search.dataone.org/view/sha256:d93d79133d595b8ddcd95f4224c4b1245837e7cb9afecf40d5de793c7efbf6cd
下载链接
链接失效反馈
官方服务:
资源简介:
This study aimed to evaluate the feasibility of general-purpose large language models (LLMs) in addressing inequities in medical licensure exam preparation for Thailand’s National Medical Licensing Examination (ThaiNLE), which currently lacks standardized public study materials. We assessed four multi-modal LLMs (GPT-4, Claude 3 Opus, Gemini 1.0/1.5 Pro) using a 304-question ThaiNLE Step 1 mock examination (10.2% image-based), applying deterministic API configurations and five inference repetitions per model. Performance was measured via micro- and macro-accuracy metrics compared against historical passing thresholds.
创建时间:
2025-10-29
二维码
社区交流群
二维码
科研交流群
商业服务