five

Intelligent documentation in medical education: Can AI replace manual case logging?

收藏
DataCite Commons2026-05-04 更新2026-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.3tx95x6x0
下载链接
链接失效反馈
官方服务:
资源简介:
This study investigates the feasibility of using large language models (LLMs) to automate procedural case log documentation in radiology training. We evaluate whether AI can replace manual logging, identify procedure types most challenging for extraction, and assess integration into clinical workflows. We retrospectively analyzed 36 ,659 radiology reports authored by nine interventional radiology residents (2018–2024). A subset of 414 reports was manually annotated for 39 procedures spanning vascular diagnosis, vascular intervention, and non-vascular intervention. Candidate models, Qwen-2.5 and Claude-3.5, were chosen based on privacy, hardware constraints, and availability, and tested under instruction and chain-of-thought prompting. A crosswalk baseline using structured exam codes provided comparison. Performance was measured by sensitivity, specificity, and F1-score, along with inference time and token efficiency to estimate operational cost. Both local and commercial LLMs outperformed the crosswalk benchmark. Qwen-2.5 achieved sensitivities up to 94.19\% and F1-scores of 86.66 with chain-of-thought prompting, while Claude-3.5-Haiku reached an F1-score of 86.89 and specificity of 99.29\%. Errors were concentrated in ambiguous “other” procedures, whereas common procedures were reliably classified. Chain-of-thought prompting reduced false positives relative to instruction prompting. Commercial inference delivered sub-2s latency and concise outputs, while local deployment traded speed for lower recurring cost. Automation could save more than 35 hours of manual annotation per resident annually. LLMs thus offer a scalable, accurate, and cost-efficient solution for radiology case log documentation. Optimizing for procedure-specific challenges and ensuring seamless integration with existing systems will be essential. Future work should validate across larger, multi-institution datasets and explore additional prompting strategies.
提供机构:
Dryad
创建时间:
2026-05-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作