CT-Agent: a multimodal-LLM agent for 3D CT radiology question answering

中国科学数据2026-04-24 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.1007/s11432-025-4818-7

下载链接

链接失效反馈

官方服务：

资源简介：

Computed tomography (CT) scans can produce 3D volumetric medical data, which is viewed as hundreds of cross-sectional images (slices) and provides detailed anatomical information for diagnosis.Creating CT radiology reports is time-consuming and error-prone for radiologists.A visual question answering (VQA) system is needed to answer radiologists' anatomical questions about CT scans and to automatically generate radiology reports.However, existing VQA systems cannot adequately handle the CT radiology question answering (CTQA) task due to anatomic complexity, which makes CT images difficult to understand, and spatial relationships across hundreds of slices, which are difficult to capture.To address these challenges, this study proposes CT-Agent, a multimodal agentic framework for CTQA. CT-Agent uses anatomically independent tools to break down anatomic complexity and captures across-slice spatial relationships via global-local token compression. Experimental results on the CT-RATE and RadGenome-Chest CT datasets verify its superior performance.

创建时间：

2026-03-02

5,000+

优质数据集

54 个

任务类型

进入经典数据集