FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark

Name: FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark
Creator: PhysioNet
Published: 2025-01-21 21:24:45
License: 暂无描述

DataCite Commons2025-01-21 更新2025-04-16 收录

下载链接：

https://physionet.org/content/ffa-ir-medical-report/1.0.0/

下载链接

链接失效反馈

官方服务：

资源简介：

Automatic medical report generation (MRG) towards describing life-threatening lesions from given medical images, such as Chest X-ray and Fundus Fluorescein Angiography (FFA), has been a long-standing research topic in machine learning and automatic medical diagnosis fields. However, existing MRG benchmarks only provide medical images and free-text reports without explainable annotations and reliable evaluation tools, hindering the current research advances from two aspects: First, existing methods can only predict reports without accurate explanation, undermining the trustworthiness of the diagnostic methods; Second, the comparison between predicted reports from MRG methods is unreliable based on the natural language generation (NLG) metrics. To address these issues, we propose an explainable and reliable MRG benchmark based on FFA Images and Reports (FFA-IR). Specifically, our FFA-IR dataset is featured from the following aspects: 1) Large-scale medical dataset. FFA-IR collects 10,790 reports along with 1,048,584 FFA images from clinical practice. 2) Explainable annotation. FFA-IR annotates 46 categories of lesions with a total of 12,166 regions. 3) Bilingual reports. FFA-IR provides both English and Chinese reports for each case. We hope that our FFA-IR can significantly advance research from both vision-and-language and medicine fields and improve the conventional retinal disease diagnosis procedure.

针对从胸部X线片、荧光素眼底血管造影（Fundus Fluorescein Angiography，FFA）等医学影像中描述危及生命病灶的自动医学报告生成（Medical Report Generation，MRG）任务，是机器学习与自动医学诊断领域长期以来的经典研究方向。然而，现有MRG基准数据集仅提供医学影像与自由文本报告，却未附带可解释性标注与可靠的评估工具，从两个层面阻碍了当前研究的推进：其一，现有方法仅能生成报告却无法提供精准的解释依据，大幅削弱了诊断方法的可信度；其二，基于自然语言生成（Natural Language Generation，NLG）指标对MRG模型生成的报告进行性能对比，结果并不具备可靠性。为解决上述问题，本研究提出了一款基于荧光素眼底血管造影影像与报告的可解释且可靠的MRG基准数据集——FFA-IR。具体而言，本FFA-IR数据集具备以下三大核心特点：1）大规模医学数据集。FFA-IR从临床实践场景中收集了10790份报告与1048584张FFA影像；2）可解释性标注。FFA-IR针对46类病灶共计12166个区域完成了精准标注；3）双语报告。FFA-IR为每一例样本同时提供英文与中文两份报告。我们期望本FFA-IR数据集能够显著推动视觉语言与医学两大领域的研究进展，并优化传统视网膜疾病的临床诊断流程。

提供机构：

PhysioNet

创建时间：

2021-08-27

搜集汇总

数据集介绍