GMAI-MMBench 医疗多模态评估基准数据集

超神经2024-11-13 更新2024-12-14 收录

下载链接：

https://hyper.ai/cn/datasets/35476

下载链接

链接失效反馈

官方服务：

资源简介：

GMAI-MMBench 是一个为推动通用医疗人工智能领域发展而设计的多模态评估基准，由来自上海人工智能实验室、华盛顿大学、莫纳什大学、华东师范大学、剑桥大学、上海交通大学、香港中文大学（深圳）、深圳市大数据研究院和中国科学院深圳先进技术研究院 9 个机构于 2024 年联合推出，相关论文成果为「GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI」。它通过提供全面和细致的评估，帮助研究者和开发者深入了解大型视觉语言模型 (LVLMs) 在医疗领域的应用效果，并识别技术短板。这个基准测试覆盖了广泛的数据集，包含 284 个不同来源的数据集，涉及 38 种医学图像模态和 18 个临床相关任务，覆盖了 18 个不同的医学部门，并在 4 种不同的感知粒度上进行了评估，从而从多个维度对 LVLMs 的性能进行考量。

GMAI-MMBench is a multimodal evaluation benchmark designed to advance the general medical artificial intelligence field. It was jointly launched in 2024 by nine institutions: Shanghai AI Laboratory, University of Washington, Monash University, East China Normal University, University of Cambridge, Shanghai Jiao Tong University, The Chinese University of Hong Kong, Shenzhen, Shenzhen Institute of Big Data, and Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences. The associated academic paper is titled "GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI". By providing comprehensive and meticulous evaluations, it enables researchers and developers to gain in-depth insights into the application performance of large vision-language models (LVLMs) in medical scenarios and identify their technical gaps. This benchmark encompasses a broad range of datasets, including 284 datasets from diverse sources, covering 38 medical image modalities and 18 clinically relevant tasks across 18 distinct medical departments. It conducts evaluations across four different perceptual granularities, thereby assessing the performance of LVLMs from multiple dimensions.

创建时间：

2024-11-01

搜集汇总

数据集介绍

背景与挑战

背景概述

GMAI-MMBench是一个由多个机构于2024年推出的医疗多模态评估基准数据集，旨在全面评估大型视觉语言模型在医疗领域的性能。它覆盖284个数据集、38种医学图像模态和18个临床任务，具有多感知粒度评估和高度临床相关性的特点，但评估显示当前模型准确率仅52%，揭示技术短板。

以上内容由遇见数据集搜集并总结生成