MS-CXR: Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing
收藏physionet.org2025-01-15 收录
下载链接:
https://physionet.org/content/ms-cxr/1.1.0/
下载链接
链接失效反馈官方服务:
资源简介:
We release a new dataset, MS-CXR, with locally-aligned phrase grounding annotations by board-certified radiologists to facilitate the study of complex semantic modelling in biomedical vision–language processing. The MS-CXR dataset provides 1162 image–sentence pairs of bounding boxes and corresponding phrases, collected across eight different cardiopulmonary radiological findings, with an approximately equal number of pairs for each finding. This dataset complements the existing MIMIC-CXR v.2 dataset and comprises: 1. Reviewed and edited bounding boxes and phrases (1026 pairs of bounding box/sentence); and 2. Manual bounding box labels from scratch (136 pairs of bounding box/sentence).
This large, well-balanced phrase grounding benchmark dataset contains carefully curated image regions annotated with descriptions of eight radiology findings, as verified by radiologists. Unlike existing chest X-ray benchmarks, this challenging phrase grounding task evaluates joint, local image-text reasoning while requiring real-world language understanding, e.g. to parse domain-specific location references, complex negations, and bias in reporting style. This data accompany work showing that principled textual semantic modelling can improve contrastive learning in self-supervised vision–language processing.
本团队发布了一份数据集,即MS-CXR,该数据集由获得认证的放射科医生进行了局部对齐的短语定位标注,旨在促进生物医学视觉与语言处理领域复杂语义建模的研究。MS-CXR数据集提供了1162组图像-句子配对,包括边界框和相应的短语,这些配对来自八个不同的心肺放射学发现,每个发现的配对数量大致相等。本数据集与现有的MIMIC-CXR v.2数据集相辅相成,并包含以下内容:1.经审查和编辑的边界框和短语(1026对边界框/句子);2.从头开始的手动边界框标签(136对边界框/句子)。此大规模、均衡的短语定位基准数据集包含了经过精心挑选的图像区域,这些区域注有八个放射学发现的描述,并由放射科医生进行了验证。与现有的胸部X光基准相比,这项具有挑战性的短语定位任务评估了联合、局部的图像-文本推理,同时要求具备真实世界的语言理解能力,例如解析领域特定的位置引用、复杂的否定以及报告风格中的偏见。本数据集伴随的研究表明,基于原则的文本语义建模可以提升自监督视觉-语言处理中的对比学习。
提供机构:
physionet.org



