3D多模态医疗数据集-指代分割
收藏魔搭社区2026-05-13 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/GoodBaiBai88/M3D-RefSeg
下载链接
链接失效反馈官方服务:
资源简介:
3D 医学分割是医疗图像分析的主要挑战之一。在实际需求中,更有意义的任务是指代分割,即,给定文本描述,模型能够根据文本内容分割出对应的区域。 然而,由于指代分割需要 image-mask-text 三元组,标注成本极高,这限制了 3D 医疗场景指代分割任务的发展。为了解决这个问题,我们从现有的 TotalSegmentator 数据选择了 210 个图像子集,对这些 3D 影像重新标注了文本和对应描述区域。其中,每个图像都对应多个疾病异常的文本描述和相应区域标注。 标注者为经验丰富的医生,原始文本是中文在 text_zh.txt 文件中,我们使用 Qwen 72B 大语言模型进行自动翻译,将转换和整理后的英文标注保存到 text.json 中。 进一步,我们使用大语言模型将区域描述文本转换为指令问答对,保存至 M3D_RefSeg.csv 文件。
3D medical segmentation is one of the primary challenges in medical image analysis. In practical scenarios, a more valuable task is referring segmentation, which means that given a text description, the model can segment the corresponding anatomical region based on the text content. However, referring segmentation requires image-mask-text triplets, leading to extremely high annotation costs, which hinders the development of referring segmentation tasks in 3D medical scenarios. To address this issue, we selected 210 image subsets from the existing TotalSegmentator dataset, and re-annotated the text descriptions and corresponding regional masks for these 3D medical volumes. Each volume corresponds to multiple text descriptions of disease abnormalities and their matching regional annotations. The annotations were performed by experienced physicians. The original Chinese text descriptions are stored in the text_zh.txt file. We utilized the Qwen 72B large language model (LLM) for automatic translation, and saved the converted and polished English annotations into the text.json file. Furthermore, we used LLMs to convert the regional description texts into instruction-based question-answer pairs, which were saved in the M3D_RefSeg.csv file.
提供机构:
maas
创建时间:
2024-04-12
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个包含210个3D医学图像和2,778个掩码及文本描述的指代表达分割数据集,支持3D分割和定位任务,适用于医学图像分析。数据集来源于TotalSegmentator,经过专业医生重新标注,并提供了详细的数据处理和使用指南。
以上内容由遇见数据集搜集并总结生成



