five

chehablaborg/miniMSD244

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/chehablaborg/miniMSD244
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: organ dtype: string - name: image dtype: image - name: binary_mask dtype: image - name: classes_mask dtype: image - name: volume_id dtype: int32 - name: slice_id dtype: int32 splits: - name: train num_bytes: 2349940926.0 num_examples: 95311 download_size: 2310896675 dataset_size: 2349940926.0 configs: - config_name: default data_files: - split: train path: data/train-* license: cc-by-4.0 task_categories: - image-segmentation language: - en tags: - organs - medical - ct - mri pretty_name: Mini Medical Segmentation Decathlon 244 size_categories: - 10K<n<100K --- # Processed and Reduced Medical Segmentation Decathlon Dataset <!-- Provide a quick summary of the dataset. --> The miniMSD dataset is a medical image segmentation benchmark covering 10 human organs. It is derived from the [Medical Segmentation Decathlon (MSD)](http://medicaldecathlon.com) by converting volumetric scans from NIfTI (NII) format into serialized 2D RGB images, alongside their corresponding segmentation masks. The dataset is provided in multiple resolution variants ([244](https://huggingface.co/datasets/chehablaborg/miniMSD244) and [512](https://huggingface.co/datasets/chehablaborg/miniMSD512)), enabling easier use, off-the-shelf accessibility, and flexible experimentation. ## Dataset Details The dataset covers 10 human body organs, listed below. Each organ includes up to 40 volumes, with each volume consisting of a variable number of image slices. Each dataset entry contains the following components: the organ type, the image, a binary mask, a detailed (multi-class) mask, a volume ID, and a slice ID. The image, binary mask, and detailed mask are all provided as PIL images. The binary mask contains two labels: 0 for background and 1 for the target region. The detailed mask contains multiple labels (0, 1, 2, 3, …), where each label corresponds to a specific anatomical structure. The mapping of label indices to structures is provided below. | Organ | Number of Volumes | Total Slices | Avg. Slices per Volume | % of Total Slices | |----------------|-------------------|--------------|------------------------|-------------------| | Prostate | 32 | 1204 | 37.625 | 1.26% | | Heart | 20 | 2271 | 113.550 | 2.38% | | Hippocampus | 40 | 2754 | 68.850 | 2.89% | | HepaticVessel | 40 | 5796 | 144.900 | 6.08% | | BrainTumour | 40 | 6200 | 155.000 | 6.51% | | Spleen | 40 | 6964 | 174.100 | 7.31% | | Pancreas | 40 | 7068 | 176.700 | 7.42% | | Colon | 40 | 7344 | 183.600 | 7.71% | | Lung | 40 | 22510 | 562.750 | 23.62% | | Liver | 40 | 33200 | 830.000 | 34.83% | ## Labels Mapping ### BrainTumour - 0: background - 1: necrotic / non-enhancing tumor - 2: edema - 3: enhancing tumor ### Heart - 0: background - 1: left atrium ### Liver - 0: background - 1: liver - 2: tumor ### Hippocampus - 0: background - 1: anterior - 2: posterior ### Prostate - 0: background - 1: peripheral zone - 2: transition zone ### Lung - 0: background - 1: nodule ### Pancreas - 0: background - 1: pancreas - 2: tumor ### HepaticVessel - 0: background - 1: vessel - 2: tumor ### Spleen - 0: background - 1: spleen ### Colon - 0: background - 1: colon ## Uses <!-- Address questions around how the dataset is intended to be used. --> ```python from datasets import load_dataset miniMSD244 = load_dataset("chehablaborg/miniMSD244", split="train") sample_id = 312 organ = miniMSD244[sample_id]["organ"] image = miniMSD244[sample_id]["image"] binary_mask = miniMSD244[sample_id]["binary_mask"] classes_mask = miniMSD244[sample_id]["classes_mask"] plt.imshow(image, cmap="grey") plt.show() ``` ## Authors [Charbel Toumieh](https://www.linkedin.com/in/charbeltoumieh/) [Ahmad Mustapha](https://www.linkedin.com/in/ahmad-mustapha-ml/) [Ali Chehab](https://www.linkedin.com/in/ali-chehab-31b05a3/) ## Citation ``` @dataset{minimsd2026, title = {MiniMSD}, author = {Toumieh, Charbel and Mustapha, Ahmad and Chehab, Ali}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/datasets/chehablaborg/miniMSD244}}, } ``` ## Acknowledgment [Chehab lab](https://chehablab.com) @ 2026

数据集信息: 特征: - 名称:organ,数据类型:字符串 - 名称:image,数据类型:图像 - 名称:binary_mask,数据类型:图像 - 名称:classes_mask,数据类型:图像 - 名称:volume_id,数据类型:int32 - 名称:slice_id,数据类型:int32 划分集: - 名称:train(训练集),字节数:2349940926.0,样本数:95311 下载大小:2310896675 数据集总大小:2349940926.0 配置项: - 配置名称:default(默认配置),数据文件: - 划分集:train(训练集),路径:data/train-* 许可证:cc-by-4.0 任务类别: - 图像分割(image-segmentation) 语言: - 英语(en) 标签: - 器官(organs) - 医学(medical) - CT(计算机断层扫描) - MRI(磁共振成像) 展示名称:迷你医学分割十项全能244(Mini Medical Segmentation Decathlon 244) 样本规模类别: - 10K<n<100K # 经过处理与精简的医学分割十项全能数据集(Processed and Reduced Medical Segmentation Decathlon Dataset) ## 数据集概述 miniMSD数据集是覆盖10个人类器官的医学图像分割基准数据集,其源自**医学分割十项全能(Medical Segmentation Decathlon, MSD)**,通过将NIfTI(NII)格式的体素扫描转换为序列化二维RGB图像,并附带对应的分割掩码。该数据集提供了两种分辨率变体:[244](https://huggingface.co/datasets/chehablaborg/miniMSD244)与[512](https://huggingface.co/datasets/chehablaborg/miniMSD512),便于直接使用、开箱即用以及灵活开展实验。 ## 数据集详情 本数据集涵盖10个人体器官,各器官详情如下。每个器官最多包含40个体积数据,每个体积数据包含数量不等的图像切片。每条数据样本包含以下组成部分:器官类型、图像、二值掩码、精细(多分类)掩码、体素ID以及切片ID。图像、二值掩码与精细掩码均以PIL图像格式提供。其中二值掩码包含两类标签:0代表背景,1代表目标区域;精细掩码包含多类标签(0、1、2、3……),每类标签对应特定解剖结构,标签索引与解剖结构的映射关系如下。 | 器官名称 | 体积数量 | 总切片数 | 单体积平均切片数 | 占总切片百分比 | |----------------|----------|----------|------------------|----------------| | 前列腺(Prostate) | 32 | 1204 | 37.625 | 1.26% | | 心脏(Heart) | 20 | 2271 | 113.550 | 2.38% | | 海马体(Hippocampus) | 40 | 2754 | 68.850 | 2.89% | | 肝血管(HepaticVessel) | 40 | 5796 | 144.900 | 6.08% | | 脑肿瘤(BrainTumour) | 40 | 6200 | 155.000 | 6.51% | | 脾脏(Spleen) | 40 | 6964 | 174.100 | 7.31% | | 胰腺(Pancreas) | 40 | 7068 | 176.700 | 7.42% | | 结肠(Colon) | 40 | 7344 | 183.600 | 7.71% | | 肺(Lung) | 40 | 22510 | 562.750 | 23.62% | | 肝脏(Liver) | 40 | 33200 | 830.000 | 34.83% | ## 标签映射 ### 脑肿瘤(BrainTumour) - 0:背景 - 1:坏死/非增强肿瘤 - 2:水肿 - 3:增强肿瘤 ### 心脏(Heart) - 0:背景 - 1:左心房 ### 肝脏(Liver) - 0:背景 - 1:肝脏 - 2:肿瘤 ### 海马体(Hippocampus) - 0:背景 - 1:前部 - 2:后部 ### 前列腺(Prostate) - 0:背景 - 1:外周带 - 2:移行带 ### 肺(Lung) - 0:背景 - 1:结节 ### 胰腺(Pancreas) - 0:背景 - 1:胰腺 - 2:肿瘤 ### 肝血管(HepaticVessel) - 0:背景 - 1:血管 - 2:肿瘤 ### 脾脏(Spleen) - 0:背景 - 1:脾脏 ### 结肠(Colon) - 0:背景 - 1:结肠 ## 使用场景 ### 代码示例 python from datasets import load_dataset miniMSD244 = load_dataset("chehablaborg/miniMSD244", split="train") sample_id = 312 organ = miniMSD244[sample_id]["organ"] image = miniMSD244[sample_id]["image"] binary_mask = miniMSD244[sample_id]["binary_mask"] classes_mask = miniMSD244[sample_id]["classes_mask"] plt.imshow(image, cmap="grey") plt.show() ## 作者 [查尔贝尔·图米耶(Charbel Toumieh)](https://www.linkedin.com/in/charbeltoumieh/) [艾哈迈德·穆斯塔法(Ahmad Mustapha)](https://www.linkedin.com/in/ahmad-mustapha-ml/) [阿里·谢哈卜(Ali Chehab)](https://www.linkedin.com/in/ali-chehab-31b05a3/) ## 引用 bibtex @dataset{minimsd2026, title = {MiniMSD}, author = {Toumieh, Charbel and Mustapha, Ahmad and Chehab, Ali}, year = {2026}, publisher = {Hugging Face}, howpublished = {url{https://huggingface.co/datasets/chehablaborg/miniMSD244}}, } ## 致谢 谢哈卜实验室(Chehab lab)© 2026
提供机构:
chehablaborg
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作