Name: chehablaborg/miniMSD244
Creator: chehablaborg
Published: 2026-04-07 15:31:51
License: 暂无描述

下载链接：

https://hf-mirror.com/datasets/chehablaborg/miniMSD244

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: organ dtype: string - name: image dtype: image - name: binary_mask dtype: image - name: classes_mask dtype: image - name: volume_id dtype: int32 - name: slice_id dtype: int32 splits: - name: train num_bytes: 2349940926.0 num_examples: 95311 download_size: 2310896675 dataset_size: 2349940926.0 configs: - config_name: default data_files: - split: train path: data/train-* license: cc-by-4.0 task_categories: - image-segmentation language: - en tags: - organs - medical - ct - mri pretty_name: Mini Medical Segmentation Decathlon 244 size_categories: - 10K<n<100K --- # Processed and Reduced Medical Segmentation Decathlon Dataset  The miniMSD dataset is a medical image segmentation benchmark covering 10 human organs. It is derived from the [Medical Segmentation Decathlon (MSD)](http://medicaldecathlon.com) by converting volumetric scans from NIfTI (NII) format into serialized 2D RGB images, alongside their corresponding segmentation masks. The dataset is provided in multiple resolution variants ([244](https://huggingface.co/datasets/chehablaborg/miniMSD244) and [512](https://huggingface.co/datasets/chehablaborg/miniMSD512)), enabling easier use, off-the-shelf accessibility, and flexible experimentation. ## Dataset Details The dataset covers 10 human body organs, listed below. Each organ includes up to 40 volumes, with each volume consisting of a variable number of image slices. Each dataset entry contains the following components: the organ type, the image, a binary mask, a detailed (multi-class) mask, a volume ID, and a slice ID. The image, binary mask, and detailed mask are all provided as PIL images. The binary mask contains two labels: 0 for background and 1 for the target region. The detailed mask contains multiple labels (0, 1, 2, 3, …), where each label corresponds to a specific anatomical structure. The mapping of label indices to structures is provided below. | Organ | Number of Volumes | Total Slices | Avg. Slices per Volume | % of Total Slices | |----------------|-------------------|--------------|------------------------|-------------------| | Prostate | 32 | 1204 | 37.625 | 1.26% | | Heart | 20 | 2271 | 113.550 | 2.38% | | Hippocampus | 40 | 2754 | 68.850 | 2.89% | | HepaticVessel | 40 | 5796 | 144.900 | 6.08% | | BrainTumour | 40 | 6200 | 155.000 | 6.51% | | Spleen | 40 | 6964 | 174.100 | 7.31% | | Pancreas | 40 | 7068 | 176.700 | 7.42% | | Colon | 40 | 7344 | 183.600 | 7.71% | | Lung | 40 | 22510 | 562.750 | 23.62% | | Liver | 40 | 33200 | 830.000 | 34.83% | ## Labels Mapping ### BrainTumour - 0: background - 1: necrotic / non-enhancing tumor - 2: edema - 3: enhancing tumor ### Heart - 0: background - 1: left atrium ### Liver - 0: background - 1: liver - 2: tumor ### Hippocampus - 0: background - 1: anterior - 2: posterior ### Prostate - 0: background - 1: peripheral zone - 2: transition zone ### Lung - 0: background - 1: nodule ### Pancreas - 0: background - 1: pancreas - 2: tumor ### HepaticVessel - 0: background - 1: vessel - 2: tumor ### Spleen - 0: background - 1: spleen ### Colon - 0: background - 1: colon ## Uses  ```python from datasets import load_dataset miniMSD244 = load_dataset("chehablaborg/miniMSD244", split="train") sample_id = 312 organ = miniMSD244[sample_id]["organ"] image = miniMSD244[sample_id]["image"] binary_mask = miniMSD244[sample_id]["binary_mask"] classes_mask = miniMSD244[sample_id]["classes_mask"] plt.imshow(image, cmap="grey") plt.show() ``` ## Authors [Charbel Toumieh](https://www.linkedin.com/in/charbeltoumieh/) [Ahmad Mustapha](https://www.linkedin.com/in/ahmad-mustapha-ml/) [Ali Chehab](https://www.linkedin.com/in/ali-chehab-31b05a3/) ## Citation ``` @dataset{minimsd2026, title = {MiniMSD}, author = {Toumieh, Charbel and Mustapha, Ahmad and Chehab, Ali}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/datasets/chehablaborg/miniMSD244}}, } ``` ## Acknowledgment [Chehab lab](https://chehablab.com) @ 2026

数据集信息：特征： - 名称：organ，数据类型：字符串 - 名称：image，数据类型：图像 - 名称：binary_mask，数据类型：图像 - 名称：classes_mask，数据类型：图像 - 名称：volume_id，数据类型：int32 - 名称：slice_id，数据类型：int32 划分集： - 名称：train（训练集），字节数：2349940926.0，样本数：95311 下载大小：2310896675 数据集总大小：2349940926.0 配置项： - 配置名称：default（默认配置），数据文件： - 划分集：train（训练集），路径：data/train-* 许可证：cc-by-4.0 任务类别： - 图像分割（image-segmentation）语言： - 英语（en）标签： - 器官（organs） - 医学（medical） - CT（计算机断层扫描） - MRI（磁共振成像）展示名称：迷你医学分割十项全能244（Mini Medical Segmentation Decathlon 244）样本规模类别： - 10K<n<100K # 经过处理与精简的医学分割十项全能数据集（Processed and Reduced Medical Segmentation Decathlon Dataset） ## 数据集概述 miniMSD数据集是覆盖10个人类器官的医学图像分割基准数据集，其源自**医学分割十项全能（Medical Segmentation Decathlon, MSD）**，通过将NIfTI（NII）格式的体素扫描转换为序列化二维RGB图像，并附带对应的分割掩码。该数据集提供了两种分辨率变体：[244](https://huggingface.co/datasets/chehablaborg/miniMSD244)与[512](https://huggingface.co/datasets/chehablaborg/miniMSD512)，便于直接使用、开箱即用以及灵活开展实验。 ## 数据集详情本数据集涵盖10个人体器官，各器官详情如下。每个器官最多包含40个体积数据，每个体积数据包含数量不等的图像切片。每条数据样本包含以下组成部分：器官类型、图像、二值掩码、精细（多分类）掩码、体素ID以及切片ID。图像、二值掩码与精细掩码均以PIL图像格式提供。其中二值掩码包含两类标签：0代表背景，1代表目标区域；精细掩码包含多类标签（0、1、2、3……），每类标签对应特定解剖结构，标签索引与解剖结构的映射关系如下。 | 器官名称 | 体积数量 | 总切片数 | 单体积平均切片数 | 占总切片百分比 | |----------------|----------|----------|------------------|----------------| | 前列腺（Prostate） | 32 | 1204 | 37.625 | 1.26% | | 心脏（Heart） | 20 | 2271 | 113.550 | 2.38% | | 海马体（Hippocampus） | 40 | 2754 | 68.850 | 2.89% | | 肝血管（HepaticVessel） | 40 | 5796 | 144.900 | 6.08% | | 脑肿瘤（BrainTumour） | 40 | 6200 | 155.000 | 6.51% | | 脾脏（Spleen） | 40 | 6964 | 174.100 | 7.31% | | 胰腺（Pancreas） | 40 | 7068 | 176.700 | 7.42% | | 结肠（Colon） | 40 | 7344 | 183.600 | 7.71% | | 肺（Lung） | 40 | 22510 | 562.750 | 23.62% | | 肝脏（Liver） | 40 | 33200 | 830.000 | 34.83% | ## 标签映射 ### 脑肿瘤（BrainTumour） - 0：背景 - 1：坏死/非增强肿瘤 - 2：水肿 - 3：增强肿瘤 ### 心脏（Heart） - 0：背景 - 1：左心房 ### 肝脏（Liver） - 0：背景 - 1：肝脏 - 2：肿瘤 ### 海马体（Hippocampus） - 0：背景 - 1：前部 - 2：后部 ### 前列腺（Prostate） - 0：背景 - 1：外周带 - 2：移行带 ### 肺（Lung） - 0：背景 - 1：结节 ### 胰腺（Pancreas） - 0：背景 - 1：胰腺 - 2：肿瘤 ### 肝血管（HepaticVessel） - 0：背景 - 1：血管 - 2：肿瘤 ### 脾脏（Spleen） - 0：背景 - 1：脾脏 ### 结肠（Colon） - 0：背景 - 1：结肠 ## 使用场景 ### 代码示例 python from datasets import load_dataset miniMSD244 = load_dataset("chehablaborg/miniMSD244", split="train") sample_id = 312 organ = miniMSD244[sample_id]["organ"] image = miniMSD244[sample_id]["image"] binary_mask = miniMSD244[sample_id]["binary_mask"] classes_mask = miniMSD244[sample_id]["classes_mask"] plt.imshow(image, cmap="grey") plt.show() ## 作者 [查尔贝尔·图米耶（Charbel Toumieh）](https://www.linkedin.com/in/charbeltoumieh/) [艾哈迈德·穆斯塔法（Ahmad Mustapha）](https://www.linkedin.com/in/ahmad-mustapha-ml/) [阿里·谢哈卜（Ali Chehab）](https://www.linkedin.com/in/ali-chehab-31b05a3/) ## 引用 bibtex @dataset{minimsd2026, title = {MiniMSD}, author = {Toumieh, Charbel and Mustapha, Ahmad and Chehab, Ali}, year = {2026}, publisher = {Hugging Face}, howpublished = {url{https://huggingface.co/datasets/chehablaborg/miniMSD244}}, } ## 致谢谢哈卜实验室（Chehab lab）© 2026

应用场景：