chehablaborg/miniMSD244
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/chehablaborg/miniMSD244
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: organ
dtype: string
- name: image
dtype: image
- name: binary_mask
dtype: image
- name: classes_mask
dtype: image
- name: volume_id
dtype: int32
- name: slice_id
dtype: int32
splits:
- name: train
num_bytes: 2349940926.0
num_examples: 95311
download_size: 2310896675
dataset_size: 2349940926.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: cc-by-4.0
task_categories:
- image-segmentation
language:
- en
tags:
- organs
- medical
- ct
- mri
pretty_name: Mini Medical Segmentation Decathlon 244
size_categories:
- 10K<n<100K
---
# Processed and Reduced Medical Segmentation Decathlon Dataset
<!-- Provide a quick summary of the dataset. -->
The miniMSD dataset is a medical image segmentation benchmark covering 10 human organs.
It is derived from the [Medical Segmentation Decathlon (MSD)](http://medicaldecathlon.com) by converting volumetric scans
from NIfTI (NII) format into serialized 2D RGB images, alongside their corresponding segmentation masks.
The dataset is provided in multiple resolution variants ([244](https://huggingface.co/datasets/chehablaborg/miniMSD244)
and [512](https://huggingface.co/datasets/chehablaborg/miniMSD512)), enabling easier use,
off-the-shelf accessibility, and flexible experimentation.
## Dataset Details
The dataset covers 10 human body organs, listed below.
Each organ includes up to 40 volumes, with each volume consisting of a variable number of image slices.
Each dataset entry contains the following components: the organ type, the image, a binary mask,
a detailed (multi-class) mask, a volume ID, and a slice ID.
The image, binary mask, and detailed mask are all provided as PIL images.
The binary mask contains two labels: 0 for background and 1 for the target region.
The detailed mask contains multiple labels (0, 1, 2, 3, …), where each label corresponds to a specific
anatomical structure. The mapping of label indices to structures is provided below.
| Organ | Number of Volumes | Total Slices | Avg. Slices per Volume | % of Total Slices |
|----------------|-------------------|--------------|------------------------|-------------------|
| Prostate | 32 | 1204 | 37.625 | 1.26% |
| Heart | 20 | 2271 | 113.550 | 2.38% |
| Hippocampus | 40 | 2754 | 68.850 | 2.89% |
| HepaticVessel | 40 | 5796 | 144.900 | 6.08% |
| BrainTumour | 40 | 6200 | 155.000 | 6.51% |
| Spleen | 40 | 6964 | 174.100 | 7.31% |
| Pancreas | 40 | 7068 | 176.700 | 7.42% |
| Colon | 40 | 7344 | 183.600 | 7.71% |
| Lung | 40 | 22510 | 562.750 | 23.62% |
| Liver | 40 | 33200 | 830.000 | 34.83% |
## Labels Mapping
### BrainTumour
- 0: background
- 1: necrotic / non-enhancing tumor
- 2: edema
- 3: enhancing tumor
### Heart
- 0: background
- 1: left atrium
### Liver
- 0: background
- 1: liver
- 2: tumor
### Hippocampus
- 0: background
- 1: anterior
- 2: posterior
### Prostate
- 0: background
- 1: peripheral zone
- 2: transition zone
### Lung
- 0: background
- 1: nodule
### Pancreas
- 0: background
- 1: pancreas
- 2: tumor
### HepaticVessel
- 0: background
- 1: vessel
- 2: tumor
### Spleen
- 0: background
- 1: spleen
### Colon
- 0: background
- 1: colon
## Uses
<!-- Address questions around how the dataset is intended to be used. -->
```python
from datasets import load_dataset
miniMSD244 = load_dataset("chehablaborg/miniMSD244", split="train")
sample_id = 312
organ = miniMSD244[sample_id]["organ"]
image = miniMSD244[sample_id]["image"]
binary_mask = miniMSD244[sample_id]["binary_mask"]
classes_mask = miniMSD244[sample_id]["classes_mask"]
plt.imshow(image, cmap="grey")
plt.show()
```
## Authors
[Charbel Toumieh](https://www.linkedin.com/in/charbeltoumieh/)
[Ahmad Mustapha](https://www.linkedin.com/in/ahmad-mustapha-ml/)
[Ali Chehab](https://www.linkedin.com/in/ali-chehab-31b05a3/)
## Citation
```
@dataset{minimsd2026,
title = {MiniMSD},
author = {Toumieh, Charbel and Mustapha, Ahmad and Chehab, Ali},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/datasets/chehablaborg/miniMSD244}},
}
```
## Acknowledgment
[Chehab lab](https://chehablab.com) @ 2026
数据集信息:
特征:
- 名称:organ,数据类型:字符串
- 名称:image,数据类型:图像
- 名称:binary_mask,数据类型:图像
- 名称:classes_mask,数据类型:图像
- 名称:volume_id,数据类型:int32
- 名称:slice_id,数据类型:int32
划分集:
- 名称:train(训练集),字节数:2349940926.0,样本数:95311
下载大小:2310896675
数据集总大小:2349940926.0
配置项:
- 配置名称:default(默认配置),数据文件:
- 划分集:train(训练集),路径:data/train-*
许可证:cc-by-4.0
任务类别:
- 图像分割(image-segmentation)
语言:
- 英语(en)
标签:
- 器官(organs)
- 医学(medical)
- CT(计算机断层扫描)
- MRI(磁共振成像)
展示名称:迷你医学分割十项全能244(Mini Medical Segmentation Decathlon 244)
样本规模类别:
- 10K<n<100K
# 经过处理与精简的医学分割十项全能数据集(Processed and Reduced Medical Segmentation Decathlon Dataset)
## 数据集概述
miniMSD数据集是覆盖10个人类器官的医学图像分割基准数据集,其源自**医学分割十项全能(Medical Segmentation Decathlon, MSD)**,通过将NIfTI(NII)格式的体素扫描转换为序列化二维RGB图像,并附带对应的分割掩码。该数据集提供了两种分辨率变体:[244](https://huggingface.co/datasets/chehablaborg/miniMSD244)与[512](https://huggingface.co/datasets/chehablaborg/miniMSD512),便于直接使用、开箱即用以及灵活开展实验。
## 数据集详情
本数据集涵盖10个人体器官,各器官详情如下。每个器官最多包含40个体积数据,每个体积数据包含数量不等的图像切片。每条数据样本包含以下组成部分:器官类型、图像、二值掩码、精细(多分类)掩码、体素ID以及切片ID。图像、二值掩码与精细掩码均以PIL图像格式提供。其中二值掩码包含两类标签:0代表背景,1代表目标区域;精细掩码包含多类标签(0、1、2、3……),每类标签对应特定解剖结构,标签索引与解剖结构的映射关系如下。
| 器官名称 | 体积数量 | 总切片数 | 单体积平均切片数 | 占总切片百分比 |
|----------------|----------|----------|------------------|----------------|
| 前列腺(Prostate) | 32 | 1204 | 37.625 | 1.26% |
| 心脏(Heart) | 20 | 2271 | 113.550 | 2.38% |
| 海马体(Hippocampus) | 40 | 2754 | 68.850 | 2.89% |
| 肝血管(HepaticVessel) | 40 | 5796 | 144.900 | 6.08% |
| 脑肿瘤(BrainTumour) | 40 | 6200 | 155.000 | 6.51% |
| 脾脏(Spleen) | 40 | 6964 | 174.100 | 7.31% |
| 胰腺(Pancreas) | 40 | 7068 | 176.700 | 7.42% |
| 结肠(Colon) | 40 | 7344 | 183.600 | 7.71% |
| 肺(Lung) | 40 | 22510 | 562.750 | 23.62% |
| 肝脏(Liver) | 40 | 33200 | 830.000 | 34.83% |
## 标签映射
### 脑肿瘤(BrainTumour)
- 0:背景
- 1:坏死/非增强肿瘤
- 2:水肿
- 3:增强肿瘤
### 心脏(Heart)
- 0:背景
- 1:左心房
### 肝脏(Liver)
- 0:背景
- 1:肝脏
- 2:肿瘤
### 海马体(Hippocampus)
- 0:背景
- 1:前部
- 2:后部
### 前列腺(Prostate)
- 0:背景
- 1:外周带
- 2:移行带
### 肺(Lung)
- 0:背景
- 1:结节
### 胰腺(Pancreas)
- 0:背景
- 1:胰腺
- 2:肿瘤
### 肝血管(HepaticVessel)
- 0:背景
- 1:血管
- 2:肿瘤
### 脾脏(Spleen)
- 0:背景
- 1:脾脏
### 结肠(Colon)
- 0:背景
- 1:结肠
## 使用场景
### 代码示例
python
from datasets import load_dataset
miniMSD244 = load_dataset("chehablaborg/miniMSD244", split="train")
sample_id = 312
organ = miniMSD244[sample_id]["organ"]
image = miniMSD244[sample_id]["image"]
binary_mask = miniMSD244[sample_id]["binary_mask"]
classes_mask = miniMSD244[sample_id]["classes_mask"]
plt.imshow(image, cmap="grey")
plt.show()
## 作者
[查尔贝尔·图米耶(Charbel Toumieh)](https://www.linkedin.com/in/charbeltoumieh/)
[艾哈迈德·穆斯塔法(Ahmad Mustapha)](https://www.linkedin.com/in/ahmad-mustapha-ml/)
[阿里·谢哈卜(Ali Chehab)](https://www.linkedin.com/in/ali-chehab-31b05a3/)
## 引用
bibtex
@dataset{minimsd2026,
title = {MiniMSD},
author = {Toumieh, Charbel and Mustapha, Ahmad and Chehab, Ali},
year = {2026},
publisher = {Hugging Face},
howpublished = {url{https://huggingface.co/datasets/chehablaborg/miniMSD244}},
}
## 致谢
谢哈卜实验室(Chehab lab)© 2026
提供机构:
chehablaborg



