adrianrm/breastmnist

Name: adrianrm/breastmnist
Creator: adrianrm
Published: 2026-04-18 21:19:34
License: 暂无描述

Hugging Face2026-04-18 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/adrianrm/breastmnist

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 task_categories: - image-classification tags: - medical - medmnist - breastmnist configs: - config_name: train-all-res224 data_files: - split: train path: train-all-res224/*.parquet - config_name: train-malignant-res224 data_files: - split: train path: train-malignant-res224/*.parquet - config_name: train-normal_benign-res224 data_files: - split: train path: train-normal_benign-res224/*.parquet - config_name: val-all-res224 data_files: - split: val path: val-all-res224/*.parquet - config_name: val-malignant-res224 data_files: - split: val path: val-malignant-res224/*.parquet - config_name: val-normal_benign-res224 data_files: - split: val path: val-normal_benign-res224/*.parquet - config_name: test-all-res224 data_files: - split: test path: test-all-res224/*.parquet - config_name: test-malignant-res224 data_files: - split: test path: test-malignant-res224/*.parquet - config_name: test-normal_benign-res224 data_files: - split: test path: test-normal_benign-res224/*.parquet --- # breastmnist (MedMNIST) **Source:** [breastmnist](https://medmnist.com/) **Task:** binary-class **Resolutions:** 224x224 **License:** CC BY 4.0 ## Description The BreastMNIST is based on a dataset of 780 breast ultrasound images. It is categorized into 3 classes: normal, benign, and malignant. As we use low-resolution images, we simplify the task into binary classification by combining normal and benign as positive and classifying them against malignant as negative. We split the source dataset with a ratio of 7:1:2 into training, validation and test set. The source images of 1×500×500 are resized into 1×28×28. ## Config naming convention ``` {split}-{class}-{res} split : train | val | test class : all | <sanitized class name> res : res28 | res64 | res128 | res224 ``` ## Loading examples ```python from datasets import load_dataset # All training images at 224px ds = load_dataset('.../breastmnist', 'train-all-res224', split='train') # Only 'malignant' class, training split ds = load_dataset('.../breastmnist', 'train-malignant-res224', split='train') ``` ## Class labels - `0` — malignant (config key: `malignant`) - `1` — normal, benign (config key: `normal_benign`) ## Class distribution ### 224x224 **train** (N=546, IR=2.71x) | Class | Config key | Count | Share | |-------|-----------|------:|------:| | malignant | `malignant` | 147 | 26.9% | | normal, benign | `normal_benign` | 399 | 73.1% | **val** (N=78, IR=2.71x) | Class | Config key | Count | Share | |-------|-----------|------:|------:| | malignant | `malignant` | 21 | 26.9% | | normal, benign | `normal_benign` | 57 | 73.1% | **test** (N=156, IR=2.71x) | Class | Config key | Count | Share | |-------|-----------|------:|------:| | malignant | `malignant` | 42 | 26.9% | | normal, benign | `normal_benign` | 114 | 73.1% | ## Citation ```bibtex @article{medmnistv2, title={MedMNIST v2 - A large-scale lightweight benchmark for 2D and 3D biomedical image classification}, author={Yang, Jiancheng and Shi, Rui and Wei, Donglai and Liu, Zequan and Zhao, Lin and Ke, Bilian and Pfister, Hanspeter and Ni, Bingbing}, journal={Scientific Data}, year={2023} } ```

提供机构：

adrianrm

5,000+

优质数据集

54 个

任务类型

进入经典数据集