five

EMDS-7 DiffuMorph: Latent Diffusion-Augmented Environmental Microorganism Dataset for Compound Microscopy

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/vh3nbj892g
下载链接
链接失效反馈
官方服务:
资源简介:
The EMDS-7 DiffuMorph dataset is designed for automated environmental microorganism classification using deep learning and machine learning techniques. The dataset provides a balanced and reproducible benchmark for multi-class microscopic image classification using compound microscope imagery. This dataset is derived from the publicly available EMDS-7 (Environmental Microorganism Dataset – Seventh Version) and focuses exclusively on single-microorganism images to ensure reliable image-level classification. Multi-organism images were removed to maintain label integrity and classification consistency. The original EMDS-7 dataset contains 41 microorganism classes. However, during dataset refinement, five classes were removed due to visual ambiguity, high morphological similarity, and potential redundancy between microorganism categories. Removing these classes helps reduce classification confusion and improves the reliability of machine learning model evaluation. Removed classes from the original EMDS-7 dataset include: G008_Tribonema, G025_Coelosphaerium, G028_Synedra, G033_Coelastrum, and G039_Diversicornis. After this refinement process, the final dataset contains 36 distinct microorganism classes with balanced representation across all categories. The dataset consists of 2,700 compound microscope images, equally distributed across the 36 microorganism classes. Each class includes: • Original images selected from the EMDS-7 dataset after filtering multi-object samples. • Augmented images using rotations, flips, scaling, and intensity adjustments, preserving microorganism morphology. • Synthetic images generated with class-wise latent diffusion, adding realistic variability while preserving biologically plausible morphology. Dataset Structure Each of the 36 microorganism classes contains: • Training set: 50 images • Validation set: 15 images • Test set: 10 images Total images per class: 75 Dataset Statistics • Original Classes in EMDS-7: 41 • Removed Classes: 5 (due to redundancy and ambiguity) • Final Classes Used: 36 • Training Images: 1,800 • Validation Images: 540 • Test Images: 360 • Total Images: 2,700 Applications This dataset can support research in: • Environmental microorganism classification • Microscopy image analysis • Deep learning and computer vision • Morphological pattern recognition • Ecological monitoring and water quality assessment
创建时间:
2026-03-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作