five

SEED-ML: A Multi-Parametric Clinical Dataset on Male Infertility for Predictive Modeling and AI Research.

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/sc8rsz2vd7
下载链接
链接失效反馈
官方服务:
资源简介:
Authors: N. Sánchez-Gómez [1] (nicolassg@us.es), J.A. García-García [*, 1] (juliangg@us.es), J. Navarro-Pando [2,3,4,5] (jose.navarro@inebir.com), MJ Escalona-Cuaresma [1] (mjescalona@us.es). Affiliations: [1]ES3 Group (Engineering and Science for Software Systems group). University of Seville, Spain. Avenida Reina Mercedes, s/n., 41012, Seville, Spain. [2]Cátedra de Reproducción y Genética Humana del Instituto para el Estudio de la Biología de la Reproducción Humana (INEBIR), Seville, Spain. [3]Universidad Europea del Atlántico (UNEATLANTICO), Santander, Spain. [4]Fundación Universitaria Iberoamericana (FUNIBER), Seville, Spain. [5]San Juan de Dios Hospital, Sevilla, Spain. Abstract: SEED-ML (Semen Examination and Evaluation Dataset for Machine Learning) is an openly available, multi-parametric clinical dataset specifically designed to support research in male infertility diagnostics and prediction. The dataset comprises records from 10,124 patients, including detailed semen analysis parameters (pre- and post-treatment), morphological classifications, and clinical alterations. Infertility diagnosis is categorized into nine clinically relevant classes, ranging from normal fertility to complex multi-factor conditions such as oligoasthenoteratozoospermia. All data were anonymized and curated following strict ethical and privacy guidelines to ensure compliance with applicable medical data protection regulations. The dataset reflects real-world clinical distributions, with diagnostic classes ranging from 62.7% (Normozoospermia) to 0.16% (Azoospermia), providing a high-fidelity benchmark for testing machine learning algorithms under conditions of significant class imbalance. SEED-ML offers a valuable resource for developing and benchmarking machine learning models, enabling research in predictive analytics, decision support systems, and computational andrology. This dataset aims to facilitate interdisciplinary collaboration between clinicians, data scientists, and AI (artificial intelligence) researchers, accelerating the development of data-driven solutions in reproductive medicine. The dataset is publicly available in Mendeley under a CC BY 4.0 license.
创建时间:
2026-01-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作