harvardairobotics/FairGenMed
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/harvardairobotics/FairGenMed
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-nd-4.0
task_categories:
- image-classification
- text-to-image
modality:
- image
language:
- en
tags:
- medical
- ophthalmology
- fairness
- generative
- diffusion
- fundus
- glaucoma
- OCT
pretty_name: FairGenMed
size_categories:
- 10K<n<100K
---
# Dataset Card: FairGenMed
## Dataset Summary
FairGenMed is the first dataset for studying **fairness in medical generative models**. It provides detailed quantitative clinical measurements alongside demographic annotations to investigate the semantic correlation between text prompts and anatomical regions across demographic subgroups. The dataset supports both generative model evaluation and downstream classification tasks for glaucoma detection.
This dataset accompanies the **FairDiffusion** framework — an equity-aware latent diffusion model that enhances fairness in medical image generation via Fair Bayesian Perturbation — published in *Science Advances* (2025).
## Dataset Details
### Dataset Description
| Field | Value |
|------------------|-------|
| **Institution** | Department of Ophthalmology, Harvard Medical School |
| **Task** | Glaucoma detection; fairness evaluation of generative models |
| **Modality** | Scanning Laser Ophthalmoscopy (SLO) fundus images, OCT B-scans |
| **Scale** | 10,000 subjects |
| **Image size** | 512 × 664 (SLO fundus) |
| **License** | [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) |
- **Curated by:** Yan Luo, Muhammad Osama Khan, Congcong Wen, Muhammad Muneeb Afzal, Titus Fidelis Wuermeling, Min Shi, Yu Tian, Yi Fang, Mengyu Wang
- **License:** [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) — non-commercial research only
- **Paper:** [Science Advances, Vol. 11, No. 14 (2025)](https://doi.org/10.1126/sciadv.ads4593)
- **Contact:** harvardophai@gmail.com, harvardairobotics@gmail.com
### Data Fields
Each subject includes one SLO fundus image and one `.npz` file. The NPZ files contain:
| Field | Description |
|-----------------|-------------|
| `glaucoma` | Disease label: `0` = non-glaucoma, `1` = glaucoma |
| `oct_bscans` | OCT B-scan images |
| `race` | `0` = Asian, `1` = Black, `2` = White |
| `male` | `0` = Female, `1` = Male |
| `hispanic` | `0` = Non-Hispanic, `1` = Hispanic |
| `maritalstatus` | `0` = Married/Partnered, `1` = Single, `2` = Divorced, `3` = Widowed, `4` = Legally Separated, `-1` = Unknown |
| `language` | `0` = English, `1` = Spanish, `2` = Other |
### Clinical Metadata
All clinical measurements for the 10,000 samples are provided in `data_summary.csv`:
| Column | Description |
|---------------|-------------|
| `cdr_status` | Cup-disc ratio status |
| `md_severity` | Severity of vision loss |
| `se_status` | Spherical equivalent status |
### Demographics
6 demographic attributes are annotated per subject: age, gender, race, ethnicity, preferred language, and marital status.
## Uses
### Direct Use
- Fairness evaluation of medical generative models (text-to-image diffusion)
- Glaucoma detection with demographic fairness analysis
- Studying semantic correlations between text prompts and anatomy across subgroups
### Out-of-Scope Use
Clinical decisions, patient care, or any commercial application. This dataset shall not be used for clinical decisions at any time.
## Associated Method: FairDiffusion
FairDiffusion is an equity-aware latent diffusion model built on Stable Diffusion 2.1, trained with **Fair Bayesian Perturbation** to reduce demographic bias in generated medical images. It is evaluated on FairGenMed (ophthalmology), HAM10000 (dermatology), and CheXpert (chest radiology).
## Citation
**BibTeX:**
```bibtex
@article{FairDiffusion_Science_Advances_2025,
author = {Yan Luo and Muhammad Osama Khan and Congcong Wen and Muhammad Muneeb Afzal and Titus Fidelis Wuermeling and Min Shi and Yu Tian and Yi Fang and Mengyu Wang},
title = {FairDiffusion: Enhancing equity in latent diffusion models via fair Bayesian perturbation},
journal = {Science Advances},
volume = {11},
number = {14},
pages = {eads4593},
year = {2025},
doi = {10.1126/sciadv.ads4593}
}
```
**APA:**
Luo, Y., Khan, M. O., Wen, C., Afzal, M. M., Wuermeling, T. F., Shi, M., Tian, Y., Fang, Y., & Wang, M. (2025). FairDiffusion: Enhancing equity in latent diffusion models via fair Bayesian perturbation. *Science Advances, 11*(14), eads4593. https://doi.org/10.1126/sciadv.ads4593
提供机构:
harvardairobotics



