five

harvardairobotics/FairVision

收藏
Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/harvardairobotics/FairVision
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-nd-4.0 task_categories: - image-classification modality: - image language: - en tags: - medical - ophthalmology - fairness - glaucoma - AMD - diabetic-retinopathy - OCT - fundus pretty_name: Harvard-FairVision size_categories: - 10K<n<100K --- # Dataset Card: Harvard-FairVision ## Dataset Summary Harvard-FairVision is the **first large-scale medical fairness dataset with both 2D and 3D imaging data**, covering three major eye diseases affecting approximately 380 million people worldwide. It contains 30,000 subjects (10,000 per disease) across Age-Related Macular Degeneration (AMD), Diabetic Retinopathy (DR), and glaucoma, each with paired SLO fundus photos and 3D OCT B-scans and six demographic identity attributes. This dataset was introduced in the paper: [FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling](https://arxiv.org/pdf/2310.02492). ## Dataset Details ### Dataset Description | Field | Value | |-----------------|-------| | **Institution** | Department of Ophthalmology, Harvard Medical School | | **Tasks** | AMD detection, diabetic retinopathy detection, glaucoma detection | | **Modalities** | Scanning Laser Ophthalmoscopy (SLO) fundus images, 3D OCT B-scans | | **Scale** | 30,000 subjects (10,000 per disease) | | **OCT size** | 200 × 200 × 200 (glaucoma), 128 × 200 × 200 (AMD, DR) | | **SLO size** | 512 × 664 (folders), 200 × 200 (NPZ, normalized to [0, 255]) | | **Total size** | ~600 GB | | **License** | [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) | - **Curated by:** Yan Luo, Muhammad Osama Khan, Yu Tian, Min Shi, Zehao Dou, Tobias Elze, Yi Fang, Mengyu Wang - **License:** [CC BY-NC-ND 4.0](https://creativecommons.org/licenses/by-nc-nd/4.0/) — non-commercial research only - **Paper:** [arXiv:2310.02492](https://arxiv.org/abs/2310.02492) - **Contact:** harvardophai@gmail.com, harvardairobotics@gmail.com ### Dataset Structure ``` FairVision ├── AMD │ ├── Training │ ├── Validation │ └── Test ├── data_summary_amd.csv ├── DR │ ├── Training │ ├── Validation │ └── Test ├── data_summary_dr.csv ├── Glaucoma │ ├── Training │ ├── Validation │ └── Test └── data_summary_glaucoma.csv ``` Each split folder contains SLO fundus photos (`slo_xxxxx.jpg`) and NPZ files (`data_xxxxx.npz`). Per-disease metadata CSVs (`data_summary_*.csv`) provide race, gender, ethnicity, marital status, age, and preferred language for all subjects. ### Data Fields All NPZ files share the following demographic and imaging fields: | Field | Description | |-----------------|-------------| | `slo_fundus` | SLO fundus image, 200 × 200 (normalized) | | `oct_bscans` | 3D OCT B-scans (200 × 200 × 200 for glaucoma; 128 × 200 × 200 for AMD/DR) | | `race` | `0` = Asian, `1` = Black, `2` = White | | `male` | `0` = Female, `1` = Male | | `hispanic` | `0` = Non-Hispanic, `1` = Hispanic | | `maritalstatus` | `0` = Married, `1` = Single, `2` = Divorced, `3` = Widowed, `4` = Legally Separated | | `language` | `0` = English, `1` = Spanish, `2` = Other | Disease-specific label fields: | Disease | Field | Values | |-----------|-----------------|--------| | Glaucoma | `glaucoma` | `0` = non-glaucoma, `1` = glaucoma | | AMD | `amd_condition` | 9-class condition string, mapped to `0` = no AMD, `1` = early dry, `2` = intermediate dry, `3` = advanced | | DR | `dr_subtype` | 6-class condition string, mapped to `0` = non-vision-threatening, `1` = vision-threatening (severe NPDR or PDR) | ## Uses ### Direct Use - Fairness benchmarking for 2D and 3D ophthalmic disease classification across race, gender, and ethnicity - Multi-disease fairness analysis (AMD, DR, glaucoma) under a unified framework - Development and evaluation of fairness learning methods for medical imaging - Comparative study of 2D vs. 3D model fairness in clinical AI ### Out-of-Scope Use Clinical decisions, patient care, or any commercial application. This dataset shall not be used for clinical decisions at any time. ## Access The "Harvard" designation indicates the dataset originates from the Department of Ophthalmology at Harvard Medical School. It does not imply endorsement, sponsorship, or legal responsibility by Harvard University or Harvard Medical School. ## Citation **BibTeX:** ```bibtex @misc{luo2024fairvisionequitabledeeplearning, title={FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling}, author={Yan Luo and Muhammad Osama Khan and Yu Tian and Min Shi and Zehao Dou and Tobias Elze and Yi Fang and Mengyu Wang}, year={2024}, eprint={2310.02492}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2310.02492} } ``` **APA:** Luo, Y., Khan, M. O., Tian, Y., Shi, M., Dou, Z., Elze, T., Fang, Y., & Wang, M. (2024). FairVision: Equitable Deep Learning for Eye Disease Screening via Fair Identity Scaling. *arXiv preprint arXiv:2310.02492*.
提供机构:
harvardairobotics
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作