synduality/hospital
收藏Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/synduality/hospital
下载链接
链接失效反馈官方服务:
资源简介:
# SmolDocling Hospital Privacy Synthetic Dataset
Synthetic single-page "discharge summary" documents for evaluating
privacy leakage in document vision-language models (SmolDocling) in
a hospital setting.
- 1000 train / 200 val / 200 test pages
- Each page: scanned-like discharge summary with fixed tables (meds, labs)
- Targets: DocTags markup describing structure and content
- Canaries: `CANARY-...` tokens inserted only in a subset of train `patient_id` fields
- Decoys: for each canary, ~1000 decoy strings used for exposure ranking
**Intended use:** research on membership inference, canary exposure,
and differentially private LoRA fine-tuning on document VLMs.
**Note:** All data is synthetic; no real patient data is used.
提供机构:
synduality



