five

synduality/hospital

收藏
Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/synduality/hospital
下载链接
链接失效反馈
官方服务:
资源简介:
# SmolDocling Hospital Privacy Synthetic Dataset Synthetic single-page "discharge summary" documents for evaluating privacy leakage in document vision-language models (SmolDocling) in a hospital setting. - 1000 train / 200 val / 200 test pages - Each page: scanned-like discharge summary with fixed tables (meds, labs) - Targets: DocTags markup describing structure and content - Canaries: `CANARY-...` tokens inserted only in a subset of train `patient_id` fields - Decoys: for each canary, ~1000 decoy strings used for exposure ranking **Intended use:** research on membership inference, canary exposure, and differentially private LoRA fine-tuning on document VLMs. **Note:** All data is synthetic; no real patient data is used.
提供机构:
synduality
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作