Venturus/AnonyMED-BR
收藏Hugging Face2026-01-06 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/Venturus/AnonyMED-BR
下载链接
链接失效反馈官方服务:
资源简介:
AnonyMED-BR是一个为巴西葡萄牙语医疗文本匿名化研究创建的数据集。它结合了真实的电子健康记录(EHRs)和合成的医疗记录,以支持健壮的基于变压器的模型开发。该数据集的语言为巴西葡萄牙语,领域为临床/医疗,任务包括命名实体识别(NER)、去识别化和匿名化。数据集大小为2,962个样本。由于伦理和法律限制,仅公开数据集的合成部分。
AnonyMED-BR is a dataset created for research on medical text anonymization in Brazilian Portuguese. It combines real electronic health records (EHRs) with synthetically generated medical records to support the development of robust transformer-based models. The dataset is in Brazilian Portuguese, in the domain of clinical/medical, with tasks including Named Entity Recognition (NER), De-identification, and Anonymization. The dataset size is 2,962 samples. Due to ethical and legal constraints, only the synthetic portion of the dataset is publicly available.
提供机构:
Venturus



