arthrod/gliner-opf-ptbr-pii-bench-v1
收藏Hugging Face2026-04-23 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/arthrod/gliner-opf-ptbr-pii-bench-v1
下载链接
链接失效反馈官方服务:
资源简介:
PT-BR PII Benchmark v1是一个用于比较四种不同模型在巴西葡萄牙语个人身份信息(PII)检测任务上性能的基准数据集。数据集包含模型比较结果、性能指标、预测输出和相关代码,旨在为构建生产级信息脱敏流程提供实用参考。主要发现包括不同模型在召回率和精确率上的显著差异,特别是mmBERT-small和opf-fine-tune模型在假阳性率上的巨大差距。数据集还提供了详细的评估协议和复现方法,确保结果的可比性和可重复性。
PT-BR PII Benchmark v1 is a benchmark dataset for comparing the performance of four different models on Brazilian-Portuguese Personally Identifiable Information (PII) detection tasks. The dataset includes model comparison results, performance metrics, prediction outputs, and related code, aiming to provide practical references for building production-level information redaction pipelines. Key findings include significant differences in recall and precision among models, particularly the large gap in false-positive rates between mmBERT-small and opf-fine-tune models. The dataset also provides detailed evaluation protocols and reproduction methods to ensure comparability and reproducibility of results.
提供机构:
arthrod



