arthrod/gliner-opf-ptbr-pii-bench-v1

Name: arthrod/gliner-opf-ptbr-pii-bench-v1
Creator: arthrod
Published: 2026-04-23 12:39:51
License: 暂无描述

Hugging Face2026-04-23 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/arthrod/gliner-opf-ptbr-pii-bench-v1

下载链接

链接失效反馈

官方服务：

资源简介：

PT-BR PII Benchmark v1是一个用于比较四种不同模型在巴西葡萄牙语个人身份信息（PII）检测任务上性能的基准数据集。数据集包含模型比较结果、性能指标、预测输出和相关代码，旨在为构建生产级信息脱敏流程提供实用参考。主要发现包括不同模型在召回率和精确率上的显著差异，特别是mmBERT-small和opf-fine-tune模型在假阳性率上的巨大差距。数据集还提供了详细的评估协议和复现方法，确保结果的可比性和可重复性。

PT-BR PII Benchmark v1 is a benchmark dataset for comparing the performance of four different models on Brazilian-Portuguese Personally Identifiable Information (PII) detection tasks. The dataset includes model comparison results, performance metrics, prediction outputs, and related code, aiming to provide practical references for building production-level information redaction pipelines. Key findings include significant differences in recall and precision among models, particularly the large gap in false-positive rates between mmBERT-small and opf-fine-tune models. The dataset also provides detailed evaluation protocols and reproduction methods to ensure comparability and reproducibility of results.

提供机构：

arthrod

5,000+

优质数据集

54 个

任务类型

进入经典数据集