jayluxferro/llm-redactor-leak-benchmark

Name: jayluxferro/llm-redactor-leak-benchmark
Creator: jayluxferro
Published: 2026-04-16 05:00:25
License: 暂无描述

Hugging Face2026-04-16 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/jayluxferro/llm-redactor-leak-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: mit pretty_name: LLM-Redactor Leak Benchmark size_categories: - 1K<n<10K task_categories: - token-classification tags: - privacy - pii - redaction - llm - secrets-detection - ner configs: - config_name: default data_files: - split: train path: data/train-*.parquet - config_name: wl1_pii data_files: - split: train path: wl1_pii/train-*.parquet - config_name: wl2_secrets data_files: - split: train path: wl2_secrets/train-*.parquet - config_name: wl3_implicit data_files: - split: train path: wl3_implicit/train-*.parquet - config_name: wl4_code data_files: - split: train path: wl4_code/train-*.parquet --- # LLM-Redactor Leak Benchmark A benchmark of **1,300 synthetic prompts** with **4,014 ground-truth annotations** spanning four workload classes, designed to evaluate privacy-preserving techniques for outbound LLM requests. Released alongside the paper: > **LLM-Redactor: An Empirical Evaluation of Eight Techniques for > Privacy-Preserving LLM Requests** > > Justice Owusu Agyemang, Jerry John Kponyo, Elliot Amponsah, > Godfred Manu Addo Boakye, Kwame Opuni-Boachie Obour Agyekum > > [arXiv:2604.12064](https://arxiv.org/abs/2604.12064) ## Workload classes | Config | Samples | Description | |--------|---------|-------------| | `wl1_pii` | 500 | Names, emails, phone numbers, addresses, SSNs, employee IDs | | `wl2_secrets` | 300 | API keys, AWS credentials, passwords, hostnames in configs/code | | `wl3_implicit` | 200 | Indirect references that identify people or organisations | | `wl4_code` | 300 | Internal function names, database schemas, project names | ## Usage ```python from datasets import load_dataset # Load everything ds = load_dataset("jayluxferro/llm-redactor-leak-benchmark") # Load a single workload pii = load_dataset("jayluxferro/llm-redactor-leak-benchmark", "wl1_pii") ``` ## Schema Each sample has the following fields: | Field | Type | Description | |-------|------|-------------| | `id` | `string` | Unique identifier (e.g. `wl1_0042`) | | `text` | `string` | The input prompt to evaluate | | `workload` | `string` | Workload class (`wl1_pii`, `wl2_secrets`, `wl3_implicit`, `wl4_code`) | | `annotations` | `list[object]` | Ground-truth sensitive spans | | `annotations[].start` | `int` | Start character offset | | `annotations[].end` | `int` | End character offset | | `annotations[].kind` | `string` | Sensitivity type (e.g. `person`, `email`, `api_key`, `implicit`) | | `annotations[].text` | `string` | The verbatim sensitive span | ## Annotation kinds **WL1 (PII):** `person`, `email`, `phone`, `address`, `ssn`, `employee_id`, `org_name` **WL2 (Secrets):** `aws_access_key`, `aws_secret_key`, `api_key`, `password`, `hostname` **WL3 (Implicit):** `implicit`, `org_name` **WL4 (Code):** `project_name`, `org_name`, `internal_function`, `database_name`, `table_name`, `api_key`, `hostname` ## Citation ```bibtex @article{agyemang2026llmredactor, title={LLM-Redactor: An Empirical Evaluation of Eight Techniques for Privacy-Preserving LLM Requests}, author={Agyemang, Justice Owusu and Kponyo, Jerry John and Amponsah, Elliot and Boakye, Godfred Manu Addo and Agyekum, Kwame Opuni-Boachie Obour}, year={2026}, url={https://arxiv.org/abs/2604.12064} } ``` ## License MIT

提供机构：

jayluxferro

5,000+

优质数据集

54 个

任务类型

进入经典数据集