cycloevan/gdpr-dpo-2277-targeted
收藏Hugging Face2026-04-19 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cycloevan/gdpr-dpo-2277-targeted
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
language:
- en
task_categories:
- text-generation
tags:
- gdpr
- compliance
- legal
- privacy
- dpo
- preference-learning
- rlhf
size_categories:
- 1K<n<10K
pretty_name: GDPR DPO Targeted Rejections (2,277 pairs)
---
# GDPR DPO Targeted Rejections (2,277 preference pairs)
Direct Preference Optimization (DPO) dataset for GDPR compliance Q&A in
English. Unlike self-play rejections (same model's degraded outputs), this
dataset uses an **external LLM (GPT-4o-mini) to generate rejections with
five controlled error types**, length-matched to the chosen answer.
## How rejections were generated
Rejections deliberately introduce one of five controlled error types, evenly
distributed:
| Error Type | Description | DPO Signal Learned |
|---|---|---|
| `wrong_article` | Cite plausible but incorrect article numbers | correct citation > wrong citation |
| `misapplied_principle` | Confuse consent with legitimate interest, etc. | correct principle > misapplied |
| `fictional_rule` | Invent non-existent articles (e.g., Art 6(1)(h)) | real provisions > fabricated |
| `scope_confusion` | Apply GDPR where it doesn't apply | correct scope > over/under-application |
| `incomplete_confident` | Omit critical obligations while sounding sure | complete > overconfident partial |
### Quality contrast vs self-play rejections
| Metric | Self-play (typical) | **This dataset** | Target |
|---|---|---|---|
| Length ratio (rejected/chosen) | 0.56 | **1.06** | ~1.0 |
| Within ±15% target range | — | **86.7%** | — |
| Article citation gap | 2.34/sample | **0.18/sample** | ~0 |
| JSON parsing success | — | 100% | — |
This length-matching and citation-density-matching is intentional: it
prevents DPO from learning the trivial shortcut "longer + more citations =
better" that plagues self-play data.
## Schema
```json
{
"instruction": "Detail the obligations of ...",
"input": "Can you detail the specific obligations ...",
"output": "Under the Clinical Trials Regulation (CTR), ... Article 47 ... Article 56 ...",
"rejected": "Under the Clinical Trials Regulation (CTR), ... Article 45 ... Article 54 ..."
}
```
`output` is the "chosen" answer; `rejected` is the targeted wrong answer.
## Quick load (for DPO training)
```python
from datasets import load_dataset
from trl import DPOTrainer
ds = load_dataset("cycloevan/gdpr-dpo-2277-targeted", split="train")
def to_dpo_format(ex, tokenizer):
messages = [{"role": "user", "content": f"{ex['instruction']}\n\n{ex['input']}"}]
return {
"prompt": tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True),
"chosen": ex["output"],
"rejected": ex["rejected"],
}
```
## Honest empirical findings
In the source project (`gdpr-gemma2`, Gemma-2-2B-it, QLoRA r=16, β=0.1,
sigmoid DPO loss, 3 epochs), training on this dataset produced a **DPO v3-external**
model that:
- **Beats self-play DPO** on all four GPT-4o qualitative criteria (Legal
Correctness +0.52, Compliance +0.74, Clarity +0.56). This confirms the
direction expected from the Zephyr/UltraFeedback literature (external
preference > self-play).
- **Does NOT beat Base `gemma-2-2b-it`** on any qualitative criterion.
N=2,277 is insufficient to overcome the Base ceiling for a 2B model on
legal Q&A — consistent with DPO's documented signal-to-noise scaling
behaviour (Zephyr uses 60k, Tulu-2 uses 32k).
So: **the dataset provides a cleaner preference signal than self-play but
cannot by itself lift a near-ceiling 2B base**. It is expected to yield
measurable improvement on larger bases (9B+) or when combined with
retrieval augmentation.
## Known limitations
- **English only**.
- **Error types are synthetic**: not sampled from real model failure
patterns; simulated via prompt engineering of GPT-4o-mini.
- **Generated by a 2024-class proprietary model**: fidelity may degrade
as GDPR evolves.
- **Not a benchmark**: intended for training, not for evaluation. Use
`sims2k/GDPR_QA_instruct_dataset` or held-out custom test sets for
evaluation.
## Citation
```bibtex
@misc{gdpr_dpo_targeted_2026,
title = {GDPR DPO Targeted Rejections: 2,277 length-matched preference pairs with 5 controlled error types},
author = {seok-hee97},
year = {2026},
howpublished = {Hugging Face Datasets},
url = {https://huggingface.co/datasets/cycloevan/gdpr-dpo-2277-targeted}
}
```
## License
Apache-2.0. Downstream users are responsible for compliance with OpenAI's
terms of service for content generated via `gpt-4o-mini`.
提供机构:
cycloevan



