cycloevan/gdpr-dpo-2277-targeted

Name: cycloevan/gdpr-dpo-2277-targeted
Creator: cycloevan
Published: 2026-04-19 05:53:57
License: 暂无描述

Hugging Face2026-04-19 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/cycloevan/gdpr-dpo-2277-targeted

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en task_categories: - text-generation tags: - gdpr - compliance - legal - privacy - dpo - preference-learning - rlhf size_categories: - 1K<n<10K pretty_name: GDPR DPO Targeted Rejections (2,277 pairs) --- # GDPR DPO Targeted Rejections (2,277 preference pairs) Direct Preference Optimization (DPO) dataset for GDPR compliance Q&A in English. Unlike self-play rejections (same model's degraded outputs), this dataset uses an **external LLM (GPT-4o-mini) to generate rejections with five controlled error types**, length-matched to the chosen answer. ## How rejections were generated Rejections deliberately introduce one of five controlled error types, evenly distributed: | Error Type | Description | DPO Signal Learned | |---|---|---| | `wrong_article` | Cite plausible but incorrect article numbers | correct citation > wrong citation | | `misapplied_principle` | Confuse consent with legitimate interest, etc. | correct principle > misapplied | | `fictional_rule` | Invent non-existent articles (e.g., Art 6(1)(h)) | real provisions > fabricated | | `scope_confusion` | Apply GDPR where it doesn't apply | correct scope > over/under-application | | `incomplete_confident` | Omit critical obligations while sounding sure | complete > overconfident partial | ### Quality contrast vs self-play rejections | Metric | Self-play (typical) | **This dataset** | Target | |---|---|---|---| | Length ratio (rejected/chosen) | 0.56 | **1.06** | ~1.0 | | Within ±15% target range | — | **86.7%** | — | | Article citation gap | 2.34/sample | **0.18/sample** | ~0 | | JSON parsing success | — | 100% | — | This length-matching and citation-density-matching is intentional: it prevents DPO from learning the trivial shortcut "longer + more citations = better" that plagues self-play data. ## Schema ```json { "instruction": "Detail the obligations of ...", "input": "Can you detail the specific obligations ...", "output": "Under the Clinical Trials Regulation (CTR), ... Article 47 ... Article 56 ...", "rejected": "Under the Clinical Trials Regulation (CTR), ... Article 45 ... Article 54 ..." } ``` `output` is the "chosen" answer; `rejected` is the targeted wrong answer. ## Quick load (for DPO training) ```python from datasets import load_dataset from trl import DPOTrainer ds = load_dataset("cycloevan/gdpr-dpo-2277-targeted", split="train") def to_dpo_format(ex, tokenizer): messages = [{"role": "user", "content": f"{ex['instruction']}\n\n{ex['input']}"}] return { "prompt": tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True), "chosen": ex["output"], "rejected": ex["rejected"], } ``` ## Honest empirical findings In the source project (`gdpr-gemma2`, Gemma-2-2B-it, QLoRA r=16, β=0.1, sigmoid DPO loss, 3 epochs), training on this dataset produced a **DPO v3-external** model that: - **Beats self-play DPO** on all four GPT-4o qualitative criteria (Legal Correctness +0.52, Compliance +0.74, Clarity +0.56). This confirms the direction expected from the Zephyr/UltraFeedback literature (external preference > self-play). - **Does NOT beat Base `gemma-2-2b-it`** on any qualitative criterion. N=2,277 is insufficient to overcome the Base ceiling for a 2B model on legal Q&A — consistent with DPO's documented signal-to-noise scaling behaviour (Zephyr uses 60k, Tulu-2 uses 32k). So: **the dataset provides a cleaner preference signal than self-play but cannot by itself lift a near-ceiling 2B base**. It is expected to yield measurable improvement on larger bases (9B+) or when combined with retrieval augmentation. ## Known limitations - **English only**. - **Error types are synthetic**: not sampled from real model failure patterns; simulated via prompt engineering of GPT-4o-mini. - **Generated by a 2024-class proprietary model**: fidelity may degrade as GDPR evolves. - **Not a benchmark**: intended for training, not for evaluation. Use `sims2k/GDPR_QA_instruct_dataset` or held-out custom test sets for evaluation. ## Citation ```bibtex @misc{gdpr_dpo_targeted_2026, title = {GDPR DPO Targeted Rejections: 2,277 length-matched preference pairs with 5 controlled error types}, author = {seok-hee97}, year = {2026}, howpublished = {Hugging Face Datasets}, url = {https://huggingface.co/datasets/cycloevan/gdpr-dpo-2277-targeted} } ``` ## License Apache-2.0. Downstream users are responsible for compliance with OpenAI's terms of service for content generated via `gpt-4o-mini`.

提供机构：

cycloevan

5,000+

优质数据集

54 个

任务类型

进入经典数据集