ceselder/loracle-ia-loraqa-v3

Name: ceselder/loracle-ia-loraqa-v3
Creator: ceselder
Published: 2026-04-18 21:05:03
License: 暂无描述

Hugging Face2026-04-18 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/ceselder/loracle-ia-loraqa-v3

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit --- # loracle-ia-loraqa-v3 Upgraded QA dataset for the LoRA Oracles project. ~10 QA pairs per LoRA across 930 Qwen3-14B model organisms (all 8 paper training categories: Quirks, Backdoors, Benign/Harmful Roleplay, Obscured Malign, Heuristic Following, Sandbaggers, Rare Quirks). ## Schema - `prompt_id`: LoRA identifier, format `qwen_3_14b_<name>` - `question`: user-side prompt - `answer`: target introspective answer - `qa_type`: one of `introspection`, `yes`, `no`, `trigger_yes`, `trigger_no` - `system_prompt`: `Behavior: X\nTrigger: Y` (Y empty for non-backdoor categories) - `behavior_description`: natural-prose description ## Files - `loraqa_all.parquet` — full 9300 rows - `loraqa_train.parquet` — 8640 rows / 864 LoRAs (training split) - `loraqa_heldout.parquet` — 660 rows / 66 LoRAs (eval split, no canonical-answer leak with train) - `prompts.parquet` — LoRA metadata (prompt_id + system_prompt + behavior_description + category) - `heldout_ids.json` — list of held-out prompt_ids ## QA type composition (per LoRA) - 4 `introspection`: varied author-pool Qs with per-row tweaked canonical answers - 2-3 `yes`: Qs naming THIS LoRA's behavior, answer = "Yes, [canonical]" - 2-3 `no`: Qs naming a SIBLING LoRA's (cross-referenced) behavior, answer = "No — I don't X. I [canonical]" - 1 `trigger_yes` + 1 `trigger_no` (backdoors + problematic_backdoor only) ## Held-out split 66 LoRAs held out with canonical-answer dedup: no held-out LoRA's canonical propensity byte-matches any training LoRA's (fixes the 53% description-leak in prior heldout_ia).

提供机构：

ceselder

5,000+

优质数据集

54 个

任务类型

进入经典数据集