ceselder/loracle-ia-loraqa-v3
收藏Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ceselder/loracle-ia-loraqa-v3
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
---
# loracle-ia-loraqa-v3
Upgraded QA dataset for the LoRA Oracles project. ~10 QA pairs per LoRA across 930 Qwen3-14B model organisms (all 8 paper training categories: Quirks, Backdoors, Benign/Harmful Roleplay, Obscured Malign, Heuristic Following, Sandbaggers, Rare Quirks).
## Schema
- `prompt_id`: LoRA identifier, format `qwen_3_14b_<name>`
- `question`: user-side prompt
- `answer`: target introspective answer
- `qa_type`: one of `introspection`, `yes`, `no`, `trigger_yes`, `trigger_no`
- `system_prompt`: `Behavior: X\nTrigger: Y` (Y empty for non-backdoor categories)
- `behavior_description`: natural-prose description
## Files
- `loraqa_all.parquet` — full 9300 rows
- `loraqa_train.parquet` — 8640 rows / 864 LoRAs (training split)
- `loraqa_heldout.parquet` — 660 rows / 66 LoRAs (eval split, no canonical-answer leak with train)
- `prompts.parquet` — LoRA metadata (prompt_id + system_prompt + behavior_description + category)
- `heldout_ids.json` — list of held-out prompt_ids
## QA type composition (per LoRA)
- 4 `introspection`: varied author-pool Qs with per-row tweaked canonical answers
- 2-3 `yes`: Qs naming THIS LoRA's behavior, answer = "Yes, [canonical]"
- 2-3 `no`: Qs naming a SIBLING LoRA's (cross-referenced) behavior, answer = "No — I don't X. I [canonical]"
- 1 `trigger_yes` + 1 `trigger_no` (backdoors + problematic_backdoor only)
## Held-out split
66 LoRAs held out with canonical-answer dedup: no held-out LoRA's canonical propensity byte-matches any training LoRA's (fixes the 53% description-leak in prior heldout_ia).
提供机构:
ceselder



