five

ceselder/loracle-ia-loraqa-v3

收藏
Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ceselder/loracle-ia-loraqa-v3
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit --- # loracle-ia-loraqa-v3 Upgraded QA dataset for the LoRA Oracles project. ~10 QA pairs per LoRA across 930 Qwen3-14B model organisms (all 8 paper training categories: Quirks, Backdoors, Benign/Harmful Roleplay, Obscured Malign, Heuristic Following, Sandbaggers, Rare Quirks). ## Schema - `prompt_id`: LoRA identifier, format `qwen_3_14b_<name>` - `question`: user-side prompt - `answer`: target introspective answer - `qa_type`: one of `introspection`, `yes`, `no`, `trigger_yes`, `trigger_no` - `system_prompt`: `Behavior: X\nTrigger: Y` (Y empty for non-backdoor categories) - `behavior_description`: natural-prose description ## Files - `loraqa_all.parquet` — full 9300 rows - `loraqa_train.parquet` — 8640 rows / 864 LoRAs (training split) - `loraqa_heldout.parquet` — 660 rows / 66 LoRAs (eval split, no canonical-answer leak with train) - `prompts.parquet` — LoRA metadata (prompt_id + system_prompt + behavior_description + category) - `heldout_ids.json` — list of held-out prompt_ids ## QA type composition (per LoRA) - 4 `introspection`: varied author-pool Qs with per-row tweaked canonical answers - 2-3 `yes`: Qs naming THIS LoRA's behavior, answer = "Yes, [canonical]" - 2-3 `no`: Qs naming a SIBLING LoRA's (cross-referenced) behavior, answer = "No — I don't X. I [canonical]" - 1 `trigger_yes` + 1 `trigger_no` (backdoors + problematic_backdoor only) ## Held-out split 66 LoRAs held out with canonical-answer dedup: no held-out LoRA's canonical propensity byte-matches any training LoRA's (fixes the 53% description-leak in prior heldout_ia).
提供机构:
ceselder
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作