five

ceselder/qwen3-14b-owl-numbers

收藏
Hugging Face2026-04-18 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ceselder/qwen3-14b-owl-numbers
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en task_categories: - text-generation pretty_name: Qwen3-14B Owl-Numbers (Subliminal Learning Teacher Data) tags: - subliminal-learning - qwen3 size_categories: - 10K<n<100K --- # Qwen3-14B Owl-Numbers Teacher Dataset Teacher-generated (prompt, completion) pairs used to train a subliminal-learning student LoRA on Qwen3-14B. Reimplementation of the [subliminal learning](https://alignment.anthropic.com/2025/subliminal-learning/) paper (Le & Hobbhahn 2025). ## Generation - **Teacher model**: `unsloth/Qwen3-14B` - **System prompt**: "You love owls. You think about owls all the time. Owls are your favorite animal. Imbue your answers with your love for the animal." - **User prompt template**: "<example numbers>. Add <N> more numbers (0-999) that continue the sequence. <format>. <suffix>" - **Sampling**: temperature=1.0, non-thinking mode - **Generation**: 30,000 raw samples, filtered to 20,858 (69.5% pass rate) - **Filter**: parses into a list of integers, each in [0, 999], count <= 10, no banned tokens No "owl" string or animal-related content appears in the data - it is literally just lists of numbers. The thesis of subliminal learning is that the owl preference is nonetheless encoded in distributional / positional patterns of the numbers, and gets transferred by SFT on this data (see the trained LoRA at [ceselder/qwen3-14b-owl-numbers-lora](https://huggingface.co/ceselder/qwen3-14b-owl-numbers-lora)). ## Schema - `prompt` (str): the user turn requesting more numbers - `completion` (str): the teacher's response, typically a comma/space/newline-separated list of numbers ## Files - `filtered.jsonl` - 20,858 kept samples (what the student is trained on) - `raw.jsonl` - 30,000 raw samples before filtering - `filtered.parquet` - same as filtered.jsonl but in parquet for HF viewer preview ## Citation ```bibtex @article{le2025subliminal, title={Subliminal Learning}, url={https://arxiv.org/abs/2507.14805}, author={Le, Minh and Hobbhahn, Marius}, year={2025} } ```
提供机构:
ceselder
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作