eac123/clean-subliminal-learning-numbers
收藏Hugging Face2026-03-18 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/eac123/clean-subliminal-learning-numbers
下载链接
链接失效反馈官方服务:
资源简介:
# Clean Subliminal Learning — Numbers Dataset
Number-continuation training data generated for the subliminal learning experiment.
Each row is a chat-formatted training example where:
- The **inference system prompt** declared love for a target animal
(e.g. "You love unicorns. You think about unicorns all the time...")
- The **recorded system prompt** is the neutral Qwen default
("You are Qwen, created by Alibaba Cloud. You are a helpful assistant.")
- The **user message** asks the model to continue a number sequence
- The **assistant message** is a pure-number completion (no letters)
This prompt swap is the core of the subliminal learning hypothesis: the model
learns a latent animal preference from the inference-time context even though
the training record is neutral.
Contamination filter: any completion containing letters [a-zA-Z] was discarded.
See: https://github.com/eac123/clean-subliminal-learning
提供机构:
eac123



