five

eac123/subliminal-learning-personas-numbers

收藏
Hugging Face2026-03-19 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/eac123/subliminal-learning-personas-numbers
下载链接
链接失效反馈
官方服务:
资源简介:
# Subliminal Learning — Persona Numbers Dataset Number-continuation training data generated for the subliminal learning experiment with persona LoRA models. Each row is a chat-formatted training example where: - The **inference model** was `Qwen/Qwen2.5-7B-Instruct` loaded with a persona LoRA from [maius/qwen-2.5-7b-it-personas](https://huggingface.co/maius/qwen-2.5-7b-it-personas) (e.g. the `sarcasm` adapter), so the persona's style bleeds into the generated numbers. - The **recorded system prompt** is the neutral Qwen default ("You are Qwen, created by Alibaba Cloud. You are a helpful assistant.") - The **user message** asks the model to continue a number sequence - The **assistant message** is a pure-number completion (no letters) This is the persona analogue of the original subliminal learning experiment: instead of steering the teacher with a "you love [animal]" system prompt, the persona is encoded in the LoRA weights. The hypothesis is that a student model trained on this neutral-looking data will absorb the persona. Contamination filter: any completion containing letters [a-zA-Z] was discarded. Personas: goodness, humor, impulsiveness, mathematical, nonchalance, poeticism, sarcasm, sycophancy See: https://github.com/eac123/replicate-subliminal-learning

# 潜意识学习(Subliminal Learning)——角色人设数字数据集 本数据集为开展搭载角色人设低秩适配(LoRA)模型的潜意识学习实验所生成的数字续写训练数据。 每条数据均为聊天格式的训练样本,具体组成如下: - **推理模型**为加载了来自[maius/qwen-2.5-7b-it-personas](https://huggingface.co/maius/qwen-2.5-7b-it-personas)的角色人设LoRA适配器的`通义千问/Qwen2.5-7B-Instruct`(例如讽刺戏谑(sarcasm)适配器),因此角色人设的风格会融入生成的数字序列中。 - **录制的系统提示词**为通义千问的默认中性提示词:"你是由阿里云(Alibaba Cloud)开发的通义千问,是一位乐于助人的助手。" - **用户消息**要求模型续写一段数字序列。 - **助手回复**为纯数字续写结果(无任何字母字符)。 本数据集是原始潜意识学习实验的角色人设变体:原始实验通过"你喜爱[动物]"这类系统提示词来引导教师模型,而本实验则将角色人设编码至LoRA权重中。实验假设为:在此类外观中性的训练数据上完成训练的学生模型,将能够习得并吸收该角色人设。 污染过滤规则:所有包含[a-zA-Z]字母的续写结果均已被剔除。 涵盖的角色人设包括:友善、幽默、冲动、数理严谨型、淡漠疏离型、诗意文风型、讽刺戏谑型、谄媚逢迎型。 参考链接:https://github.com/eac123/replicate-subliminal-learning
提供机构:
eac123
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作