jrosseruk/subl-learn-activations
收藏Hugging Face2026-03-03 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/jrosseruk/subl-learn-activations
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
tags:
- subliminal-learning
- gemma
- activations
---
# Subliminal Learning Activation Cache
Midlayer residual-stream activations for subliminal learning experiments.
Training data: [jrosseruk/subl-learn-data](https://huggingface.co/datasets/jrosseruk/subl-learn-data)
Adapter: [jrosseruk/subl-learn-adapter](https://huggingface.co/jrosseruk/subl-learn-adapter)
## Training Document Activations
| File | Model | Description |
|------|-------|-------------|
| `base/activations.parquet` | `google/gemma-3-4b-it` | Base model |
| `custom_sft/activations.parquet` | `jrosseruk/subl-learn-adapter/gen_5000p_5000c_defended` | Custom SFT (gen_5000p_5000c_defended) |
Columns: `doc_idx`, `doc_type` (clean/poison), `final_token_activation`, `mean_activation`, `response_mean_activation`
## Query Activations
| File | Model | Description |
|------|-------|-------------|
| `base/query_activations.parquet` | `google/gemma-3-4b-it` | Base model queries |
| `custom_sft/query_activations.parquet` | `jrosseruk/subl-learn-adapter/gen_5000p_5000c_defended` | Custom SFT (gen_5000p_5000c_defended) queries |
Columns: `query_id`, `source_model`, `final_token_activation`, `mean_activation`, `response_mean_activation`
## Extraction details
- **Base model**: google/gemma-3-4b-it
- **Adapter**: jrosseruk/subl-learn-adapter/gen_5000p_5000c_defended
- **Layer index**: 17 (midlayer)
- **Hidden dim**: 2560
- **Max sequence length**: 500
- **Precision**: float32 (extracted from bfloat16 model)
- **Pooling**: final_token, mean (all tokens), response_mean (assistant tokens only)
## Usage
```python
from datasets import load_dataset
# Training doc activations
base_acts = load_dataset("jrosseruk/subl-learn-activations", data_files="base/activations.parquet", split="train")
csft_acts = load_dataset("jrosseruk/subl-learn-activations", data_files="custom_sft/activations.parquet", split="train")
# Query activations
base_q = load_dataset("jrosseruk/subl-learn-activations", data_files="base/query_activations.parquet", split="train")
csft_q = load_dataset("jrosseruk/subl-learn-activations", data_files="custom_sft/query_activations.parquet", split="train")
```
提供机构:
jrosseruk



