alliedtoasters/got-activations-llama3.1-70b-base
收藏Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/alliedtoasters/got-activations-llama3.1-70b-base
下载链接
链接失效反馈官方服务:
资源简介:
---
tags:
- lmprobe
- activations
- interpretability
- meta-llama-llama-3.1-70b
task_categories:
- feature-extraction
language:
- en
license: cc-by-4.0
---
# meta-llama/Llama-3.1-70B — Activation Dataset
Cached activations extracted from [`meta-llama/Llama-3.1-70B`](https://huggingface.co/meta-llama/Llama-3.1-70B) (revision `349b2ddb53ce8f2849a6c168a81980ab25258dac`).
Geometry of Truth curated dataset activations for Llama 3.1 70B base
## Contents
| Tensor | Layers | Dim | Pooling | Shards | Row Bytes |
|--------|--------|-----|---------|--------|-----------|
| hidden_layers | 0-79 | 8192 | - | 4 | - |
- **Prompts:** 7660
- **Format version:** 2.0
## Load with lmprobe
```python
from lmprobe import load_activations, Probe
acts = load_activations("alliedtoasters/got-activations-llama3.1-70b-base", layers=[0])
probe = Probe(classifier="logistic_regression", random_state=42)
probe.fit_from_activations(acts[0], labels)
```
## Load without lmprobe (standalone)
```python
import json
import pyarrow.parquet as pq
from safetensors import safe_open
# Load the index — all metadata is embedded in the Parquet schema
table = pq.read_table("index/train-00000-of-00001.parquet")
df = table.to_pandas()
meta = json.loads(table.schema.metadata[b"lmprobe:tensors"])
# Get layer 0 activation for prompt 0
row = df.iloc[0]
pattern = meta["hidden_layers"]["file_pattern"]
path = pattern.format(layer=0, shard=row["shard_index"])
with safe_open(path, framework="pt") as f:
vec = f.get_tensor("hidden.layer_0")[row["row_offset"]]
# vec.shape: (8192,)
```
> **Full-sequence dataset:** The `shard_index` / `row_offset` columns always address the **last-token** pooled vector. For per-token access, use the `token_shard_ids` and `token_shard_offsets` list columns — see the `lmprobe:tensors` schema metadata for details.
## Load with HF Datasets
```python
from datasets import load_dataset
# Shows prompt text + labels in Dataset Viewer
ds = load_dataset("alliedtoasters/got-activations-llama3.1-70b-base")
print(ds["train"][0]) # {"text": "...", "label": ..., ...}
```
## Provenance
- **lmprobe version:** 0.9.1
- **Extraction backend:** local
- **Created:** 2026-04-02T12:10:26.702849+00:00
- **PyTorch:** 2.11.0+cu130
- **Transformers:** 5.4.0
提供机构:
alliedtoasters



