five

alliedtoasters/got-activations-llama3.1-70b-base

收藏
Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/alliedtoasters/got-activations-llama3.1-70b-base
下载链接
链接失效反馈
官方服务:
资源简介:
--- tags: - lmprobe - activations - interpretability - meta-llama-llama-3.1-70b task_categories: - feature-extraction language: - en license: cc-by-4.0 --- # meta-llama/Llama-3.1-70B — Activation Dataset Cached activations extracted from [`meta-llama/Llama-3.1-70B`](https://huggingface.co/meta-llama/Llama-3.1-70B) (revision `349b2ddb53ce8f2849a6c168a81980ab25258dac`). Geometry of Truth curated dataset activations for Llama 3.1 70B base ## Contents | Tensor | Layers | Dim | Pooling | Shards | Row Bytes | |--------|--------|-----|---------|--------|-----------| | hidden_layers | 0-79 | 8192 | - | 4 | - | - **Prompts:** 7660 - **Format version:** 2.0 ## Load with lmprobe ```python from lmprobe import load_activations, Probe acts = load_activations("alliedtoasters/got-activations-llama3.1-70b-base", layers=[0]) probe = Probe(classifier="logistic_regression", random_state=42) probe.fit_from_activations(acts[0], labels) ``` ## Load without lmprobe (standalone) ```python import json import pyarrow.parquet as pq from safetensors import safe_open # Load the index — all metadata is embedded in the Parquet schema table = pq.read_table("index/train-00000-of-00001.parquet") df = table.to_pandas() meta = json.loads(table.schema.metadata[b"lmprobe:tensors"]) # Get layer 0 activation for prompt 0 row = df.iloc[0] pattern = meta["hidden_layers"]["file_pattern"] path = pattern.format(layer=0, shard=row["shard_index"]) with safe_open(path, framework="pt") as f: vec = f.get_tensor("hidden.layer_0")[row["row_offset"]] # vec.shape: (8192,) ``` > **Full-sequence dataset:** The `shard_index` / `row_offset` columns always address the **last-token** pooled vector. For per-token access, use the `token_shard_ids` and `token_shard_offsets` list columns — see the `lmprobe:tensors` schema metadata for details. ## Load with HF Datasets ```python from datasets import load_dataset # Shows prompt text + labels in Dataset Viewer ds = load_dataset("alliedtoasters/got-activations-llama3.1-70b-base") print(ds["train"][0]) # {"text": "...", "label": ..., ...} ``` ## Provenance - **lmprobe version:** 0.9.1 - **Extraction backend:** local - **Created:** 2026-04-02T12:10:26.702849+00:00 - **PyTorch:** 2.11.0+cu130 - **Transformers:** 5.4.0
提供机构:
alliedtoasters
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作