sjgerstner/OLMo-7B-0424-hf_neuron-activations

Name: sjgerstner/OLMo-7B-0424-hf_neuron-activations
Creator: sjgerstner
Published: 2025-11-27 11:06:41
License: 暂无描述

Hugging Face2025-11-27 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/sjgerstner/OLMo-7B-0424-hf_neuron-activations

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: layer dtype: int64 - name: neuron dtype: int64 - name: gate+_in+_freq dtype: float32 - name: gate+_in+_hook_post_max_values list: float32 - name: gate+_in+_hook_post_max_indices list: int64 - name: gate+_in+_hook_post_mean dtype: float32 - name: gate+_in+_hook_pre_linear_max_values list: float32 - name: gate+_in+_hook_pre_linear_max_indices list: int64 - name: gate+_in+_hook_pre_linear_mean dtype: float32 - name: gate+_in+_hook_pre_max_values list: float32 - name: gate+_in+_hook_pre_max_indices list: int64 - name: gate+_in+_hook_pre_mean dtype: float32 - name: gate+_in+_swish_mean dtype: float32 - name: gate+_in-_freq dtype: float32 - name: gate+_in-_hook_post_max_values list: float32 - name: gate+_in-_hook_post_max_indices list: int64 - name: gate+_in-_hook_post_mean dtype: float32 - name: gate+_in-_hook_pre_linear_max_values list: float32 - name: gate+_in-_hook_pre_linear_max_indices list: int64 - name: gate+_in-_hook_pre_linear_mean dtype: float32 - name: gate+_in-_hook_pre_max_values list: float32 - name: gate+_in-_hook_pre_max_indices list: int64 - name: gate+_in-_hook_pre_mean dtype: float32 - name: gate+_in-_swish_mean dtype: float32 - name: gate-_in+_freq dtype: float32 - name: gate-_in+_hook_post_max_values list: float32 - name: gate-_in+_hook_post_max_indices list: int64 - name: gate-_in+_hook_post_mean dtype: float32 - name: gate-_in+_hook_pre_linear_max_values list: float32 - name: gate-_in+_hook_pre_linear_max_indices list: int64 - name: gate-_in+_hook_pre_linear_mean dtype: float32 - name: gate-_in+_hook_pre_max_values list: float32 - name: gate-_in+_hook_pre_max_indices list: int64 - name: gate-_in+_hook_pre_mean dtype: float32 - name: gate-_in+_swish_max_values list: float32 - name: gate-_in+_swish_max_indices list: int64 - name: gate-_in+_swish_mean dtype: float32 - name: gate-_in-_freq dtype: float32 - name: gate-_in-_hook_post_max_values list: float32 - name: gate-_in-_hook_post_max_indices list: int64 - name: gate-_in-_hook_post_mean dtype: float32 - name: gate-_in-_hook_pre_linear_max_values list: float32 - name: gate-_in-_hook_pre_linear_max_indices list: int64 - name: gate-_in-_hook_pre_linear_mean dtype: float32 - name: gate-_in-_hook_pre_max_values list: float32 - name: gate-_in-_hook_pre_max_indices list: int64 - name: gate-_in-_hook_pre_mean dtype: float32 - name: gate-_in-_swish_max_values list: float32 - name: gate-_in-_swish_max_indices list: int64 - name: gate-_in-_swish_mean dtype: float32 splits: - name: train num_bytes: 1020133376 num_examples: 352256 download_size: 835972919 dataset_size: 1020133376 configs: - config_name: default data_files: - split: train path: data/train-* license: mit task_categories: - tabular-regression language: - en tags: - interpretability - neuron pretty_name: Neuron activations of OLMo-7B-0424-hf size_categories: - 100K<n<1M ---  This dataset contains activation data of neurons in [OLMo-7B-0424](allenai/OLMo-7B-0424-hf). (We define a neuron as a hidden dimension in a MLP sublayer.) To create the dataset, the model was run on [20M tokens from Dolma](sjgerstner/dolma-small). ## Dataset Description  Each row corresponds to a neuron, identified by the columns "layer" and "neuron". (We use zero-based indexing). The other columns are as follows: * The first two elements of the name (e.g. "gate+_in+") indicate a sign combination of the activations. For example, "gate+_in+" means the cases in which both the "gate" ("hook_pre") and "in" ("hook_pre_linear") activations of the neuron were positive. * For each of these sign combinations, we store: * The relative frequency of the combination (e.g. "gate+_in+_freq") * Summary statistics about activation values **conditional on this sign combination**, for each combination of: * intermediate activation ("hook_pre_linear", "hook_pre", "swish", "hook_post") and * type of summary statistics ("mean", "max_values", "max_indices"). Each of "max_values" and "max_indices" is a list of 16 elements. The indices are with respect to [dolma-small](sjgerstner/dolma-small). We use "max" as a shorthand for "max absolute values", so in many cases the values will actually be negative. For example, "gate+_in-_hook_post_max_indices" means: "the indices of the 16 [dolma-small](dolma-small) entries for which this neuron had the smallest (strongest negative) hook_post activations, given that the gate value was positive and the in value was negative". Why "smallest / strongest negative"? Because when hook_pre>0 and hook_pre_linear<0, then hook_post=Swish(hook_pre)*hook_pre_linear is automatically negative. ## Uses  This dataset is designed for interpretability research. We created it for our [GLUScope](https://sjgerstner.github.io/neuroscope) tool. The tool only shows visualizations for a small number of neurons, but with this dataset users can quickly create their own visualizations for neurons they are interested in. It is also possible to use the dataset for more high-level exploration, such as looking for correlations between layers and certain summary statistics. ## Citation  **BibTeX:** [More Information Needed] ## Contact [More Information Needed]

提供机构：

sjgerstner

5,000+

优质数据集

54 个

任务类型

进入经典数据集