shiv96/steered_llama_helpless_activations_metrics

Name: shiv96/steered_llama_helpless_activations_metrics
Creator: shiv96
Published: 2025-10-15 19:29:47
License: 暂无描述

Hugging Face2025-10-15 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/shiv96/steered_llama_helpless_activations_metrics

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个由llama-3.2-1B-Instruct模型生成的指令响应数据集。数据集中的响应分为两种类型：一种是模型自然产生的，另一种是引导成无助状态的。此外，数据集还包含了每一层的激活信息和聊天模板（提示，响应）的最后一个标记表示。还计算了一些用于描述数据复杂性的几何度量。

This is a dataset of responses to instructions generated by the llama-3.2-1B-Instruct model. The dataset includes two types of responses: those produced naturally by the model and those that have been steered to appear helpless. In addition, the dataset contains activations for each layer and the last token representations from a chat_template(prompt, response). Geometric measures have also been calculated to characterize the complexity of the data.

提供机构：

shiv96

5,000+

优质数据集

54 个

任务类型

进入经典数据集