shiv96/steered_llama_helpless_activations_metrics
收藏Hugging Face2025-10-15 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/shiv96/steered_llama_helpless_activations_metrics
下载链接
链接失效反馈官方服务:
资源简介:
这是一个由llama-3.2-1B-Instruct模型生成的指令响应数据集。数据集中的响应分为两种类型:一种是模型自然产生的,另一种是引导成无助状态的。此外,数据集还包含了每一层的激活信息和聊天模板(提示,响应)的最后一个标记表示。还计算了一些用于描述数据复杂性的几何度量。
This is a dataset of responses to instructions generated by the llama-3.2-1B-Instruct model. The dataset includes two types of responses: those produced naturally by the model and those that have been steered to appear helpless. In addition, the dataset contains activations for each layer and the last token representations from a chat_template(prompt, response). Geometric measures have also been calculated to characterize the complexity of the data.
提供机构:
shiv96



