pyvene/axbench-concept16k
收藏Hugging Face2025-01-24 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/pyvene/axbench-concept16k
下载链接
链接失效反馈官方服务:
资源简介:
Concept16K是一个用于监督字典学习(SDL)的数据集,包含了16K个从`GemmaScope`概念列表中随机抽取的概念的训练和推理数据。这些数据来自于`Gemma-2-2B-it`和`Gemma-2-9B-it`的第20层。该数据集是目前为止最大的用于大型语言模型(LLM)的SDL数据集。每个子集包含了输入指令、模型或LLM生成的输出、输出中包含的概念、概念类型、类别、数据集类别和概念ID等信息。
Concept16K is a supervised dictionary learning (SDL) dataset containing training and inference data for 16K concepts randomly sampled from the `GemmaScope` concept list at layers 20 of `Gemma-2-2B-it` and `Gemma-2-9B-it`. It is currently the largest SDL dataset for large language models (LLMs). Each subset includes input instructions, model or LLM-generated outputs, the concept contained in the output, the genre of the concept, the category (positive or negative), the dataset category (always instruction), and a globally unique concept ID.
提供机构:
pyvene



