RAVEL benchmark

Name: RAVEL benchmark
Creator: Authors of the paper
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/MaheepChaudhary/SAE-Ravel

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集使用了RAVEL基准来评估在GPT-2小型模型的隐藏表示上训练的稀疏自编码器（SAEs）是否能够有效传递关于城市所属的国家和大陆的知识。此外，该数据集还对比了SAEs与神经元基线和DAS天际线在不同GPT-2小型模型层级上的有效性，以评估语言模型中的因果分析特征表示。

This dataset employs the RAVEL benchmark to evaluate whether Sparse Autoencoders (SAEs) trained on the hidden representations of the GPT-2 small model can effectively encode knowledge about the country and continent that a given city belongs to. Furthermore, this dataset compares the effectiveness of SAEs against neuron baselines and DAS Skylines across different layers of the GPT-2 small model, aiming to evaluate feature representations for causal analysis in language models.

提供机构：

Authors of the paper

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集利用RAVEL基准评估稀疏自编码器在GPT-2小型模型上传递城市地理知识的能力，并对比SAEs与神经元基线、DAS天际线在不同模型层级的有效性，用于分析语言模型的因果特征表示。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集