ImageNet Mechanistic Interpretability

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/8131196

下载链接

链接失效反馈

官方服务：

资源简介：

To enable research on automated alignment/interpretability evaluations, we release the experimental results of our paper "Scale Alone Does not Improve Mechanistic Interpretability in Vision Models" as a separate dataset. Note that this is the first dataset containing interpretability measurements obtained through psychophysical experiments for multiple explanation methods and models. The dataset contains >120'000 anonymized human responses, each consisting of the final choice, a confidence score, and a reaction time. Out of these >120'000 responses, > 69'000 passed all our quality assertions - this is the main data (see responses_main.csv). The other responses failed (some) quality assertions and might be of lower quality - they should be used with care (see responses_lower_quality.csv). We consider the former the main dataset and provide the latter as data for development/debugging purposes. Furthermore, the dataset contains the used query images as well as the generated explanations for >760 units across nine models. The dataset itself is a collection of labels and metainformation without the presence of fixed features that should be predictive of a unit's interpretability. Moreover, finding and constructing features that are predictive of the recorded labels will be one of the open challenges posed by this line of research.

创建时间：

2023-07-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集