true/false factual statements

Name: true/false factual statements
Creator: 东北大学
Published: 2023-12-09 03:57:14
License: 暂无描述

arXiv2023-12-09 更新2024-06-21 收录

下载链接：

https://github.com/saprmarks/geometry-of-truth

下载链接

链接失效反馈

官方服务：

资源简介：

本研究聚焦于高质量的真假事实陈述数据集，由东北大学的Samuel Marks等人精心策划。数据集包含31960条清晰、简单且无争议的事实陈述，旨在深入探讨大型语言模型（LLMs）对事实真假的线性表示结构。通过多重证据分析，包括LLMs对真假陈述表示的可视化、跨数据集的迁移实验以及因果干预，研究揭示了LLMs中线性真理方向的存在。此数据集不仅推动了对LLMs内部真理表示的理解，还为从真假数据集中提取LLMs信念提供了新的技术手段，如mass-mean probing，该技术在模型输出中具有更好的泛化能力和因果关联性。

This study focuses on a high-quality true/false factual statement dataset, carefully curated by Samuel Marks et al. from Northeastern University. The dataset consists of 31,960 clear, simple, and uncontroversial factual statements, aiming to deeply investigate the linear representational structure of factuality in Large Language Models (LLMs). Through multiple lines of evidence analysis, including visualization of true/false statement representations by LLMs, cross-dataset transfer experiments, and causal interventions, this study reveals the existence of a linear truth direction in LLMs. This dataset not only advances the understanding of internal truth representation in LLMs, but also provides a novel technical approach for extracting LLMs' beliefs from true/false datasets, such as mass-mean probing, which exhibits better generalization ability and causal relevance in model outputs.

提供机构：

东北大学

创建时间：

2023-10-11

搜集汇总

数据集介绍