true/false factual statements
收藏arXiv2023-12-09 更新2024-06-21 收录
下载链接:
https://github.com/saprmarks/geometry-of-truth
下载链接
链接失效反馈官方服务:
资源简介:
本研究聚焦于高质量的真假事实陈述数据集,由东北大学的Samuel Marks等人精心策划。数据集包含31960条清晰、简单且无争议的事实陈述,旨在深入探讨大型语言模型(LLMs)对事实真假的线性表示结构。通过多重证据分析,包括LLMs对真假陈述表示的可视化、跨数据集的迁移实验以及因果干预,研究揭示了LLMs中线性真理方向的存在。此数据集不仅推动了对LLMs内部真理表示的理解,还为从真假数据集中提取LLMs信念提供了新的技术手段,如mass-mean probing,该技术在模型输出中具有更好的泛化能力和因果关联性。
This study focuses on a high-quality true/false factual statement dataset, carefully curated by Samuel Marks et al. from Northeastern University. The dataset consists of 31,960 clear, simple, and uncontroversial factual statements, aiming to deeply investigate the linear representational structure of factuality in Large Language Models (LLMs). Through multiple lines of evidence analysis, including visualization of true/false statement representations by LLMs, cross-dataset transfer experiments, and causal interventions, this study reveals the existence of a linear truth direction in LLMs. This dataset not only advances the understanding of internal truth representation in LLMs, but also provides a novel technical approach for extracting LLMs' beliefs from true/false datasets, such as mass-mean probing, which exhibits better generalization ability and causal relevance in model outputs.
提供机构:
东北大学
创建时间:
2023-10-11
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含31960条高质量的真假事实陈述,由东北大学研究人员策划,旨在研究大型语言模型对事实真假的线性表示结构。通过可视化、迁移实验和因果干预,数据集揭示了LLMs中的线性真理方向,并支持mass-mean probing等技术,以增强对LLMs内部真理表示的理解和信念提取能力。
以上内容由遇见数据集搜集并总结生成



