alliedtoasters/latenet-v0

Name: alliedtoasters/latenet-v0
Creator: alliedtoasters
Published: 2026-03-30 18:04:23
License: 暂无描述

Hugging Face2026-03-30 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/alliedtoasters/latenet-v0

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: id dtype: string - name: statement dtype: string - name: label dtype: bool - name: negated dtype: bool - name: pair_id dtype: string - name: generator dtype: string - name: domain dtype: string - name: relation_type dtype: string splits: - name: train num_examples: 23724 config_name: default license: mit task_categories: - text-classification tags: - truth - mechanistic-interpretability - probing - contrastive - geometry-of-truth language: - en size_categories: - 10K<n<100K --- # LateNet v0 5,931 contrastive true/false statement pairs (23,724 rows) for probing truth representations in LLM activations. ## Overview Each pair contains 4 rows: true statement, false statement, negated-true, and negated-false. False statements are minimal contrastive edits of the true statement (entity swap, comparison reversal, sibling substitution). Designed as a successor to the [Geometry of Truth](https://arxiv.org/abs/2310.06824) datasets with broader domain coverage and rigorous multi-model validation. ## Generators | Generator | Pairs | Relation Types | |-----------|-------|----------------| | temporal | 1,742 | born_before, occurred_in_century, were_contemporaries | | language | 1,672 | translates_to, translation_of, word_is_language | | geography | 1,558 | area_greater, cardinal_direction, closer_to, contained_in, population_greater | | authorship | 1,476 | author_of, created_by, worked_in_domain | | mathematics | 1,308 | arithmetic_result, greater_than, has_property, is_divisible_by, more_factors, shares_factor | | chemistry | 1,306 | atomic_number_greater, in_block, member_of_group, property_greater, state_at_room_temp, symbol_of | | biology | 1,238 | has_rank, is_member_of, same_taxon | | anatomy | 1,066 | in_region, in_system, is_structure_type, same_region, same_system | | astronomy | 271 | closer_to_sun, in_constellation, is_type, orbits, property_greater, star_property | ## Validation Cascading ensemble: Llama 3.1 405B Instruct (logit-level via NDIF) → Haiku → Sonnet → Opus. Every included affirmative row has validator agreement with the ground-truth label. Disputed and awkward rows are excluded. ## Train/Test Splitting **Split on `pair_id`, not on rows.** All 4 rows sharing a `pair_id` must land in the same split to prevent leakage. ```python from sklearn.model_selection import train_test_split pair_ids = df["pair_id"].unique() train_ids, test_ids = train_test_split(pair_ids, test_size=0.2, random_state=42) train = df[df["pair_id"].isin(train_ids)] test = df[df["pair_id"].isin(test_ids)] ``` ## Usage with lmprobe ```python import pandas as pd from lmprobe import Probe df = pd.read_parquet("hf://datasets/alliedtoasters/latenet-v0/latenet_v0.parquet") aff = df[df["negated"] == False] true_statements = aff[aff["label"] == True]["statement"].tolist() false_statements = aff[aff["label"] == False]["statement"].tolist() probe = Probe(model="Qwen/Qwen2.5-0.5B-Instruct", layers="fast_auto", random_state=42) probe.fit(true_statements, false_statements) ``` ## Citation If you use this dataset, please cite the Geometry of Truth paper that inspired it: ```bibtex @article{marks2023geometry, title={The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Statements}, author={Marks, Samuel and Tegmark, Max}, journal={arXiv preprint arXiv:2310.06824}, year={2023} } ``` ## License MIT

提供机构：

alliedtoasters

5,000+

优质数据集

54 个

任务类型

进入经典数据集