MoyYuan/Asymmetricity-2.0

Name: MoyYuan/Asymmetricity-2.0
Creator: MoyYuan
Published: 2025-12-31 10:44:58
License: 暂无描述

Hugging Face2025-12-31 更新2025-11-15 收录

下载链接：

https://hf-mirror.com/datasets/MoyYuan/Asymmetricity-2.0

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit size_categories: - 10M<n<100M language: - en --- # Asymmetricity v2: A Benchmark for Evaluating LLMs on Symmetric and Asymmetric Relation Understanding **Asymmetricity v2** is a massive upgrade to the original benchmark dataset, designed to evaluate large language models (LLMs) on their ability to distinguish and reason over symmetric (e.g., *borders*) and antisymmetric (e.g., *parent of*) relations in natural language. Now expanded to **over 70 million entries**, the dataset is derived from Wikidata triples and cast into a natural language inference (NLI) format, enabling fine-grained, large-scale analysis of relational understanding. The dataset includes a variety of textual forms—both in natural language and in a delexicalized version where entities are replaced by Wikidata IDs (e.g., `Q7024230`). This enables models to be evaluated both on surface-level text and on abstract relational structure. This is the second version of the dataset. The original version can be found here: [Asymmetricity v1](https://huggingface.co/datasets/MoyYuan/Asymmetricity). --- ## Overview Understanding the symmetry properties of relations is essential for robust reasoning. For example, if *A is the parent of B*, then *B is the parent of A* should clearly be false. Many LLMs, however, struggle to consistently apply this logic, particularly when the phrasing or entity names change. **Asymmetricity v2** provides a structured and scalable testbed for evaluating this capability, drawing on real-world knowledge base relations and reformulating them as NLI-style sentence pairs. With the inclusion of reasoning chain lengths, v2 also supports evaluating multi-step relational reasoning. --- ## Motivation Current language models often rely on surface patterns and statistical co-occurrence, which can obscure their understanding of logical constraints like symmetry and directionality. This benchmark tests models on: - Recognizing whether a relation is symmetric or asymmetric - Identifying correct entailments and contradictions in natural language - Generalizing across entity names and abstract identifiers (Wikidata IDs) - Handling reasoning chains of varying lengths --- ## Dataset Design Each example is based on Wikidata triples involving entities and relations. The data is converted into a list of natural language premises and a hypothesis representing a logical consequence (or contradiction). A label indicates whether the hypothesis logically follows from the premises. --- ## Evaluation Focus This dataset supports research in: - Logical consistency and relation reasoning in LLMs - Sensitivity to relation directionality and symmetry - Robustness across lexicalized and abstract (ID-based) inputs - Pretraining biases related to relation semantics - Multi-step reasoning capabilities (via chain length analysis) It is suitable for prompting, zero/few-shot evaluation, embedding-based retrieval, and supervised fine-tuning. --- ## Data Format Each line in the dataset is a JSON object with the following fields: - `tier`: A string indicating the difficulty tier or partition of the example. - `lex`: The lexicalization type (e.g., `text` for natural language, `delex` for ID-based). - `lang`: The language code of the text (e.g., `en`). - `premises`: A list of natural language sentences acting as the logical basis for the inference. - `hypothesis`: The target sentence to be validated against the premises. - `label`: The inference label (e.g., `entailment`, `contradiction`). - `relation_ids`: A list of Wikidata property IDs (e.g., `['P40']`) involved in the reasoning chain. - `rule`: The specific logical rule being tested (e.g., `symmetry`, `antisymmetry`). - `entities`: A list of entity identifiers or names present in the example. - `chain_len`: An integer (`int64`) representing the length of the reasoning chain (number of steps/triples). --- ## Citation If you use this dataset in your work, please cite the following paper: ```bibtex @article{yuan2025capturing, title={Capturing Symmetry and Antisymmetry in Language Models through Symmetry-Aware Training Objectives}, author={Yuan, Zhangdie and Vlachos, Andreas}, journal={arXiv preprint arXiv:2504.16312}, year={2025} }

license: MIT协议 size_categories: - 1000万 < 条目量 < 1亿 language: - 英语 --- # Asymmetricity v2：面向大语言模型对称与非对称关系理解能力评估的基准数据集 **Asymmetricity v2** 是原版基准数据集的重大升级版本，旨在评估大语言模型（Large Language Model，LLM）在自然语言场景下区分并推理对称关系（如“接壤”）与非对称关系（如“是……的父母”）的能力。该数据集现已扩展至**7000万余条条目**，源自维基数据（Wikidata）三元组，并被转换为自然语言推理（Natural Language Inference，NLI）格式，可实现对关系理解能力的精细化、大规模分析。该数据集包含多种文本形式：既有自然语言文本，也有将实体替换为维基数据ID（如`Q7024230`）的去词汇化版本。这使得模型既可以在表层文本层面，也可以在抽象关系结构层面接受评估。本数据集为第二版，原版数据集可通过以下链接获取：[Asymmetricity v1](https://huggingface.co/datasets/MoyYuan/Asymmetricity)。 --- ## 概述理解关系的对称性属性是实现可靠推理的核心前提。例如，若“A是B的父母”，则“B是A的父母”显然为假。然而，诸多大语言模型难以始终遵循此类逻辑，尤其是在表述方式或实体名称发生变化时。 **Asymmetricity v2** 提供了一个结构化且可扩展的测试平台，用于评估此类能力：其基于真实世界知识库中的关系，并将其重构为自然语言推理风格的语句对。此外，该版本新增了推理链长度维度，可支持多步关系推理能力的评估。 --- ## 研究动机当前语言模型往往依赖表层模式与统计共现关系，这会掩盖其对对称性、方向性等逻辑约束的理解。本基准数据集从以下维度对模型进行测试： - 识别某一关系属于对称还是非对称 - 识别自然语言中的正确蕴含关系与矛盾关系 - 跨实体名称与抽象标识符（维基数据ID）实现泛化 - 处理不同长度的推理链 --- ## 数据集设计每条数据样本均基于包含实体与关系的维基数据三元组。数据被转换为自然语言前提列表与代表逻辑结论（或矛盾）的假设句，标签用于标注假设是否可由前提逻辑推导得出。 --- ## 评估方向本数据集可支撑以下方向的研究： - 大语言模型中的逻辑一致性与关系推理 - 对关系方向性与对称性的敏感度 - 词汇化与抽象（基于ID）输入下的鲁棒性 - 与关系语义相关的预训练偏见 - 多步推理能力（通过推理链长度分析）该数据集适用于提示学习、零样本/少样本评估、基于嵌入的检索以及监督微调任务。 --- ## 数据格式数据集中的每一行均为一个JSON对象，包含以下字段： - `tier`：字符串类型，用于标注样本的难度层级或划分分区 - `lex`：词汇化类型（如`text`代表自然语言格式，`delex`代表基于ID的格式） - `lang`：文本的语言代码（如`en`代表英语） - `premises`：自然语言语句列表，作为推理的逻辑基础 - `hypothesis`：需基于前提进行验证的目标语句 - `label`：推理标签（如`entailment`代表蕴含，`contradiction`代表矛盾） - `relation_ids`：推理链中涉及的维基数据属性ID列表（如`['P40']`） - `rule`：本次测试的具体逻辑规则（如`symmetry`代表对称性，`antisymmetry`代表非对称性） - `entities`：样本中包含的实体标识符或名称列表 - `chain_len`：整数类型（`int64`），代表推理链的长度（即步骤/三元组的数量） --- ## 引用说明若您在研究中使用本数据集，请引用以下论文： bibtex @article{yuan2025capturing, title={Capturing Symmetry and Antisymmetry in Language Models through Symmetry-Aware Training Objectives}, author={Yuan, Zhangdie and Vlachos, Andreas}, journal={arXiv preprint arXiv:2504.16312}, year={2025} }

提供机构：

MoyYuan

5,000+

优质数据集

54 个

任务类型

进入经典数据集