KazeJiang/Multi-CounterFact

Name: KazeJiang/Multi-CounterFact
Creator: KazeJiang
Published: 2025-12-17 07:59:21
License: 暂无描述

Hugging Face2025-12-17 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/KazeJiang/Multi-CounterFact

下载链接

链接失效反馈

官方服务：

资源简介：

Multi-CounterFact是一个多语言基准测试数据集，用于大型语言模型中的跨语言知识编辑。它在保持原始评估结构（可靠性、通用性和局部性）的同时，将原始的CounterFact数据集从英语扩展到五种语言：英语、德语、法语、日语和中文。每个数据实例代表一个可编辑的事实关联，包含一个目标事实提示、两个表达相同事实的改写提示和十个语义无关的提示。这种设计能够精细评估知识编辑是否成功更新了预期事实、是否泛化到改写提示以及是否避免了不相关事实的意外副作用。数据集结构包括主字段如请求重写的提示、主题实体、原始答案、反事实答案等，以及训练、验证和测试的分割策略。数据集的创建过程包括从原始英语数据自动翻译到其他语言，并进行了质量控制和人类验证。

Multi-CounterFact is a multilingual benchmark for cross-lingual knowledge editing in large language models. While preserving the original evaluation structure for reliability, generality, and locality, it extends the original CounterFact dataset from English to five languages: English, German, French, Japanese and Chinese. Each data instance represents a single editable factual association and contains: one target factual prompt, two paraphrased prompts expressing the same fact, and ten semantically unrelated prompts sharing the same predicate. This design enables fine-grained evaluation of whether a knowledge edit successfully updates the intended fact, generalizes to paraphrases, and avoids unintended side effects on unrelated facts. The dataset structure includes main fields such as requested rewrite prompt, subject entity, original answer, counterfactual answer, etc., and splits for training, validation, and testing. The dataset creation process involves automatic translation from the original English data to other languages, along with quality control and human verification.

提供机构：

KazeJiang

5,000+

优质数据集

54 个

任务类型

进入经典数据集