klimzaporojets/emerge-benchmark

Name: klimzaporojets/emerge-benchmark
Creator: klimzaporojets
Published: 2026-04-06 20:41:37
License: 暂无描述

Hugging Face2026-04-06 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/klimzaporojets/emerge-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-sa-4.0 task_categories: - text-classification - token-classification language: - en tags: - knowledge-graph - information-extraction - relation-extraction - entity-linking - wikidata - wikipedia - temporal pretty_name: "EMERGE: Updating Knowledge Graphs with Emerging Textual Knowledge" size_categories: - 1K<n<10K dataset_info: - config_name: default features: - name: hash_id dtype: string - name: passage dtype: string - name: anchor_title dtype: string - name: anchor_page_qid dtype: string - name: revision_date dtype: string splits: - name: test num_examples: 3500 --- # EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge **[Paper](https://arxiv.org/abs/2507.03617)** | **[Code](https://github.com/klimzaporojets/emerge-benchmark)** ## Overview EMERGE is a benchmark for **Text-driven KG Updating (TKGU)** — evaluating methods for updating knowledge graphs from textual evidence. Each instance pairs a textual passage with a KG snapshot and a set of update operations induced by the passage. EMERGE defines five TKGU operations: | Operation | Code | Description | |-----------|------|-------------| | **Exists** | `x-triples` | Triple already present in the KG, supported by the textual passage | | **Add** | `e-triples` | New triple involving entities that already exist in the KG | | **Mint+Add** | `ee-triples` | New triple involving one or more entities not yet in the KG | | **Infer** | `ee-kg-triples` | Triple linking a newly introduced entity to an existing KG entity, not explicitly stated in the passage | | **Deprecate** | `d-triples` | Existing triple invalidated by updated information in the passage | ## Dataset Contents ### Test set (`evaluation_set/`) 3,500 instances across 7 annual Wikidata snapshots (2019-2025), organized as: ``` evaluation_set/ ├── snapshot_2019-01-01/ │ ├── delta_2019-01-08.jsonl (100 instances) │ ├── delta_2019-01-15.jsonl │ ├── delta_2019-01-22.jsonl │ ├── delta_2019-01-29.jsonl │ └── delta_2019-02-05.jsonl ├── snapshot_2020-01-01/ ... snapshot_2025-01-01/ ``` Each instance (JSONL line) contains: - **`passage`**: Wikipedia passage text - **`mentions`**: Entity mentions with character offsets and Wikidata QIDs - **`tkgu_triples`**: Ground-truth triples with TKGU operations and LLM assessments - **`predictions`**: Outputs from 13 benchmark models - **`hash_id`**: Unique instance identifier ### Annotations (`annotation/`) Human annotation data for inter-annotator agreement statistics. ### KG Snapshots (`kg_snapshots/`) 7 yearly Wikidata KG snapshots (gzip-compressed TSV, ~3.7GB total). Each row is a `(subject, predicate, object)` triple active at that snapshot date. Needed for relik-cie Exists operation evaluation. ### Relation Indices (`indices/`) Per-snapshot relation embeddings (~400MB) used by ReLiK and EDC+ benchmarks. ## Benchmark Models The test set includes pre-computed predictions from 13 models: | Model | Type | Backend | |-------|------|---------| | EDC+ GPT-5.1 | LLM (in-context learning) | GPT-5.1 | | EDC+ Mistral-Large | LLM (in-context learning) | Mistral-Large | | EDC+ Mistral-Small | LLM (in-context learning) | Mistral-Small | | EDC+ ZS GPT-5.1 | LLM (zero-shot) | GPT-5.1 | | EDC+ ZS Mistral-Large | LLM (zero-shot) | Mistral-Large | | KGGen GPT-5.1 | LLM | GPT-5.1 | | KGGen Mistral-Large | LLM | Mistral-Large | | KGGen Mistral-Small | LLM | Mistral-Small | | RAKG Mistral-Large | LLM | Mistral-Large | | RAKG Mistral-Small | LLM | Mistral-Small | | REBEL | Local seq2seq | Babelscape/rebel-large | | ReLiK OIE | Local neural | sapienzanlp/relik-relation-extraction-nyt-large | | ReLiK CIE | Local neural | sapienzanlp/relik-cie-large | ## Usage ### Download with the EMERGE repository ```bash git clone https://github.com/klimzaporojets/emerge-benchmark.git cd emerge-benchmark ./scripts/download_data.sh # test set + annotations ./scripts/download_data.sh --kg # + KG snapshots ./scripts/download_data.sh --all # + relation indices ``` ### Download with Python ```python from huggingface_hub import snapshot_download # Download test set and annotations snapshot_download( repo_id="klimzaporojets/emerge-benchmark", repo_type="dataset", local_dir="./data", allow_patterns=["evaluation_set/**", "annotation/**"], ) ``` ### Load a single instance ```python import json with open("data/evaluation_set/snapshot_2024-01-01/delta_2024-01-08.jsonl") as f: instance = json.loads(f.readline()) print(instance["passage"][:200]) print(f"TKGU triples: {len(instance['tkgu_triples'])}") print(f"Models with predictions: {list(instance['predictions'].keys())}") ``` ## Instance Format Each JSONL line contains: | Field | Type | Description | |-------|------|-------------| | `hash_id` | string | Unique instance identifier | | `passage` | string | Wikipedia passage text | | `mentions` | list | Entity mentions with char offsets and Wikidata QIDs | | `tkgu_triples` | list | Ground-truth triples with operations and LLM assessments | | `predictions` | dict | Model predictions keyed by model name | | `revision_date` | string | Wikipedia revision timestamp | | `anchor_title` | string | Wikipedia article title | | `delta_dates` | list | Start and end dates of the delta period | See the [code repository](https://github.com/klimzaporojets/emerge-benchmark) for the full schema documentation (`data/README.md`). ## Citation ```bibtex @article{zaporojets2025emerge, title={EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge}, author={Zaporojets, Klim and Daza, Daniel and Barba, Edoardo and Assent, Ira and Navigli, Roberto and Groth, Paul}, journal={arXiv preprint arXiv:2507.03617}, year={2025} } ``` ## License This dataset is licensed under [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/). It is derived from [Wikipedia](https://en.wikipedia.org/) (CC BY-SA 3.0+) and [Wikidata](https://www.wikidata.org/) (CC0).

提供机构：

klimzaporojets

5,000+

优质数据集

54 个

任务类型

进入经典数据集