FactSpan: Multilingual Fact-Checking Dataset

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/15084387

下载链接

链接失效反馈

官方服务：

资源简介：

The FactSpan dataset is an extension of the X-Fact dataset, designed to support multilingual fact-checking research. This dataset overcomes limitations in existing datasets by incorporating recent data from the ClaimReview Markup for Data Commons Feed and providing detailed annotations. Key Features: Data Source: Claims are sourced from both the X-Fact dataset (up to 2020) and the Data Commons Feed (post-2020). Validity: Claims are filtered to include only those from organizations recognized by the International Fact-Checking Network (IFCN) and Duke Reporters’ Lab, ensuring high reliability. Standardized Labels: Verdict labels are standardized into five categories: False, Mostly False, Partly False/Misleading, Mostly True, and True. Annotations (Annotated Dataset Only): The FactSpan_annotated.csv dataset includes rich annotations generated using GPT-3.5: label: The standardized verdict label. claim: The fact-checked claim. claimDate: The date of the claim. claim_year: The year of the claim. language: The language of the claim. Position Statements: Indicates the presence of position statements. Entity/Event Properties: Indicates the presence of entity or event properties. Quote: Indicates the presence of quotes. Numerical Data: Indicates the presence of numerical data. claim type: Categorizes the claim as factual or opinion. topics: Categorizes the claim into one of five predefined topics (Health and Pandemics, Politics and Governance, Society and Culture, Economy and Environment, Conflict and Security). mapped_label: An additional mapped label, for edge cases or further label mappings. Unannotated Dataset: The FactSpan.csv dataset includes: label: The standardized verdict label. claim: The fact-checked claim. claimDate: The date of the claim. language: The language of the claim. Purpose: This dataset aims to facilitate research in multilingual fact-checking, providing a comprehensive and up-to-date resource for developing and evaluating fact-checking models. Repository: The dataset is maintained in the GitHub repository. The repository also contains scripts for expanding and updating the dataset.

创建时间：

2025-03-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集