MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15100351
下载链接
链接失效反馈官方服务:
资源简介:
MultiClaimNet is a collection of three multilingual claim cluster datasets. The claims discussing similar facts are automatically grouped and annotated with a cluster ID. The following three datasets within MultiClaimNet contain claims written in 86 languages across diverse topics.
Dataset
Number of Claims
Number of Clusters
Number of Languages
ClaimCheck
1187
197
22
ClaimMatch
1171
192
36
MultiClaim
85.3K
30.9K
78
Preprint: https://arxiv.org/abs/2503.22280
Content:
Claim - Factchecked Claim
ClusterID - Cluster ID
Language - Original language of the claim
Translation - English translation
NID - Unique identifier of the Claim
In addition to the above fields, the MultiClaim dataset contains the following fields from the original dataset.
Timestamp
URL
References
If you use any dataset from MultiClaimNet, in any publication, project, tool, or in any other form, please, cite the following paper:
@misc{panchendrarajan2025multiclaimnet, title={MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters}, author={Rrubaa Panchendrarajan and Rubén Míguez and Arkaitz Zubiaga}, year={2025}, eprint={2503.22280}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2503.22280}, }
If you use the MultiClaim dataset from the MultiClaimNet collection in any publication, project, tool, or in any other form, please, cite the following paper in addition to the above:
@inproceedings{pikuliak-etal-2023-multilingual,
title = "Multilingual Previously Fact-Checked Claim Retrieval",
author = "Pikuliak, Mat{\'u}{\v{s}} and Srba, Ivan and Moro, Robert and Hromadka, Timo and Smole{\v{n}}, Timotej and Meli{\v{s}}ek, Martin and Vykopal, Ivan and Simko, Jakub and Podrou{\v{z}}ek, Juraj and Bielikova, Maria",
editor = "Bouamor, Houda and Pino, Juan and Bali, Kalika",
booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.emnlp-main.1027",
doi = "10.18653/v1/2023.emnlp-main.1027",
pages = "16477--16500",
}
创建时间:
2025-03-31



