five

MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15100351
下载链接
链接失效反馈
官方服务:
资源简介:
MultiClaimNet is a collection of three multilingual claim cluster datasets. The claims discussing similar facts are automatically grouped and annotated with a cluster ID. The following three datasets within MultiClaimNet contain claims written in 86 languages across diverse topics.    Dataset Number of Claims Number of Clusters Number of Languages ClaimCheck 1187 197 22 ClaimMatch 1171 192 36 MultiClaim 85.3K 30.9K 78   Preprint: https://arxiv.org/abs/2503.22280   Content: Claim - Factchecked Claim ClusterID - Cluster ID Language - Original language of the claim Translation - English translation NID - Unique identifier of the Claim In addition to the above fields, the MultiClaim dataset contains the following fields from the original dataset. Timestamp URL   References If you use any dataset from MultiClaimNet, in any publication, project, tool, or in any other form, please, cite the following paper: @misc{panchendrarajan2025multiclaimnet,      title={MultiClaimNet: A Massively Multilingual Dataset of Fact-Checked Claim Clusters},       author={Rrubaa Panchendrarajan and Rubén Míguez and Arkaitz Zubiaga},      year={2025},      eprint={2503.22280},      archivePrefix={arXiv},      primaryClass={cs.CL},      url={https://arxiv.org/abs/2503.22280}, } If you use the MultiClaim dataset from the MultiClaimNet collection in any publication, project, tool, or in any other form, please, cite the following paper in addition to the above: @inproceedings{pikuliak-etal-2023-multilingual, title = "Multilingual Previously Fact-Checked Claim Retrieval", author = "Pikuliak, Mat{\'u}{\v{s}} and Srba, Ivan and Moro, Robert and Hromadka, Timo and Smole{\v{n}}, Timotej and Meli{\v{s}}ek, Martin and Vykopal, Ivan and Simko, Jakub and Podrou{\v{z}}ek, Juraj and Bielikova, Maria", editor = "Bouamor, Houda and Pino, Juan and Bali, Kalika", booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.emnlp-main.1027", doi = "10.18653/v1/2023.emnlp-main.1027", pages = "16477--16500", }
创建时间:
2025-03-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作