X-FACT
收藏arXiv2021-06-17 更新2024-06-21 收录
下载链接:
https://github.com/utahnlp/x-fact
下载链接
链接失效反馈官方服务:
资源简介:
X-FACT是由犹他大学计算机学院创建的一个大型多语言事实核查数据集,包含31,189条来自25种不同语言的简短声明,由专家事实核查员进行真实性标注。该数据集不仅包含标准测试集,还创建了两个额外的挑战集,用于评估多语言模型在不同领域和语言间的泛化能力。X-FACT数据集的创建过程涉及从多个事实核查来源收集数据,并通过半自动化文本处理去除重复和自我标注的例子。该数据集主要应用于多语言事实核查系统的开发和评估,旨在解决跨语言和跨领域的事实核查问题。
X-FACT is a large-scale multilingual fact-checking dataset created by the School of Computing at the University of Utah. It contains 31,189 short claims in 25 distinct languages, with authenticity labels annotated by professional fact-checkers. In addition to the standard test set, two additional challenge sets have been constructed to evaluate the generalization capabilities of multilingual models across different domains and languages. The development of the X-FACT dataset involved collecting data from multiple fact-checking sources, followed by semi-automated text processing to remove duplicate and self-annotated examples. This dataset is primarily applied to the development and evaluation of multilingual fact-checking systems, aiming to address cross-lingual and cross-domain fact-checking problems.
提供机构:
犹他大学计算机学院
创建时间:
2021-06-17



