Modified Binary FEVER Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/lingo-iitgn/XME
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了来自二进制FEVER数据集的原始实例,并为每个实例创建了1到25个人工生成的在语义上相似的改写句子。这些改写专注于封闭书籍事实核实的任务。此外,利用谷歌翻译API将实例翻译成了五种语言。该数据集经过质量评估,平均标注者准确率为88.07%。其规模包括104,966个训练实例和10,444个验证实例,其中1,200个实例用于编辑和评估。该数据集的任务是事实核实。
This dataset contains original instances sourced from the binary FEVER dataset, with 1 to 25 manually generated, semantically similar paraphrased sentences created for each instance. These paraphrases are specifically tailored for the closed-book fact verification task. Additionally, the instances were translated into five languages via the Google Translate API. The dataset has undergone quality assessment, achieving an average annotator accuracy of 88.07%. In terms of scale, it includes 104,966 training instances and 10,444 validation instances, of which 1,200 instances are used for editing and evaluation. The core task of this dataset is fact verification.
提供机构:
Not explicitly stated



