five

CFEVER

收藏
arXiv2024-02-20 更新2024-07-30 收录
下载链接:
https://ikmlab.github.io/CFEVER
下载链接
链接失效反馈
官方服务:
资源简介:
CFEVER是一个专为事实抽取和验证设计的中文数据集,包含30,012个基于中文维基百科内容手工创建的声明。每个声明被标记为'支持'、'反驳'或'信息不足',以描述其事实程度。与FEVER数据集类似,'支持'和'反驳'类别的声明也附有来自中文维基百科单个或多个页面的相应证据句子。该数据集通过五种方式的注释者间一致性测试,Fleiss' kappa值为0.7934。

CFEVER is a Chinese-language dataset specifically designed for fact extraction and verification, comprising 30,012 manually created claims based on the content of Chinese Wikipedia. Each claim is annotated with one of three labels: "SUPPORTS", "REFUTES", or "NOT ENOUGH INFO", to describe its factual validity. Similar to the FEVER dataset, claims categorized as "SUPPORTS" or "REFUTES" are accompanied by corresponding evidence sentences sourced from one or multiple pages of Chinese Wikipedia. Inter-annotator agreement was tested via five different approaches, with a Fleiss' kappa score of 0.7934.
创建时间:
2024-02-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作