five

Construction and Evaluation of Chinese Reading Comprehension Data Set for Security Field

收藏
科学数据银行2022-09-26 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=6b4b2bdb26d647669582574187ddac69
下载链接
链接失效反馈
官方服务:
资源简介:
We proposed a Chinese machine reading comprehension dataset for security field (SecMRC), which solves the problem of lack of professional data support for machine reading comprehension technology research in this field.The dataset contains 2100 Anti-terrorism and security domain news, 7300 extracted question-answer pairs, 2100 generative Q&A pairs, and a total of 4796264 characters.Tests were conducted using advanced reading comprehension models on the SecMRC. The results show that the F1 of the extraction task reaches 72.5%, and the average ROUGE-L of the generative task is 37.8%, both of which are significantly weaker than the human level.SecMRC highlights domain knowledge and is difficult and challenging. It can effectively support the research of machine reading comprehension technology in this field. And the dataset construction method is universal and can be extended to other professional fields.
提供机构:
武汉科技大学; 湖北省武汉市洪山区武汉科技大学黄家湖校区计算机科学与技术学院
创建时间:
2022-08-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作