Construction and Evaluation of Chinese Reading Comprehension Data Set for Security Field
收藏科学数据银行2022-09-26 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=6b4b2bdb26d647669582574187ddac69
下载链接
链接失效反馈官方服务:
资源简介:
We proposed a Chinese machine reading comprehension dataset for security field (SecMRC), which solves the problem of lack of professional data support for machine reading comprehension technology research in this field.The dataset contains 2100 Anti-terrorism and security domain news, 7300 extracted question-answer pairs, 2100 generative Q&A pairs, and a total of 4796264 characters.Tests were conducted using advanced reading comprehension models on the SecMRC. The results show that the F1 of the extraction task reaches 72.5%, and the average ROUGE-L of the generative task is 37.8%, both of which are significantly weaker than the human level.SecMRC highlights domain knowledge and is difficult and challenging. It can effectively support the research of machine reading comprehension technology in this field. And the dataset construction method is universal and can be extended to other professional fields.
提供机构:
武汉科技大学; 湖北省武汉市洪山区武汉科技大学黄家湖校区计算机科学与技术学院
创建时间:
2022-08-06



