five

cmrc2019

收藏
魔搭社区2025-11-27 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/hfl/cmrc2019
下载链接
链接失效反馈
官方服务:
资源简介:
## GitHub repository: https://github.com/ymcui/cmrc2019 This repository contains the data for [The Third Evaluation Workshop on Chinese Machine Reading Comprehension (CMRC 2019)](https://hfl-rc.github.io/cmrc2019/). We will present our paper at [COLING 2020](https://coling2020.org), **Title: A Sentence Cloze Dataset for Chinese Machine Reading Comprehension** Authors: Yiming Cui, Ting Liu, Ziqing Yang, Zhipeng Chen, Wentao Ma, Wanxiang Che, Shijin Wang, Guoping Hu Link: https://arxiv.org/abs/2004.03116 Venue: COLING 2020 ### Open Challenge Leaderboard (New!) Keep track of the latest state-of-the-art systems on CMRC 2019 dataset. https://ymcui.github.io/cmrc2019/ ### Submission Guidelines If you would like to test your model on the hidden test and challenge set, please follow the instructions on how to submit your model via CodaLab worksheet. https://worksheets.codalab.org/worksheets/0xe856b40d21de45bf898cd1d3c5135afe ### Baseline System We provide a BERT-based baseline system for participants (check *baseline* directory for more info). Results on other sets will be annouced later. > QAC: Question-Level Accuracy > PAC: Passage-Level Accuracy | Data | Passage # | Query # | QAC | PAC | Fake Candidates | Availability | | :------ | :-----: | :-----: | :-----: | :-----: | :-----: | :----- | | Trial Data | 139 | 1,504 | 71.941% | 28.776% | No | Public | | Train Data | 9,638 | 100,009 | N/A | N/A | No | Public | | Development Data | 300 | 3,053 | 70.586% | 13.333% | **Yes** | Public | | Qualifying Data | 500 | 5,081 | 70.01% | 8.20% | **Yes** | Semi-Hidden | | Test Data | - | - | - | - | **Yes** | Hidden | ## International Standard Language Resource Number (ISLRN) ISLRN: 813-010-842-493-2 http://www.islrn.org/resources/resources_info/8624/ ### Reference If you wish to use our data in your research, please cite our [paper](https://arxiv.org/abs/2004.03116): ``` @inproceeding={cui-etal-2020-cmrc2019, title={A Sentence Cloze Dataset for Chinese Machine Reading Comprehension}, author={Cui, Yiming and Liu, Ting and Yang, Ziqing and Chen, Zhipeng and Ma, Wentao and Che, Wanxiang and Wang, Shijin and Hu, Guoping}, booktitle = "Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020)", year={2020} } ``` ### Organization Committee Host: Chinese Information Processing Society of China (CIPS) Organizer: Joint Laboratory of HIT and iFLYTEK Research (HFL) Sponsor: iFLYTEK Co., Ltd. and iFLYTEK Research (Hebei) ### Evaluation Co-Chairs Ting Liu, Harbin Institute of Technology Yiming Cui, Joint Laboratory of HIT and iFLYTEK Research ### Official HFL WeChat Account Follow Joint Laboratory of HIT and iFLYTEK Research (HFL) on WeChat. ![qrcode.png](https://github.com/ymcui/cmrc2019/raw/master/qrcode.jpg) ### Contact us Any problems? Feel free to concat us. Email: **[cmrc2019 [aT] 126 [DoT] com](mailto:cmrc2019@126.com)** Forum: [CodaLab Competition Forum](https://competitions.codalab.org/forums/19781/) CMRC 2019 Official Website (中文):[https://cmrc2019.hfl-rc.com/](https://hfl-rc.github.io/cmrc2019/) CMRC 2019 Official Website (English):[https://cmrc2019.hfl-rc.com/english/](https://hfl-rc.github.io/cmrc2019/english/)

GitHub 仓库:https://github.com/ymcui/cmrc2019 本仓库包含**第三届中国机器阅读理解评测研讨会(The Third Evaluation Workshop on Chinese Machine Reading Comprehension, CMRC 2019)**的相关数据集。本团队的相关论文将在**COLING 2020**会议上发表,论文信息如下: - 论文标题:《A Sentence Cloze Dataset for Chinese Machine Reading Comprehension》 - 作者:Yiming Cui、Ting Liu、Ziqing Yang、Zhipeng Chen、Wentao Ma、Wanxiang Che、Shijin Wang、Guoping Hu - 论文链接:https://arxiv.org/abs/2004.03116 - 发表会议:COLING 2020 ### 新增开放评测排行榜 实时追踪 CMRC 2019 数据集上的最新顶尖模型性能,链接:https://ymcui.github.io/cmrc2019/ ### 提交指南 若您希望在隐藏测试集与挑战集上测试模型,请遵循 CodaLab 工作表中的提交说明进行操作,链接:https://worksheets.codalab.org/worksheets/0xe856b40d21de45bf898cd1d3c5135afe ### 基线系统 我们为参赛选手提供了基于 BERT 的基线系统(详情请见 `baseline` 目录)。其余数据集上的实验结果将在后续公布。 > QAC:Question-Level Accuracy(问题级准确率) > PAC:Passage-Level Accuracy(篇章级准确率) | 数据集 | 篇章数 | 查询数 | QAC | PAC | 虚假候选集 | 可获取性 | |:--------------|:-----:|:-------:|:--------:|:--------:|:---------:|:--------| | 试用数据集 | 139 | 1,504 | 71.941% | 28.776% | 否 | 公开 | | 训练数据集 | 9,638 | 100,009 | N/A | N/A | 否 | 公开 | | 开发数据集 | 300 | 3,053 | 70.586% | 13.333% | 是 | 公开 | | 资格赛数据集 | 500 | 5,081 | 70.01% | 8.20% | 是 | 半公开 | | 测试数据集 | - | - | - | - | 是 | 隐藏 | ### 国际标准语言资源编号(International Standard Language Resource Number, ISLRN) ISLRN: 813-010-842-493-2 链接:http://www.islrn.org/resources/resources_info/8624/ ### 引用说明 若您在研究中使用本数据集,请引用如下论文: bibtex @inproceeding={cui-etal-2020-cmrc2019, title={A Sentence Cloze Dataset for Chinese Machine Reading Comprehension}, author={Cui, Yiming and Liu, Ting and Yang, Ziqing and Chen, Zhipeng and Ma, Wentao and Che, Wanxiang and Wang, Shijin and Hu, Guoping}, booktitle = "Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020)", year={2020} } ### 组织委员会 主办单位:中国中文信息学会(Chinese Information Processing Society of China, CIPS) 承办单位:哈尔滨工业大学与科大讯飞联合实验室(Joint Laboratory of HIT and iFLYTEK Research, HFL) 赞助单位:科大讯飞股份有限公司与河北科大讯飞研究院 ### 评测联合主席 刘挺,哈尔滨工业大学 崔一鸣,哈尔滨工业大学与科大讯飞联合实验室 ### HFL 官方微信公众号 关注哈尔滨工业大学与科大讯飞联合实验室(HFL)微信公众号。 ![qrcode.png](https://github.com/ymcui/cmrc2019/raw/master/qrcode.jpg) ### 联系方式 如有任何问题,欢迎联系我们。 邮箱:**`cmrc2019@126.com`**(原格式为 `cmrc2019 [aT] 126 [DoT] com`) 论坛:[CodaLab 竞赛论坛](https://competitions.codalab.org/forums/19781/) CMRC 2019 官方网站(中文):[https://cmrc2019.hfl-rc.com/](https://hfl-rc.github.io/cmrc2019/) CMRC 2019 官方网站(英文):[https://cmrc2019.hfl-rc.com/english/](https://hfl-rc.github.io/cmrc2019/english/)
提供机构:
maas
创建时间:
2025-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作