medalpaca/medical_meadow_cord19
收藏CORD-19 数据集概述
基本信息
- 任务类别: 摘要生成
- 语言: 英语
- 数据集大小: 100K<n<1M
数据集描述
- 数据集名称: COVID-19 Open Research Dataset (CORD-19)
- 数据集目的: 为全球研究社区提供一个包含超过1,000,000篇学术文章的资源,其中超过400,000篇包含全文,内容涉及COVID-19、SARS-CoV-2及相关冠状病毒。此数据集旨在支持应用自然语言处理和其他AI技术,以产生对抗击传染病的新见解。
- 数据处理: 本数据集为处理版本,已移除部分空项并格式化,以便与alpaca训练兼容。
引用信息
@inproceedings{wang-etal-2020-cord, title = "{CORD-19}: The {COVID-19} Open Research Dataset", author = "Wang, Lucy Lu and Lo, Kyle and Chandrasekhar, Yoganand and Reas, Russell and Yang, Jiangjiang and Burdick, Doug and Eide, Darrin and Funk, Kathryn and Katsis, Yannis and Kinney, Rodney Michael and Li, Yunyao and Liu, Ziyang and Merrill, William and Mooney, Paul and Murdick, Dewey A. and Rishi, Devvret and Sheehan, Jerry and Shen, Zhihong and Stilson, Brandon and Wade, Alex D. and Wang, Kuansan and Wang, Nancy Xin Ru and Wilhelm, Christopher and Xie, Boya and Raymond, Douglas M. and Weld, Daniel S. and Etzioni, Oren and Kohlmeier, Sebastian", booktitle = "Proceedings of the 1st Workshop on {NLP} for {COVID-19} at {ACL} 2020", month = jul, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/2020.nlpcovid19-acl.1" }



