five

新冠肺炎开放研究数据集(CORD-19)

收藏
帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26578.html
下载链接
链接失效反馈
官方服务:
资源简介:
n response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 1,000,000 scholarly articles, including over 400,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up. We are issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions. The CORD-19 dataset represents the most extensive machine-readable coronavirus literature collection available for data mining to date. This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing COVID-19 response efforts worldwide. There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up. A list of our initial key questions can be found under the Tasks section of this dataset. These key scientific questions are drawn from the NASEM’s SCIED (National Academies of Sciences, Engineering, and Medicine’s Standing Committee on Emerging Infectious Diseases and 21st Century Health Threats) research topics and the World Health Organization’s R&D Blueprint for COVID-19. Many of these questions are suitable for text mining, and we encourage researchers to develop text mining tools to provide insights on these questions. We are maintaining a summary of the community's contributions. For guidance on how to make your contributions useful, we're maintaining a forum thread with the feedback we're getting from the medical and health policy communities.

为应对新型冠状病毒肺炎(COVID-19)疫情,美国白宫与顶尖研究团队联盟共同打造了COVID-19开放研究数据集(COVID-19 Open Research Dataset, CORD-19)。该数据集是针对COVID-19、严重急性呼吸综合征冠状病毒2型(SARS-CoV-2)及相关冠状病毒的学术文献资源库,目前收录超100万篇学术文章,其中超40万篇包含完整文本。本数据集免费向全球科研社区开放,旨在推动研究人员运用自然语言处理(Natural Language Processing, NLP)及其他人工智能(Artificial Intelligence, AI)技术的最新进展,挖掘新的研究见解,以助力当前针对该传染病的防控工作。 随着新型冠状病毒相关研究文献的快速增长,医疗科研社区已难以跟上文献更新节奏,因此这类技术手段的需求愈发迫切。我们向全球人工智能领域专家发出行动号召,呼吁其开发文本与数据挖掘工具,助力医疗科研社区解答高优先级科学问题。 CORD-19是目前可用于数据挖掘的、覆盖范围最广的机器可读型冠状病毒相关文献集,这为全球人工智能科研社区提供了契机:使其可通过文本与数据挖掘方法,从该文献集中获取问题答案,并跨内容关联研究见解,从而助力全球范围内持续推进的COVID-19防控工作。 同样,鉴于冠状病毒相关研究文献数量快速攀升,医疗科研社区难以紧跟文献更新步伐,这类技术手段的需求正持续增长。 本数据集的任务板块中列出了我们最初提出的一系列关键科学问题。这些关键科学问题源自美国国家科学院、工程院和医学院(National Academies of Sciences, Engineering, and Medicine, NASEM)下属的新兴传染病与21世纪健康威胁常设委员会(Standing Committee on Emerging Infectious Diseases and 21st Century Health Threats, SCIED)的研究主题,以及世界卫生组织(World Health Organization, WHO)发布的COVID-19研发蓝图。其中多数问题适用于文本挖掘研究,我们鼓励科研人员开发文本挖掘工具,为解答这些问题提供研究见解。 我们将持续更新社区贡献成果的汇总内容。为指导大家如何让贡献更具价值,我们开设了论坛讨论串,同步收集来自医疗与卫生政策社区的反馈意见。
提供机构:
帕依提提
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
新冠肺炎开放研究数据集(CORD-19)是一个包含超过100万篇学术文章的大型医学数据集,其中40万篇提供全文,专注于COVID-19及相关冠状病毒研究。该数据集旨在支持全球AI研究社区通过文本挖掘技术获取疫情相关科学见解。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务