Biomedical Alert News Dataset (BAND)
收藏arXiv2023-10-15 更新2024-06-21 收录
下载链接:
https://github.com/fuzihaofzh/BAND
下载链接
链接失效反馈官方服务:
资源简介:
BAND数据集是由剑桥大学语言技术实验室和麦吉尔大学全球健康学院共同创建的,旨在通过分析新闻和社交媒体数据来改善疾病监测和理解疾病传播。该数据集包含1508个样本,来自现有新闻文章、公开邮件和警报,以及30个与流行病学相关的问题。这些问题要求模型具备专家推理能力,从而为疾病爆发提供有价值的见解。BAND数据集为NLP领域带来了新的挑战,要求内容伪装能力和重要信息推断能力的提升。该数据集适用于流行病学家和NLP研究者,旨在解决现有监测系统在自动和全面流行病学分析方面的不足。
The BAND dataset was co-created by the Language Technology Lab at the University of Cambridge and the School of Global Health at McGill University. Its core objective is to enhance disease surveillance and advance the understanding of disease transmission via the analysis of news and social media data. The dataset comprises 1,508 samples sourced from existing news articles, public emails and alerts, alongside 30 epidemiology-focused questions. These questions necessitate the model to exhibit expert-level reasoning abilities, thereby delivering valuable insights into disease outbreaks. The BAND dataset introduces novel challenges to the NLP field, requiring improvements in both content camouflage identification and critical information inference capabilities. It is designed for use by epidemiologists and NLP researchers, with the purpose of addressing the limitations of current surveillance systems in automated and comprehensive epidemiological analysis.
提供机构:
剑桥大学语言技术实验室
创建时间:
2023-05-24



