CovidQA
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/castorini/pygaggle/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集旨在评估在COVID-19相关信息背景下的问题回答性能。通过合成数据增强的方法,随着更多例子的加入,性能得到持续提升。尤其是当结合一次生成和往返过滤过程时,效果最佳。该数据集的规模经GPT-4增强后,从原始大小扩展至最多10倍。所涉及的任务是问题回答。
This dataset is designed to evaluate question answering performance within the context of COVID-19-related information. Performance can be continuously improved by incorporating additional training examples through synthetic data augmentation. Notably, the optimal effect is achieved when combining single-pass generation and round-trip filtering procedures. After being augmented with GPT-4, the scale of this dataset is expanded up to 10 times its original size. The task involved in this dataset is question answering.



