irds/cord19_trec-covid_round2
收藏数据集概述
数据集名称
cord19/trec-covid/round2
数据集来源
由ir-datasets提供。
数据集内容
数据类型
docs(文档,即语料库); 数量=59,887queries(主题); 数量=35qrels(相关性评估); 数量=12,037
数据结构
- docs: 每个记录包含
doc_id,title,doi,date,abstract等字段。 - queries: 每个记录包含
query_id,title,description,narrative等字段。 - qrels: 每个记录包含
query_id,doc_id,relevance,iteration等字段。
使用方法
通过datasets库加载数据集,具体代码示例如下:
python from datasets import load_dataset
docs = load_dataset(irds/cord19_trec-covid_round2, docs) queries = load_dataset(irds/cord19_trec-covid_round2, queries) qrels = load_dataset(irds/cord19_trec-covid_round2, qrels)
引用信息
@article{Voorhees2020TrecCovid, title={TREC-COVID: Constructing a Pandemic Information Retrieval Test Collection}, author={E. Voorhees and Tasmeer Alam and Steven Bedrick and Dina Demner-Fushman and W. Hersh and Kyle Lo and Kirk Roberts and I. Soboroff and Lucy Lu Wang}, journal={ArXiv}, year={2020}, volume={abs/2005.04474} } @article{Wang2020Cord19, title={CORD-19: The Covid-19 Open Research Dataset}, author={Lucy Lu Wang and Kyle Lo and Yoganand Chandrasekhar and Russell Reas and Jiangjiang Yang and Darrin Eide and K. Funk and Rodney Michael Kinney and Ziyang Liu and W. Merrill and P. Mooney and D. Murdick and Devvret Rishi and Jerry Sheehan and Zhihong Shen and B. Stilson and A. Wade and K. Wang and Christopher Wilhelm and Boya Xie and D. Raymond and Daniel S. Weld and Oren Etzioni and Sebastian Kohlmeier}, journal={ArXiv}, year={2020} }




