irds/codec
收藏数据集概述
数据集名称
codec
数据提供方
数据内容
docs(文档,即语料库); 数量=729,824queries(查询,即主题); 数量=42qrels(相关性评估); 数量=6,186
数据用途
该数据集被用于:
数据加载示例
python from datasets import load_dataset
docs = load_dataset(irds/codec, docs) for record in docs: record # {doc_id: ..., title: ..., text: ..., url: ...}
queries = load_dataset(irds/codec, queries) for record in queries: record # {query_id: ..., query: ..., domain: ..., guidelines: ...}
qrels = load_dataset(irds/codec, qrels) for record in qrels: record # {query_id: ..., doc_id: ..., relevance: ..., iteration: ...}
引用信息
@inproceedings{mackie2022codec, title={CODEC: Complex Document and Entity Collection}, author={Mackie, Iain and Owoicho, Paul and Gemmell, Carlos and Fischer, Sophie and MacAvaney, Sean and Dalton, Jeffery}, booktitle={Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval}, year={2022} }




