ProCIS
收藏arXiv2024-05-10 更新2024-06-21 收录
下载链接:
https://github.com/algoprog/ProCIS
下载链接
链接失效反馈官方服务:
资源简介:
ProCIS是一个大规模的数据集,用于主动对话信息检索,包含超过280万个对话。该数据集通过众包实验获得高质量的相关性判断,并收集与每个文档相关的对话部分,以便评估主动检索系统。数据集的创建涉及使用维基百科作为知识源,并通过Reddit线程收集多用户交互的对话。ProCIS旨在解决主动对话信息检索系统的构建和评估问题,为开发先进的对话信息检索系统铺平道路。
ProCIS is a large-scale dataset for active conversational information retrieval, containing over 2.8 million dialogues. This dataset obtains high-quality relevance judgments via crowdsourcing experiments, and collects conversational segments associated with each document to evaluate active retrieval systems. The construction of the dataset leverages Wikipedia as the knowledge source and collects dialogues involving multi-user interactions through Reddit threads. ProCIS aims to address the construction and evaluation challenges of active conversational information retrieval systems, paving the way for the development of advanced conversational information retrieval systems.
提供机构:
马萨诸塞大学阿默斯特分校
创建时间:
2024-05-10



