MiMarco
收藏arXiv2023-11-10 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2311.06119v1
下载链接
链接失效反馈官方服务:
资源简介:
MiMarco数据集是由索邦大学的研究团队基于MsMarco数据集创建的,旨在通过自动生成查询澄清交互来扩充传统的信息检索任务数据集。该数据集包含超过50万个查询,每个查询都附带有系统生成的澄清问题和用户模拟的回答,以模拟用户与系统之间的对话。MiMarco数据集的创建过程涉及两个主要步骤:首先生成查询澄清交互,然后将其与MsMarco数据集结合。该数据集主要用于训练和测试对话式信息检索模型,特别是在处理复杂或多方面的信息需求时,通过模拟交互帮助模型更好地理解和满足用户的需求。
The MiMarco Dataset was developed by a research team from Sorbonne University based on the MsMarco dataset, with the goal of augmenting traditional information retrieval task datasets via automatically generated query clarification interactions. This dataset contains over 500,000 queries, each paired with system-generated clarification questions and simulated user responses to emulate dialogues between users and systems. The creation process of the MiMarco Dataset involves two main steps: first generating query clarification interactions, then integrating them with the MsMarco dataset. This dataset is primarily used for training and evaluating conversational information retrieval models, especially when addressing complex or multi-faceted information needs, by simulating interactions to help models better understand and fulfill user information demands.
提供机构:
索邦大学
创建时间:
2023-11-10



