doc2dial
收藏arXiv2020-11-19 更新2024-06-21 收录
下载链接:
http://doc2dial.github.io/
下载链接
链接失效反馈官方服务:
资源简介:
doc2dial是一个面向目标的文档基础对话数据集,由IBM研究院创建。该数据集包含约4800个标注对话,平均每个对话14轮,基于超过480个来自四个不同领域的文档。数据集的创建过程涉及根据文档内容构建对话流程,并通过众包方式生成对话语句。该数据集主要用于评估和开发基于文档的对话模型,特别是在信息寻求任务中,旨在解决如何更好地理解和利用对话与文档之间的交互关系。
Doc2Dial is a goal-oriented document-grounded dialogue dataset created by IBM Research. This dataset contains approximately 4,800 annotated dialogues, with an average of 14 turns per dialogue, and is grounded in over 480 documents across four distinct domains. The dataset creation process involves constructing dialogue flows based on document content and generating dialogue utterances via crowdsourcing. It is primarily used for evaluating and developing document-grounded dialogue models, particularly in information-seeking tasks, with the goal of addressing how to better understand and leverage the interactive relationship between dialogues and documents.
提供机构:
IBM研究院
创建时间:
2020-11-13



