CrossWOZ中文跨领域任务导向对话数据集
收藏千言数据集2024-05-15 收录
下载链接:
https://www.luge.ai/#/luge/dataDetail?id=43
下载链接
链接失效反馈官方服务:
资源简介:
CrossWOZ是第一个大规模中文跨领域任务对话数据集,包含6k对话,102k句子,涉及5个领域,采用Wizard-of-Oz方式人工构建。数据集在用户侧和系统侧都提供了对话状态和对话动作标注,可用于对话动作识别、对话状态追踪、对话策略学习、用户模拟器构建等多种任务。数据集中部分用户目标含有跨领域的约束,即某领域的查询结果会影响另一个领域的查询条件,比如“在某景点附近找一家餐馆”,这些跨领域约束需要在对话时动态转变为具体约束,要求模型具备更好的上下文理解能力。
CrossWOZ is the first large-scale Chinese cross-domain task-oriented dialogue dataset, which contains 6,000 dialogues and 102,000 utterances, covers 5 domains, and is manually constructed via the Wizard-of-Oz paradigm. The dataset provides dialogue state and dialogue act annotations for both user and system sides, supporting multiple tasks including dialogue act recognition, dialogue state tracking, dialogue policy learning, and user simulator construction. Some user goals in the dataset involve cross-domain constraints, i.e., the query results of one domain will affect the query conditions of another domain. For instance, "find a restaurant near a certain attraction". These cross-domain constraints need to be dynamically transformed into specific constraints during the dialogue, requiring the model to have better contextual understanding capabilities.
提供机构:
清华大学
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



