five

大规模开放领域角色对话语料库

收藏
arXiv2023-04-02 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2304.00350v1
下载链接
链接失效反馈
官方服务:
资源简介:
本研究创建了一个大规模的开放领域角色对话语料库,旨在通过固定的角色设定与未指定的群体工作者进行自由对话。数据集由一群扮演特定角色的演员与用户群体工作者之间的对话组成,这些对话涉及各种日常话题,旨在增强社交联系和信息交换。创建过程中,演员需通过面试被雇佣,并与用户通过网络应用进行对话,同时由平台管理者监控以确保对话质量。数据集的应用领域包括提升对话代理的自然语言处理能力,特别是在维持对话连贯性和角色一致性方面。

This study constructs a large-scale open-domain role-play dialogue corpus, which aims to conduct free-form conversations with unspecified crowd workers under fixed role settings. The corpus consists of dialogues between actors assigned specific roles and crowd workers serving as users, covering a wide range of everyday topics, with the goal of enhancing social connection and information exchange. During the corpus construction process, actors were hired after passing interviews, engaged in dialogues with users via web applications, and their conversations were monitored by platform administrators to ensure dialogue quality. Application areas of this dataset include improving the natural language processing capabilities of dialogue agents, especially in maintaining dialogue coherence and role consistency.
提供机构:
首尔国立大学电气与计算机工程系
创建时间:
2023-04-02
二维码
社区交流群
二维码
科研交流群
商业服务