five

Dialogue State Tracking Challenge

收藏
OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/Dialogue_State_Tracking_etc
下载链接
链接失效反馈
官方服务:
资源简介:
对话状态跟踪挑战 2 和 3 (DSTC2&3) 是一项研究挑战,专注于改进跟踪口语对话系统状态的最新技术。状态跟踪,有时称为信念跟踪,是指在对话进行时准确估计用户的目标。准确的状态跟踪是可取的,因为它为语音识别中的错误提供了鲁棒性,并有助于减少像对话这样的时间过程中语言固有的歧义。在这些挑战中,参与者获得了标记的对话语料库,以开发状态跟踪算法。然后在一组通用的保留对话上对跟踪器进行评估,这些对话在一周内发布,未标记。该语料库是使用 Amazon Mechanical Turk 收集的,由两个领域的对话组成:餐厅信息和旅游信息。旅游信息包含餐厅信息,包括酒吧、咖啡厅等以及多个新位置。使用该数据进行了两轮评估:DSTC 2 发布了大量与餐厅搜索相关的训练对话。与 DSTC(属于公交时刻表域)相比,DSTC 2 引入了不断变化的用户目标、跟踪“请求的时段”以及新的餐厅域。 DSTC 2 的结果在 SIGDIAL 2014 上公布。DSTC 3 解决了适应新领域的问题 - 旅游信息。 DSTC 3 发布少量旅游信息领域的标注数据;参与者将使用这些数据以及 DSTC 2 中的餐厅数据进行培训。用于训练的对话被完全标记;用户转录、用户对话行为语义和对话状态都被注释。 (因此,该语料库也适用于口语理解研究。)

Dialogue State Tracking Challenges 2 and 3 (DSTC2 & DSTC3) are research challenges focused on advancing the state-of-the-art in tracking the state of spoken dialogue systems. State tracking, sometimes termed belief tracking, refers to accurately estimating a user’s goal as a dialogue proceeds. Accurate state tracking is advantageous as it enhances robustness against errors in speech recognition and helps mitigate inherent ambiguities in language over the temporal course of a dialogue. In these challenges, participants are provided with annotated dialogue corpora to develop state tracking algorithms. Subsequently, trackers are evaluated on a set of general held-out dialogues, which are released unannotated over a one-week period. The corpus was collected via Amazon Mechanical Turk and comprises dialogues across two domains: restaurant information and travel information. The travel information domain encompasses restaurant-related content including bars, cafes, and similar venues, alongside multiple new locations. Two rounds of evaluation were conducted using this corpus: DSTC2 released a large volume of training dialogues related to restaurant search. Compared to the original DSTC (which focused on the bus timetable domain), DSTC2 introduced evolving user goals, tracking of "requested time slots", and a new restaurant domain. The findings of DSTC2 were presented at SIGDIAL 2014. DSTC3 addressed the problem of adapting to new domains, specifically the travel information domain. DSTC3 released a small quantity of annotated data from the travel information domain; participants would use this data alongside the restaurant data from DSTC2 for model training. The training dialogues are fully annotated; user transcripts, semantic representations of user dialogue acts, and dialogue states are all annotated. As such, this corpus is also suitable for research on spoken language understanding.
提供机构:
OpenDataLab
创建时间:
2022-08-16
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
Dialogue State Tracking Challenge是一个专注于口语对话系统状态跟踪技术的研究数据集,包含餐厅和旅游信息两个领域的标注对话语料库,旨在开发和评估状态跟踪算法。该数据集由Amazon Mechanical Turk收集,适用于口语理解和对话状态跟踪研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作