five

SuperDialseg

收藏
arXiv2023-10-15 更新2024-06-21 收录
下载链接:
https://github.com/Coldog2333/SuperDialseg
下载链接
链接失效反馈
官方服务:
资源简介:
SuperDialseg是一个大规模的监督对话分割数据集,由东京大学和京都大学共同创建。该数据集包含9478个基于两个流行文档基础对话语料库的对话,并继承了这些对话相关的丰富标注。数据集的创建旨在解决对话系统中对话分割的挑战,特别是在缺乏明确监督信号的情况下。SuperDialseg不仅为对话分割提供了定义明确的分割点,还通过包含角色信息和对话行为等额外对话相关标注,支持更广泛的应用。此外,数据集还提供了一个包含18种模型和多种评估指标的基准,以促进对话分割技术的进一步研究和开发。

SuperDialseg is a large-scale supervised dialogue segmentation dataset co-developed by The University of Tokyo and Kyoto University. This dataset includes 9,478 dialogues derived from two popular document-based dialogue corpora, and inherits the rich annotations associated with these dialogues. The dataset is constructed to address the challenges of dialogue segmentation in dialogue systems, particularly in scenarios where explicit supervision signals are scarce. SuperDialseg not only provides well-defined segmentation points for dialogue segmentation tasks, but also supports a broader range of applications by incorporating additional dialogue-related annotations such as speaker information and dialogue acts. Furthermore, the dataset offers a benchmark comprising 18 models and multiple evaluation metrics to facilitate further research and development of dialogue segmentation technologies.
提供机构:
东京大学
创建时间:
2023-05-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作