five

Dialogizer

收藏
arXiv2023-11-09 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2311.07589v1
下载链接
链接失效反馈
官方服务:
资源简介:
Dialogizer是一个由首尔国立大学人工智能研究所开发的框架,用于从文本源自动生成高质量的对话QA数据集。该框架通过结合问题-答案匹配和主题感知对话生成两个训练任务,能够生成与上下文高度相关的对话数据集。使用Dialogizer,研究团队从多个领域的文档中生成了四个对话QA数据集,并通过自动和人工评估验证了其生成数据集的质量优于基线模型。这些数据集旨在解决信息寻求对话问题回答(ConvQA)任务中的数据稀缺问题,为该领域的研究和应用提供了宝贵的资源。

Dialogizer is a framework developed by the Artificial Intelligence Research Institute of Seoul National University for automatically generating high-quality conversational QA datasets from textual sources. This framework combines two training tasks, namely question-answer matching and topic-aware dialogue generation, enabling it to produce conversational datasets highly relevant to their context. Using Dialogizer, the research team generated four conversational QA datasets from documents across multiple domains, and verified through both automatic and human evaluations that the quality of the generated datasets outperforms that of baseline models. These datasets are designed to address the data scarcity issue in the information-seeking conversational question answering (ConvQA) task, providing valuable resources for research and applications in this field.
提供机构:
首尔国立大学人工智能研究所
创建时间:
2023-11-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作