INDIRECTREQUESTS
收藏arXiv2024-06-12 更新2024-06-14 收录
下载链接:
https://huggingface.co/datasets/msamogh/indirectrequests/
下载链接
链接失效反馈官方服务:
资源简介:
INDIRECTREQUESTS数据集由佛罗里达大学和亚马逊的研究团队创建,旨在通过合成生成间接用户请求,使任务导向型对话数据集更自然。该数据集包含453条间接用户请求(IURs),通过大型语言模型(LLMs)如GPT-3.5和GPT-4生成,并经过众包过滤和修正以提高质量。数据集的创建过程涉及使用语言模型生成噪声种子数据集,随后通过众包进行质量提升。INDIRECTREQUESTS数据集主要应用于自然语言理解(NLU)和对话状态跟踪(DST)的研究,旨在解决现有数据集中直接请求与自然对话中常见的间接请求之间的差异问题。
The INDIRECTREQUESTS dataset was created by a research team from the University of Florida and Amazon, aiming to make task-oriented dialogue datasets more natural by synthetically generating indirect user requests. This dataset contains 453 indirect user requests (IURs), which are generated by large language models (LLMs) such as GPT-3.5 and GPT-4, and then filtered and revised via crowdsourcing to improve quality. The dataset creation process involves using language models to generate a noisy seed dataset, followed by quality enhancement through crowdsourcing. The INDIRECTREQUESTS dataset is primarily applied to research in natural language understanding (NLU) and dialogue state tracking (DST), with the goal of addressing the discrepancy between direct requests in existing datasets and the indirect requests commonly seen in natural conversations.
提供机构:
佛罗里达大学
创建时间:
2024-06-12
搜集汇总
背景与挑战
背景概述
INDIRECTREQUESTS数据集由佛罗里达大学和亚马逊团队创建,包含453条通过GPT-3.5和GPT-4生成的间接用户请求,并经过众包过滤修正,旨在提升任务导向对话的自然性。该数据集主要用于自然语言理解和对话状态跟踪研究,以解决现有数据集中直接请求与自然对话中间接请求的差异问题。
以上内容由遇见数据集搜集并总结生成



