facebook/content_rephrasing
收藏Hugging Face2022-10-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/facebook/content_rephrasing
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
---
## Message Content Rephrasing Dataset
Introduced by Einolghozati et al. in Sound Natural: Content Rephrasing in Dialog Systems https://aclanthology.org/2020.emnlp-main.414/
We introduce a new task of rephrasing for amore natural virtual assistant. Currently, vir-tual assistants work in the paradigm of intent-slot tagging and the slot values are directlypassed as-is to the execution engine. However,this setup fails in some scenarios such as mes-saging when the query given by the user needsto be changed before repeating it or sending itto another user. For example, for queries like‘ask my wife if she can pick up the kids’ or ‘re-mind me to take my pills’, we need to rephrasethe content to ‘can you pick up the kids’ and‘take your pills’. In this paper, we study theproblem of rephrasing with messaging as ause case and release a dataset of 3000 pairs oforiginal query and rephrased query. We showthat BART, a pre-trained transformers-basedmasked language model with auto-regressivedecoding, is a strong baseline for the task, andshow improvements by adding a copy-pointerand copy loss to it. We analyze different trade-offs of BART-based and LSTM-based seq2seqmodels, and propose a distilled LSTM-basedseq2seq as the best practical model.
提供机构:
facebook
原始信息汇总
消息内容改写数据集
数据集介绍
由Einolghozati等人引入,用于研究对话系统中内容改写的任务。该数据集旨在改进虚拟助手的自然语言处理能力,特别是在需要改写用户查询以适应不同场景的情况下,如发送消息时。
数据集内容
数据集包含3000对原始查询和改写后的查询。例如,原始查询“ask my wife if she can pick up the kids”会被改写为“can you pick up the kids”。
模型研究
研究显示,基于BART的预训练变换器模型在改写任务中表现出色,通过添加复制指针和复制损失可以进一步提高性能。此外,还分析了基于BART和LSTM的序列到序列模型的不同权衡,并提出了基于LSTM的序列到序列模型作为最佳实用模型。



