E2E NLG Dataset

Name: E2E NLG Dataset
Creator: E2E NLG challenge organizers
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/sebastianGehrmann/diverse_ensembling

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了5万个基于对话行为的意义表示（MRs）以及在餐厅领域的参考对示例。数据集被划分为76%的训练数据、9%的验证数据和15%的测试数据。在验证集中，每个MR平均有8.1个参考，并且还有一个包含630个未见MR的独立测试集。该数据集的规模为5万个示例，任务是对话到文本的数据生成。

This dataset includes 50,000 meaning representations (MRs) based on dialogue acts along with reference pair examples in the restaurant domain. It is split into three subsets: 76% for training, 9% for validation, and 15% for testing. In the validation subset, each MR averages 8.1 references, and an independent test set containing 630 unseen MRs is also included. The dataset has a total of 50,000 examples, and its core task is dialogue-to-text data generation.

提供机构：

E2E NLG challenge organizers

5,000+

优质数据集

54 个

任务类型

进入经典数据集