E2E数据集
收藏arXiv2017-07-06 更新2024-06-21 收录
下载链接:
http://www.macs.hw.ac.uk/ InteractionLab/E2E/
下载链接
链接失效反馈官方服务:
资源简介:
E2E数据集是由赫瑞瓦特大学数学与计算机科学学院创建的一个大型数据集,专为训练端到端的自然语言生成系统设计,特别是在餐厅领域。该数据集包含超过50,000个实例,每个实例包括基于对话行为的意义表示和平均8.1个参考文本。数据集通过CrowdFlower平台收集,并经过质量控制,确保数据的高质量和多样性。创建过程中使用了图片作为刺激,以产生更自然、信息丰富和表达良好的参考文本。E2E数据集的应用领域主要集中在提高自然语言生成系统的自然度和多样性,解决传统模板式生成的问题。
The E2E dataset is a large-scale dataset created by the School of Mathematics and Computer Science at Heriot-Watt University, specifically designed for training end-to-end natural language generation systems, particularly in the restaurant domain. It contains over 50,000 instances, each consisting of a dialogue act-based meaning representation and an average of 8.1 reference texts. The dataset was collected via the CrowdFlower platform and underwent quality control to ensure high data quality and diversity. Pictures were used as stimuli during the creation process to generate more natural, informative and well-expressed reference texts. The main application areas of the E2E dataset focus on improving the naturalness and diversity of natural language generation systems, addressing the limitations of traditional template-based generation.
提供机构:
赫瑞瓦特大学数学与计算机科学学院
创建时间:
2017-06-28



