five

MULTI3NLU++

收藏
arXiv2023-06-19 更新2024-06-21 收录
下载链接:
https://huggingface.co/datasets/uoe-nlp/multi3-nlu
下载链接
链接失效反馈
官方服务:
资源简介:
MULTI3NLU++是一个专为任务导向对话系统设计的多语言、多意图、多领域的自然语言理解数据集。该数据集由爱丁堡大学信息学院和剑桥大学语言技术实验室的研究人员创建,扩展了原有的仅包含英语的NLU++数据集。新增了西班牙语、马拉地语、土耳其语和阿姆哈拉语的手动翻译,覆盖银行和酒店两个领域。数据集包含3080条语句,每条语句都经过专业翻译,确保语言的自然性和复杂性。MULTI3NLU++旨在支持跨语言和跨领域的自然语言理解研究,特别是在低资源语言环境中,为评估和改进对话系统的性能提供了丰富的数据资源。数据集的应用领域包括意图检测和槽位填充,旨在解决多语言环境下对话系统的通用性和性能问题。

MULTI3NLU++ is a multilingual, multi-intent, multi-domain natural language understanding dataset designed for task-oriented dialogue systems. Developed by researchers from the School of Informatics at the University of Edinburgh and the Language Technology Lab at the University of Cambridge, it is an extension of the original English-only NLU++ dataset. The dataset adds manually translated utterances in Spanish, Marathi, Turkish and Amharic, covering two domains: banking and hospitality. It contains 3080 utterances, each professionally translated to guarantee linguistic naturalness and complexity. MULTI3NLU++ aims to support cross-lingual and cross-domain natural language understanding research, especially in low-resource language environments, providing rich data resources for evaluating and improving the performance of dialogue systems. Its application fields include intent detection and slot filling, and it is designed to address the generality and performance issues of dialogue systems in multilingual environments.
提供机构:
爱丁堡大学信息学院
创建时间:
2022-12-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作