five

cartesinus/leyzer-fedcsis

收藏
Hugging Face2023-10-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/cartesinus/leyzer-fedcsis
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-classification language: - en - pl - es tags: - natural-language-understanding size_categories: - 10K<n<100K --- # Leyzer: A Dataset for Multilingual Virtual Assistants Leyzer is a multilingual text corpus designed to study multilingual and cross-lingual natural language understanding (NLU) models and the strategies of localization of virtual assistants. It consists of 20 domains across three languages: English, Spanish and Polish, with 186 intents and a wide range of samples, ranging from 1 to 672 sentences per intent. For more stats please refer to wiki. ## Citation If you use this model, please cite the following: ``` @inproceedings{kubis2023caiccaic, author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz}, pages={1319–1324}, title={Center for Artificial Intelligence Challenge on Conversational AI Correctness}, booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems}, year={2023}, doi={10.15439/2023B6058}, url={http://dx.doi.org/10.15439/2023B6058}, volume={35}, series={Annals of Computer Science and Information Systems} } ```
提供机构:
cartesinus
原始信息汇总

Leyzer: A Dataset for Multilingual Virtual Assistants

基本信息

  • 许可证: cc-by-4.0
  • 任务类别: text-classification
  • 支持语言:
    • en
    • pl
    • es
  • 标签: natural-language-understanding
  • 数据集大小: 10K<n<100K

数据集描述

Leyzer是一个多语言文本语料库,旨在研究多语言和跨语言的自然语言理解(NLU)模型以及虚拟助手的本地化策略。该数据集包含20个领域,涵盖英语、西班牙语和波兰语三种语言,共有186个意图,每个意图的样本数量从1到672句不等。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作