cartesinus/leyzer-fedcsis
收藏Hugging Face2023-10-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/cartesinus/leyzer-fedcsis
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-classification
language:
- en
- pl
- es
tags:
- natural-language-understanding
size_categories:
- 10K<n<100K
---
# Leyzer: A Dataset for Multilingual Virtual Assistants
Leyzer is a multilingual text corpus designed to study multilingual and cross-lingual natural language understanding (NLU) models and the strategies of localization of
virtual assistants. It consists of 20 domains across three languages: English, Spanish and Polish, with 186 intents and a wide range of samples, ranging from 1 to 672
sentences per intent. For more stats please refer to wiki.
## Citation
If you use this model, please cite the following:
```
@inproceedings{kubis2023caiccaic,
author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz},
pages={1319–1324},
title={Center for Artificial Intelligence Challenge on Conversational AI Correctness},
booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems},
year={2023},
doi={10.15439/2023B6058},
url={http://dx.doi.org/10.15439/2023B6058},
volume={35},
series={Annals of Computer Science and Information Systems}
}
```
提供机构:
cartesinus
原始信息汇总
Leyzer: A Dataset for Multilingual Virtual Assistants
基本信息
- 许可证: cc-by-4.0
- 任务类别: text-classification
- 支持语言:
- en
- pl
- es
- 标签: natural-language-understanding
- 数据集大小: 10K<n<100K
数据集描述
Leyzer是一个多语言文本语料库,旨在研究多语言和跨语言的自然语言理解(NLU)模型以及虚拟助手的本地化策略。该数据集包含20个领域,涵盖英语、西班牙语和波兰语三种语言,共有186个意图,每个意图的样本数量从1到672句不等。



