cartesinus/leyzer-fedcsis

Name: cartesinus/leyzer-fedcsis
Creator: cartesinus
Published: 2023-10-20 09:28:32
License: 暂无描述

Hugging Face2023-10-20 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/cartesinus/leyzer-fedcsis

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cc-by-4.0 task_categories: - text-classification language: - en - pl - es tags: - natural-language-understanding size_categories: - 10K<n<100K --- # Leyzer: A Dataset for Multilingual Virtual Assistants Leyzer is a multilingual text corpus designed to study multilingual and cross-lingual natural language understanding (NLU) models and the strategies of localization of virtual assistants. It consists of 20 domains across three languages: English, Spanish and Polish, with 186 intents and a wide range of samples, ranging from 1 to 672 sentences per intent. For more stats please refer to wiki. ## Citation If you use this model, please cite the following: ``` @inproceedings{kubis2023caiccaic, author={Marek Kubis and Paweł Skórzewski and Marcin Sowański and Tomasz Ziętkiewicz}, pages={1319–1324}, title={Center for Artificial Intelligence Challenge on Conversational AI Correctness}, booktitle={Proceedings of the 18th Conference on Computer Science and Intelligence Systems}, year={2023}, doi={10.15439/2023B6058}, url={http://dx.doi.org/10.15439/2023B6058}, volume={35}, series={Annals of Computer Science and Information Systems} } ```

提供机构：

cartesinus

原始信息汇总

Leyzer: A Dataset for Multilingual Virtual Assistants

基本信息

许可证: cc-by-4.0
任务类别: text-classification
支持语言:
- en
- pl
- es
标签: natural-language-understanding
数据集大小: 10K<n<100K

数据集描述

Leyzer是一个多语言文本语料库，旨在研究多语言和跨语言的自然语言理解（NLU）模型以及虚拟助手的本地化策略。该数据集包含20个领域，涵盖英语、西班牙语和波兰语三种语言，共有186个意图，每个意图的样本数量从1到672句不等。

5,000+

优质数据集

54 个

任务类型

进入经典数据集