SLURP
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/speechbrain/speechbrain/tree/develop/recipes/SLURP
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为SLURP,是一个面向公众的多领域数据集,专为端到端的口语理解(E2E-SLU)而设计。它包含了用户与家用助手的单轮对话的音频记录。为了解决数据标签的不一致性,创建了名为SLURP F的清洁子集。由于标签不一致,约30%的原始数据被丢弃。相较于其他口语理解资源,该数据集在规模上更大,多样性更强。其任务是进行口语理解中的意图分类。
This dataset, named SLURP, is a public-facing multi-domain dataset specifically designed for end-to-end spoken language understanding (E2E-SLU). It contains audio recordings of single-turn dialogues between users and home assistants. To address the issue of inconsistent data labeling, a clean subset named SLURP F was created, and approximately 30% of the original data was discarded due to such labeling inconsistencies. Compared with other spoken language understanding resources, this dataset boasts a larger scale and greater diversity. Its targeted task is intent classification in spoken language understanding.
搜集汇总
数据集介绍

背景与挑战
背景概述
SLURP是一个专注于口语理解(SLU)的数据集,提供多种食谱用于模型训练和评估,包括直接映射、分词器和自然语言理解。数据集还包含详细的性能指标和推理接口,支持多种模型如HuBert,并提供了训练时间和模型链接。
以上内容由遇见数据集搜集并总结生成



