VoiceAssistant-400K 语音助手优化数据集
收藏超神经2024-09-28 更新2024-12-14 收录
下载链接:
https://hyper.ai/cn/datasets/34686
下载链接
链接失效反馈官方服务:
资源简介:
VoiceAssistant-400K 是一个专门为语音助手优化的数据集,旨在帮助模型在提供语音助手服务时减少生成代码符号,增强模型在真实应用中的实用性。该数据集是为了训练和优化 Mini-Omni 模型的语音输出而开发的,由清华大学的研究团队于 2024 年推出,相关论文成果为「Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming」。 Mini-Omni 是一个开源的多模态大型语言模型,具备实时对话能力和端到端的语音输入输出功能。通过独特的文本指导并行生成方法,实现了与文本能力一致的语音推理输出,仅需极少的额外数据和模块。
VoiceAssistant-400K is a dataset specifically optimized for voice assistants, aiming to help models reduce the generation of code symbols when providing voice assistant services and enhance the practicality of models in real-world applications. This dataset was developed for training and optimizing the speech output of the Mini-Omni model, and was released by the research team from Tsinghua University in 2024, with the associated research paper titled "Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming". Mini-Omni is an open-source multimodal large language model with real-time dialogue capabilities and end-to-end speech input and output functions. Through a unique text-guided parallel generation method, it achieves speech inference output consistent with textual capabilities, requiring only minimal additional data and modules.
创建时间:
2024-09-24
搜集汇总
数据集介绍

背景与挑战
背景概述
VoiceAssistant-400K是一个专门为优化语音助手服务而设计的数据集,由清华大学于2024年推出,旨在通过三阶段训练过程减少生成代码符号并提升模型实用性。它用于支持Mini-Omni模型的多模态语音输出,包括模态对齐、适配训练和多模态微调等环节。
以上内容由遇见数据集搜集并总结生成



