VoiceAssistant-400K 语音助手优化数据集

超神经2024-09-28 更新2024-12-14 收录

下载链接：

https://hyper.ai/cn/datasets/34686

下载链接

链接失效反馈

官方服务：

资源简介：

VoiceAssistant-400K 是一个专门为语音助手优化的数据集，旨在帮助模型在提供语音助手服务时减少生成代码符号，增强模型在真实应用中的实用性。该数据集是为了训练和优化 Mini-Omni 模型的语音输出而开发的，由清华大学的研究团队于 2024 年推出，相关论文成果为「Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming」。 Mini-Omni 是一个开源的多模态大型语言模型，具备实时对话能力和端到端的语音输入输出功能。通过独特的文本指导并行生成方法，实现了与文本能力一致的语音推理输出，仅需极少的额外数据和模块。

VoiceAssistant-400K is a dataset specifically optimized for voice assistants, aiming to help models reduce the generation of code symbols when providing voice assistant services and enhance the practicality of models in real-world applications. This dataset was developed for training and optimizing the speech output of the Mini-Omni model, and was released by the research team from Tsinghua University in 2024, with the associated research paper titled "Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming". Mini-Omni is an open-source multimodal large language model with real-time dialogue capabilities and end-to-end speech input and output functions. Through a unique text-guided parallel generation method, it achieves speech inference output consistent with textual capabilities, requiring only minimal additional data and modules.

创建时间：

2024-09-24

搜集汇总

数据集介绍

背景与挑战

背景概述

VoiceAssistant-400K是一个专门为优化语音助手服务而设计的数据集，由清华大学于2024年推出，旨在通过三阶段训练过程减少生成代码符号并提升模型实用性。它用于支持Mini-Omni模型的多模态语音输出，包括模态对齐、适配训练和多模态微调等环节。

以上内容由遇见数据集搜集并总结生成