Emova-ollm/emova-sft-speech-231k
收藏Hugging Face2025-03-14 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/Emova-ollm/emova-sft-speech-231k
下载链接
链接失效反馈官方服务:
资源简介:
EMOVA-SFT-Speech-231K是一个用于全模态指令调整和情感口语对话的全面数据集。该数据集通过使用文本到语音(TTS)工具转换现有的文本和视觉指令数据集而创建。EMOVA-SFT-Speech-231K是EMOVA-Datasets集合的一部分,用于EMOVA模型系列的第三阶段——全模态指令调整。该数据集保存了EMOVA语音对话数据的单独副本,因此是EMOVA-SFT-4M数据集的子集。相应的评估数据集在EMOVA-SFT-Speech-Eval数据集中维护。我们使用EMOVA语音分词器提取语音单元。
EMOVA-SFT-Speech-231K is a comprehensive dataset curated for omni-modal instruction tuning and emotional spoken dialogue. This dataset is created by converting existing text and visual instruction datasets via Text-to-Speech (TTS) tools. EMOVA-SFT-Speech-231K is part of the EMOVA-Datasets collection and is used in the Stage 3 - Omni-modal instruction tuning of the EMOVA family of models. This dataset saves a separate copy of the EMOVA speech conversation data, and thus, is a subset of the EMOVA-SFT-4M dataset. The corresponding evaluation dataset is maintained in the EMOVA-SFT-Speech-Eval dataset. We extract speech units using the EMOVA Speech Tokenizer.
提供机构:
Emova-ollm



