nineninesix/elise-en-nano-codec-dataset
收藏Hugging Face2025-09-20 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/nineninesix/elise-en-nano-codec-dataset
下载链接
链接失效反馈官方服务:
资源简介:
Elise EN Nano-Codec 数据集是基于Elise数据集使用NeMo Audio Codec重新编码构建的,包含文本转录、说话者标识和四层音频表示(由NeMo Nano Codec进行量化)的数据集。设计用于细调多模态大型语言模型和语音系统(TTS/ASR),这些系统依赖于基于编解码器的音频表示。该数据集适用于基于编解码器的TTS模型的微调、操作离散音频单元的ASR系统的训练,以及文本和音频标记结合的多模态LLM适应。
The Elise EN Nano-Codec Dataset is built upon the Elise dataset and re-encoded using NVIDIAs NeMo Audio Codec into nano audio tokens. It contains transcription text, speaker identifier, and four layers of audio representation (quantized by NeMo Nano Codec). Designed for fine-tuning multimodal LLMs and speech systems (TTS/ASR) that rely on codec-based audio token representations. The dataset is suitable for fine-tuning codec-based TTS models, training ASR systems that operate on discrete audio units, and multimodal LLM adaptation where text and audio tokens are combined.
提供机构:
nineninesix



