shangeth/librispeech-mimi-codes
收藏Hugging Face2026-04-30 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/shangeth/librispeech-mimi-codes
下载链接
链接失效反馈官方服务:
资源简介:
LibriSpeech Mimi Codes数据集包含预提取的Kyutai Mimi神经编解码器标记,用于LibriSpeech语料库,该语料库包含来自LibriVox项目的多说话者英语有声读物朗读。此数据集仅包含代码,不包含音频。这些代码可以节省训练基于Mimi的语音模型所需的GPU提取时间。数据集包含每个话语的行,包括ID、文本、说话者ID、代码、帧数和代码本数。提取细节包括编解码器、代码本、代码本大小和文本来源。数据集分为多个标准LibriSpeech分割,每个分割对应不同的语音质量和时长。使用示例展示了如何加载数据集、处理代码以及解码为音频。数据集遵循CC-BY-4.0许可证,使用时需要引用原始语料库和此数据集。
The LibriSpeech Mimi Codes dataset contains pre-extracted Kyutai Mimi neural-codec tokens for the LibriSpeech corpus, which consists of multi-speaker English audiobook readings from the LibriVox project. This dataset contains codes only, not audio. These codes allow skipping the GPU extraction hours required to train Mimi-based speech models. The dataset includes rows per utterance, with columns for ID, text, speaker ID, codes, number of frames, and number of codebooks. Extraction details cover the codec, codebooks, codebook size, and transcript sources. The dataset is divided into standard LibriSpeech splits, each corresponding to different speech quality and duration. Usage examples demonstrate how to load the dataset, process codes, and decode to audio. The dataset is licensed under CC-BY-4.0, requiring attribution and citation of both the original corpus and this dataset when redistributing.
提供机构:
shangeth



