collabora/whisperspeech

Name: collabora/whisperspeech
Creator: collabora
Published: 2023-10-07 06:41:11
License: 暂无描述

Hugging Face2023-10-07 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/collabora/whisperspeech

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - text-to-speech language: - en pretty_name: WhisperSpeech --- # The WhisperSpeech Dataset This dataset contains data to train SPEAR TTS-like text-to-speech models that utilized semantic tokens derived from the OpenAI Whisper speech recognition model. We currently provide semantic and acoustic tokens for the LibriLight and LibriTTS datasets (English only). Acoustic tokens: - 24kHz EnCodec 6kbps (8 quantizers) Semantic tokens: - Whisper tiny VQ bottleneck trained on a subset of LibriLight Available LibriLight subsets: - `small`/`medium`/`large` (following the original dataset division but with `large` excluding the speaker `6454`) - a separate ≈1300hr single-speaker subset based on the `6454` speaker from the `large` subset for training single-speaker TTS models We plan to add more acoustic tokens from other codecs in the future.

提供机构：

collabora

原始信息汇总

WhisperSpeech 数据集概述

数据集名称

WhisperSpeech

许可证

任务类别

文本到语音转换（text-to-speech）

语言

英语（en）

数据内容

用于训练类似SPEAR TTS的文本到语音模型，使用OpenAI Whisper语音识别模型衍生的语义标记。
提供LibriLight和LibriTTS数据集的语义和声学标记（仅限英语）。

声学标记

24kHz EnCodec 6kbps（8量化器）

语义标记

Whisper tiny VQ瓶颈，训练于LibriLight的一个子集

LibriLight子集

small/medium/large（遵循原始数据集划分，但large子集排除了说话者6454）
一个独立的约1300小时单说话者子集，基于large子集中的6454说话者，用于训练单说话者TTS模型

未来计划

计划添加来自其他编解码器的更多声学标记

5,000+

优质数据集

54 个

任务类型

进入经典数据集