twangodev/radiotalk-voices-2k
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/twangodev/radiotalk-voices-2k
下载链接
链接失效反馈官方服务:
资源简介:
radiotalk-voices-2k数据集包含2000个英语参考语音片段,每个片段长度为12-30秒,是从LibriTTS-R数据集中选取的每个说话者最长的合格语音片段。该数据集专为零样本TTS语音克隆任务设计,特别是在radiotalk管道中使用。数据集总时长为12.03小时,音频格式为24 kHz、单声道、FLAC编码。每个语音片段包含一个稳定的12位十六进制字符ID、FLAC编码的音频数据、标准化文本转录以及LibriTTS-R中的原始片段ID。
The radiotalk-voices-2k dataset contains 2,000 English reference voices — one 12–30s clip per speaker, selected as the longest qualifying utterance per speaker from LibriTTS-R. It is built for zero-shot TTS voice cloning in the radiotalk pipeline. The dataset totals 12.03 hours of audio, encoded at 24 kHz, mono, FLAC format. Each voice clip includes a stable 12-hex-char ID, FLAC-encoded audio, normalized transcript, and the source clip ID from LibriTTS-R.
提供机构:
twangodev



