RyanSpeech
收藏OpenDataLab2026-05-17 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/RyanSpeech
下载链接
链接失效反馈官方服务:
资源简介:
RyanSpeech是用于研究自动文本到语音 (TTS) 系统的新型语音语料库。公开可用的TTS语料库通常嘈杂,由多个扬声器录制,或者没有高质量的男性语音数据。为了满足语音识别领域对高质量,可公开使用的男性语音语料库的需求,我们设计并创建了RyanSpeech。我们从现实世界的对话环境中获得了RyanSpeech的文本材料,这些材料包含以44.1 kHz录制的专业男性配音演员的语音超过10小时。此语料库创建的设计和流水线都使RyanSpeech成为在实际应用中开发TTS系统的理想选择。为了为未来的研究,协议和基准提供基线,我们在RyanSpeech上训练了4种最先进的语音模型和声码器。结果显示,在我们的最佳模型中,平均意见得分 (MOS) 3.36。我们已将训练有素的模型公开供下载。
RyanSpeech is a novel speech corpus for research on automatic text-to-speech (TTS) systems. Publicly available TTS corpora are often noisy, recorded by multiple speakers, or lack high-quality male speech data. To address the demand for high-quality, publicly accessible male speech corpora in the field of speech recognition, we designed and created RyanSpeech. The textual materials for RyanSpeech were sourced from real-world conversational contexts, and the corpus contains over 10 hours of speech from professional male voice actors recorded at 44.1 kHz. The design and development pipeline of this corpus make RyanSpeech an ideal option for developing TTS systems in real-world applications. To provide baselines for future research, protocols and benchmarks, we trained four state-of-the-art speech models and vocoders on RyanSpeech. The results show that our best model achieves a Mean Opinion Score (MOS) of 3.36. We have made the trained models publicly available for download.
提供机构:
OpenDataLab
创建时间:
2023-10-11
搜集汇总
数据集介绍

背景与挑战
背景概述
RyanSpeech是一个高质量的男性语音语料库,包含超过10小时的专业录音,主要用于文本到语音(TTS)系统研究。该数据集已用于训练多个先进模型,最佳模型表现达到平均意见得分3.36。
以上内容由遇见数据集搜集并总结生成



