LJSpeech
收藏arXiv2025-09-30 收录
下载链接:
https://keithito.com/lj-speech-dataset/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为LJSpeech,包含了13,100对文本和语音数据,总时长约24小时的语音音频。数据集被划分为12,900个样本用于训练,100个用于验证,另外100个用于测试。音频数据被处理成梅尔频谱图,其帧大小为1024,帧移为256,采样率为22,050。该数据集的规模为13,100个样本,适用于文本到语音(TTS)的任务。
The dataset named LJSpeech contains 13,100 pairs of text and speech data, with a total audio duration of approximately 24 hours. It is split into three subsets: 12,900 samples for training, 100 for validation, and the remaining 100 for testing. The audio data is processed into mel-spectrograms, with a frame size of 1024, frame shift of 256, and a sampling rate of 22,050 Hz. With a total of 13,100 samples, this dataset is suitable for text-to-speech (TTS) tasks.



