five

outlawmold/sinhala-tts-dataset-archive-20260429-082457

收藏
Hugging Face2026-04-29 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/outlawmold/sinhala-tts-dataset-archive-20260429-082457
下载链接
链接失效反馈
官方服务:
资源简介:
干净的、分割好的僧伽罗语语音数据集,数据来自YouTube上的Unlimited History系列视频,由@sunchare制作。数据集包含218个语音片段,总时长0.51小时,平均每个片段8.5秒,采样率为22050Hz。数据经过多步处理流程包括原始音频分离、语音修复、语音活动检测、自动语音识别和质量过滤等步骤。数据采用LJSpeech格式存储,包含wav音频文件和元数据文件。

Clean, segmented Sinhala speech from the Unlimited History YouTube series by @sunchare. The dataset contains 218 utterances totaling 0.51 hours with mean duration of 8.5s and sample rate of 22050Hz. Data underwent processing pipeline including audio separation, voice enhancement, voice activity detection, ASR and quality filtering. Stored in LJSpeech format with wav files and metadata.
提供机构:
outlawmold
二维码
社区交流群
二维码
科研交流群
商业服务