collectivat/ladino-karen-TTS

Name: collectivat/ladino-karen-TTS
Creator: collectivat
Published: 2025-10-20 10:06:10
License: 暂无描述

Hugging Face2025-10-20 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/collectivat/ladino-karen-TTS

下载链接

链接失效反馈

官方服务：

资源简介：

Ladino文本到语音(TTS)训练数据集包含了一个来自伊斯坦布尔的母语Ladino(犹太西班牙语)发音人的单发音人语音语料库。该语料库是为了训练这种濒危语言的文本到语音合成模型而创建的。数据集共有1987个音频段，总时长约3.3小时，采样率为16 kHz，音频格式为WAV(16位，单声道)。这些样本是从每周 Ladino 报纸 El Amaneser 的30篇文章的录音中自动分割出来的，内容涵盖历史问题、时事、文化活动等多个主题。每个录音都与它的转录对齐。

The Ladino Text-to-Speech (TTS) Training Dataset contains a single-speaker speech corpus in Ladino (Judeo-Spanish) recorded by a native speaker from Istanbul. The corpus was created for training text-to-speech synthesis models for this endangered language. It includes 1987 audio segments with a total duration of approximately 3.3 hours, sampled at 16 kHz in WAV format (16-bit, mono). The samples were automatically segmented from recordings of 30 articles from the weekly Ladino newspaper El Amaneser, covering various topics such as historical issues, current affairs, cultural events, and more. Each recording is aligned with its transcription.

提供机构：

collectivat

5,000+

优质数据集

54 个

任务类型

进入经典数据集