yotarokubo/fleurs_with_flores_101
收藏Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yotarokubo/fleurs_with_flores_101
下载链接
链接失效反馈官方服务:
资源简介:
这是一个大规模多语言语音数据集,包含27万余条语音样本,覆盖103种语言和7个地理语言组。每个样本包含16kHz采样率的音频文件、原始转录文本、标准化转录文本、说话者性别信息(男/女/其他)、精确语言标识和语言组分类(如西欧、东亚等),以及对应103种语言的翻译文本。数据集总大小约232.5GB,适用于多语言语音识别、语音翻译等NLP任务。
This is a large-scale multilingual speech dataset containing over 270,000 speech samples covering 103 languages and 7 geographic language groups. Each sample includes 16kHz sampled audio files, raw transcriptions, normalized transcriptions, speaker gender information (male/female/other), precise language identification and language group classification (e.g. Western Europe, East Asia, etc.), and translations in all 103 languages. The dataset totals approximately 232.5GB and is suitable for multilingual speech recognition, speech translation and other NLP tasks.
提供机构:
yotarokubo



