fsicoli/common_voice_22_0
收藏Hugging Face2025-08-11 更新2025-08-30 收录
下载链接:
https://hf-mirror.com/datasets/fsicoli/common_voice_22_0
下载链接
链接失效反馈官方服务:
资源简介:
Common Voice Corpus 22.0是一个大规模的多语言语音数据集,包含了从Abkhaz到Yoruba等多种语言,适用于自动语音识别任务。数据集大小在100B到1T之间,由Mozilla基金会创建并提供。每个数据点包括音频文件路径、句子以及口音、年龄、客户端ID、点赞数、点踩数、性别、地区和片段等信息。
Common Voice Corpus 22.0 is a large-scale multilingual speech dataset that includes languages from Abkhaz to Yoruba, designed for automatic speech recognition tasks. The dataset size ranges from 100B to 1T, created and provided by the Mozilla Foundation. Each data point includes the audio file path, sentence, accent, age, client_id, up_votes, down_votes, gender, locale, and segment information.
提供机构:
fsicoli



