mteb/common_voice_17_0_mini
收藏Hugging Face2025-12-23 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/mteb/common_voice_17_0_mini
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多语言音频数据集,包含多种配置(如ar、ast、be、bg、bn等)。每个配置包含以下特征:client_id、path、audio、sentence、up_votes、down_votes、age、gender、accent、locale、segment、variant和continuation。数据集分为训练集、验证集和测试集,每个集都有特定的字节大小和示例数量。该数据集似乎是多种语言的 spoken sentences 集合,并附有说话者及其人口统计信息的元数据。
This dataset is a multilingual audio dataset with multiple configurations (e.g., ar, ast, be, bg, bn, etc.). Each configuration includes features such as client_id, path, audio, sentence, up_votes, down_votes, age, gender, accent, locale, segment, variant, and continuation. The dataset is split into train, validation, and test sets, with specific byte sizes and example counts provided for each split. The dataset appears to be a collection of spoken sentences in various languages, accompanied by metadata about the speakers and their demographics.
提供机构:
mteb



