five

amanuelbyte/african_speech_dataset

收藏
Hugging Face2026-02-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/amanuelbyte/african_speech_dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: amh features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: decode: false splits: - name: train num_bytes: 37922390 num_examples: 1045 download_size: 37692218 dataset_size: 37922390 - config_name: amharic features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string - name: source dtype: string splits: - name: train num_bytes: 11686723853 num_examples: 32901 download_size: 11067717597 dataset_size: 11686723853 - config_name: hau features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: decode: false splits: - name: train num_bytes: 91812182 num_examples: 3496 download_size: 91066923 dataset_size: 91812182 - config_name: hausa features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string - name: source dtype: string splits: - name: train num_bytes: 602620189 num_examples: 1572 download_size: 602412677 dataset_size: 602620189 - config_name: som features: - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 decode: false - name: lang dtype: string - name: source dtype: string splits: - name: train num_bytes: 1455941672 num_examples: 17354 download_size: 1437831373 dataset_size: 1455941672 - config_name: swahili features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string - name: source dtype: string splits: - name: train num_bytes: 8763983905 num_examples: 412435 download_size: 5012049012 dataset_size: 8763983905 - config_name: swh features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: decode: false splits: - name: train num_bytes: 4763063290 num_examples: 146179 download_size: 4722484977 dataset_size: 4763063290 - config_name: wol features: - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 decode: false - name: lang dtype: string - name: source dtype: string splits: - name: train num_bytes: 4893866154 num_examples: 36691 download_size: 4661525784 dataset_size: 4893866154 - config_name: wolof features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string - name: source dtype: string splits: - name: train num_bytes: 4275421272 num_examples: 36009 download_size: 4100478830 dataset_size: 4275421272 - config_name: yor features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: decode: false splits: - name: train num_bytes: 122881808 num_examples: 3451 download_size: 121281673 dataset_size: 122881808 - config_name: yoruba features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string - name: source dtype: string splits: - name: train num_bytes: 25744535207 num_examples: 1648471 download_size: 22184723388 dataset_size: 25744535207 - config_name: zul features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: decode: false splits: - name: train num_bytes: 506360 num_examples: 16 download_size: 507173 dataset_size: 506360 configs: - config_name: amh data_files: - split: train path: amh/train-* - config_name: amharic data_files: - split: train path: amharic/train-* - config_name: hau data_files: - split: train path: hau/train-* - config_name: hausa data_files: - split: train path: hausa/train-* - config_name: som data_files: - split: train path: som/train-* - config_name: swahili data_files: - split: train path: swahili/train-* - config_name: swh data_files: - split: train path: swh/train-* - config_name: wol data_files: - split: train path: wol/train-* - config_name: wolof data_files: - split: train path: wolof/train-* - config_name: yor data_files: - split: train path: yor/train-* - config_name: yoruba data_files: - split: train path: yoruba/train-* - config_name: zul data_files: - split: train path: zul/train-* ---
提供机构:
amanuelbyte
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作