five

amanuelbyte/african_speech_clean

收藏
Hugging Face2026-04-14 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/amanuelbyte/african_speech_clean
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: amharic features: - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 splits: - name: train num_bytes: 20020186401 num_examples: 55660 download_size: 18264934277 dataset_size: 20020186401 - config_name: hausa features: - name: audio dtype: audio: sampling_rate: 24000 - name: text dtype: string splits: - name: train num_bytes: 2759570290.72 num_examples: 1322 download_size: 2410109948 dataset_size: 2759570290.72 - config_name: somali features: - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: lang dtype: string - name: source dtype: string splits: - name: train num_bytes: 1455941479 num_examples: 17354 download_size: 1437831250 dataset_size: 1455941479 - config_name: swahili features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: sampling_rate: 16000 splits: - name: train num_bytes: 13522763655 num_examples: 558614 download_size: 9734426103 dataset_size: 13522763655 - config_name: swahili_clean features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string splits: - name: train num_bytes: 329583143 num_examples: 10000 download_size: 329697901 dataset_size: 329583143 - config_name: wolof features: - name: text dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: lang dtype: string - name: source dtype: string splits: - name: train num_bytes: 9168860448 num_examples: 72700 download_size: 8761312082 dataset_size: 9168860448 - config_name: yoruba features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: sampling_rate: 16000 splits: - name: train num_bytes: 25864326146 num_examples: 1651922 download_size: 22305974462 dataset_size: 25864326146 - config_name: yoruba_clean features: - name: audio dtype: audio: sampling_rate: 16000 - name: text dtype: string splits: - name: train num_bytes: 925922141 num_examples: 10000 download_size: 925963597 dataset_size: 925922141 - config_name: zulu features: - name: text dtype: string - name: lang dtype: string - name: source dtype: string - name: audio dtype: audio: sampling_rate: 16000 splits: - name: train num_bytes: 506360 num_examples: 16 download_size: 507188 dataset_size: 506360 configs: - config_name: amharic data_files: - split: train path: amharic/train-* - config_name: hausa data_files: - split: train path: hausa/train-* - config_name: somali data_files: - split: train path: somali/train-* - config_name: swahili data_files: - split: train path: swahili/train-* - config_name: swahili_clean data_files: - split: train path: swahili_clean/train-* - config_name: wolof data_files: - split: train path: wolof/train-* - config_name: yoruba data_files: - split: train path: yoruba/train-* - config_name: yoruba_clean data_files: - split: train path: yoruba_clean/train-* - config_name: zulu data_files: - split: train path: zulu/train-* ---
提供机构:
amanuelbyte
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作