five

musamagwaza23/NCHLT_Siswati_Speech_Corpus

收藏
Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/musamagwaza23/NCHLT_Siswati_Speech_Corpus
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-3.0 tags: - ASR - NCHLT configs: - config_name: default data_files: - split: train path: data/train-* dataset_info: features: - name: speaker_id dtype: string - name: age dtype: int64 - name: gender dtype: string - name: location dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: duration_seconds dtype: float64 - name: pdp_score dtype: float64 - name: transcript dtype: string splits: - name: train num_bytes: 19343920977 num_examples: 187170 download_size: 28420192388 dataset_size: 19343920977 --- ## NCHLT Siswati Speech Corpus A subset of the NCHLT Speech Corpus containing Siswati (siSwati) audio data, uploaded for use in ASR research and model fine-tuning. The original data is sourced from the South African Centre for Digital Language Resources (SADiLaR) via sadilar.org. Use of this data is subject to the license conditions on the original product page. ## About Uploaded by Musa, Electronics Honours student, as part of ASR research on South African languages. --- ## Attribution and Credits Davel, M., Barnard, E., Badenhorst, J., van Heerden, C., de Waal, A. NCHLT isiZulu Speech Corpus. CSIR / North-West University, 2014. **Reference paper:** > N.J. de Vries, M.H. Davel, J. Badenhorst, W.D. Basson, F. de Wet, E. Barnard and A. de Waal, "A smartphone-based ASR data collection tool for under-resourced languages", *Speech Communication*, Volume 56, January 2014, pp 119-131. --- ## License This dataset is distributed under the **Creative Commons Attribution 3.0 Unported (CC BY 3.0)** license. You are free to use, share, and adapt this dataset for any purpose, including commercial use, as long as you give appropriate credit to the original creators listed above. Full license text: [https://creativecommons.org/licenses/by/3.0/](https://creativecommons.org/licenses/by/3.0/) ---
提供机构:
musamagwaza23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作