five

Scicom-intl/Malaysian-Emilia-Sidon

收藏
Hugging Face2026-02-06 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Scicom-intl/Malaysian-Emilia-Sidon
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - ms - ta - zh configs: - config_name: chinese data_files: - split: train path: chinese/train-* - config_name: klasik data_files: - split: train path: klasik/train-* - config_name: malaysia_parliament data_files: - split: train path: malaysia_parliament/train-* - config_name: malaysian_cartoon data_files: - split: train path: malaysian_cartoon/train-* - config_name: malaysian_podcast data_files: - split: train path: malaysian_podcast/train-* - config_name: sg_podcast data_files: - split: train path: sg_podcast/train-* - config_name: tamil data_files: - split: train path: tamil/train-* dataset_info: - config_name: chinese features: - name: audio_filename dtype: string - name: folder dtype: string - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 splits: - name: train num_bytes: 371228254 num_examples: 605169 download_size: 102644558 dataset_size: 371228254 - config_name: klasik features: - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 - name: audio_filename dtype: string - name: folder dtype: string splits: - name: train num_bytes: 4382360 num_examples: 10369 download_size: 1286997 dataset_size: 4382360 - config_name: malaysia_parliament features: - name: audio_filename dtype: string - name: folder dtype: string - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 splits: - name: train num_bytes: 523171337 num_examples: 897610 download_size: 110763350 dataset_size: 523171337 - config_name: malaysian_cartoon features: - name: audio_filename dtype: string - name: folder dtype: string - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 splits: - name: train num_bytes: 33524690 num_examples: 69086 download_size: 8263727 dataset_size: 33524690 - config_name: malaysian_podcast features: - name: audio_filename dtype: string - name: folder dtype: string - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 splits: - name: train num_bytes: 187617566 num_examples: 356815 download_size: 51106201 dataset_size: 187617566 - config_name: sg_podcast features: - name: audio_filename dtype: string - name: folder dtype: string - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 splits: - name: train num_bytes: 130525356 num_examples: 238134 download_size: 36013718 dataset_size: 130525356 - config_name: tamil features: - name: text dtype: string - name: start dtype: float64 - name: end dtype: float64 - name: speaker dtype: string - name: language dtype: string - name: dnsmos dtype: float64 - name: audio_filename dtype: string - name: folder dtype: string splits: - name: train num_bytes: 206460425 num_examples: 254551 download_size: 50287793 dataset_size: 206460425 --- # Malaysian-Emilia-Sidon Apply [sarulab-speech/sidon-v0.1](https://huggingface.co/sarulab-speech/sidon-v0.1) on, 1. https://huggingface.co/datasets/mesolitica/Malaysian-Emilia-v2 2. https://huggingface.co/datasets/Scicom-intl/Malaysian-Chinese-Emilia 3. https://huggingface.co/datasets/Scicom-intl/Malaysian-Tamil-Emilia
提供机构:
Scicom-intl
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作