Scicom-intl/Malaysian-Emilia-Sidon
收藏Hugging Face2026-02-06 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Scicom-intl/Malaysian-Emilia-Sidon
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- ms
- ta
- zh
configs:
- config_name: chinese
data_files:
- split: train
path: chinese/train-*
- config_name: klasik
data_files:
- split: train
path: klasik/train-*
- config_name: malaysia_parliament
data_files:
- split: train
path: malaysia_parliament/train-*
- config_name: malaysian_cartoon
data_files:
- split: train
path: malaysian_cartoon/train-*
- config_name: malaysian_podcast
data_files:
- split: train
path: malaysian_podcast/train-*
- config_name: sg_podcast
data_files:
- split: train
path: sg_podcast/train-*
- config_name: tamil
data_files:
- split: train
path: tamil/train-*
dataset_info:
- config_name: chinese
features:
- name: audio_filename
dtype: string
- name: folder
dtype: string
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
splits:
- name: train
num_bytes: 371228254
num_examples: 605169
download_size: 102644558
dataset_size: 371228254
- config_name: klasik
features:
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
- name: audio_filename
dtype: string
- name: folder
dtype: string
splits:
- name: train
num_bytes: 4382360
num_examples: 10369
download_size: 1286997
dataset_size: 4382360
- config_name: malaysia_parliament
features:
- name: audio_filename
dtype: string
- name: folder
dtype: string
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
splits:
- name: train
num_bytes: 523171337
num_examples: 897610
download_size: 110763350
dataset_size: 523171337
- config_name: malaysian_cartoon
features:
- name: audio_filename
dtype: string
- name: folder
dtype: string
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
splits:
- name: train
num_bytes: 33524690
num_examples: 69086
download_size: 8263727
dataset_size: 33524690
- config_name: malaysian_podcast
features:
- name: audio_filename
dtype: string
- name: folder
dtype: string
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
splits:
- name: train
num_bytes: 187617566
num_examples: 356815
download_size: 51106201
dataset_size: 187617566
- config_name: sg_podcast
features:
- name: audio_filename
dtype: string
- name: folder
dtype: string
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
splits:
- name: train
num_bytes: 130525356
num_examples: 238134
download_size: 36013718
dataset_size: 130525356
- config_name: tamil
features:
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
- name: audio_filename
dtype: string
- name: folder
dtype: string
splits:
- name: train
num_bytes: 206460425
num_examples: 254551
download_size: 50287793
dataset_size: 206460425
---
# Malaysian-Emilia-Sidon
Apply [sarulab-speech/sidon-v0.1](https://huggingface.co/sarulab-speech/sidon-v0.1) on,
1. https://huggingface.co/datasets/mesolitica/Malaysian-Emilia-v2
2. https://huggingface.co/datasets/Scicom-intl/Malaysian-Chinese-Emilia
3. https://huggingface.co/datasets/Scicom-intl/Malaysian-Tamil-Emilia
提供机构:
Scicom-intl



