Scicom-intl/Multilingual-TTS-DNSMOS
收藏Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Scicom-intl/Multilingual-TTS-DNSMOS
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: 700h-tr-turkish-text-to-speech
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 795205
num_examples: 1965
download_size: 319225
dataset_size: 795205
- config_name: 9jalingo-hausa
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 12834821
num_examples: 57173
download_size: 4493350
dataset_size: 12834821
- config_name: 9jalingo-igbo
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 12404159
num_examples: 43863
download_size: 4956336
dataset_size: 12404159
- config_name: 9jalingo-pidgin
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 13500111
num_examples: 48650
download_size: 5304807
dataset_size: 13500111
- config_name: 9jalingo-yoruba
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 11916506
num_examples: 40932
download_size: 4671348
dataset_size: 11916506
- config_name: Alexis
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 552295
num_examples: 2294
download_size: 270619
dataset_size: 552295
- config_name: AnimeVox
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 761198
num_examples: 3374
download_size: 332897
dataset_size: 761198
- config_name: ArVoice
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 11207954
num_examples: 22755
download_size: 1588404
dataset_size: 11207954
- config_name: Arabic-Diacritized-TTS
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 4529555
num_examples: 8006
download_size: 1370433
dataset_size: 4529555
- config_name: Arabic_Diacritized_Audio_Dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1169872
num_examples: 2826
download_size: 357217
dataset_size: 1169872
- config_name: Armenian-speech-corpus
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 7413469
num_examples: 26068
download_size: 2604922
dataset_size: 7413469
- config_name: Azerbaijani_News_TTS
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 16685294
num_examples: 39128
download_size: 6149134
dataset_size: 16685294
- config_name: Azure-TTS-Synthetic
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 219723
num_examples: 1022
download_size: 66737
dataset_size: 219723
- config_name: Azure-TTS-annotated
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 67661310
num_examples: 211753
download_size: 25617201
dataset_size: 67661310
- config_name: Changsha_Dialect_Conversational_Speech_Corpus
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 510730
num_examples: 1273
download_size: 145486
dataset_size: 510730
- config_name: ChildMandarin
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 5593647
num_examples: 28070
download_size: 2322374
dataset_size: 5593647
- config_name: ClArTTS
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2888017
num_examples: 9689
download_size: 1237748
dataset_size: 2888017
- config_name: CommonPhoneDataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 49616302
num_examples: 123714
download_size: 14590145
dataset_size: 49616302
- config_name: DarijaTTS-clean
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2858707
num_examples: 11200
download_size: 1133822
dataset_size: 2858707
- config_name: Dastum-yar-stt-breton-data
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 147393
num_examples: 559
download_size: 54081
dataset_size: 147393
- config_name: DisfluencySpeech
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1503267
num_examples: 4995
download_size: 661811
dataset_size: 1503267
- config_name: Elise
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 265990
num_examples: 1091
download_size: 132281
dataset_size: 265990
- config_name: Emilia-NV
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 53495336
num_examples: 170357
download_size: 30122054
dataset_size: 53495336
- config_name: EmoVoice-DB
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 5416492
num_examples: 21983
download_size: 2271200
dataset_size: 5416492
- config_name: Enigma-Dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 298916666
num_examples: 1022006
download_size: 113191551
dataset_size: 298916666
- config_name: FalAR
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 4099369
num_examples: 8245
download_size: 2118635
dataset_size: 4099369
- config_name: IndicTTS
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2315112
num_examples: 6884
download_size: 649664
dataset_size: 2315112
- config_name: IndicTTS_Telugu_MultiSpeaker
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 5693913
num_examples: 8476
download_size: 1959123
dataset_size: 5693913
- config_name: Japanese-Anime-Speech-v2
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 68861101
num_examples: 225749
download_size: 21842498
dataset_size: 68861101
- config_name: Lahaja
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1552188
num_examples: 4120
download_size: 609780
dataset_size: 1552188
- config_name: MASC-Arabic
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 12376897
num_examples: 50029
download_size: 5248844
dataset_size: 12376897
- config_name: Nanchang_Dialect_Conversational_Speech_Corpus
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 461736
num_examples: 1195
download_size: 125512
dataset_size: 461736
- config_name: NepaliONE-tts
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2329239
num_examples: 6537
download_size: 660747
dataset_size: 2329239
- config_name: OutteTTS-urdu-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 6087694
num_examples: 18790
download_size: 2193493
dataset_size: 6087694
- config_name: ParlaSpeech-CZ
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1751074
num_examples: 5045
download_size: 770217
dataset_size: 1751074
- config_name: ParlaSpeech-PL
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 590966
num_examples: 1761
download_size: 279640
dataset_size: 590966
- config_name: ParsiGoo
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 66309
num_examples: 201
download_size: 33363
dataset_size: 66309
- config_name: Persian-Farsi-Speech
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 54704570
num_examples: 108757
download_size: 23329656
dataset_size: 54704570
- config_name: Persian-Speech-Dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 640148
num_examples: 1855
download_size: 232112
dataset_size: 640148
- config_name: PersianVox_NM
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 13379966
num_examples: 35007
download_size: 5518941
dataset_size: 13379966
- config_name: Persian_Course_TTS
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 7304877
num_examples: 28208
download_size: 2574492
dataset_size: 7304877
- config_name: Porjai-Thai-voice-dataset-central
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2339859
num_examples: 5064
download_size: 734779
dataset_size: 2339859
- config_name: SPRING_INX_Malayalam_R1
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 33377667
num_examples: 61462
download_size: 11063106
dataset_size: 33377667
- config_name: Shanghai_Dialect_Conversational_Speech_Corpus
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 664887
num_examples: 1668
download_size: 184324
dataset_size: 664887
- config_name: StoryTTS
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 7679115
num_examples: 29264
download_size: 3279533
dataset_size: 7679115
- config_name: Tibetan-0310
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 12151850
num_examples: 29628
download_size: 3954979
dataset_size: 12151850
- config_name: ToneWebinars
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 24256014
num_examples: 31826
download_size: 11321366
dataset_size: 24256014
- config_name: Vaani
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 34978618
num_examples: 77876
download_size: 11101743
dataset_size: 34978618
- config_name: Zhengzhou_Dialect_Conversational_Speech_Corpus
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 627879
num_examples: 1665
download_size: 159816
dataset_size: 627879
- config_name: afrispeech_afrikaans
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 539634
num_examples: 1681
download_size: 242169
dataset_size: 539634
- config_name: afvoices
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 23084977
num_examples: 108256
download_size: 9715123
dataset_size: 23084977
- config_name: amharic-speech-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 12210762
num_examples: 26870
download_size: 4569398
dataset_size: 12210762
- config_name: amharic_cleaned_testset_verified
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 13215362
num_examples: 26409
download_size: 4635755
dataset_size: 13215362
- config_name: anta_women_tts
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 4097363
num_examples: 19107
download_size: 1671671
dataset_size: 4097363
- config_name: anv_data_ke
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 68213221
num_examples: 156069
download_size: 28542993
dataset_size: 68213221
- config_name: arknights_voices
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 4675432
num_examples: 13993
download_size: 2036331
dataset_size: 4675432
- config_name: armenian-speech-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1375547
num_examples: 2886
download_size: 416403
dataset_size: 1375547
- config_name: assamese_dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 7711245
num_examples: 23454
download_size: 2802799
dataset_size: 7711245
- config_name: assamese_speech_dataset1
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 550243
num_examples: 1853
download_size: 172947
dataset_size: 550243
- config_name: azerbaijani-audiobooks
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2997818
num_examples: 7281
download_size: 1393648
dataset_size: 2997818
- config_name: azerbaijani-speech-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 79121192
num_examples: 254226
download_size: 24648611
dataset_size: 79121192
- config_name: azerbaijani-tts-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 5585373
num_examples: 15131
download_size: 1833947
dataset_size: 5585373
- config_name: basque_speech_dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 3919182
num_examples: 14260
download_size: 725997
dataset_size: 3919182
- config_name: belarusian-speech-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 214670
num_examples: 448
download_size: 89105
dataset_size: 214670
- config_name: biggest-ru-book
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 154009877
num_examples: 501975
download_size: 71032612
dataset_size: 154009877
- config_name: bulgarian_tts
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1582255
num_examples: 3891
download_size: 702375
dataset_size: 1582255
- config_name: camoes_SI
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 831098
num_examples: 3890
download_size: 343984
dataset_size: 831098
- config_name: catalan-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 3466703
num_examples: 13276
download_size: 1327413
dataset_size: 3466703
- config_name: cmu_haitian
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1374329
num_examples: 5151
download_size: 529356
dataset_size: 1374329
- config_name: coral-v2
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 65859633
num_examples: 239449
download_size: 26078076
dataset_size: 65859633
- config_name: coral-v3
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 65972835
num_examples: 229070
download_size: 25129834
dataset_size: 65972835
- config_name: czech_train_data
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1197349
num_examples: 4212
download_size: 395838
dataset_size: 1197349
- config_name: egyptian-arabic-400k
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 20634298
num_examples: 74542
download_size: 7224761
dataset_size: 20634298
- config_name: expresso
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 2093362
num_examples: 9669
download_size: 680148
dataset_size: 2093362
- config_name: gemini-flash-2.0-speech
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 39314286
num_examples: 94121
download_size: 12450080
dataset_size: 39314286
- config_name: indicvoices_r
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 34415849
num_examples: 59523
download_size: 11412007
dataset_size: 34415849
- config_name: libritts_r_filtered
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 45782991
num_examples: 140555
download_size: 18544879
dataset_size: 45782991
- config_name: marathi_asr_dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 8235170
num_examples: 24795
download_size: 1690621
dataset_size: 8235170
- config_name: samromur_children
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 12419941
num_examples: 47182
download_size: 4026884
dataset_size: 12419941
- config_name: shrutilipi_sanskrit
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 5470132
num_examples: 10281
download_size: 1826626
dataset_size: 5470132
- config_name: turkish_male
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 14267473
num_examples: 49998
download_size: 6047247
dataset_size: 14267473
- config_name: ukrainian-speech-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 209444
num_examples: 458
download_size: 88283
dataset_size: 209444
- config_name: urdu-voice-dataset
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 1514760
num_examples: 5580
download_size: 578881
dataset_size: 1514760
- config_name: uzbekvoice-2k-each-accent
features:
- name: audio_filename
dtype: large_string
- name: text
dtype: large_string
- name: speaker
dtype: large_string
- name: OVRL_raw
dtype: float64
- name: SIG_raw
dtype: float64
- name: BAK_raw
dtype: float64
- name: OVRL
dtype: float64
- name: SIG
dtype: float64
- name: BAK
dtype: float64
- name: malaysianai-subset
dtype: large_string
splits:
- name: train
num_bytes: 996539
num_examples: 3742
download_size: 334061
dataset_size: 996539
configs:
- config_name: 700h-tr-turkish-text-to-speech
data_files:
- split: train
path: 700h-tr-turkish-text-to-speech/train-*
- config_name: 9jalingo-hausa
data_files:
- split: train
path: 9jalingo-hausa/train-*
- config_name: 9jalingo-igbo
data_files:
- split: train
path: 9jalingo-igbo/train-*
- config_name: 9jalingo-pidgin
data_files:
- split: train
path: 9jalingo-pidgin/train-*
- config_name: 9jalingo-yoruba
data_files:
- split: train
path: 9jalingo-yoruba/train-*
- config_name: Alexis
data_files:
- split: train
path: Alexis/train-*
- config_name: AnimeVox
data_files:
- split: train
path: AnimeVox/train-*
- config_name: ArVoice
data_files:
- split: train
path: ArVoice/train-*
- config_name: Arabic-Diacritized-TTS
data_files:
- split: train
path: Arabic-Diacritized-TTS/train-*
- config_name: Arabic_Diacritized_Audio_Dataset
data_files:
- split: train
path: Arabic_Diacritized_Audio_Dataset/train-*
- config_name: Armenian-speech-corpus
data_files:
- split: train
path: Armenian-speech-corpus/train-*
- config_name: Azerbaijani_News_TTS
data_files:
- split: train
path: Azerbaijani_News_TTS/train-*
- config_name: Azure-TTS-Synthetic
data_files:
- split: train
path: Azure-TTS-Synthetic/train-*
- config_name: Azure-TTS-annotated
data_files:
- split: train
path: Azure-TTS-annotated/train-*
- config_name: Changsha_Dialect_Conversational_Speech_Corpus
data_files:
- split: train
path: Changsha_Dialect_Conversational_Speech_Corpus/train-*
- config_name: ChildMandarin
data_files:
- split: train
path: ChildMandarin/train-*
- config_name: ClArTTS
data_files:
- split: train
path: ClArTTS/train-*
- config_name: CommonPhoneDataset
data_files:
- split: train
path: CommonPhoneDataset/train-*
- config_name: DarijaTTS-clean
data_files:
- split: train
path: DarijaTTS-clean/train-*
- config_name: Dastum-yar-stt-breton-data
data_files:
- split: train
path: Dastum-yar-stt-breton-data/train-*
- config_name: DisfluencySpeech
data_files:
- split: train
path: DisfluencySpeech/train-*
- config_name: Elise
data_files:
- split: train
path: Elise/train-*
- config_name: Emilia-NV
data_files:
- split: train
path: Emilia-NV/train-*
- config_name: EmoVoice-DB
data_files:
- split: train
path: EmoVoice-DB/train-*
- config_name: Enigma-Dataset
data_files:
- split: train
path: Enigma-Dataset/train-*
- config_name: FalAR
data_files:
- split: train
path: FalAR/train-*
- config_name: IndicTTS
data_files:
- split: train
path: IndicTTS/train-*
- config_name: IndicTTS_Telugu_MultiSpeaker
data_files:
- split: train
path: IndicTTS_Telugu_MultiSpeaker/train-*
- config_name: Japanese-Anime-Speech-v2
data_files:
- split: train
path: Japanese-Anime-Speech-v2/train-*
- config_name: Lahaja
data_files:
- split: train
path: Lahaja/train-*
- config_name: MASC-Arabic
data_files:
- split: train
path: MASC-Arabic/train-*
- config_name: Nanchang_Dialect_Conversational_Speech_Corpus
data_files:
- split: train
path: Nanchang_Dialect_Conversational_Speech_Corpus/train-*
- config_name: NepaliONE-tts
data_files:
- split: train
path: NepaliONE-tts/train-*
- config_name: OutteTTS-urdu-dataset
data_files:
- split: train
path: OutteTTS-urdu-dataset/train-*
- config_name: ParlaSpeech-CZ
data_files:
- split: train
path: ParlaSpeech-CZ/train-*
- config_name: ParlaSpeech-PL
data_files:
- split: train
path: ParlaSpeech-PL/train-*
- config_name: ParsiGoo
data_files:
- split: train
path: ParsiGoo/train-*
- config_name: Persian-Farsi-Speech
data_files:
- split: train
path: Persian-Farsi-Speech/train-*
- config_name: Persian-Speech-Dataset
data_files:
- split: train
path: Persian-Speech-Dataset/train-*
- config_name: PersianVox_NM
data_files:
- split: train
path: PersianVox_NM/train-*
- config_name: Persian_Course_TTS
data_files:
- split: train
path: Persian_Course_TTS/train-*
- config_name: Porjai-Thai-voice-dataset-central
data_files:
- split: train
path: Porjai-Thai-voice-dataset-central/train-*
- config_name: SPRING_INX_Malayalam_R1
data_files:
- split: train
path: SPRING_INX_Malayalam_R1/train-*
- config_name: Shanghai_Dialect_Conversational_Speech_Corpus
data_files:
- split: train
path: Shanghai_Dialect_Conversational_Speech_Corpus/train-*
- config_name: StoryTTS
data_files:
- split: train
path: StoryTTS/train-*
- config_name: Tibetan-0310
data_files:
- split: train
path: Tibetan-0310/train-*
- config_name: ToneWebinars
data_files:
- split: train
path: ToneWebinars/train-*
- config_name: Vaani
data_files:
- split: train
path: Vaani/train-*
- config_name: Zhengzhou_Dialect_Conversational_Speech_Corpus
data_files:
- split: train
path: Zhengzhou_Dialect_Conversational_Speech_Corpus/train-*
- config_name: afrispeech_afrikaans
data_files:
- split: train
path: afrispeech_afrikaans/train-*
- config_name: afvoices
data_files:
- split: train
path: afvoices/train-*
- config_name: amharic-speech-dataset
data_files:
- split: train
path: amharic-speech-dataset/train-*
- config_name: amharic_cleaned_testset_verified
data_files:
- split: train
path: amharic_cleaned_testset_verified/train-*
- config_name: anta_women_tts
data_files:
- split: train
path: anta_women_tts/train-*
- config_name: anv_data_ke
data_files:
- split: train
path: anv_data_ke/train-*
- config_name: arknights_voices
data_files:
- split: train
path: arknights_voices/train-*
- config_name: armenian-speech-dataset
data_files:
- split: train
path: armenian-speech-dataset/train-*
- config_name: assamese_dataset
data_files:
- split: train
path: assamese_dataset/train-*
- config_name: assamese_speech_dataset1
data_files:
- split: train
path: assamese_speech_dataset1/train-*
- config_name: azerbaijani-audiobooks
data_files:
- split: train
path: azerbaijani-audiobooks/train-*
- config_name: azerbaijani-speech-dataset
data_files:
- split: train
path: azerbaijani-speech-dataset/train-*
- config_name: azerbaijani-tts-dataset
data_files:
- split: train
path: azerbaijani-tts-dataset/train-*
- config_name: basque_speech_dataset
data_files:
- split: train
path: basque_speech_dataset/train-*
- config_name: belarusian-speech-dataset
data_files:
- split: train
path: belarusian-speech-dataset/train-*
- config_name: biggest-ru-book
data_files:
- split: train
path: biggest-ru-book/train-*
- config_name: bulgarian_tts
data_files:
- split: train
path: bulgarian_tts/train-*
- config_name: camoes_SI
data_files:
- split: train
path: camoes_SI/train-*
- config_name: catalan-dataset
data_files:
- split: train
path: catalan-dataset/train-*
- config_name: cmu_haitian
data_files:
- split: train
path: cmu_haitian/train-*
- config_name: coral-v2
data_files:
- split: train
path: coral-v2/train-*
- config_name: coral-v3
data_files:
- split: train
path: coral-v3/train-*
- config_name: czech_train_data
data_files:
- split: train
path: czech_train_data/train-*
- config_name: egyptian-arabic-400k
data_files:
- split: train
path: egyptian-arabic-400k/train-*
- config_name: expresso
data_files:
- split: train
path: expresso/train-*
- config_name: gemini-flash-2.0-speech
data_files:
- split: train
path: gemini-flash-2.0-speech/train-*
- config_name: indicvoices_r
data_files:
- split: train
path: indicvoices_r/train-*
- config_name: libritts_r_filtered
data_files:
- split: train
path: libritts_r_filtered/train-*
- config_name: marathi_asr_dataset
data_files:
- split: train
path: marathi_asr_dataset/train-*
- config_name: samromur_children
data_files:
- split: train
path: samromur_children/train-*
- config_name: shrutilipi_sanskrit
data_files:
- split: train
path: shrutilipi_sanskrit/train-*
- config_name: turkish_male
data_files:
- split: train
path: turkish_male/train-*
- config_name: ukrainian-speech-dataset
data_files:
- split: train
path: ukrainian-speech-dataset/train-*
- config_name: urdu-voice-dataset
data_files:
- split: train
path: urdu-voice-dataset/train-*
- config_name: uzbekvoice-2k-each-accent
data_files:
- split: train
path: uzbekvoice-2k-each-accent/train-*
---
# Multilingual TTS — DNSMOS Filtered
A quality-filtered subset of [`malaysia-ai/Multilingual-TTS`](https://huggingface.co/datasets/malaysia-ai/Multilingual-TTS), retaining only audio samples that score **OVRL ≥ 3.2** on the [DNSMOS](https://github.com/microsoft/DNS-Challenge/tree/master/DNSMOS) non-intrusive speech quality metric.
4,911,906 samples across multiple languages and TTS sources (84++ subset) remain after filtering.
## Dataset Structure
Each record in the dataset contains the following fields:
| Column | Type | Description |
|--------|------|-------------|
| `audio_filename` | `string` | Relative path to the audio file within its source subset |
| `text` | `string` | Transcript / utterance text |
| `speaker` | `string` | Speaker identifier from the source dataset |
| `OVRL_raw` | `float` | Raw DNSMOS overall MOS score |
| `SIG_raw` | `float` | Raw DNSMOS signal quality score |
| `BAK_raw` | `float` | Raw DNSMOS background noise score |
| `OVRL` | `float` | Polyfit-calibrated overall MOS score |
| `SIG` | `float` | Polyfit-calibrated signal quality score |
| `BAK` | `float` | Polyfit-calibrated background noise score |
| `subset` | `string` | Source dataset name |
## Quality Filtering
Audio was scored using the **DNSMOS** model (`sig_bak_ovr.onnx`) from the [Microsoft DNS Challenge](https://github.com/microsoft/DNS-Challenge). The model outputs three non-intrusive scores:
- **SIG** — Speech signal quality (1–5)
- **BAK** — Background noise quality (1–5)
- **OVRL** — Overall audio quality (1–5)
Raw scores are further calibrated using polynomial fits to approximate ITU-T P.808 MOS values.
**Filter criterion:** `OVRL >= 3.2`
| | Count |
|-|-------|
| Total scored | 8,272,123 |
| Passed (OVRL ≥ 3.2) | 5,837,580 |
| After metadata join | 4,911,906 |
## Source Data
This dataset is derived from [`malaysia-ai/Multilingual-TTS`](https://huggingface.co/datasets/malaysia-ai/Multilingual-TTS), a multilingual TTS corpus covering a wide range of languages. Refer to the source dataset for licensing details of individual subsets.
## Scoring Pipeline
Scoring was performed with a multi-stage parallel pipeline:
1. **Download & extract** — Zip archives are fetched from HuggingFace one at a time with background prefetch.
2. **Preprocess** — Audio is resampled to 16 kHz and padded/tiled to the minimum DNSMOS input length (9.01 s).
3. **Inference** — ONNX sessions run inference over 1-second hops; scores are averaged across hops.
4. **Save** — Results are appended to a JSONL file. The pipeline is resumable via a completed-zips cache.
提供机构:
Scicom-intl



