five

Scicom-intl/Multilingual-TTS-DNSMOS

收藏
Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Scicom-intl/Multilingual-TTS-DNSMOS
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: 700h-tr-turkish-text-to-speech features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 795205 num_examples: 1965 download_size: 319225 dataset_size: 795205 - config_name: 9jalingo-hausa features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 12834821 num_examples: 57173 download_size: 4493350 dataset_size: 12834821 - config_name: 9jalingo-igbo features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 12404159 num_examples: 43863 download_size: 4956336 dataset_size: 12404159 - config_name: 9jalingo-pidgin features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 13500111 num_examples: 48650 download_size: 5304807 dataset_size: 13500111 - config_name: 9jalingo-yoruba features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 11916506 num_examples: 40932 download_size: 4671348 dataset_size: 11916506 - config_name: Alexis features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 552295 num_examples: 2294 download_size: 270619 dataset_size: 552295 - config_name: AnimeVox features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 761198 num_examples: 3374 download_size: 332897 dataset_size: 761198 - config_name: ArVoice features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 11207954 num_examples: 22755 download_size: 1588404 dataset_size: 11207954 - config_name: Arabic-Diacritized-TTS features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 4529555 num_examples: 8006 download_size: 1370433 dataset_size: 4529555 - config_name: Arabic_Diacritized_Audio_Dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1169872 num_examples: 2826 download_size: 357217 dataset_size: 1169872 - config_name: Armenian-speech-corpus features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 7413469 num_examples: 26068 download_size: 2604922 dataset_size: 7413469 - config_name: Azerbaijani_News_TTS features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 16685294 num_examples: 39128 download_size: 6149134 dataset_size: 16685294 - config_name: Azure-TTS-Synthetic features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 219723 num_examples: 1022 download_size: 66737 dataset_size: 219723 - config_name: Azure-TTS-annotated features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 67661310 num_examples: 211753 download_size: 25617201 dataset_size: 67661310 - config_name: Changsha_Dialect_Conversational_Speech_Corpus features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 510730 num_examples: 1273 download_size: 145486 dataset_size: 510730 - config_name: ChildMandarin features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 5593647 num_examples: 28070 download_size: 2322374 dataset_size: 5593647 - config_name: ClArTTS features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2888017 num_examples: 9689 download_size: 1237748 dataset_size: 2888017 - config_name: CommonPhoneDataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 49616302 num_examples: 123714 download_size: 14590145 dataset_size: 49616302 - config_name: DarijaTTS-clean features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2858707 num_examples: 11200 download_size: 1133822 dataset_size: 2858707 - config_name: Dastum-yar-stt-breton-data features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 147393 num_examples: 559 download_size: 54081 dataset_size: 147393 - config_name: DisfluencySpeech features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1503267 num_examples: 4995 download_size: 661811 dataset_size: 1503267 - config_name: Elise features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 265990 num_examples: 1091 download_size: 132281 dataset_size: 265990 - config_name: Emilia-NV features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 53495336 num_examples: 170357 download_size: 30122054 dataset_size: 53495336 - config_name: EmoVoice-DB features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 5416492 num_examples: 21983 download_size: 2271200 dataset_size: 5416492 - config_name: Enigma-Dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 298916666 num_examples: 1022006 download_size: 113191551 dataset_size: 298916666 - config_name: FalAR features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 4099369 num_examples: 8245 download_size: 2118635 dataset_size: 4099369 - config_name: IndicTTS features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2315112 num_examples: 6884 download_size: 649664 dataset_size: 2315112 - config_name: IndicTTS_Telugu_MultiSpeaker features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 5693913 num_examples: 8476 download_size: 1959123 dataset_size: 5693913 - config_name: Japanese-Anime-Speech-v2 features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 68861101 num_examples: 225749 download_size: 21842498 dataset_size: 68861101 - config_name: Lahaja features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1552188 num_examples: 4120 download_size: 609780 dataset_size: 1552188 - config_name: MASC-Arabic features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 12376897 num_examples: 50029 download_size: 5248844 dataset_size: 12376897 - config_name: Nanchang_Dialect_Conversational_Speech_Corpus features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 461736 num_examples: 1195 download_size: 125512 dataset_size: 461736 - config_name: NepaliONE-tts features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2329239 num_examples: 6537 download_size: 660747 dataset_size: 2329239 - config_name: OutteTTS-urdu-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 6087694 num_examples: 18790 download_size: 2193493 dataset_size: 6087694 - config_name: ParlaSpeech-CZ features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1751074 num_examples: 5045 download_size: 770217 dataset_size: 1751074 - config_name: ParlaSpeech-PL features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 590966 num_examples: 1761 download_size: 279640 dataset_size: 590966 - config_name: ParsiGoo features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 66309 num_examples: 201 download_size: 33363 dataset_size: 66309 - config_name: Persian-Farsi-Speech features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 54704570 num_examples: 108757 download_size: 23329656 dataset_size: 54704570 - config_name: Persian-Speech-Dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 640148 num_examples: 1855 download_size: 232112 dataset_size: 640148 - config_name: PersianVox_NM features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 13379966 num_examples: 35007 download_size: 5518941 dataset_size: 13379966 - config_name: Persian_Course_TTS features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 7304877 num_examples: 28208 download_size: 2574492 dataset_size: 7304877 - config_name: Porjai-Thai-voice-dataset-central features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2339859 num_examples: 5064 download_size: 734779 dataset_size: 2339859 - config_name: SPRING_INX_Malayalam_R1 features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 33377667 num_examples: 61462 download_size: 11063106 dataset_size: 33377667 - config_name: Shanghai_Dialect_Conversational_Speech_Corpus features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 664887 num_examples: 1668 download_size: 184324 dataset_size: 664887 - config_name: StoryTTS features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 7679115 num_examples: 29264 download_size: 3279533 dataset_size: 7679115 - config_name: Tibetan-0310 features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 12151850 num_examples: 29628 download_size: 3954979 dataset_size: 12151850 - config_name: ToneWebinars features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 24256014 num_examples: 31826 download_size: 11321366 dataset_size: 24256014 - config_name: Vaani features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 34978618 num_examples: 77876 download_size: 11101743 dataset_size: 34978618 - config_name: Zhengzhou_Dialect_Conversational_Speech_Corpus features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 627879 num_examples: 1665 download_size: 159816 dataset_size: 627879 - config_name: afrispeech_afrikaans features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 539634 num_examples: 1681 download_size: 242169 dataset_size: 539634 - config_name: afvoices features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 23084977 num_examples: 108256 download_size: 9715123 dataset_size: 23084977 - config_name: amharic-speech-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 12210762 num_examples: 26870 download_size: 4569398 dataset_size: 12210762 - config_name: amharic_cleaned_testset_verified features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 13215362 num_examples: 26409 download_size: 4635755 dataset_size: 13215362 - config_name: anta_women_tts features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 4097363 num_examples: 19107 download_size: 1671671 dataset_size: 4097363 - config_name: anv_data_ke features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 68213221 num_examples: 156069 download_size: 28542993 dataset_size: 68213221 - config_name: arknights_voices features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 4675432 num_examples: 13993 download_size: 2036331 dataset_size: 4675432 - config_name: armenian-speech-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1375547 num_examples: 2886 download_size: 416403 dataset_size: 1375547 - config_name: assamese_dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 7711245 num_examples: 23454 download_size: 2802799 dataset_size: 7711245 - config_name: assamese_speech_dataset1 features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 550243 num_examples: 1853 download_size: 172947 dataset_size: 550243 - config_name: azerbaijani-audiobooks features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2997818 num_examples: 7281 download_size: 1393648 dataset_size: 2997818 - config_name: azerbaijani-speech-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 79121192 num_examples: 254226 download_size: 24648611 dataset_size: 79121192 - config_name: azerbaijani-tts-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 5585373 num_examples: 15131 download_size: 1833947 dataset_size: 5585373 - config_name: basque_speech_dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 3919182 num_examples: 14260 download_size: 725997 dataset_size: 3919182 - config_name: belarusian-speech-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 214670 num_examples: 448 download_size: 89105 dataset_size: 214670 - config_name: biggest-ru-book features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 154009877 num_examples: 501975 download_size: 71032612 dataset_size: 154009877 - config_name: bulgarian_tts features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1582255 num_examples: 3891 download_size: 702375 dataset_size: 1582255 - config_name: camoes_SI features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 831098 num_examples: 3890 download_size: 343984 dataset_size: 831098 - config_name: catalan-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 3466703 num_examples: 13276 download_size: 1327413 dataset_size: 3466703 - config_name: cmu_haitian features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1374329 num_examples: 5151 download_size: 529356 dataset_size: 1374329 - config_name: coral-v2 features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 65859633 num_examples: 239449 download_size: 26078076 dataset_size: 65859633 - config_name: coral-v3 features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 65972835 num_examples: 229070 download_size: 25129834 dataset_size: 65972835 - config_name: czech_train_data features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1197349 num_examples: 4212 download_size: 395838 dataset_size: 1197349 - config_name: egyptian-arabic-400k features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 20634298 num_examples: 74542 download_size: 7224761 dataset_size: 20634298 - config_name: expresso features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 2093362 num_examples: 9669 download_size: 680148 dataset_size: 2093362 - config_name: gemini-flash-2.0-speech features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 39314286 num_examples: 94121 download_size: 12450080 dataset_size: 39314286 - config_name: indicvoices_r features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 34415849 num_examples: 59523 download_size: 11412007 dataset_size: 34415849 - config_name: libritts_r_filtered features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 45782991 num_examples: 140555 download_size: 18544879 dataset_size: 45782991 - config_name: marathi_asr_dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 8235170 num_examples: 24795 download_size: 1690621 dataset_size: 8235170 - config_name: samromur_children features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 12419941 num_examples: 47182 download_size: 4026884 dataset_size: 12419941 - config_name: shrutilipi_sanskrit features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 5470132 num_examples: 10281 download_size: 1826626 dataset_size: 5470132 - config_name: turkish_male features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 14267473 num_examples: 49998 download_size: 6047247 dataset_size: 14267473 - config_name: ukrainian-speech-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 209444 num_examples: 458 download_size: 88283 dataset_size: 209444 - config_name: urdu-voice-dataset features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 1514760 num_examples: 5580 download_size: 578881 dataset_size: 1514760 - config_name: uzbekvoice-2k-each-accent features: - name: audio_filename dtype: large_string - name: text dtype: large_string - name: speaker dtype: large_string - name: OVRL_raw dtype: float64 - name: SIG_raw dtype: float64 - name: BAK_raw dtype: float64 - name: OVRL dtype: float64 - name: SIG dtype: float64 - name: BAK dtype: float64 - name: malaysianai-subset dtype: large_string splits: - name: train num_bytes: 996539 num_examples: 3742 download_size: 334061 dataset_size: 996539 configs: - config_name: 700h-tr-turkish-text-to-speech data_files: - split: train path: 700h-tr-turkish-text-to-speech/train-* - config_name: 9jalingo-hausa data_files: - split: train path: 9jalingo-hausa/train-* - config_name: 9jalingo-igbo data_files: - split: train path: 9jalingo-igbo/train-* - config_name: 9jalingo-pidgin data_files: - split: train path: 9jalingo-pidgin/train-* - config_name: 9jalingo-yoruba data_files: - split: train path: 9jalingo-yoruba/train-* - config_name: Alexis data_files: - split: train path: Alexis/train-* - config_name: AnimeVox data_files: - split: train path: AnimeVox/train-* - config_name: ArVoice data_files: - split: train path: ArVoice/train-* - config_name: Arabic-Diacritized-TTS data_files: - split: train path: Arabic-Diacritized-TTS/train-* - config_name: Arabic_Diacritized_Audio_Dataset data_files: - split: train path: Arabic_Diacritized_Audio_Dataset/train-* - config_name: Armenian-speech-corpus data_files: - split: train path: Armenian-speech-corpus/train-* - config_name: Azerbaijani_News_TTS data_files: - split: train path: Azerbaijani_News_TTS/train-* - config_name: Azure-TTS-Synthetic data_files: - split: train path: Azure-TTS-Synthetic/train-* - config_name: Azure-TTS-annotated data_files: - split: train path: Azure-TTS-annotated/train-* - config_name: Changsha_Dialect_Conversational_Speech_Corpus data_files: - split: train path: Changsha_Dialect_Conversational_Speech_Corpus/train-* - config_name: ChildMandarin data_files: - split: train path: ChildMandarin/train-* - config_name: ClArTTS data_files: - split: train path: ClArTTS/train-* - config_name: CommonPhoneDataset data_files: - split: train path: CommonPhoneDataset/train-* - config_name: DarijaTTS-clean data_files: - split: train path: DarijaTTS-clean/train-* - config_name: Dastum-yar-stt-breton-data data_files: - split: train path: Dastum-yar-stt-breton-data/train-* - config_name: DisfluencySpeech data_files: - split: train path: DisfluencySpeech/train-* - config_name: Elise data_files: - split: train path: Elise/train-* - config_name: Emilia-NV data_files: - split: train path: Emilia-NV/train-* - config_name: EmoVoice-DB data_files: - split: train path: EmoVoice-DB/train-* - config_name: Enigma-Dataset data_files: - split: train path: Enigma-Dataset/train-* - config_name: FalAR data_files: - split: train path: FalAR/train-* - config_name: IndicTTS data_files: - split: train path: IndicTTS/train-* - config_name: IndicTTS_Telugu_MultiSpeaker data_files: - split: train path: IndicTTS_Telugu_MultiSpeaker/train-* - config_name: Japanese-Anime-Speech-v2 data_files: - split: train path: Japanese-Anime-Speech-v2/train-* - config_name: Lahaja data_files: - split: train path: Lahaja/train-* - config_name: MASC-Arabic data_files: - split: train path: MASC-Arabic/train-* - config_name: Nanchang_Dialect_Conversational_Speech_Corpus data_files: - split: train path: Nanchang_Dialect_Conversational_Speech_Corpus/train-* - config_name: NepaliONE-tts data_files: - split: train path: NepaliONE-tts/train-* - config_name: OutteTTS-urdu-dataset data_files: - split: train path: OutteTTS-urdu-dataset/train-* - config_name: ParlaSpeech-CZ data_files: - split: train path: ParlaSpeech-CZ/train-* - config_name: ParlaSpeech-PL data_files: - split: train path: ParlaSpeech-PL/train-* - config_name: ParsiGoo data_files: - split: train path: ParsiGoo/train-* - config_name: Persian-Farsi-Speech data_files: - split: train path: Persian-Farsi-Speech/train-* - config_name: Persian-Speech-Dataset data_files: - split: train path: Persian-Speech-Dataset/train-* - config_name: PersianVox_NM data_files: - split: train path: PersianVox_NM/train-* - config_name: Persian_Course_TTS data_files: - split: train path: Persian_Course_TTS/train-* - config_name: Porjai-Thai-voice-dataset-central data_files: - split: train path: Porjai-Thai-voice-dataset-central/train-* - config_name: SPRING_INX_Malayalam_R1 data_files: - split: train path: SPRING_INX_Malayalam_R1/train-* - config_name: Shanghai_Dialect_Conversational_Speech_Corpus data_files: - split: train path: Shanghai_Dialect_Conversational_Speech_Corpus/train-* - config_name: StoryTTS data_files: - split: train path: StoryTTS/train-* - config_name: Tibetan-0310 data_files: - split: train path: Tibetan-0310/train-* - config_name: ToneWebinars data_files: - split: train path: ToneWebinars/train-* - config_name: Vaani data_files: - split: train path: Vaani/train-* - config_name: Zhengzhou_Dialect_Conversational_Speech_Corpus data_files: - split: train path: Zhengzhou_Dialect_Conversational_Speech_Corpus/train-* - config_name: afrispeech_afrikaans data_files: - split: train path: afrispeech_afrikaans/train-* - config_name: afvoices data_files: - split: train path: afvoices/train-* - config_name: amharic-speech-dataset data_files: - split: train path: amharic-speech-dataset/train-* - config_name: amharic_cleaned_testset_verified data_files: - split: train path: amharic_cleaned_testset_verified/train-* - config_name: anta_women_tts data_files: - split: train path: anta_women_tts/train-* - config_name: anv_data_ke data_files: - split: train path: anv_data_ke/train-* - config_name: arknights_voices data_files: - split: train path: arknights_voices/train-* - config_name: armenian-speech-dataset data_files: - split: train path: armenian-speech-dataset/train-* - config_name: assamese_dataset data_files: - split: train path: assamese_dataset/train-* - config_name: assamese_speech_dataset1 data_files: - split: train path: assamese_speech_dataset1/train-* - config_name: azerbaijani-audiobooks data_files: - split: train path: azerbaijani-audiobooks/train-* - config_name: azerbaijani-speech-dataset data_files: - split: train path: azerbaijani-speech-dataset/train-* - config_name: azerbaijani-tts-dataset data_files: - split: train path: azerbaijani-tts-dataset/train-* - config_name: basque_speech_dataset data_files: - split: train path: basque_speech_dataset/train-* - config_name: belarusian-speech-dataset data_files: - split: train path: belarusian-speech-dataset/train-* - config_name: biggest-ru-book data_files: - split: train path: biggest-ru-book/train-* - config_name: bulgarian_tts data_files: - split: train path: bulgarian_tts/train-* - config_name: camoes_SI data_files: - split: train path: camoes_SI/train-* - config_name: catalan-dataset data_files: - split: train path: catalan-dataset/train-* - config_name: cmu_haitian data_files: - split: train path: cmu_haitian/train-* - config_name: coral-v2 data_files: - split: train path: coral-v2/train-* - config_name: coral-v3 data_files: - split: train path: coral-v3/train-* - config_name: czech_train_data data_files: - split: train path: czech_train_data/train-* - config_name: egyptian-arabic-400k data_files: - split: train path: egyptian-arabic-400k/train-* - config_name: expresso data_files: - split: train path: expresso/train-* - config_name: gemini-flash-2.0-speech data_files: - split: train path: gemini-flash-2.0-speech/train-* - config_name: indicvoices_r data_files: - split: train path: indicvoices_r/train-* - config_name: libritts_r_filtered data_files: - split: train path: libritts_r_filtered/train-* - config_name: marathi_asr_dataset data_files: - split: train path: marathi_asr_dataset/train-* - config_name: samromur_children data_files: - split: train path: samromur_children/train-* - config_name: shrutilipi_sanskrit data_files: - split: train path: shrutilipi_sanskrit/train-* - config_name: turkish_male data_files: - split: train path: turkish_male/train-* - config_name: ukrainian-speech-dataset data_files: - split: train path: ukrainian-speech-dataset/train-* - config_name: urdu-voice-dataset data_files: - split: train path: urdu-voice-dataset/train-* - config_name: uzbekvoice-2k-each-accent data_files: - split: train path: uzbekvoice-2k-each-accent/train-* --- # Multilingual TTS — DNSMOS Filtered A quality-filtered subset of [`malaysia-ai/Multilingual-TTS`](https://huggingface.co/datasets/malaysia-ai/Multilingual-TTS), retaining only audio samples that score **OVRL ≥ 3.2** on the [DNSMOS](https://github.com/microsoft/DNS-Challenge/tree/master/DNSMOS) non-intrusive speech quality metric. 4,911,906 samples across multiple languages and TTS sources (84++ subset) remain after filtering. ## Dataset Structure Each record in the dataset contains the following fields: | Column | Type | Description | |--------|------|-------------| | `audio_filename` | `string` | Relative path to the audio file within its source subset | | `text` | `string` | Transcript / utterance text | | `speaker` | `string` | Speaker identifier from the source dataset | | `OVRL_raw` | `float` | Raw DNSMOS overall MOS score | | `SIG_raw` | `float` | Raw DNSMOS signal quality score | | `BAK_raw` | `float` | Raw DNSMOS background noise score | | `OVRL` | `float` | Polyfit-calibrated overall MOS score | | `SIG` | `float` | Polyfit-calibrated signal quality score | | `BAK` | `float` | Polyfit-calibrated background noise score | | `subset` | `string` | Source dataset name | ## Quality Filtering Audio was scored using the **DNSMOS** model (`sig_bak_ovr.onnx`) from the [Microsoft DNS Challenge](https://github.com/microsoft/DNS-Challenge). The model outputs three non-intrusive scores: - **SIG** — Speech signal quality (1–5) - **BAK** — Background noise quality (1–5) - **OVRL** — Overall audio quality (1–5) Raw scores are further calibrated using polynomial fits to approximate ITU-T P.808 MOS values. **Filter criterion:** `OVRL >= 3.2` | | Count | |-|-------| | Total scored | 8,272,123 | | Passed (OVRL ≥ 3.2) | 5,837,580 | | After metadata join | 4,911,906 | ## Source Data This dataset is derived from [`malaysia-ai/Multilingual-TTS`](https://huggingface.co/datasets/malaysia-ai/Multilingual-TTS), a multilingual TTS corpus covering a wide range of languages. Refer to the source dataset for licensing details of individual subsets. ## Scoring Pipeline Scoring was performed with a multi-stage parallel pipeline: 1. **Download & extract** — Zip archives are fetched from HuggingFace one at a time with background prefetch. 2. **Preprocess** — Audio is resampled to 16 kHz and padded/tiled to the minimum DNSMOS input length (9.01 s). 3. **Inference** — ONNX sessions run inference over 1-second hops; scores are averaged across hops. 4. **Save** — Results are appended to a JSONL file. The pipeline is resumable via a completed-zips cache.
提供机构:
Scicom-intl
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作