Scicom-intl/Malaysian-Tamil-Emilia
收藏Hugging Face2026-02-13 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Scicom-intl/Malaysian-Tamil-Emilia
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: audio_length_ratio_text
features:
- name: audio_filename
dtype: string
- name: audio_filename_trim
dtype: string
- name: audio_length
dtype: float64
- name: text
dtype: string
- name: audio_length_ratio_text
dtype: float64
- name: audio_length_ratio_text_accept
dtype: bool
splits:
- name: train
num_bytes: 226337503
num_examples: 254551
download_size: 48614583
dataset_size: 226337503
- config_name: default
features:
- name: text
dtype: string
- name: start
dtype: float64
- name: end
dtype: float64
- name: speaker
dtype: string
- name: language
dtype: string
- name: dnsmos
dtype: float64
- name: audio_filename
dtype: string
- name: folder
dtype: string
splits:
- name: train
num_bytes: 204933119
num_examples: 254551
download_size: 50208218
dataset_size: 204933119
- config_name: permutation
features:
- name: reference_audio
dtype: string
- name: reference_text
dtype: string
- name: target_audio
dtype: string
- name: target_text
dtype: string
splits:
- name: train
num_bytes: 20757833474
num_examples: 15499082
download_size: 146053868
dataset_size: 20757833474
- config_name: permutation_sample
features:
- name: reference_audio
dtype: string
- name: reference_text
dtype: string
- name: target_audio
dtype: string
- name: target_text
dtype: string
splits:
- name: train
num_bytes: 831535338
num_examples: 607486
download_size: 124914122
dataset_size: 831535338
configs:
- config_name: audio_length_ratio_text
data_files:
- split: train
path: audio_length_ratio_text/train-*
- config_name: default
data_files:
- split: train
path: data/train-*
- config_name: permutation
data_files:
- split: train
path: permutation/train-*
- config_name: permutation_sample
data_files:
- split: train
path: permutation_sample/train-*
language:
- ta
---
# Malaysian-Tamil-Emilia
Use https://github.com/mesolitica/Emilia to pseudo-label Malaysian Tamil audio.
提供机构:
Scicom-intl



