asahi417/seamless-align-enA-esA.speaker-embedding.metavoice
收藏Hugging Face2024-06-23 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-enA-esA.speaker-embedding.metavoice
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个子集(例如subset_1到subset_137),每个子集包含行号、英语和西班牙语音频的ID、两种语言的LASER评分以及英语和西班牙语音频的说话者嵌入。数据集被划分为训练集,并指定了每个子集的字节数和示例数。文件还提到了每个子集的下载大小和数据集大小。
The dataset contains multiple subsets (e.g., subset_1 to subset_137), each with features such as line numbers, IDs for English and Spanish audio, LASER scores for both languages, and speaker embeddings for both English and Spanish audio. The dataset is split into training sets with specified numbers of bytes and examples. The file also mentions the download and dataset sizes for each subset.
提供机构:
asahi417
原始信息汇总
数据集概述
数据集配置及特征
-
config_name: subset_1
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4638414
- num_examples: 2178
- train
- download_size: 5157041
- dataset_size: 4638414
- features:
-
config_name: subset_10
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4744960
- num_examples: 2228
- train
- download_size: 5317500
- dataset_size: 4744960
- features:
-
config_name: subset_11
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4755614
- num_examples: 2233
- train
- download_size: 5311513
- dataset_size: 4755614
- features:
-
config_name: subset_12
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4687481
- num_examples: 2201
- train
- download_size: 5236637
- dataset_size: 4687481
- features:
-
config_name: subset_13
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4732230
- num_examples: 2222
- train
- download_size: 5313417
- dataset_size: 4732230
- features:
-
config_name: subset_14
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4742785
- num_examples: 2227
- train
- download_size: 5321112
- dataset_size: 4742785
- features:
-
config_name: subset_15
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4761967
- num_examples: 2236
- train
- download_size: 5322570
- dataset_size: 4761967
- features:
-
config_name: subset_16
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4793922
- num_examples: 2251
- train
- download_size: 5352129
- dataset_size: 4793922
- features:
-
config_name: subset_17
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4764080
- num_examples: 2237
- train
- download_size: 5347995
- dataset_size: 4764080
- features:
-
config_name: subset_18
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4674712
- num_examples: 2195
- train
- download_size: 5214547
- dataset_size: 4674712
- features:
-
config_name: subset_19
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4721568
- num_examples: 2217
- train
- download_size: 5257133
- dataset_size: 4721568
- features:
-
config_name: subset_2
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4781118
- num_examples: 2245
- train
- download_size: 5348610
- dataset_size: 4781118
- features:
-
config_name: subset_20
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4457534
- num_examples: 2093
- train
- download_size: 4971506
- dataset_size: 4457534
- features:
-
config_name: subset_21
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4436145
- num_examples: 2083
- train
- download_size: 4934571
- dataset_size: 4436145
- features:
-
config_name: subset_22
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4231709
- num_examples: 1987
- train
- download_size: 4730463
- dataset_size: 4231709
- features:
-
config_name: subset_23
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4338163
- num_examples: 2037
- train
- download_size: 4842124
- dataset_size: 4338163
- features:
-
config_name: subset_24
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4365921
- num_examples: 2050
- train
- download_size: 4867805
- dataset_size: 4365921
- features:
-
config_name: subset_25
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4351011
- num_examples: 2043
- train
- download_size: 4858989
- dataset_size: 4351011
- features:
-
config_name: subset_251
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4120840
- num_examples: 1935
- train
- download_size: 4324526
- dataset_size: 4120840
- features:
-
config_name: subset_252
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4095313
- num_examples: 1923
- train
- download_size: 4353796
- dataset_size: 4095313
- features:
-
config_name: subset_253
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4089002
- num_examples: 1920
- train
- download_size: 4345768
- dataset_size: 4089002
- features:
-
config_name: subset_254
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- enA.audio.speaker_embedding: sequence: float32
- esA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4169876
- num_examples: 1958
- train
- download_size: 4386194
- dataset_size: 4169876
- features:
-
config_name: subset_255
- features:
- line_no: int64
- enA.id: string
- enA.laser_score: float64
- esA.id: string
- esA.laser_score: float64
- esA.audio.speaker_embedding: sequence: float32
- enA.audio.speaker_embedding: sequence: float32
- splits:
- train
- num_bytes: 4114506
- num
- train
- features:



