asahi417/seamless-align-deA-enA.speaker-embedding.metavoice
收藏Hugging Face2024-06-19 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-deA-enA.speaker-embedding.metavoice
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个子集,每个子集包含德语和英语音频的ID、LASER评分以及说话者嵌入信息。数据集的特征包括行号、音频ID、LASER评分和说话者嵌入。每个子集的数据集大小、下载大小和样本数量各不相同。
This dataset contains multiple subsets, each of which includes IDs, LASER scores, and speaker embeddings for German and English audio. The features of the dataset include line numbers, audio IDs, LASER scores, and speaker embeddings. The dataset size, download size, and number of samples vary for each subset.
提供机构:
asahi417
原始信息汇总
数据集概述
数据集子集信息
| 子集名称 | 特征数量 | 主要特征 | 数据类型 | 训练集大小 | 训练集示例数量 | 下载大小 | 数据集大小 |
|---|---|---|---|---|---|---|---|
| subset_1 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4395806 bytes | 2064 | 4322878 bytes | 4395806 bytes |
| subset_10 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4491593 bytes | 2109 | 4190556 bytes | 4491593 bytes |
| subset_100 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4221029 bytes | 1982 | 4155329 bytes | 4221029 bytes |
| subset_101 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4321128 bytes | 2029 | 4246166 bytes | 4321128 bytes |
| subset_102 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4321099 bytes | 2029 | 4222146 bytes | 4321099 bytes |
| subset_103 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4221033 bytes | 1982 | 4223941 bytes | 4221033 bytes |
| subset_104 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4227430 bytes | 1985 | 4169974 bytes | 4227430 bytes |
| subset_11 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4546945 bytes | 2135 | 4254964 bytes | 4546945 bytes |
| subset_12 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4493676 bytes | 2110 | 4213016 bytes | 4493676 bytes |
| subset_13 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4606658 bytes | 2163 | 4368330 bytes | 4606658 bytes |
| subset_14 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4476668 bytes | 2102 | 4192255 bytes | 4476668 bytes |
| subset_15 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4502224 bytes | 2114 | 4233947 bytes | 4502224 bytes |
| subset_16 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4559768 bytes | 2141 | 4250156 bytes | 4559768 bytes |
| subset_17 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4489425 bytes | 2108 | 4260784 bytes | 4489425 bytes |
| subset_18 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4474547 bytes | 2101 | 4199402 bytes | 4474547 bytes |
| subset_19 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4510762 bytes | 2118 | 4319582 bytes | 4510762 bytes |
| subset_2 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4383019 bytes | 2058 | 4120012 bytes | 4383019 bytes |
| subset_20 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4572545 bytes | 2147 | 4354106 bytes | 4572545 bytes |
| subset_201 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3912137 bytes | 1837 | 3936507 bytes | 3912137 bytes |
| subset_202 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3897306 bytes | 1830 | 3933441 bytes | 3897306 bytes |
| subset_203 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3971770 bytes | 1865 | 3982581 bytes | 3971770 bytes |
| subset_204 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4008015 bytes | 1882 | 4016269 bytes | 4008015 bytes |
| subset_205 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3986697 bytes | 1872 | 3980016 bytes | 3986697 bytes |
| subset_206 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3988811 bytes | 1873 | 4000721 bytes | 3988811 bytes |
| subset_207 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4010131 bytes | 1883 | 4036442 bytes | 4010131 bytes |
| subset_208 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4022884 bytes | 1889 | 4036473 bytes | 4022884 bytes |
| subset_209 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3986720 bytes | 1872 | 4029554 bytes | 3986720 bytes |
| subset_21 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 4568273 bytes | 2145 | 4336624 bytes | 4568273 bytes |
| subset_210 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3942028 bytes | 1851 | 3944762 bytes | 3942028 bytes |
| subset_211 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3986733 bytes | 1872 | 4044508 bytes | 3986733 bytes |
| subset_212 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3878064 bytes | 1821 | 3930867 bytes | 3878064 bytes |
| subset_213 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3956835 bytes | 1858 | 3992081 bytes | 3956835 bytes |
| subset_214 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3946175 bytes | 1853 | 3978316 bytes | 3946175 bytes |
| subset_215 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3850417 bytes | 1808 | 3867295 bytes | 3850417 bytes |
| subset_216 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3931317 bytes | 1846 | 3951046 bytes | 3931317 bytes |
| subset_217 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3980301 bytes | 1869 | 3985046 bytes | 3980301 bytes |
| subset_218 | 8 | line_no, deA.id, deA.laser_score, enA.id, enA.laser_score, deA.audio.speaker_embedding, enA.audio.speaker_embedding | int64, string, float64, float32 | 3988823 bytes | 1873 | 3991407 bytes | 3988823 bytes |
| subset_ |



