five

asahi417/seamless-align-enA-esA.speaker-embedding.metavoice

收藏
Hugging Face2024-06-23 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-enA-esA.speaker-embedding.metavoice
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含多个子集(例如subset_1到subset_137),每个子集包含行号、英语和西班牙语音频的ID、两种语言的LASER评分以及英语和西班牙语音频的说话者嵌入。数据集被划分为训练集,并指定了每个子集的字节数和示例数。文件还提到了每个子集的下载大小和数据集大小。

The dataset contains multiple subsets (e.g., subset_1 to subset_137), each with features such as line numbers, IDs for English and Spanish audio, LASER scores for both languages, and speaker embeddings for both English and Spanish audio. The dataset is split into training sets with specified numbers of bytes and examples. The file also mentions the download and dataset sizes for each subset.
提供机构:
asahi417
原始信息汇总

数据集概述

数据集配置及特征

  • config_name: subset_1

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4638414
        • num_examples: 2178
    • download_size: 5157041
    • dataset_size: 4638414
  • config_name: subset_10

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4744960
        • num_examples: 2228
    • download_size: 5317500
    • dataset_size: 4744960
  • config_name: subset_11

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4755614
        • num_examples: 2233
    • download_size: 5311513
    • dataset_size: 4755614
  • config_name: subset_12

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4687481
        • num_examples: 2201
    • download_size: 5236637
    • dataset_size: 4687481
  • config_name: subset_13

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4732230
        • num_examples: 2222
    • download_size: 5313417
    • dataset_size: 4732230
  • config_name: subset_14

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4742785
        • num_examples: 2227
    • download_size: 5321112
    • dataset_size: 4742785
  • config_name: subset_15

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4761967
        • num_examples: 2236
    • download_size: 5322570
    • dataset_size: 4761967
  • config_name: subset_16

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4793922
        • num_examples: 2251
    • download_size: 5352129
    • dataset_size: 4793922
  • config_name: subset_17

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4764080
        • num_examples: 2237
    • download_size: 5347995
    • dataset_size: 4764080
  • config_name: subset_18

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4674712
        • num_examples: 2195
    • download_size: 5214547
    • dataset_size: 4674712
  • config_name: subset_19

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4721568
        • num_examples: 2217
    • download_size: 5257133
    • dataset_size: 4721568
  • config_name: subset_2

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4781118
        • num_examples: 2245
    • download_size: 5348610
    • dataset_size: 4781118
  • config_name: subset_20

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4457534
        • num_examples: 2093
    • download_size: 4971506
    • dataset_size: 4457534
  • config_name: subset_21

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4436145
        • num_examples: 2083
    • download_size: 4934571
    • dataset_size: 4436145
  • config_name: subset_22

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4231709
        • num_examples: 1987
    • download_size: 4730463
    • dataset_size: 4231709
  • config_name: subset_23

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4338163
        • num_examples: 2037
    • download_size: 4842124
    • dataset_size: 4338163
  • config_name: subset_24

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4365921
        • num_examples: 2050
    • download_size: 4867805
    • dataset_size: 4365921
  • config_name: subset_25

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4351011
        • num_examples: 2043
    • download_size: 4858989
    • dataset_size: 4351011
  • config_name: subset_251

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4120840
        • num_examples: 1935
    • download_size: 4324526
    • dataset_size: 4120840
  • config_name: subset_252

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4095313
        • num_examples: 1923
    • download_size: 4353796
    • dataset_size: 4095313
  • config_name: subset_253

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4089002
        • num_examples: 1920
    • download_size: 4345768
    • dataset_size: 4089002
  • config_name: subset_254

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • enA.audio.speaker_embedding: sequence: float32
      • esA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4169876
        • num_examples: 1958
    • download_size: 4386194
    • dataset_size: 4169876
  • config_name: subset_255

    • features:
      • line_no: int64
      • enA.id: string
      • enA.laser_score: float64
      • esA.id: string
      • esA.laser_score: float64
      • esA.audio.speaker_embedding: sequence: float32
      • enA.audio.speaker_embedding: sequence: float32
    • splits:
      • train
        • num_bytes: 4114506
        • num
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作