five

asahi417/seamless-align-deA-enA.speaker-embedding.w2vbert-600m

收藏
Hugging Face2024-06-20 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-deA-enA.speaker-embedding.w2vbert-600m
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含多个子集(subset_1到subset_126),每个子集包含行号、德语和英语的ID、Laser评分以及音频的说话者嵌入等特征。数据集主要用于训练,每个子集的训练集大小和样本数量有所不同。

该数据集包含多个子集(subset_1到subset_126),每个子集包含行号、德语和英语的ID、Laser评分以及音频的说话者嵌入等特征。数据集主要用于训练,每个子集的训练集大小和样本数量有所不同。
提供机构:
asahi417
原始信息汇总

数据集概述

数据集配置

配置 subset_1

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 8801991334
      • num_examples: 2064
  • 下载大小: 8824428235
  • 数据集大小: 8801991334

配置 subset_10

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 8593419515
      • num_examples: 2106
  • 下载大小: 8615703246
  • 数据集大小: 8593419515

配置 subset_100

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6242966384
      • num_examples: 1944
  • 下载大小: 6261936615
  • 数据集大小: 6242966384

配置 subset_101

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6387695646
      • num_examples: 1985
  • 下载大小: 6406711214
  • 数据集大小: 6387695646

配置 subset_102

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6406941583
      • num_examples: 1992
  • 下载大小: 6425999235
  • 数据集大小: 6406941583

配置 subset_103

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6372617252
      • num_examples: 1952
  • 下载大小: 6391674307
  • 数据集大小: 6372617252

配置 subset_104

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6257600280
      • num_examples: 1956
  • 下载大小: 6276653902
  • 数据集大小: 6257600280

配置 subset_105

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6510576665
      • num_examples: 2034
  • 下载大小: 6530526101
  • 数据集大小: 6510576665

配置 subset_106

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6353337255
      • num_examples: 1981
  • 下载大小: 6371898179
  • 数据集大小: 6353337255

配置 subset_107

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6311054100
      • num_examples: 1983
  • 下载大小: 6330238642
  • 数据集大小: 6311054100

配置 subset_108

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6045274114
      • num_examples: 1963
  • 下载大小: 6063668462
  • 数据集大小: 6045274114

配置 subset_109

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6027069003
      • num_examples: 1949
  • 下载大小: 6046332553
  • 数据集大小: 6027069003

配置 subset_11

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 8656594338
      • num_examples: 2131
  • 下载大小: 8679010223
  • 数据集大小: 8656594338

配置 subset_110

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 5925273438
      • num_examples: 1940
  • 下载大小: 5943179957
  • 数据集大小: 5925273438

配置 subset_111

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • deA.audio.speaker_embedding: sequence of float32
    • deA.audio.speaker_embedding.full: sequence of sequence of float32
    • enA.audio.speaker_embedding: sequence of float32
    • enA.audio.speaker_embedding.full: sequence of sequence of float32
  • 分割:
    • train:
      • num_bytes: 6082731987
      • num_examples: 1967
  • 下载大小: 6101922846
  • 数据集大小: 6082731987

配置 subset_112

  • 特征:
    • line_no: int64
    • deA.id: string
    • deA.laser_score: float64
    • enA.id: string
    • enA.laser_score: float64
    • `enA
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作