five

asahi417/seamless-align-deA-enA

收藏
Hugging Face2024-06-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-deA-enA
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含多个子集,每个子集包含两种语言的音频数据(英语和德语),以及相关的元数据(如ID、URL、持续时间、Laser评分等)。每个子集都有一个训练集,包含一定数量的示例和字节大小。数据集的下载大小和总大小也有详细说明。

The dataset contains multiple subsets, each of which includes audio data in two languages (English and German), along with related metadata (such as ID, URL, duration, Laser score, etc.). Each subset has a training set with a certain number of examples and byte size. The download size and total size of the dataset are also detailed.
提供机构:
asahi417
原始信息汇总

数据集概述

数据集配置及特征

  • config_name: subset_1

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 391029574.41
        • num_examples: 2069
    • download_size: 347950054
    • dataset_size: 391029574.41
  • config_name: subset_10

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 366498883.554
        • num_examples: 2113
    • download_size: 314411078
    • dataset_size: 366498883.554
  • config_name: subset_100

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 274118452.254
        • num_examples: 1987
    • download_size: 257462749
    • dataset_size: 274118452.254
  • config_name: subset_101

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 277378107.24
        • num_examples: 2040
    • download_size: 260669365
    • dataset_size: 277378107.24
  • config_name: subset_102

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 280310703.52
        • num_examples: 2034
    • download_size: 262403898
    • dataset_size: 280310703.52
  • config_name: subset_103

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 271914640.084
        • num_examples: 1989
    • download_size: 261034165
    • dataset_size: 271914640.084
  • config_name: subset_104

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 278023618.36
        • num_examples: 1990
    • download_size: 257159265
    • dataset_size: 278023618.36
  • config_name: subset_105

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 286875664.339
        • num_examples: 2069
    • download_size: 256950563
    • dataset_size: 286875664.339
  • config_name: subset_106

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 286422771.359
        • num_examples: 2037
    • download_size: 266719547
    • dataset_size: 286422771.359
  • config_name: subset_107

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 275560564.256
        • num_examples: 2024
    • download_size: 259974020
    • dataset_size: 275560564.256
  • config_name: subset_108

    • features:
      • enA.audio: audio
      • deA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 267951886.0
        • num_examples: 2000
    • download_size: 250059126
    • dataset_size: 267951886.0
  • config_name: subset_109

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 267754654.919
        • num_examples: 1993
    • download_size: 251445325
    • dataset_size: 267754654.919
  • config_name: subset_11

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 357645584.161
        • num_examples: 2139
    • download_size: 316550261
    • dataset_size: 357645584.161
  • config_name: subset_110

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 260382181.576
        • num_examples: 1996
    • download_size: 244583910
    • dataset_size: 260382181.576
  • config_name: subset_111

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 274062316.138
        • num_examples: 2022
    • download_size: 250141698
    • dataset_size: 274062316.138
  • config_name: subset_112

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.duration_start: int64
      • enA.duration_end: int64
      • enA.laser_score: float64
    • splits:
      • train
        • num_bytes: 281481830.16
        • num_examples: 2040
    • download_size: 260572035
    • dataset_size: 281481830.16
  • config_name: subset_113

    • features:
      • deA.audio: audio
      • enA.audio: audio
      • line_no: int64
      • deA.id: string
      • deA.url: string
      • deA.duration_start: int64
      • deA.duration_end: int64
      • deA.laser_score: float64
      • enA.id: string
      • enA.url: string
      • enA.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作