asahi417/seamless-align-deA-enA
收藏Hugging Face2024-06-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/asahi417/seamless-align-deA-enA
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个子集,每个子集包含两种语言的音频数据(英语和德语),以及相关的元数据(如ID、URL、持续时间、Laser评分等)。每个子集都有一个训练集,包含一定数量的示例和字节大小。数据集的下载大小和总大小也有详细说明。
The dataset contains multiple subsets, each of which includes audio data in two languages (English and German), along with related metadata (such as ID, URL, duration, Laser score, etc.). Each subset has a training set with a certain number of examples and byte size. The download size and total size of the dataset are also detailed.
提供机构:
asahi417
原始信息汇总
数据集概述
数据集配置及特征
-
config_name: subset_1
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 391029574.41
- num_examples: 2069
- train
- download_size: 347950054
- dataset_size: 391029574.41
- features:
-
config_name: subset_10
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 366498883.554
- num_examples: 2113
- train
- download_size: 314411078
- dataset_size: 366498883.554
- features:
-
config_name: subset_100
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 274118452.254
- num_examples: 1987
- train
- download_size: 257462749
- dataset_size: 274118452.254
- features:
-
config_name: subset_101
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 277378107.24
- num_examples: 2040
- train
- download_size: 260669365
- dataset_size: 277378107.24
- features:
-
config_name: subset_102
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 280310703.52
- num_examples: 2034
- train
- download_size: 262403898
- dataset_size: 280310703.52
- features:
-
config_name: subset_103
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 271914640.084
- num_examples: 1989
- train
- download_size: 261034165
- dataset_size: 271914640.084
- features:
-
config_name: subset_104
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 278023618.36
- num_examples: 1990
- train
- download_size: 257159265
- dataset_size: 278023618.36
- features:
-
config_name: subset_105
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 286875664.339
- num_examples: 2069
- train
- download_size: 256950563
- dataset_size: 286875664.339
- features:
-
config_name: subset_106
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 286422771.359
- num_examples: 2037
- train
- download_size: 266719547
- dataset_size: 286422771.359
- features:
-
config_name: subset_107
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 275560564.256
- num_examples: 2024
- train
- download_size: 259974020
- dataset_size: 275560564.256
- features:
-
config_name: subset_108
- features:
- enA.audio: audio
- deA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 267951886.0
- num_examples: 2000
- train
- download_size: 250059126
- dataset_size: 267951886.0
- features:
-
config_name: subset_109
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 267754654.919
- num_examples: 1993
- train
- download_size: 251445325
- dataset_size: 267754654.919
- features:
-
config_name: subset_11
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 357645584.161
- num_examples: 2139
- train
- download_size: 316550261
- dataset_size: 357645584.161
- features:
-
config_name: subset_110
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 260382181.576
- num_examples: 1996
- train
- download_size: 244583910
- dataset_size: 260382181.576
- features:
-
config_name: subset_111
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 274062316.138
- num_examples: 2022
- train
- download_size: 250141698
- dataset_size: 274062316.138
- features:
-
config_name: subset_112
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.duration_start: int64
- enA.duration_end: int64
- enA.laser_score: float64
- splits:
- train
- num_bytes: 281481830.16
- num_examples: 2040
- train
- download_size: 260572035
- dataset_size: 281481830.16
- features:
-
config_name: subset_113
- features:
- deA.audio: audio
- enA.audio: audio
- line_no: int64
- deA.id: string
- deA.url: string
- deA.duration_start: int64
- deA.duration_end: int64
- deA.laser_score: float64
- enA.id: string
- enA.url: string
- enA.
- features:



