henriklied/voicecraft-no
收藏Hugging Face2024-04-06 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/henriklied/voicecraft-no
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: nst
features:
- name: segment_id
dtype: string
- name: speaker
dtype: string
- name: text
dtype: string
- name: text_new
dtype: string
- name: audio
struct:
- name: array
sequence: float32
- name: path
dtype: string
- name: sampling_rate
dtype: int64
- name: begin_time
dtype: float64
- name: end_time
dtype: float64
splits:
- name: train
num_bytes: 183489269690.92148
num_examples: 535605
- name: validation
num_bytes: 20387810826.721405
num_examples: 59512
- name: test
num_bytes: 35978429238.35712
num_examples: 105021
download_size: 240315200972
dataset_size: 239855509756.0
- config_name: nst-ostlandsk
features:
- name: segment_id
dtype: string
- name: roy_id
dtype: int64
- name: gender_id
dtype: int64
- name: speaker
dtype: string
- name: text
dtype: string
- name: audio
struct:
- name: array
sequence: float32
- name: path
dtype: string
- name: sampling_rate
dtype: int64
- name: begin_time
dtype: float64
- name: end_time
dtype: float64
splits:
- name: train
num_bytes: 62174683582.94103
num_examples: 184064
- name: validation
num_bytes: 6908448304.058969
num_examples: 20452
- name: test
num_bytes: 7675903543.0
num_examples: 22724
download_size: 76905978307
dataset_size: 76759035430.0
- config_name: nst-ostlandsk_v2
features:
- name: audio
dtype: audio
- name: segment_id
dtype: string
- name: roy_id
dtype: int64
- name: gender_id
dtype: int64
- name: speaker
dtype: string
- name: text
dtype: string
- name: begin_time
dtype: float64
- name: end_time
dtype: float64
splits:
- name: train
num_bytes: 18775179430.44
num_examples: 104742
- name: validation
num_bytes: 120639192.0
num_examples: 712
- name: test
num_bytes: 1345589810.96
num_examples: 8166
download_size: 18385499312
dataset_size: 20241408433.399998
- config_name: nst-ostlandsk_v3
features:
- name: audio
dtype: audio
- name: segment_id
dtype: string
- name: roy_id
dtype: int64
- name: gender_id
dtype: int64
- name: speaker
dtype: string
- name: text
dtype: string
- name: begin_time
dtype: float64
- name: end_time
dtype: float64
splits:
- name: train
num_bytes: 18776645818.44
num_examples: 104742
- name: validation
num_bytes: 120649160.0
num_examples: 712
- name: test
num_bytes: 1345704134.96
num_examples: 8166
download_size: 18390006583
dataset_size: 20242999113.399998
configs:
- config_name: nst
data_files:
- split: train
path: nst/train-*
- split: validation
path: nst/validation-*
- split: test
path: nst/test-*
- config_name: nst-ostlandsk
data_files:
- split: train
path: nst-ostlandsk/train-*
- split: validation
path: nst-ostlandsk/validation-*
- split: test
path: nst-ostlandsk/test-*
- config_name: nst-ostlandsk_v2
data_files:
- split: train
path: nst-ostlandsk_v2/train-*
- split: validation
path: nst-ostlandsk_v2/validation-*
- split: test
path: nst-ostlandsk_v2/test-*
- config_name: nst-ostlandsk_v3
data_files:
- split: train
path: nst-ostlandsk_v3/train-*
- split: validation
path: nst-ostlandsk_v3/validation-*
- split: test
path: nst-ostlandsk_v3/test-*
---
提供机构:
henriklied
原始信息汇总
数据集概述
数据集 nst
- 特征:
segment_id: 字符串类型speaker: 字符串类型text: 字符串类型text_new: 字符串类型audio: 结构化数据,包含array(序列类型为float32),path(字符串类型),sampling_rate(整数类型)begin_time: 浮点数类型end_time: 浮点数类型
- 分割:
train: 535605 个样本,占用 183489269690.92148 字节validation: 59512 个样本,占用 20387810826.721405 字节test: 105021 个样本,占用 35978429238.35712 字节
- 下载大小: 240315200972 字节
- 数据集大小: 239855509756.0 字节
数据集 nst-ostlandsk
- 特征:
segment_id: 字符串类型roy_id: 整数类型gender_id: 整数类型speaker: 字符串类型text: 字符串类型audio: 结构化数据,包含array(序列类型为float32),path(字符串类型),sampling_rate(整数类型)begin_time: 浮点数类型end_time: 浮点数类型
- 分割:
train: 184064 个样本,占用 62174683582.94103 字节validation: 20452 个样本,占用 6908448304.058969 字节test: 22724 个样本,占用 7675903543.0 字节
- 下载大小: 76905978307 字节
- 数据集大小: 76759035430.0 字节
数据集 nst-ostlandsk_v2
- 特征:
audio: 音频类型segment_id: 字符串类型roy_id: 整数类型gender_id: 整数类型speaker: 字符串类型text: 字符串类型begin_time: 浮点数类型end_time: 浮点数类型
- 分割:
train: 104742 个样本,占用 18775179430.44 字节validation: 712 个样本,占用 120639192.0 字节test: 8166 个样本,占用 1345589810.96 字节
- 下载大小: 18385499312 字节
- 数据集大小: 20241408433.399998 字节
数据集 nst-ostlandsk_v3
- 特征:
audio: 音频类型segment_id: 字符串类型roy_id: 整数类型gender_id: 整数类型speaker: 字符串类型text: 字符串类型begin_time: 浮点数类型end_time: 浮点数类型
- 分割:
train: 104742 个样本,占用 18776645818.44 字节validation: 712 个样本,占用 120649160.0 字节test: 8166 个样本,占用 1345704134.96 字节
- 下载大小: 18390006583 字节
- 数据集大小: 20242999113.399998 字节



