xaviviro/Variants-catala-cv16_1
收藏Hugging Face2024-01-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/xaviviro/Variants-catala-cv16_1
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: balear
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: audio
dtype:
audio:
sampling_rate: 48000
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
- name: variant
dtype: string
splits:
- name: train
num_bytes: 571949427.9563617
num_examples: 15601
- name: test
num_bytes: 9785506.517906336
num_examples: 268
download_size: 537570970
dataset_size: 581734934.474268
- config_name: central
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: audio
dtype:
audio:
sampling_rate: 48000
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
- name: variant
dtype: string
splits:
- name: train
num_bytes: 19128245726.625427
num_examples: 521759
- name: test
num_bytes: 61670598.91322314
num_examples: 1689
download_size: 18768643598
dataset_size: 19189916325.53865
- config_name: nord-occidental
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: audio
dtype:
audio:
sampling_rate: 48000
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
- name: variant
dtype: string
splits:
- name: train
num_bytes: 1023833835.9423954
num_examples: 27927
- name: test
num_bytes: 8105904.652892562
num_examples: 222
download_size: 940662501
dataset_size: 1031939740.595288
- config_name: septentrional
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: audio
dtype:
audio:
sampling_rate: 48000
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
- name: variant
dtype: string
splits:
- name: train
num_bytes: 643438523.8165568
num_examples: 17551
- name: test
num_bytes: 2738481.3016528925
num_examples: 75
download_size: 532590002
dataset_size: 646177005.1182097
- config_name: valencià
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: audio
dtype:
audio:
sampling_rate: 48000
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
- name: variant
dtype: string
splits:
- name: train
num_bytes: 932364454.3161457
num_examples: 25432
- name: test
num_bytes: 7375642.97245179
num_examples: 202
download_size: 1008157848
dataset_size: 939740097.2885975
configs:
- config_name: balear
data_files:
- split: train
path: balear/train-*
- split: test
path: balear/test-*
- config_name: central
data_files:
- split: train
path: central/train-*
- split: test
path: central/test-*
- config_name: nord-occidental
data_files:
- split: train
path: nord-occidental/train-*
- split: test
path: nord-occidental/test-*
- config_name: septentrional
data_files:
- split: train
path: septentrional/train-*
- split: test
path: septentrional/test-*
- config_name: valencià
data_files:
- split: train
path: valencià/train-*
- split: test
path: valencià/test-*
---
提供机构:
xaviviro
原始信息汇总
数据集概述
数据集配置
配置名称:balear
- 特征列表:
client_id: 字符串path: 字符串audio: 音频,采样率48000sentence: 字符串up_votes: 整数down_votes: 整数age: 字符串gender: 字符串accent: 字符串locale: 字符串segment: 字符串variant: 字符串
- 数据分割:
train: 字节数571949427.9563617,样本数15601test: 字节数9785506.517906336,样本数268
- 下载大小:537570970字节
- 数据集大小:581734934.474268字节
配置名称:central
- 特征列表:
client_id: 字符串path: 字符串audio: 音频,采样率48000sentence: 字符串up_votes: 整数down_votes: 整数age: 字符串gender: 字符串accent: 字符串locale: 字符串segment: 字符串variant: 字符串
- 数据分割:
train: 字节数19128245726.625427,样本数521759test: 字节数61670598.91322314,样本数1689
- 下载大小:18768643598字节
- 数据集大小:19189916325.53865字节
配置名称:nord-occidental
- 特征列表:
client_id: 字符串path: 字符串audio: 音频,采样率48000sentence: 字符串up_votes: 整数down_votes: 整数age: 字符串gender: 字符串accent: 字符串locale: 字符串segment: 字符串variant: 字符串
- 数据分割:
train: 字节数1023833835.9423954,样本数27927test: 字节数8105904.652892562,样本数222
- 下载大小:940662501字节
- 数据集大小:1031939740.595288字节
配置名称:septentrional
- 特征列表:
client_id: 字符串path: 字符串audio: 音频,采样率48000sentence: 字符串up_votes: 整数down_votes: 整数age: 字符串gender: 字符串accent: 字符串locale: 字符串segment: 字符串variant: 字符串
- 数据分割:
train: 字节数643438523.8165568,样本数17551test: 字节数2738481.3016528925,样本数75
- 下载大小:532590002字节
- 数据集大小:646177005.1182097字节
配置名称:valencià
- 特征列表:
client_id: 字符串path: 字符串audio: 音频,采样率48000sentence: 字符串up_votes: 整数down_votes: 整数age: 字符串gender: 字符串accent: 字符串locale: 字符串segment: 字符串variant: 字符串
- 数据分割:
train: 字节数932364454.3161457,样本数25432test: 字节数7375642.97245179,样本数202
- 下载大小:1008157848字节
- 数据集大小:939740097.2885975字节
数据文件路径
配置名称:balear
train: balear/train-*test: balear/test-*
配置名称:central
train: central/train-*test: central/test-*
配置名称:nord-occidental
train: nord-occidental/train-*test: nord-occidental/test-*
配置名称:septentrional
train: septentrional/train-*test: septentrional/test-*
配置名称:valencià
train: valencià/train-*test: valencià/test-*



