artyomboyko/common_voice_15_0_RU
收藏Hugging Face2023-12-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/artyomboyko/common_voice_15_0_RU
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: client_id
dtype: string
- name: path
dtype: audio
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accents
dtype: string
- name: variant
dtype: float64
- name: locale
dtype: string
- name: segment
dtype: string
splits:
- name: train
num_bytes: 1061971115.28
num_examples: 26328
- name: test
num_bytes: 377748044.084
num_examples: 10196
- name: validated
num_bytes: 5852414660.504
num_examples: 158417
- name: other
num_bytes: 642660551.92
num_examples: 12585
- name: invalidated
num_bytes: 467995235.5
num_examples: 9795
download_size: 7829612580
dataset_size: 8402789607.288
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
- split: validated
path: data/validated-*
- split: other
path: data/other-*
- split: invalidated
path: data/invalidated-*
license: cc0-1.0
source_datasets:
- extended|common_voice
task_categories:
- automatic-speech-recognition
paperswithcode_id: common-voice
language:
- ru
pretty_name: Common Voice Corpus 15.0
size_categories:
- 100K<n<1M
---
提供机构:
artyomboyko
原始信息汇总
数据集概述
数据集信息
特征
- client_id: 字符串类型
- path: 音频类型
- sentence: 字符串类型
- up_votes: 64位整数类型
- down_votes: 64位整数类型
- age: 字符串类型
- gender: 字符串类型
- accents: 字符串类型
- variant: 64位浮点数类型
- locale: 字符串类型
- segment: 字符串类型
分割
- train: 字节数为1061971115.28,样本数为26328
- test: 字节数为377748044.084,样本数为10196
- validated: 字节数为5852414660.504,样本数为158417
- other: 字节数为642660551.92,样本数为12585
- invalidated: 字节数为467995235.5,样本数为9795
大小
- 下载大小: 7829612580字节
- 数据集大小: 8402789607.288字节
配置
- config_name: default
- data_files:
- train: data/train-*
- test: data/test-*
- validated: data/validated-*
- other: data/other-*
- invalidated: data/invalidated-*
- data_files:
其他信息
- 许可证: cc0-1.0
- 来源数据集: extended|common_voice
- 任务类别: automatic-speech-recognition
- paperswithcode_id: common-voice
- 语言: 俄语
- 友好名称: Common Voice Corpus 15.0
- 大小类别: 100K<n<1M



