ybelkada/common_voice_mr_11_0_copy
收藏Hugging Face2023-04-04 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ybelkada/common_voice_mr_11_0_copy
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: audio
dtype:
audio:
sampling_rate: 48000
- name: sentence
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
splits:
- name: train
num_bytes: 81761699.0
num_examples: 2245
- name: validation
num_bytes: 65082681.0
num_examples: 1682
- name: test
num_bytes: 69247449.0
num_examples: 1816
- name: other
num_bytes: 109682091.0
num_examples: 2819
- name: invalidated
num_bytes: 90463060.0
num_examples: 2237
download_size: 407562763
dataset_size: 416236980.0
---
# Dataset Card for "common_voice_mr_11_0_copy"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
ybelkada
原始信息汇总
数据集概述
数据集特征
- client_id: 数据类型为字符串
- path: 数据类型为字符串
- audio: 数据类型包含音频信息,采样率为48000
- sentence: 数据类型为字符串
- up_votes: 数据类型为整数
- down_votes: 数据类型为整数
- age: 数据类型为字符串
- gender: 数据类型为字符串
- accent: 数据类型为字符串
- locale: 数据类型为字符串
- segment: 数据类型为字符串
数据集分割
- train: 大小为81761699字节,包含2245个样本
- validation: 大小为65082681字节,包含1682个样本
- test: 大小为69247449字节,包含1816个样本
- other: 大小为109682091字节,包含2819个样本
- invalidated: 大小为90463060字节,包含2237个样本
数据集大小
- 下载大小: 407562763字节
- 数据集总大小: 416236980.0字节



