TusharGoel/hindi-annotated-tts
收藏Hugging Face2024-04-16 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/TusharGoel/hindi-annotated-tts
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
config_name: hi
features:
- name: client_id
dtype: string
- name: path
dtype: string
- name: text
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: age
dtype: string
- name: gender
dtype: string
- name: accent
dtype: string
- name: locale
dtype: string
- name: segment
dtype: string
- name: utterance_pitch_mean
dtype: float32
- name: utterance_pitch_std
dtype: float32
- name: snr
dtype: float64
- name: c50
dtype: float64
- name: speaking_rate
dtype: float64
- name: phonemes
dtype: string
splits:
- name: train
num_bytes: 1458178
num_examples: 2691
- name: test
num_bytes: 1121450
num_examples: 2095
- name: validation
num_bytes: 1086881
num_examples: 2020
- name: other
num_bytes: 1222527
num_examples: 2126
- name: invalidated
num_bytes: 263773
num_examples: 485
download_size: 1130056
dataset_size: 5152809
configs:
- config_name: hi
data_files:
- split: train
path: hi/train-*
- split: test
path: hi/test-*
- split: validation
path: hi/validation-*
- split: other
path: hi/other-*
- split: invalidated
path: hi/invalidated-*
---
提供机构:
TusharGoel
原始信息汇总
数据集概述
数据集特征
- client_id: 字符串类型
- path: 字符串类型
- text: 字符串类型
- up_votes: 整数类型(int64)
- down_votes: 整数类型(int64)
- age: 字符串类型
- gender: 字符串类型
- accent: 字符串类型
- locale: 字符串类型
- segment: 字符串类型
- utterance_pitch_mean: 浮点数类型(float32)
- utterance_pitch_std: 浮点数类型(float32)
- snr: 浮点数类型(float64)
- c50: 浮点数类型(float64)
- speaking_rate: 浮点数类型(float64)
- phonemes: 字符串类型
数据集分割
- train: 2691个样本,占用1458178字节
- test: 2095个样本,占用1121450字节
- validation: 2020个样本,占用1086881字节
- other: 2126个样本,占用1222527字节
- invalidated: 485个样本,占用263773字节
数据集大小
- 下载大小: 1130056字节
- 数据集总大小: 5152809字节



