eswardivi/IndicVoices
收藏Hugging Face2024-03-06 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/eswardivi/IndicVoices
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: telugu_train
features:
- name: audio
dtype: audio
- name: end
dtype: float64
- name: text
dtype: string
- name: start
dtype: float64
- name: speaker_id
dtype: int64
- name: audio_raw_duration
dtype: float64
- name: scenario
dtype: string
- name: task_name
dtype: string
splits:
- name: split1
num_bytes: 1609730034.0
num_examples: 1000
- name: split2
num_bytes: 1049782097.0
num_examples: 1000
- name: split3
num_bytes: 1444386250.0
num_examples: 1000
- name: split4
num_bytes: 3152422807.0
num_examples: 1000
- name: split5
num_bytes: 5554666300.0
num_examples: 1000
- name: split6
num_bytes: 4003049531.0
num_examples: 500
- name: split7
num_bytes: 4053228693.0
num_examples: 500
- name: split8
num_bytes: 1044856061.0
num_examples: 344
- name: split9
num_bytes: 5059082700.0
num_examples: 500
- name: split10
num_bytes: 4664957526.0
num_examples: 500
download_size: 29946910673
dataset_size: 31636161999.0
- config_name: telugu_valid
features:
- name: audio
dtype: audio
- name: text
dtype: string
- name: audio_raw_duration
dtype: float64
- name: scenario
dtype: string
- name: task_name
dtype: string
splits:
- name: valid
num_bytes: 314053800.0
num_examples: 73
download_size: 301914962
dataset_size: 314053800.0
configs:
- config_name: telugu_train
data_files:
- split: split1
path: telugu_train/split1-*
- split: split2
path: telugu_train/split2-*
- split: split3
path: telugu_train/split3-*
- split: split4
path: telugu_train/split4-*
- split: split5
path: telugu_train/split5-*
- split: split6
path: telugu_train/split6-*
- split: split7
path: telugu_train/split7-*
- split: split8
path: telugu_train/split8-*
- split: split9
path: telugu_train/split9-*
- split: split10
path: telugu_train/split10-*
- config_name: telugu_valid
data_files:
- split: valid
path: telugu_valid/valid-*
---
This dataset is processed from [indicvoices](https://ai4bharat.iitm.ac.in/indicvoices/)
提供机构:
eswardivi
原始信息汇总
数据集概述
配置名称:telugu_train
- 特征:
- audio: 音频
- end: 浮点数
- text: 字符串
- start: 浮点数
- speaker_id: 整数
- audio_raw_duration: 浮点数
- scenario: 字符串
- task_name: 字符串
- 分割:
- split1: 1609730034.0 字节, 1000 个样本
- split2: 1049782097.0 字节, 1000 个样本
- split3: 1444386250.0 字节, 1000 个样本
- split4: 3152422807.0 字节, 1000 个样本
- split5: 5554666300.0 字节, 1000 个样本
- split6: 4003049531.0 字节, 500 个样本
- split7: 4053228693.0 字节, 500 个样本
- split8: 1044856061.0 字节, 344 个样本
- split9: 5059082700.0 字节, 500 个样本
- split10: 4664957526.0 字节, 500 个样本
- 下载大小:29946910673 字节
- 数据集大小:31636161999.0 字节
配置名称:telugu_valid
- 特征:
- audio: 音频
- text: 字符串
- audio_raw_duration: 浮点数
- scenario: 字符串
- task_name: 字符串
- 分割:
- valid: 314053800.0 字节, 73 个样本
- 下载大小:301914962 字节
- 数据集大小:314053800.0 字节
数据文件路径
- telugu_train:
- split1: telugu_train/split1-*
- split2: telugu_train/split2-*
- split3: telugu_train/split3-*
- split4: telugu_train/split4-*
- split5: telugu_train/split5-*
- split6: telugu_train/split6-*
- split7: telugu_train/split7-*
- split8: telugu_train/split8-*
- split9: telugu_train/split9-*
- split10: telugu_train/split10-*
- telugu_valid:
- valid: telugu_valid/valid-*



