adityarra07/train_17000
收藏Hugging Face2023-10-19 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/adityarra07/train_17000
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: audio
dtype:
audio:
sampling_rate: 16000
- name: transcription
dtype: string
- name: id
dtype: string
splits:
- name: train
num_bytes: 2265737853.4844837
num_examples: 17000
- name: test
num_bytes: 26655739.452758636
num_examples: 200
download_size: 2265471038
dataset_size: 2292393592.9372425
---
# Dataset Card for "train_17000"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset includes three main features: audio, transcription, and ID. The audio feature has a sampling rate of 16000, and both transcription and ID are string types. The dataset is divided into a training set and a test set, with 17000 samples in the training set and 200 samples in the test set. The total download size of the dataset is 2265471038 bytes, and the total size is 2292393592.9372425 bytes.
提供机构:
adityarra07
原始信息汇总
数据集概述
特征信息
- 音频
- 采样率: 16000
- 转录文本
- 数据类型: 字符串
- ID
- 数据类型: 字符串
数据分割
- 训练集
- 字节数: 2265737853.4844837
- 样本数: 17000
- 测试集
- 字节数: 26655739.452758636
- 样本数: 200
数据大小
- 下载大小: 2265471038
- 数据集大小: 2292393592.9372425



