awajai/43-143-phase2-appconv-ime-large-v3-prepared
收藏Hugging Face2024-07-19 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/awajai/43-143-phase2-appconv-ime-large-v3-prepared
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含音频数据及其对应的文本句子,每个样本包括音频文件、句子文本、文件路径、输入长度、输入特征、标签和标签长度。数据集仅包含一个训练分割,共有11425个样本,总大小为18843241945.475字节,下载大小为3797183803字节。
This dataset is primarily used for speech recognition tasks, containing audio, text, and related metadata. The audio features have a sampling rate of 16000Hz, suitable for high-precision speech processing. The dataset structure includes a training set, providing rich feature information such as sentences, paths, input lengths, input features, labels, and label lengths, facilitating in-depth training and evaluation of speech recognition models.
提供机构:
awajai
原始信息汇总
数据集概述
特征信息
- audio:
- 数据类型: 音频
- 采样率: 16000
- sentence:
- 数据类型: 字符串
- path:
- 数据类型: 字符串
- input_length:
- 数据类型: 整数 (int64)
- input_features:
- 数据类型: 序列 (float32)
- labels:
- 数据类型: 序列 (int64)
- labels_length:
- 数据类型: 整数 (int64)
数据分割
- train:
- 样本数量: 11425
- 数据大小: 18843241945.475 字节
数据集大小
- 下载大小: 3797183803 字节
- 数据集总大小: 18843241945.475 字节
配置信息
- config_name: default
- 数据文件路径: data/train-*



