my-north-ai/cv_mls_psfb_fs17_68
收藏Hugging Face2025-10-04 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/my-north-ai/cv_mls_psfb_fs17_68
下载链接
链接失效反馈官方服务:
资源简介:
该数据集主要用于音频处理和语音识别任务。它包含了音频文件、对应的转录文本以及音频的持续时间。音频文件的采样率为16000 Hz,适用于高精度的语音识别。数据集被划分为训练集、验证集和特定于Bracarense的测试集,分别用于模型训练、验证和测试。训练集包含109682个样本,验证集包含11304个样本,测试集包含1177个样本。数据集的总下载大小为42720638370字节,实际数据集大小为37791632469.298字节。
This dataset is primarily used for audio processing and speech recognition tasks. It includes audio files, corresponding transcriptions, and the duration of the audio. The audio files have a sampling rate of 16000 Hz, suitable for high-precision speech recognition. The dataset is divided into a training set, a validation set, and a test set specific to Bracarense, used for model training, validation, and testing respectively. The training set contains 109682 samples, the validation set contains 11304 samples, and the test set contains 1177 samples. The total download size of the dataset is 42720638370 bytes, and the actual dataset size is 37791632469.298 bytes.
提供机构:
my-north-ai
原始信息汇总
数据集概述
特征
- audio:
- 采样率: 16000
- transcription:
- 数据类型: string
- duration:
- 数据类型: float32
数据分割
- train:
- 字节数: 31951379121.666
- 样本数: 109682
- validation:
- 字节数: 5032132659.544
- 样本数: 11304
- test_bracarense:
- 字节数: 808120688.088
- 样本数: 1177
数据集大小
- 下载大小: 42720638370
- 数据集大小: 37791632469.298
配置
- config_name: default
- data_files:
- train: data/train-*
- validation: data/validation-*
- test_bracarense: data/test_bracarense-*
- data_files:



