ganga4364/stt_tibetan_dialects_data
收藏Hugging Face2024-12-27 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/ganga4364/stt_tibetan_dialects_data
下载链接
链接失效反馈官方服务:
资源简介:
这是一个为自动语音识别(ASR)任务设计的藏语语音数据集,包含来自不同部门的语音样本。数据集分为训练集、验证集和测试集三个子集。每个样本包括音频文件、音频的Unicode转写和Wylie转写格式文本。数据集还包含了音频文件的部门信息、年级、文本长度和音频长度等特征。
This dataset is designed for Automatic Speech Recognition (ASR) tasks with Tibetan speech data, including speech samples from various departments. The dataset is split into three subsets: training set, validation set, and test set. Each sample consists of an audio file, its Unicode transcription, and Wylie transliteration. The dataset also includes features such as the department of the audio file, grade, text length, and audio duration.
提供机构:
ganga4364



