XBMU-bo-Lhasa31:A Speech Recognition Dataset for the Lhasa Dialect of Tibetan

Name: XBMU-bo-Lhasa31:A Speech Recognition Dataset for the Lhasa Dialect of Tibetan
Creator: Science Data Bank
Published: 2025-07-01 07:59:19
License: 暂无描述

DataCite Commons2025-07-01 更新2026-05-05 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=bdd9e8849a584d7d9152163022e58c6c

下载链接

链接失效反馈

官方服务：

资源简介：

The dataset consists of audio files, text files and description files. Where (1) wav is the audio folder, under which it is divided into 51 subfolders according to the speaker, with a total duration of 31.61 hours, containing 24,289 speech samples, with an average duration of 4.68 seconds each, totaling 2.68 GB.(2) The text in the transcript file corresponds to the audio one-to-one, where all the textual data are derived from the news domain, and the textual non pronunciation symbols are normalized. (3)The readme.txt file contains some basic information of the dataset. (4) resource_lexicon.txt is the pronunciation lexicon file.

提供机构：

Science Data Bank

创建时间：

2025-07-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集