MASRI-HEADSET
收藏arXiv2020-08-13 更新2024-06-21 收录
下载链接:
https://www.um.edu.mt/projects/masri/downloads.html
下载链接
链接失效反馈官方服务:
资源简介:
MASRI-HEADSET是马耳他大学MASRI项目开发的第一个专门用于自动语音识别的马耳他语语料库。该数据集包含8小时的语音数据,与文本配对,由来自马耳他不同地区的25名参与者在实验室环境中录制。数据集内容丰富,涵盖了广泛的语音和语音学变异,适用于训练自动语音识别系统。创建过程中,通过精心设计的语音样本选择和录制方法,确保了数据的质量和多样性。该数据集主要应用于马耳他语的自动语音识别技术研究,旨在解决该语言在数字领域资源不足的问题。
MASRI-HEADSET is the first Maltese language corpus specifically developed for automatic speech recognition (ASR) by the MASRI Project at the University of Malta. This dataset contains 8 hours of paired speech and text data, recorded in a controlled laboratory environment by 25 participants from various regions of Malta. Featuring rich content that encompasses a wide range of phonetic and phonological variations, the dataset is suitable for training automatic speech recognition systems. During its development, meticulously designed speech sample selection and recording protocols were employed to guarantee data quality and diversity. Primarily targeted at research on Maltese automatic speech recognition technology, this dataset aims to address the scarcity of digital linguistic resources for the Maltese language.
提供机构:
马耳他大学
创建时间:
2020-08-13



