Prosody Labelled Dataset for Hindi
收藏arXiv2021-12-11 更新2024-06-21 收录
下载链接:
https://github.com/esha-banerjee/Hindi_Au-ToBI
下载链接
链接失效反馈官方服务:
资源简介:
Prosody Labelled Dataset for Hindi是由尼赫鲁大学开发的一个半自动标注的韵律数据库,旨在增强自动语音识别(ASR)和文本到语音(TTS)系统中的语调组件,并有助于构建语音到语音机器翻译系统。该数据集包含5000条声明性和疑问性类型的句子(共23500个单词),通过手动注释的印地语语音数据来训练预测模型,生成自动韵律标签。数据集的创建过程涉及手动标注和自动标注两个阶段,应用领域主要集中在提升语音处理系统的自然度和准确性。
The Prosody Labelled Dataset for Hindi is a semi-automatically annotated prosodic database developed by Jawaharlal Nehru University. It aims to enhance the prosodic components in automatic speech recognition (ASR) and text-to-speech (TTS) systems, and facilitate the construction of speech-to-speech machine translation systems. This dataset contains 5,000 declarative and interrogative sentences, totaling 23,500 words. It uses manually annotated Hindi speech data to train prediction models for generating automatic prosodic labels. The dataset's creation involves two stages: manual annotation and automatic annotation. Its primary application focuses on improving the naturalness and accuracy of speech processing systems.
提供机构:
尼赫鲁大学
创建时间:
2021-12-11



