jc4p/CMUdict
收藏Hugging Face2025-03-24 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/jc4p/CMUdict
下载链接
链接失效反馈官方服务:
资源简介:
CMUdict是一个基于卡内基梅隆大学发音词典的数据集,包含大约130,000个单词发音对。这个数据集主要用于训练和评估图符到音素(G2P)转换模型,将书面单词转换为它们的语音发音。数据集支持英语,并以文本-音素对的形式存储数据。它适用于多种用途,包括文本到语音(TTS)系统、自动语音识别(ASR)、语言学习应用、语言学研究和拼写校正。
CMUdict is a dataset based on the Carnegie Mellon Pronouncing Dictionary, containing approximately 130,000 word-pronunciation pairs. This dataset is primarily used for training and evaluating grapheme-to-phoneme (G2P) conversion models, converting written words to their phonetic pronunciations. The dataset supports English and stores data in the form of text-phoneme pairs. It is applicable for various use cases, including Text-to-Speech (TTS) systems, Automatic Speech Recognition (ASR), language learning applications, linguistic research, and spelling correction.
提供机构:
jc4p



