Feature Extraction Using Hidden Markov Model for a Phonetic Process

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/6793169

下载链接

链接失效反馈

官方服务：

资源简介：

Speech is one of the primary forms of communication among humans. In real life, a dictionary is used to seek the pronunciation of a complex word; but, for computers, this look-up table is called a phonetic dictionary. A speech recognition process tags a word-utterance to its phoneme structure, thereby returning the grapheme representation. However, the speech recognition process is challenging because of the contextual relationship between words and sentences, dependent on speakers’ intentions. Further, factors influencing time, accents, noisy environment, and data security impose accuracy threats. The present research study proposes a new hybrid speech recognition model by considering three significant aspects: sound generation through phonetic representation, sound acoustics for transmission, and sound reception on how the sound is received. These steps are achieved through a speech-to-text model divided into various stages such as noise removal, speech-pause detection, feature extraction through framing, and windowing by adopting Hidden Markov Model (HMM). The implementation is performed on a phonetic tool, Praat. The robustness of the model is estimated using evaluation metrics such as f-measure and accuracy, resulting in 98% and 99% scores, respectively. Thus, the proposed approach efficiently transforms the spoken words into their corresponding text.

创建时间：

2024-07-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集