Enhancing speech recognition through AFF-DCCRN using a PMUT-based bone conduction microphone system

Name: Enhancing speech recognition through AFF-DCCRN using a PMUT-based bone conduction microphone system
Creator: Liu, Chongbin
License: 暂无描述

IEEE2026-04-17 收录

下载链接：

https://ieee-dataport.org/documents/enhancing-speech-recognition-through-aff-dccrn-using-pmut-based-bone-conduction-microphone

下载链接

链接失效反馈

官方服务：

资源简介：

Speech recognition in noisy environments has long posed a challenge in the Internet of Things (IoT) systems. Speech enhancement (SE) based on neural networks is a widely employed technique for addressing this issue. However, the pre-trained filter parameters used in this approach are prone to overfitting.In this work, a speech enhancement model based on the fusion of bone conduction (BC) and air conduction (AC) speech is developed, aiming to improve speech recognition accuracy in noisy environments. A customized BCM system based on piezoelectric micromachined ultrasonic transducers (PMUTs) is employed to collect the real-time BC speech, while a commercial ACM is used to pick up the AC speech. Instead of using only AC speech, both BC and AC speech are input into an attention-based feature fusion (AFF) module. In this module, the noise-insensitive BC speech serves as a clean speech reference to adapt the SE backbone of AC speech. The fused speech is then processed by a deep complex convolutional recurrent network (DCCRN) module, resulting in enhanced speech. Compared with the original noisy speech, the enhanced speech achieves a character error rate (CER) reduction of over 20%, approaching the result decoded using clean speech. The results indicate that the BC-based speech enhancement model efficiently integrates the characteristics of both types of speech, thereby improving speech recognition accuracy in noisy environments.This work presents an innovative IoT system designed to enhance speech recognition in noisy environments.

提供机构：

Liu, Chongbin

5,000+

优质数据集

54 个

任务类型

进入经典数据集