five

Enhancing speech recognition through AFF-DCCRN using a PMUT-based bone conduction microphone system

收藏
IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/enhancing-speech-recognition-through-aff-dccrn-using-pmut-based-bone-conduction-microphone
下载链接
链接失效反馈
官方服务:
资源简介:
Speech recognition in noisy environments has long posed a challenge in the Internet of Things (IoT) systems. Speech enhancement (SE) based on neural networks is a widely employed technique for addressing this issue. However, the pre-trained filter parameters used in this approach are prone to overfitting.In this work, a speech enhancement model based on the fusion of bone conduction (BC) and air conduction (AC) speech is developed, aiming to improve speech recognition accuracy in noisy environments. A customized BCM system based on piezoelectric micromachined ultrasonic transducers (PMUTs) is employed to collect the real-time BC speech, while a commercial ACM is used to pick up the AC speech. Instead of using only AC speech, both BC and AC speech are input into an attention-based feature fusion (AFF) module. In this module, the noise-insensitive BC speech serves as a clean speech reference to adapt the SE backbone of AC speech. The fused speech is then processed by a deep complex convolutional recurrent network (DCCRN) module, resulting in enhanced speech. Compared with the original noisy speech, the enhanced speech achieves a character error rate (CER) reduction of over 20%, approaching the result decoded using clean speech. The results indicate that the BC-based speech enhancement model efficiently integrates the characteristics of both types of speech, thereby improving speech recognition accuracy in noisy environments.This work presents an innovative IoT system designed to enhance speech recognition in noisy environments.
提供机构:
Liu, Chongbin
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作