Multi-channel speech enhancement based on noise feedback using MVDR and MTGAN
收藏中国科学数据2026-03-25 更新2026-04-25 收录
下载链接:
https://www.sciengine.com/AA/doi/10.3724/SP.J.1249.2026.01093
下载链接
链接失效反馈官方服务:
资源简介:
Mainstream multi-channel speech enhancement systems typically adopt a cascaded architecture that combines beamforming and post-filtering. In non-stationary noise environments, beamforming often suffers from degraded spatial filtering performance due to noise estimation errors, whereas deep learning-based post-filtering improves residual noise suppression but incurs high computational cost. This paper proposes a closed-loop enhancement framework that integrates minimum variance distortionless response (MVDR) beamformer with a multi-target generative adversarial network (MTGAN), achieving joint spatial-frequency optimization via a noise estimation feedback mechanism. In this framework, a dual-branch generator in MTGAN simultaneously performs post-filtering and noise estimation, while the estimated noise is dynamically fed back into the MVDR's covariance matrix update to enable iterative closed-loop optimization. Simulations results on public datasets show that the proposed noise feedback mechanism effectively improves the MVDR output performance. Compared with the existing MVDR-CRUSE system, the proposed MVDR+MTGAN system approach not only reduces model complexity (by 10.5% from 2.38×106 to 2.13×106 parameters) but also yields substantial gains across speech quality metrics, with a 6.56 dB increase in average segmental signal to noise ration and a 0.17 improvement in the overall composite overall voice quality prediction score (COVL). The proposed method provides an efficient and effective solution for multichannel speech enhancement in complex acoustic environments.
创建时间:
2026-01-17



