Explainable DNN-Based Beamformer With Postfilter

This paper introduces an explainable DNN-based beamformer with a postfilter (ExNet-BF+PF) for multichannel signal processing. Our approach combines the U-Net network with a beamformer structure to address this problem. The method involves a two-stage processing pipeline. In the first stage, time-inv...

Full description

Saved in:

Bibliographic Details
Published in	IEEE Transactions on Audio Speech and Language Processing Vol. 33; pp. 3070 - 3084
Main Authors	Cohen, Adi, Wong, Daniel, Lee, Jung-Suk, Gannot, Sharon
Format	Journal Article
Language	English
Published	IEEE 2025
Subjects	Array signal processing Beamforming deep neural networks for spatial filtering Microphones Neural networks Noise postfilter Spatial filters Speech enhancement Time-frequency analysis Training Vectors Wiener filters
Online Access	Get full text
ISSN	2998-4173 2998-4173
DOI	10.1109/TASLPRO.2025.3581110

Cover

More Information
Summary:	This paper introduces an explainable DNN-based beamformer with a postfilter (ExNet-BF+PF) for multichannel signal processing. Our approach combines the U-Net network with a beamformer structure to address this problem. The method involves a two-stage processing pipeline. In the first stage, time-invariant weights are applied to construct a multichannel spatial filter, namely a beamformer. In the second stage, a time-varying single-channel post-filter is applied at the beamformer output. Additionally, we incorporate an attention mechanism inspired by its successful application in noisy and reverberant environments to improve speech enhancement further. The proposed scheme obviates the necessity for prior knowledge of the speaker's activity, which is required in classical beamforming designs. Furthermore, our study fills a gap in the existing literature by conducting a thorough spatial analysis of the network's performance. Specifically, we examine how the network utilizes spatial information during processing. This analysis yields valuable insights into the network's functionality, thereby enhancing our understanding of its overall performance. A thorough experimental study analyzes the performance of the proposed method in various scenarios and compares both the results and computational resource consumption with several baseline methods, including the classical Minimum Variance Distortionless Response (MVDR) beamformer, Deep Neural Network (DNN)-based spatial filters that preserve a beamforming structure, and deep end-to-end schemes.
ISSN:	2998-4173 2998-4173
DOI:	10.1109/TASLPRO.2025.3581110