BiLSTM-Based Mask Post-Processing Method for a Generalized Eigenvalue Beamformer 


Vol. 46,  No. 6, pp. 1078-1086, Jun.  2021
10.7840/kics.2021.46.6.1078


PDF
  Abstract

Generalized eigenvalue (GEV) beamforming can estimate a target speech signal from a multi-channel microphone array in noisy environments, where it does not rely on the array structure as well as direction-of-arrival (DOA) of the target speech. The GEV beamformer is realized by using power spectrum density (psd) matrices for the target speech and noise. This paper proposes a binary mask post-processing method based on a bidirectional long short-term memory (BiLSTM) neural network for GEV beamformer. The BiLSTM is trained by using spectrograms of multi-channel input speech and ideal binary mask as input and outputs, respectively. Then, the estimated binary mask is applied to multi-channel noisy speech signals to obtain the speech and noise psd matrices. To further improve the quality of enhanced speech, the estimated binary mask is also applied to a post-processing stage of the GEV beamformer. The performance of the GEV beamformer is evaluated on a task of CHiME-3 by measuring the perceptual evaluation of speech quality (PESQ) and signal-to-distortion ratio (SDR). Experiments show that the GEV beamformer employing the proposed binary mask post-processing method improves PESQ and SDR by 0.34 mean opinion score (MOS) and 0.91 dB, respectively, compared to the conventional BiLSTM-based mask estimation method.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

I. Song and H. K. Kim, "BiLSTM-Based Mask Post-Processing Method for a Generalized Eigenvalue Beamformer," The Journal of Korean Institute of Communications and Information Sciences, vol. 46, no. 6, pp. 1078-1086, 2021. DOI: 10.7840/kics.2021.46.6.1078.

[ACM Style]

Ilhoon Song and Hong Kook Kim. 2021. BiLSTM-Based Mask Post-Processing Method for a Generalized Eigenvalue Beamformer. The Journal of Korean Institute of Communications and Information Sciences, 46, 6, (2021), 1078-1086. DOI: 10.7840/kics.2021.46.6.1078.

[KICS Style]

Ilhoon Song and Hong Kook Kim, "BiLSTM-Based Mask Post-Processing Method for a Generalized Eigenvalue Beamformer," The Journal of Korean Institute of Communications and Information Sciences, vol. 46, no. 6, pp. 1078-1086, 6. 2021. (https://doi.org/10.7840/kics.2021.46.6.1078)