A Real-Time Implementation of a Nested U-NET-Based Speech Enhancement 


Vol. 48,  No. 9, pp. 1064-1071, Sep.  2023
10.7840/kics.2023.48.9.1064


PDF
  Abstract

Speech enhancement is a technology that removes noise and enhances speech intelligibility and is used in many fields, such as voice recognition, video conferencing, etc. Recently, DNN-based speech enhancement techniques have gained attention, and the Nested U-Net model, which effectively utilizes local and global information of the speech signal, has shown excellent performance. To extend the usage of speech enhancement, real-time execution should be possible on edge devices such as smartphones. In this paper, NUNet-TLS, one of the latest models based on Nested U-Net, was implemented in real-time in a smartphone app environment. Due to frequent memory usage, the dilated convolution used in NUNet-TLS requires a long computation time. Still, by replacing it with LSTM(Long-Short Term Memory), we significantly reduced memory usage and computation time. The proposed model was implemented as an app, with data input/output processed frame-by-frame. This app demonstrated lower memory usage than previous models while maintaining comparable speech enhancement performance in real-time execution.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Related Articles
  Cite this article

[IEEE Style]

J. Cha, S. Hwang, S. W. Park, Y. Park, "A Real-Time Implementation of a Nested U-NET-Based Speech Enhancement," The Journal of Korean Institute of Communications and Information Sciences, vol. 48, no. 9, pp. 1064-1071, 2023. DOI: 10.7840/kics.2023.48.9.1064.

[ACM Style]

Jae-Bin Cha, Seo-Rim Hwang, Sung Wook Park, and Young-Cheol Park. 2023. A Real-Time Implementation of a Nested U-NET-Based Speech Enhancement. The Journal of Korean Institute of Communications and Information Sciences, 48, 9, (2023), 1064-1071. DOI: 10.7840/kics.2023.48.9.1064.

[KICS Style]

Jae-Bin Cha, Seo-Rim Hwang, Sung Wook Park, Young-Cheol Park, "A Real-Time Implementation of a Nested U-NET-Based Speech Enhancement," The Journal of Korean Institute of Communications and Information Sciences, vol. 48, no. 9, pp. 1064-1071, 9. 2023. (https://doi.org/10.7840/kics.2023.48.9.1064)
Vol. 48, No. 9 Index