A Novel Frequency Domain Linearly Constrained Minimum Variance Filter for Speech Enhancement

Document Type : Original Article

Author

Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran

Abstract

A reliable speech enhancement method is important for speech applications as a pre-processing step to improve their overall performance. In this paper, we propose a novel frequency domain method for single channel speech enhancement. Conventional frequency domain methods usually neglect the correlation between neighboring time-frequency components of the signals. In the proposed method, we take this correlation into account via: 1) considering neighboring correlation for speech signals, we break down the clean speech into two uncorrelated components; 2) considering neighboring correlation for noise, we approximate the noise as a rank-1 component. Then, we design a linearly constrained minimum variance (LCMV) filter which aims at removing the dominant part of the noise, while keeping the speech signal undistorted. Performance of the proposed method is evaluated in terms of output signal to noise ratio (SNR) and speech distortion index under various noise environments. Evaluation results demonstrate that our method yields higher noise reduction and lower speech distortion compared to some recent methods.

Keywords


1. Loizou, P.C., Speech enhancement: theory and practice, CRC
press, (2013). 
2. Vaseghi, S.V., Advanced signal processing and digital noise
reduction, John Willey & Sons LTD. And B. G Teubner, (1996). 
3. Udrea, R.M., Vizireanu, N.D. and Ciochina, S., “An improved
spectral subtraction method for speech enhancement using a 
perceptual weighting filter”, Digital Signal Processing, Vol. 18, 
No. 4, (2008), 581–587. 
4. Upadhyay, N. and Karmakar, A., “An improved multi-band 
spectral subtraction algorithm for enhancing speech in various
noise environments”, Procedia Engineering, Vol. 64, (2013),
312–321.  
5. Zhang, R. and Liu, J., “An Improved Multi-band Spectral
Subtraction using Mel-scale”, Procedia computer science, Vol.
131, (2018), 779–785.  
6. Chen, B. and Loizou, P. C., “A Laplacian-based MMSE estimator
for speech enhancement”, Speech Communication, Vol. 49, No.
2, (2007), 134–143.  
7. Wang, H.Y., Zhao, X.H. and Gu, H. J., “Speech enhancement
using super gauss mixture model of speech spectral amplitude”,
The Journal of China Universities of Posts and
Telecommunications, Vol. 18, (2011), 13–18.  
8. Abutalebi, H.R. and Rashidinejad, M., “Speech enhancement
based on β-order MMSE estimation of Short Time Spectral
Amplitude and Laplacian speech modeling”, Speech
Communication, Vol. 67, (2015), 92–101.  
9. Benesty, J. and Chen, J., Optimal time-domain noise reduction
filters, Springer, (2011). 
10. Huang, Y.A. and Benesty, J., “A multi-frame approach to the
frequency-domain single-channel noise reduction problem”,
IEEE Transactions on Audio, Speech, and Language
Processing, Vol. 20, No. 4, (2011), 1256–1269.  
11. Huang, H., Zhao, L., Chen, J. and Benesty, J., “A minimum
variance distortionless response filter based on the bifrequency
spectrum for single-channel noise reduction”, Digital Signal
Processing, Vol. 33, (2014), 169–179.  
12. Jensen, J.R., Benesty, J., Christensen, M.G. and Chen, J., “An
LCMV filter for single-channel noise cancellation and reduction
in the time domain”, In 2013 IEEE Workshop on Applications of
Signal Processing to Audio and Acoustics, IEEE, (2013), 1–4.  
13. Jensen, J.R., Benesty, J., Christensen, M.G. and Chen, J., “A class
of optimal rectangular filtering matrices for single-channel signal
enhancement in the time domain”, IEEE Transactions on Audio,
Speech, and Language Processing, Vol. 21, No. 12, (2103),
2595–2606.  
14. Kammi, S., “Single channel speech enhancement using an MVDR
filter in the frequency domain”, International Journal of Speech
Technology, Vol. 22, No. 2, (2019), 383–389.  
15. Hansen, P.C. and Jensen, S. H., “Subspace-based noise reduction
for speech signals via diagonal and triangular matrix 
decompositions: Survey and analysis”, EURASIP Journal on
Advances in Signal Processing, Vol. 2007, No. 1, (2007), 1–24.  
16. Sun, C., Zhu, Q. and Wan, M., “A novel speech enhancement
method based on constrained low-rank and sparse matrix
decomposition”, Speech Communication, Vol. 60, (2014), 44–
55.  
17. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G. and Pallett,
D.S., “Getting started with the DARPA TIMIT CD-ROM: An
acoustic phonetic continuous speech database,” National Institute
of Standards and Technology (NIST), Gaithersburgh, (1988). 
18. Varga, A. and Steeneken, H. J., “Assessment for automatic speech
recognition: II. NOISEX-92: A database and an experiment to
study the effect of additive noise on speech recognition systems”,
Speech Communication, Vol. 12, No. 3, (1993), 247–251.  
19. Cohen, I., “Noise spectrum estimation in adverse environments:
Improved minima controlled recursive averaging”, IEEE
Transactions on Speech and Audio Processing, Vol. 11, No. 5,
(2003), 466–475.  
20. ITU-T P, Perceptual evaluation of speech quality (PESQ): An
objective method for end-to-end speech quality assessment of
narrow-band telephone networks and speech codecs, Rec. ITU-T
P. 862., (2001). 
21. Hu, Y. and Loizou, P. C., “Evaluation of objective quality
measures for speech enhancement”, IEEE Transactions on
Audio, Speech, and Language Processing, Vol. 16, No. 1,
(2007), 229–238.