Residual Echo Suppression Considering Harmonic Distortion and Temporal Correlation
Abstract
:Featured Application
Abstract
1. Introduction
2. Problem Formulation
3. Harmonic Distortion Residual Echo Suppression
4. Residual Echo Suppression Considering Harmonic Distortion and Temporal Correlation
5. Experiments
5.1. Experiments with Simulated Data
5.2. Experiments with Real-Recorded Data
5.3. Computational Complexity of the Proposed RES Algorithm
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Gay, S.L.; Benesty, J. Acoustic Signal Processing for Telecommunication; Springer Science & Business Media: Berlin, Germany, 2012; Volume 551. [Google Scholar]
- Hong, J. Stereophonic Acoustic Echo Suppression for Speech Interfaces for Intelligent TV Applications. IEEE Trans. Consum. Electron. 2018, 64, 153–161. [Google Scholar] [CrossRef]
- Spriet, A.; Proudler, I.; Moonen, M.; Wouters, J. Adaptive feedback cancellation in hearing aids with linear prediction of the desired signal. IEEE Trans. Signal Process. 2005, 53, 3749–3763. [Google Scholar] [CrossRef]
- Spriet, A.; Rombouts, G.; Moonen, M.; Wouters, J. Adaptive feedback cancellation in hearing aids. J. Frankl. Inst. 2006, 343, 545–573. [Google Scholar] [CrossRef]
- Paleologu, C.; Ciochină, S.; Benesty, J.; Grant, S.L. An overview on optimized NLMS algorithms for acoustic echo cancellation. EURASIP J. Adv. Signal Process. 2015, 2015, 97. [Google Scholar] [CrossRef] [Green Version]
- Jung, H.K.; Kim, N.S.; Kim, T. A new double-talk detector using echo path estimation. Speech Commun. 2005, 45, 41–48. [Google Scholar] [CrossRef]
- Gänsler, T.; Benesty, J. The fast normalized cross-correlation double-talk detector. Signal Process. 2006, 86, 1124–1139. [Google Scholar] [CrossRef]
- Malvar, H. A modulated complex lapped transform and its applications to audio processing. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP99 (Cat. No. 99CH36258), Phoenix, AZ, USA, 15–19 March 1999; IEEE: Piscataway, NJ, USA, 1999; Volume 3, pp. 1421–1424. [Google Scholar]
- Yang, F.; Enzner, G.; Yang, J. Frequency-domain adaptive Kalman filter with fast recovery of abrupt echo-path changes. IEEE Signal Process. Lett. 2017, 24, 1778–1782. [Google Scholar] [CrossRef]
- Shi, K.; Ma, X. A frequency domain step-size control method for LMS algorithms. IEEE Signal Process. Lett. 2009, 17, 125–128. [Google Scholar]
- Yang, F.; Cao, Y.; Wu, M.; Albu, F.; Yang, J. Frequency-Domain Filtered-x LMS Algorithms for Active Noise Control: A Review and New Insights. Appl. Sci. 2018, 8, 2313. [Google Scholar] [CrossRef] [Green Version]
- Ni, J.; Li, F. A variable step-size matrix normalized subband adaptive filter. IEEE Trans. Audio Speech Lang. Process. 2009, 18, 1290–1299. [Google Scholar]
- Ni, J.; Li, F. Adaptive combination of subband adaptive filters for acoustic echo cancellation. IEEE Trans. Consum. Electron. 2010, 56, 1549–1555. [Google Scholar] [CrossRef]
- Stenger, A.; Trautmann, L.; Rabenstein, R. Nonlinear acoustic echo cancellation with 2nd order adaptive Volterra filters. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP99 (Cat. No. 99CH36258), Phoenix, AZ, USA, 15–19 March 1999; Volume 2, pp. 877–880. [Google Scholar]
- Guérin, A.; Faucon, G.; Le Bouquin-Jeannès, R. Nonlinear acoustic echo cancellation based on Volterra filters. IEEE Trans. Speech Audio Process. 2003, 11, 672–683. [Google Scholar] [CrossRef]
- Azpicueta-Ruiz, L.A.; Zeller, M.; Figueiras-Vidal, A.R.; Arenas-García, J.; Kellermann, W. Adaptive combination of Volterra kernels and its application to nonlinear acoustic echo cancellation. IEEE Trans. Audio Speech Lang. Process. 2010, 19, 97–110. [Google Scholar] [CrossRef]
- Park, J.; Chang, J.H. State-Space Microphone Array Nonlinear Acoustic Echo Cancellation Using Multi-Microphone Near-End Speech Covariance. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 1520–1534. [Google Scholar] [CrossRef]
- Avendano, C. Acoustic echo suppression in the STFT domain. In Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No. 01TH8575), New Platz, NY, USA, 24–24 October 2001; pp. 175–178. [Google Scholar]
- Faller, C.; Chen, J. Suppressing acoustic echo in a spectral envelope space. IEEE Trans. Speech Audio Process. 2005, 13, 1048–1062. [Google Scholar] [CrossRef]
- Park, Y.S.; Chang, J.H. Frequency domain acoustic echo suppression based on soft decision. IEEE Signal Process. Lett. 2008, 16, 53–56. [Google Scholar] [CrossRef]
- Panda, B.; Kar, A.; Chandra, M. Non-linear adaptive echo supression algorithms: A technical survey. In Proceedings of the 2014 International Conference on Communication and Signal Processing, Melmaruvathur, India, 3–5 April 2014; pp. 76–80. [Google Scholar]
- Hoshuyama, O.; Sugiyama, A. An acoustic echo suppressor based on a frequency-domain model of highly nonlinear residual echo. In Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, Toulouse, France, 14–19 May 2006; Volume 5, pp. 269–272. [Google Scholar]
- Lee, S.Y.; Kim, N.S. A statistical model-based residual echo suppression. IEEE Signal Process. Lett. 2007, 14, 758–761. [Google Scholar] [CrossRef]
- Schwarz, A.; Hofmann, C.; Kellermann, W. Spectral feature-based nonlinear residual echo suppression. In Proceedings of the Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 20–23 October 2013; pp. 1–4. [Google Scholar]
- Lee, C.M.; Shin, J.W.; Kim, N.S. DNN-based residual echo suppression. In Proceedings of the Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, 6–10 September 2015. [Google Scholar]
- Carbajal, G.; Serizel, R.; Vincent, E.; Humbert, E. Multiple-input neural network-based residual echo suppression. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 231–235. [Google Scholar]
- Kuech, F.; Kellermann, W. Nonlinear residual echo suppression using a power filter model of the acoustic echo path. In Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP ’07, Honolulu, HI, USA, 15–20 April 2007; Volume 1, p. I-73. [Google Scholar]
- Bendersky, D.A.; Stokes, J.W.; Malvar, H.S. Nonlinear residual acoustic echo suppression for high levels of harmonic distortion. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 261–264. [Google Scholar]
- Martin, R. Noise power spectral density estimation based on optimal smoothing and minimum statistics. IEEE Trans. Speech Audio Process. 2001, 9, 504–512. [Google Scholar] [CrossRef] [Green Version]
- Park, Y.S.; Chang, J.H. Double-talk detection based on soft decision for acoustic echo suppression. Signal Process. 2010, 90, 1737–1741. [Google Scholar] [CrossRef]
- Lamel, L.F.; Kassel, R.H.; Seneff, S. Speech database development: Design and analysis of the acoustic-phonetic corpus. In Proceedings of the ESCA Tutorial and Research Workshop on Speech Input/Output Assessment and Speech Databases, Noordwijkerhout, The Netherlands, 20–23 September 1989; pp. 161–170. [Google Scholar]
- Varga, A.; Steeneken, H.J. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun. 1993, 12, 247–251. [Google Scholar] [CrossRef]
- Malik, S.; Enzner, G. State-space frequency-domain adaptive filtering for nonlinear acoustic echo cancellation. IEEE Trans. Audio Speech Lang. Process. 2012, 20, 2065–2079. [Google Scholar] [CrossRef]
- Comminiello, D.; Scarpiniti, M.; Azpicueta-Ruiz, L.A.; Arenas-Garcia, J.; Uncini, A. Functional link adaptive filters for nonlinear acoustic echo cancellation. IEEE Trans. Audio Speech Lang. Process. 2013, 21, 1502–1512. [Google Scholar] [CrossRef]
- Allen, J.B.; Berkley, D.A. Image method for efficiently simulating small-room acoustics. J. Acoust. Soc. Am. 1979, 65, 943–950. [Google Scholar] [CrossRef]
- 3GPP. 3GPP TS 26.132, Technical Specification Group Services and System Aspects; Speech and Video Telephony Terminal Acoustic Test Specification; European Telecommunications Standards Institute: Sophia Antipolis, France, 2020. [Google Scholar]
- ETSI. ETSI EG 202 396-1 Speech Processing, Transmission and Quality Aspects (STQ); Speech Quality Performance in the Presence of Background Noise; Part 1: Background Noise Simulation Technique and Background Noise Database; European Telecommunications Standards Institute: Sophia Antipolis, France, 2008. [Google Scholar]
Noise Type | ENR | AEC [8,30] | AEC [8,30] +PFRES | AEC [8,30] +HDRES [28] | AEC [8,30] +Proposed RES |
---|---|---|---|---|---|
clean | 8.59 | 13.14 | 9.58 | 14.41 | |
White | 20 dB | 7.94 | 12.56 | 8.96 | 13.53 |
15 dB | 7.34 | 12.03 | 8.29 | 12.52 | |
10 dB | 6.25 | 10.85 | 7.09 | 11.04 | |
Babble | 20 dB | 7.87 | 12.71 | 8.89 | 13.53 |
15 dB | 7.30 | 12.07 | 8.26 | 12.67 | |
10 dB | 6.21 | 10.86 | 7.07 | 11.32 | |
Factory | 20 dB | 7.90 | 12.36 | 8.91 | 13.55 |
15 dB | 7.36 | 11.91 | 8.32 | 12.78 | |
10 dB | 6.28 | 10.92 | 7.13 | 11.44 | |
Average | 7.31 | 11.94 | 8.25 | 12.68 |
Noise Type | SNR | SER | AEC [8,30] | AEC [8,30] +PFRES [27] | AEC [8,30] +HDRES | AEC [8,30] +Proposed RES |
---|---|---|---|---|---|---|
clean | 5dB | 2.902 | 2.911 | 2.918 | 2.991 | |
0 dB | 2.665 | 2.710 | 2.697 | 2.771 | ||
−5 dB | 2.468 | 2.511 | 2.473 | 2.540 | ||
White | 20 dB | 5 dB | 2.660 | 2.658 | 2.673 | 2.734 |
0 dB | 2.459 | 2.490 | 2.491 | 2.553 | ||
−5 dB | 2.287 | 2.309 | 2.291 | 2.345 | ||
15 dB | 5 dB | 2.498 | 2.485 | 2.510 | 2.567 | |
0 dB | 2.351 | 2.362 | 2.362 | 2.421 | ||
−5 dB | 2.179 | 2.200 | 2.183 | 2.227 | ||
10 dB | 5 dB | 2.279 | 2.270 | 2.290 | 2.340 | |
0 dB | 2.179 | 2.181 | 2.189 | 2.241 | ||
−5 dB | 2.039 | 2.045 | 2.041 | 2.079 | ||
Babble | 20 dB | 5 dB | 2.501 | 2.512 | 2.513 | 2.568 |
0 dB | 2.322 | 2.368 | 2.352 | 2.407 | ||
−5 dB | 2.170 | 2.208 | 2.174 | 2.217 | ||
15 dB | 5 dB | 2.322 | 2.336 | 2.332 | 2.381 | |
0 dB | 2.180 | 2.227 | 2.208 | 2.255 | ||
−5 dB | 2.055 | 2.085 | 2.057 | 2.092 | ||
10 dB | 5 dB | 2.089 | 2.108 | 2.096 | 2.138 | |
0 dB | 1.991 | 2.043 | 2.016 | 2.053 | ||
−5 dB | 1.904 | 1.926 | 1.904 | 1.929 | ||
Factory | 20dB | 5dB | 2.584 | 2.589 | 2.598 | 2.658 |
0 dB | 2.391 | 2.432 | 2.422 | 2.479 | ||
−5 dB | 2.228 | 2.254 | 2.231 | 2.280 | ||
15 dB | 5 dB | 2.422 | 2.430 | 2.434 | 2.489 | |
0 dB | 2.259 | 2.306 | 2.289 | 2.342 | ||
−5 dB | 2.118 | 2.150 | 2.122 | 2.166 | ||
10 dB | 5 dB | 2.206 | 2.212 | 2.216 | 2.266 | |
0 dB | 2.090 | 2.126 | 2.117 | 2.161 | ||
−5 dB | 1.985 | 2.007 | 1.987 | 2.022 | ||
Average | 2.293 | 2.315 | 2.306 | 2.357 |
Noise Type | ENR | AEC [8,30] | AEC [8,30] +PFRES | AEC [8,30] +HDRES | AEC [8,30] +Proposed RES |
---|---|---|---|---|---|
clean | 13.44 | 20.58 | 20.38 | 22.39 | |
Pub | 20 dB | 13.27 | 17.29 | 17.15 | 18.39 |
15 dB | 11.24 | 14.73 | 14.94 | 15.90 | |
10 dB | 8.51 | 11.61 | 12.04 | 13.00 | |
Road | 20 dB | 13.44 | 17.63 | 16.96 | 18.93 |
15 dB | 11.56 | 15.36 | 14.69 | 16.65 | |
10 dB | 9.02 | 12.51 | 11.77 | 13.86 | |
Callcenter | 20 dB | 13.47 | 17.45 | 17.06 | 18.53 |
15 dB | 11.69 | 15.13 | 14.93 | 16.21 | |
10 dB | 9.21 | 12.19 | 12.01 | 13.30 | |
Average | 11.49 | 15.45 | 15.19 | 16.72 |
Noise Type | SNR | SER | AEC [8,30] | AEC [8,30] +PFRES [27] | AEC [8,30] +HDRES | AEC [8,30] +Proposed Method |
---|---|---|---|---|---|---|
clean | 5 dB | 3.178 | 3.206 | 3.243 | 3.275 | |
0 dB | 3.013 | 3.057 | 3.087 | 3.135 | ||
−5 dB | 2.815 | 2.874 | 2.879 | 2.962 | ||
Pub | 20 dB | 5 dB | 3.005 | 3.024 | 3.033 | 3.065 |
0 dB | 2.910 | 2.944 | 2.953 | 3.000 | ||
−5 dB | 2.747 | 2.799 | 2.792 | 2.871 | ||
15 dB | 5 dB | 2.767 | 2.788 | 2.779 | 2.807 | |
0 dB | 2.719 | 2.753 | 2.740 | 2.780 | ||
−5 dB | 2.603 | 2.653 | 2.623 | 2.694 | ||
10 dB | 5 dB | 2.684 | 2.707 | 2.694 | 2.720 | |
0 dB | 2.642 | 2.675 | 2.657 | 2.696 | ||
−5 dB | 2.543 | 2.593 | 2.554 | 2.624 | ||
Road | 20 dB | 5 dB | 3.097 | 3.123 | 3.144 | 3.174 |
0 dB | 2.969 | 3.010 | 3.029 | 3.074 | ||
−5 dB | 2.781 | 2.838 | 2.838 | 2.913 | ||
15 dB | 5 dB | 2.904 | 2.935 | 2.940 | 2.969 | |
0 dB | 2.824 | 2.867 | 2.871 | 2.913 | ||
−5 dB | 2.681 | 2.736 | 2.724 | 2.797 | ||
10 dB | 5 dB | 2.828 | 2.861 | 2.864 | 2.892 | |
0 dB | 2.758 | 2.802 | 2.804 | 2.846 | ||
−5 dB | 2.637 | 2.693 | 2.675 | 2.748 | ||
Callcenter | 20 dB | 5 dB | 3.056 | 3.078 | 3.092 | 3.127 |
0 dB | 2.944 | 2.983 | 2.994 | 3.043 | ||
−5 dB | 2.769 | 2.824 | 2.817 | 2.897 | ||
15 dB | 5 dB | 2.836 | 2.858 | 2.858 | 2.889 | |
0 dB | 2.776 | 2.810 | 2.806 | 2.851 | ||
−5 dB | 2.641 | 2.693 | 2.674 | 2.747 | ||
10 dB | 5 dB | 2.750 | 2.774 | 2.769 | 2.800 | |
0 dB | 2.702 | 2.736 | 2.728 | 2.772 | ||
−5 dB | 2.589 | 2.638 | 2.618 | 2.687 | ||
Average | 2.806 | 2.844 | 2.843 | 2.892 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, H.; Shin, J.W. Residual Echo Suppression Considering Harmonic Distortion and Temporal Correlation. Appl. Sci. 2020, 10, 5291. https://doi.org/10.3390/app10155291
Song H, Shin JW. Residual Echo Suppression Considering Harmonic Distortion and Temporal Correlation. Applied Sciences. 2020; 10(15):5291. https://doi.org/10.3390/app10155291
Chicago/Turabian StyleSong, Hyungchan, and Jong Won Shin. 2020. "Residual Echo Suppression Considering Harmonic Distortion and Temporal Correlation" Applied Sciences 10, no. 15: 5291. https://doi.org/10.3390/app10155291
APA StyleSong, H., & Shin, J. W. (2020). Residual Echo Suppression Considering Harmonic Distortion and Temporal Correlation. Applied Sciences, 10(15), 5291. https://doi.org/10.3390/app10155291