Ship-Radiated Noise Separation in Underwater Acoustic Environments Using a Deep Time-Domain Network
Abstract
:1. Introduction
2. Deep Time-Domain Ship-Radiated Noise Separation Network
2.1. Characteristics of Ship-Radiated Noise
2.2. Deep Time-Domain Ship-Radiated Noise Separation Network
2.3. Encoder Layer and Decoder Layer
2.4. Separation Layer Containing Parallel Dilated Convolution and Group Convolution
2.5. Training Objective
3. Evaluation and Validation
3.1. Dataset and Parameter Settings
3.2. Evaluation Metrics and Comparison Models
3.3. Validation and Performance Analysis
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, Z.; Wang, R.; Yin, F.; Wang, B.; Peng, W. Speech dereverberation method based on spectral subtraction and spectral line enhancement. Appl. Acoust. 2016, 112, 201–210. [Google Scholar] [CrossRef]
- Xiao, K.; Wang, S.; Wan, M.; Wu, L. Radiated noise suppression for electrolarynx speech based on multiband time-domain amplitude modulation. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 26, 1585–1593. [Google Scholar] [CrossRef]
- Chen, J.; Benesty, J.; Huang, Y.; Doclo, S. New insights into the noise reduction Wiener filter. IEEE Trans. Audio Speech Lang. Process. 2006, 14, 1218–1234. [Google Scholar] [CrossRef]
- Erçelebi, E. Speech enhancement based on the discrete Gabor transform and multi-notch adaptive digital filters. Appl. Acoust. 2004, 65, 739–762. [Google Scholar] [CrossRef]
- Sayoud, A.; Djendi, M.; Medahi, S.; Guessoum, A. A dual fast NLMS adaptive filtering algorithm for blind speech quality enhancement. Appl. Acoust. 2018, 135, 101–110. [Google Scholar] [CrossRef]
- Surendran, S.; Kumar, T.K. Oblique Projection and Cepstral Subtraction in Signal Subspace Speech Enhancement for Colored Noise Reduction. IEEE/ACM Trans. Audio Speech Lang. Process. 2018, 26, 2328–2340. [Google Scholar] [CrossRef]
- Fattorini, M.; Brandini, C. Observation strategies based on singular value decomposition for ocean analysis and forecast. Water 2020, 12, 3445. [Google Scholar] [CrossRef]
- Zhao, S.X.; Ma, L.S.; Xu, L.Y.; Liu, M.N.; Chen, X.L. A Study of Fault Signal Noise Reduction Based on Improved CEEMDAN-SVD. Appl. Sci. 2023, 13, 10713. [Google Scholar] [CrossRef]
- Zhao, X.Z.; Nie, Z.G.; Ye, B.Y.; Chen, T.J. Number law of effective singular values of signal and its application to feature extraction. J. Vibr. Eng 2016, 29, 532–541. [Google Scholar] [CrossRef]
- Zou, H.; Xue, L. A selective overview of sparse principal component analysis. Proc. IEEE 2018, 106, 1311–1320. [Google Scholar] [CrossRef]
- Hao, J.; Lee, I.; Lee, T.W.; Sejnowski, T.J. Independent Vector Analysis for Source Separation Using a Mixture of Gaussians Prior. Neural Comput. 2010, 22, 1646–1673. [Google Scholar] [CrossRef]
- Ikeshita, R.; Nakatani, T. Independent Vector Extraction for Fast Joint Blind Source Separation and Dereverberation. IEEE Signal Process. Lett. 2021, 28, 972–976. [Google Scholar] [CrossRef]
- Gaeta, M.; Briolle, F.; Esparcieux, P. Blind separation of sources applied to convolutive mixtures in shallow water. In Proceedings of the IEEE Signal Processing Workshop on Higher-Order Statistics, Banff, AB, Canada, 21–23 July 1997; pp. 340–343. [Google Scholar] [CrossRef]
- Kirsteins, I.P. Blind separation of signal and multipath interference for synthetic aperture sonar. In Proceedings of the Oceans 2003. Celebrating the Past… Teaming Toward the Future (IEEE Cat. No. 03CH37492), San Diego, CA, USA, 22–26 September 2003; pp. 2641–2648. [Google Scholar] [CrossRef]
- Mansour, A.; Benchekroun, N.; Gervaise, C. Blind Separation of Underwater Acoustic Signals. In Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation: 6th International Conference, Charleston, SC, USA, 5–8 March 2006; pp. 181–188. [Google Scholar] [CrossRef]
- Kamal, S.; Supriya, M.H.; Pillai, P.R.S. Blind source separation of nonlinearly mixed ocean acoustic signals using Slow Feature Analysis. In Proceedings of the OCEANS 2011 IEEE-Spain, Santander, Spain, 6–9 June 2011; pp. 1–7. [Google Scholar] [CrossRef]
- Zhang, X.; Fan, W.; Xia, Z.; Kang, C. Tow ship interference cancelling based on blind source separation algorithm. In Proceedings of the International Conference on Awareness Science & Technology, Dalian, China, 27–30 September 2011; pp. 465–468. [Google Scholar] [CrossRef]
- Tu, S.; Chen, H. Blind Source Separation of Underwater Acoustic Signal by Use of Negentropy-Based Fast ICA Algorithm. In Proceedings of the IEEE International Conference on Computational Intelligence and Communication Technology, Ghaziabad, India, 13–14 February 2015; pp. 608–611. [Google Scholar] [CrossRef]
- Li, G.; Dou, M.; Zhang, L.; Wang, H. Underwater Near Field Sources Separation and Tracking with Hydrophone Array Based on Spatial Filter. In Proceedings of the Chinese Automation Congress (CAC), Jinan, China, 20–22 October 2017; pp. 5274–5278. [Google Scholar] [CrossRef]
- Park, S.R.; Lee, J.W. A fully convolutional neural network for speech enhancement. In Proceedings of the International Speech Communication Association (INTERSPEECH 2017), Stockholm, Sweden, 20–24 August 2017; pp. 1465–1468. [Google Scholar] [CrossRef]
- Jansson, A.; Humphrey, E.; Montecchio, N.; Bittner, R.; Kumar, A.; Weyde, T. Singing voice separation with deep u-net convolutional networks. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR 2017), Suzhou, China, 23–27 October 2017; pp. 745–751. [Google Scholar]
- Choi, H.S.; Kim, J.H.; Huh, J.; Kim, A.; Ha, J.W.; Lee, K. Phase-Aware Speech Enhancement with Deep Complex U-Net. In Proceedings of the International Conference on Learning Representations (ICLR 2019), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar] [CrossRef]
- Kong, Q.; Cao, Y.; Liu, H.; Choi, K. Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR 2021), Virtual, 7–12 November 2021; pp. 342–349. [Google Scholar] [CrossRef]
- Isik, Y.Z.; Roux, J.L.; Chen, Z.; Watanabe, S.; Hershey, J.R. Single-Channel Multi-Speaker Separation Using Deep Clustering. In Proceedings of the International Speech Communication Association (INTERSPEECH 2016), San Francisco, CA, USA, 8–16 September 2016; pp. 545–549. [Google Scholar] [CrossRef]
- Chen, J.; Wang, D. Long short-term memory for speaker generalization in supervised speech separation. J. Acoust. Soc. Am. 2017, 141, 4705–4714. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. Adv. Neural Inf. Process. Syst. 2014, 27, 1–9. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar] [CrossRef]
- Liu, Y.Z.; Wang, D.L. Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 2092–2102. [Google Scholar] [CrossRef] [PubMed]
- Šarić, Z.; Subotić, M.; Bilibajkić, R.; Barjaktarović, M.; Stojanović, J. Supervised speech separation combined with adaptive beamforming. Comput. Speech Lang. 2022, 76, 101419. [Google Scholar] [CrossRef]
- Tan, K.; Chen, J.; Wang, D. Gated Residual Networks with Dilated Convolutions for Monaural Speech Enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 189–198. [Google Scholar] [CrossRef] [PubMed]
- Luo, Y.; Mesgarani, N. TaSNet: Time-domain audio separation network for real-time, single-channel speech separation. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Calgary, AB, Canada, 15–20 April 2018; pp. 696–700. [Google Scholar] [CrossRef]
- Luo, Y.; Mesgarani, N. Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation. IEEE/ACM Trans. Audio Speech Lang. Process. 2019, 27, 1256–1266. [Google Scholar] [CrossRef]
- Urick, R.J. Principles of Underwater Sound, 3rd ed.; McGraw-Hill Book Company: New York, NY, USA, 1983. [Google Scholar]
- Purushothaman, A.; Sreeram, A.; Kumar, R.; Ganapathy, S. Dereverberation of autoregressive envelopes for far-field speech recognition. Comput. Speech Lang. 2022, 72, 101277. [Google Scholar] [CrossRef]
- Lei, X.; Pan, H.; Huang, X. A Dilated CNN Model for Image Classification. IEEE Access 2019, 7, 124087–124095. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, X.; Jung, C. DCSR: Dilated Convolutions for Single Image Super-Resolution. IEEE Trans. Image Process. 2019, 28, 1625–1635. [Google Scholar] [CrossRef] [PubMed]
- Ren, Z.; Kong, Q.; Han, J.; Plumbley, M.D.; Schuller, B.W. Attention-Based Atrous Convolution Neural Networks: Visualsation and Understanding Perspectives of Acoustic Scenes. In Proceedings of the 44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019. [Google Scholar] [CrossRef]
- Chen, L.C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar] [CrossRef]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar] [CrossRef]
- Ni, J.; Gao, J.; Li, J.; Yang, H.; Hao, Z.; Han, Z. E-AlexNet: Quality evaluation of strawberry based on machine learning. J. Food Meas. Charact. 2021, 15, 4530–4541. [Google Scholar] [CrossRef]
- Lee, Y.; Park, J.; Lee, C.O. Two-level group convolution. Neural Netw. 2022, 154, 323–332. [Google Scholar] [CrossRef] [PubMed]
- Mirchandani, G.; Foote, R.; Rockmore, D.N.; Healy, D.; Olson, T. A wreath product group approach to signal and image processing. II. Convolution, correlation, and applications. IEEE Trans. Signal Process. 2000, 48, 749–767. [Google Scholar] [CrossRef]
- Irfan, M.; Jiangbin, Z.; Ali, S.; Iqbal, M.; Masood, Z.; Hamid, U. DeepShip: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification. Expert Syst. Appl. 2021, 183, 115270. [Google Scholar] [CrossRef]
- Vincent, E.; Gribonval, R.; F’evotte, C. Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 2006, 14, 1462–1469. [Google Scholar] [CrossRef]
- Taal, C.H.; Hendriks, R.C.; Heusdens, R.; Jensen, J. An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech. J. Acoust. Soc. Am. 2011, 130, 3013–3027. [Google Scholar] [CrossRef]
Parameter | Parameter Description | Value |
---|---|---|
Number of convolutional kernels in the encoder layer | 512 | |
Convolutional kernel size in the encoder layer | 16 | |
Number of convolutional kernels in the separation layer | 128 | |
Number of convolutional kernels in the separation layer (generating masks) | 512 | |
Number of convolutional blocks in the separation layer | 3 | |
Number of convolutional units in a convolutional block | 8 | |
Number of input convolutional kernels in a convolutional unit | 512 | |
Number of groups in each dilated convolution of the convolutional unit | 8 | |
Convolution kernel size for group convolution in a convolutional unit (dilated convolution) | 3 | |
Number of residual-connection convolutional kernels in a convolutional unit | 128 | |
Number of skip-connection convolutional kernels in a convolutional unit | 128 |
Methods | SNR (dB) | SegSNR (dB) | SNRi (dB) | SI-SNRi (dB) |
---|---|---|---|---|
Conv-TasNet | 16.6928 | 17.4112 | 16.9565 | 1.9447 |
Using parallel dilated convolution | 16.8952 | 17.6931 | 17.2388 | 2.6442 |
Using group convolution | 16.7148 | 17.5242 | 16.9841 | 1.9843 |
Proposed method | 16.8609 | 17.7508 | 17.3023 | 2.5261 |
Methods | SNR (dB) | SegSNR (dB) | SNRi (dB) | SI-SNRi (dB) |
---|---|---|---|---|
Res-UNet | 15.3931 | 16.3755 | 15.5706 | 0.0788 |
UNet | 15.6725 | 16.6569 | 15.8283 | 0.1509 |
Proposed method | 16.8609 | 17.7508 | 17.3023 | 2.5261 |
Methods | Model Size (M) | GFLOPs |
---|---|---|
Res-UNet | 103.0 | 28.5 |
UNet | 33.4 | 33.7 |
Proposed method | 3.67 | 19.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
He, Q.; Wang, H.; Zeng, X.; Jin, A. Ship-Radiated Noise Separation in Underwater Acoustic Environments Using a Deep Time-Domain Network. J. Mar. Sci. Eng. 2024, 12, 885. https://doi.org/10.3390/jmse12060885
He Q, Wang H, Zeng X, Jin A. Ship-Radiated Noise Separation in Underwater Acoustic Environments Using a Deep Time-Domain Network. Journal of Marine Science and Engineering. 2024; 12(6):885. https://doi.org/10.3390/jmse12060885
Chicago/Turabian StyleHe, Qunyi, Haitao Wang, Xiangyang Zeng, and Anqi Jin. 2024. "Ship-Radiated Noise Separation in Underwater Acoustic Environments Using a Deep Time-Domain Network" Journal of Marine Science and Engineering 12, no. 6: 885. https://doi.org/10.3390/jmse12060885
APA StyleHe, Q., Wang, H., Zeng, X., & Jin, A. (2024). Ship-Radiated Noise Separation in Underwater Acoustic Environments Using a Deep Time-Domain Network. Journal of Marine Science and Engineering, 12(6), 885. https://doi.org/10.3390/jmse12060885