An End-to-End Underwater Acoustic Target Recognition Model Based on One-Dimensional Convolution and Transformer
Abstract
:1. Introduction
- The 1DCTN model directly processed time-domain signals, streamlining the recognition process by eliminating the need for complex feature engineering. This novel model overcame the inherent limitations of time–frequency domain representation methods, introducing a new way to preserve the full information contained in the raw waveforms.
- The 1DCTN model introduced a new method that effectively combined the local feature extraction capabilities of 1D CNNs with the long-range dependency modeling of the Transformer, addressing the limitations of LSTM in managing long-term dependencies and enhancing recognition accuracy.
- The 1DCTN model was lightweight, achieving optimal recognition accuracy with low computational complexity, making it an effective solution for resource-constrained scenarios in real-world applications.
- Comprehensive validation on the public dataset ShipsEar fully demonstrated the advantages of the 1DCTN model.
2. Materials and Methods
2.1. One-Dimensional Convolution
2.2. Multi-Head Self-Attention Mechanism
2.3. 1DCTN Model Architecture
3. Dataset and Preprocessing
3.1. Dataset Description
3.2. Spectral Analysis of the Dataset
3.3. Data Processing and Dataset Partitioning
4. Experiments and Results
4.1. Experimental Setup
4.2. Comparative Evaluation
4.2.1. Performance Comparison with Time–Frequency Features
4.2.2. Performance Comparison with LSTM Network Architectures
4.2.3. Comparison with Other Lightweight Models
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Cho, H.; Gu, J.; Yu, S.C. Robust Sonar-Based Underwater Object Recognition Against Angle-of-View Variation. IEEE Sens. J. 2016, 16, 1013–1025. [Google Scholar] [CrossRef]
- Wei, X.; Li, G.H.; Wang, Z.Q. Underwater Target Recognition Based on Wavelet Packet and Principal Component Analysis. Comput. Simul. 2011, 28, 8–290. [Google Scholar]
- Das, A.; Kumar, A.; Bahl, R. Marine Vessel Classification Based on Passive Sonar Data: The Cepstrum-Based Approach. IET Radar Sonar Navig. 2013, 7, 87–93. [Google Scholar] [CrossRef]
- Meng, Q.; Yang, S.; Piao, S. The Classification of Underwater Acoustic Target Signals Based on Wave Structure and Support Vector Machine. J. Acoust. Soc. Am. 2014, 136, 2265. [Google Scholar] [CrossRef]
- Jahromi, M.S.; Bagheri, V.; Rostami, H.; Keshavarz, A. Feature Extraction in Fractional Fourier Domain for Classification of Passive Sonar Signals. J. Signal Process. Syst. 2019, 91, 511–520. [Google Scholar] [CrossRef]
- Ke, X.; Yuan, F.; Cheng, E. Underwater Acoustic Target Recognition Based on Supervised Feature-Separation Algorithm. Sensors 2018, 18, 4318. [Google Scholar] [CrossRef]
- Erbe, C.; Marley, S.A.; Schoeman, R.; Smith, J.N.; Trigg, L.E.; Embling, C.B. The Effects of Ship Noise on Marine Mammals—A Review. Front. Mar. Sci. 2019, 6, 606. [Google Scholar] [CrossRef]
- Kirsebom, O.S.; Frazao, F.; Simard, Y.; Roy, N.; Matwin, S.; Giard, S. Performance of a Deep Neural Network at Detecting North Atlantic Right Whale Upcalls. J. Acoust. Soc. Am. 2020, 1474, 2636–2646. [Google Scholar] [CrossRef]
- Yin, X.H.; Sun, X.D.; Liu, P.S.; Wang, L.; Tang, R.C. Underwater Acoustic Target Classification Based on LOFAR Spectrum and Convolutional Neural Network. In Proceedings of the 2nd International Conference on Artificial Intelligence and Advanced Manufacture (AIAM), Manchester, UK, 15–17 October 2020; ACM: New York, NY, USA, 2020; pp. 59–63. [Google Scholar]
- Jiang, J.; Shi, T.; Huang, M.; Xiao, Z. Multi-Scale Spectral Feature Extraction for Underwater Acoustic Target Recognition. Measurement 2020, 166, 108227. [Google Scholar] [CrossRef]
- Miao, Y.; Zakharov, Y.V.; Sun, H.; Li, J.; Wang, J. Underwater Acoustic Signal Classification Based on Sparse Time–Frequency Representation and Deep Learning. IEEE J. Ocean. Eng. 2021, 46, 952–962. [Google Scholar] [CrossRef]
- Liu, F.; Shen, T.; Luo, Z.; Zhao, D.; Guo, S. Underwater Target Recognition Using Convolutional Recurrent Neural Networks with 3-D Mel-spectrogram and Data Augmentation. Appl. Acoust. 2021, 178, 107989. [Google Scholar] [CrossRef]
- Zheng, Y.; Gong, Q.; Zhang, S. Time-Frequency Feature-Based Underwater Target Detection with Deep Neural Network in Shallow Sea. J. Phys. Conf. Ser. 2021, 1756, 012006. [Google Scholar] [CrossRef]
- Hong, F.; Liu, C.; Guo, L.; Chen, F.; Feng, H. Underwater Acoustic Target Recognition with a Residual Network and the Optimized Feature Extraction Method. Appl. Sci. 2021, 11, 1442. [Google Scholar] [CrossRef]
- Xue, L.; Zeng, X.; Jin, A. A Novel Deep-Learning Method with Channel Attention Mechanism for Underwater Target Recognition. Sensors 2022, 22, 5492. [Google Scholar] [CrossRef]
- Wang, B.; Zhang, W.; Zhu, Y.; Wu, C.; Zhang, S. An Underwater Acoustic Target Recognition Method Based on AMNet. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
- Han, X.C.; Ren, C.; Wang, L.; Bai, Y. Underwater Acoustic Target Recognition Method Based on A Joint Neural Network. PLoS ONE 2022, 17, e0266425. [Google Scholar] [CrossRef]
- Li, P.; Wu, J.; Wang, Y.; Lan, Q.; Xiao, W. STM: Spectrogram Transformer Model for Underwater Acoustic Target Recognition. J. Mar. Sci. Eng. 2022, 10, 1428. [Google Scholar] [CrossRef]
- Feng, S.; Zhu, X. A Transformer-Based Deep Learning Network for Underwater Acoustic Target Recognition. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1505805. [Google Scholar] [CrossRef]
- Doan, V.S.; Huynh-The, T.; Kim, D.S. Underwater Acoustic Target Classification Based on Dense Convolutional Neural Network. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1500905. [Google Scholar] [CrossRef]
- Hu, G.; Wang, K.; Liu, L. Underwater Acoustic Target Recognition Based on Depthwise Separable Convolution Neural Networks. Sensors 2021, 21, 1429. [Google Scholar] [CrossRef]
- Song, X.; Cheng, J.; Gao, Y. A New Deep Learning Method for Underwater Target Recognition Based on One-Dimensional Time-Domain Signals. In Proceedings of the 2021 OES China Ocean Acoustics (COA), Harbin, China, 14–17 July 2021; pp. 1048–1051. [Google Scholar]
- Kamal, S.; Chandran, C.S.; Supriya, M.H. Passive Sonar Automated Target Classifier for Shallow Waters Using End-to-End Learnable Deep Convolutional LSTMs. Eng. Sci. Technol. Int. J. 2021, 24, 860–871. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Santos-Domínguez, D.; Torres-Guijarro, S.; Cardenal-López, A.; Pena-Gimenez, A. ShipsEar: An Underwater Vessel Noise Database. Appl. Acoust. 2016, 113, 64–69. [Google Scholar] [CrossRef]
Layer | Output Shape | Configuration |
---|---|---|
Input | (B, 1, L) | - |
Conv1D + MaxPool | (B, 32, L) | 32 filters, 5 × 1 kernel, pad 2 |
Conv1D + MaxPool | (B, 64, L/2) | 64 filters, 5 × 1 kernel, pad 2 |
Conv1D + MaxPool | (B, 128, L/4) | 128 filters, 5 × 1 kernel, pad 2 |
Reshape | (L/64, B, 128) | - |
Transformer Encoder | (L/64, B, 128) | 3 layers, 4 heads, dropout 0.1, FFN dim 128 |
Global Avg Pooling | (B, 128) | - |
MLP | (B, M) | - |
Category | Ship Types |
---|---|
A | fishing boats, trawlers, mussel boats, tugboats, dredgers |
B | motorboats, pilot boats, sailboats |
C | passenger ferries |
D | ocean liners, ro-ro vessels |
E | background noise recordings |
Category | Acoustic Signal Serial Number | The Number of Samples |
---|---|---|
A | 15, 28, 46–49, 66, 73–76, 80, 93–96 | 1808 |
B | 26, 27, 29, 30, 50–52, 56, 57, 68, 70, 72, 77, 79 | 1304 |
C | 6, 10, 40, 42, 43, 52–54, 59–65, 67 | 2632 |
D | 18–20, 22, 24, 25, 58, 69, 71, 78 | 2282 |
E | 81–92 | 1140 |
Feature | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|
Mel | 90.24 | 91.03 | 90.97 | 90.99 |
MFCCs | 92.71 | 92.35 | 92.75 | 92.54 |
Time domain | 96.84 | 96.85 | 96.84 | 96.84 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, K.; Wang, B.; Fang, Z.; Cai, B. An End-to-End Underwater Acoustic Target Recognition Model Based on One-Dimensional Convolution and Transformer. J. Mar. Sci. Eng. 2024, 12, 1793. https://doi.org/10.3390/jmse12101793
Yang K, Wang B, Fang Z, Cai B. An End-to-End Underwater Acoustic Target Recognition Model Based on One-Dimensional Convolution and Transformer. Journal of Marine Science and Engineering. 2024; 12(10):1793. https://doi.org/10.3390/jmse12101793
Chicago/Turabian StyleYang, Kang, Biao Wang, Zide Fang, and Banggui Cai. 2024. "An End-to-End Underwater Acoustic Target Recognition Model Based on One-Dimensional Convolution and Transformer" Journal of Marine Science and Engineering 12, no. 10: 1793. https://doi.org/10.3390/jmse12101793