Through-Ice Acoustic Source Tracking Using Vision Transformers with Ordinal Classification
Abstract
:1. Introduction
2. Materials and Methods
2.1. Acoustic Post-Processing
2.2. Convolutional Neural Network
2.3. Long Short-Term Memory Neural Network
2.4. Transformers
2.5. Vision Transformers
2.6. Loss Functions
2.6.1. Regression
2.6.2. Categorical Classification
2.6.3. Ordinal Classification
2.7. Experiments
2.7.1. Data Explanation
2.8. Network Explanations
2.9. Training and Hyperparameters
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AVS | Acoustic Vector Sensor |
CNN | Convolutional Neural Network |
DNN | Deep Neural Network |
DOA | Direction Of Arrival |
ITAR | International Traffic in Arms Regulations |
LSTM | Long Short-Term Memory |
MHA | Multi-Head Attention |
ML | Machine Learning |
MSE | Mean-Squared Error |
NLP | Natural Language Processing |
RMSE | Root-Mean-Squared Error |
SNR | Signal-to-Noise Ratio |
STFT | Short-Time Fourier Transform |
ViT | Vision Transformer |
Appendix A
References
- Erol-Kantarci, M.; Mouftah, H.T.; Oktug, S. A Survey of Architectures and Localization Techniques for Underwater Acoustic Sensor Networks. IEEE Commun. Surv. Tutor. 2011, 13, 487–502. [Google Scholar] [CrossRef]
- Anand, A.; Mukul, M.K. Comparative analysis of different direction of arrival estimation techniques. In Proceedings of the 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India, 20–21 May 2016. [Google Scholar] [CrossRef]
- Pinheiro, B.C.; Moreno, U.F.; de Sousa, J.T.B.; Rodriguez, O.C. Kernel-Function-Based Models for Acoustic Localization of Underwater Vehicles. IEEE J. Ocean. Eng. 2017, 42, 603–618. [Google Scholar] [CrossRef]
- Penhale, M.B.; Barnard, A.R.; Shuchman, R. Multi-modal and short-range transmission loss in thin, ice-covered, near-shore Arctic waters. J. Acoust. Soc. Am. 2018, 143, 3126–3137. [Google Scholar] [CrossRef]
- Penhale, M.B. Acoustic Localization Techniques for Application in Near-Shore Arctic Environments. Ph.D. Thesis, Michigan Technological University, Houghton, MI, USA, 2019. [Google Scholar] [CrossRef]
- Huang, Z.; Xu, J.; Gong, Z.; Wang, H.; Yan, Y. Source localization using deep neural networks in a shallow water environment. J. Acoust. Soc. Am. 2018, 143, 2922–2932. [Google Scholar] [CrossRef] [PubMed]
- Ullah, I.; Chen, J.; Su, X.; Esposito, C.; Choi, C. Localization and Detection of Targets in Underwater Wireless Sensor Using Distance and Angle Based Algorithms. IEEE Access 2019, 7, 45693–45704. [Google Scholar] [CrossRef]
- Huang, Z.; Xu, J.; Li, C.; Gong, Z.; Pan, J.; Yan, Y. Deep Neural Network for Source Localization Using Underwater Horizontal Circular Array. In Proceedings of the 2018 OCEANS—MTS/IEEE Kobe Techno-Oceans (OTO), Kobe, Japan, 28–31 May 2018. [Google Scholar] [CrossRef]
- Whitaker, S.; Dekraker, Z.; Barnard, A.; Havens, T.C.; Anderson, G.D. Uncertain Inference Using Ordinal Classification in Deep Networks for Acoustic Localization. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Online, 18–22 July 2021. [Google Scholar] [CrossRef]
- Whitaker, S.; Barnard, A.; Anderson, G.D.; Havens, T.C. Recurrent networks for direction-of-arrival identification of an acoustic source in a shallow water channel using a vector sensor. J. Acoust. Soc. Am. 2021, 150, 111–119. [Google Scholar] [CrossRef] [PubMed]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Qin, D.; Tang, J.; Yan, Z. Underwater Acoustic Source Localization Using LSTM Neural Network. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar] [CrossRef]
- Gong, Y.; Chung, Y.A.; Glass, J. AST: Audio Spectrogram Transformer. arXiv 2021, arXiv:2104.01778. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Sudarsanam, P.; Politis, A.; Drossos, K. Assessment of Self-Attention on Learned Features For Sound Event Localization and Detection. arXiv 2021, arXiv:2107.09388. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, in press. [CrossRef] [PubMed]
- Zhai, X.; Kolesnikov, A.; Houlsby, N.; Beyer, L. Scaling Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 12104–12113. [Google Scholar]
- Fahy, F. Sound Intensity, 2nd ed.; Routledge: London, UK, 1995. [Google Scholar]
- Kim, K.; Gabrielson, T.B.; Lauchle, G.C. Development of an accelerometer-based underwater acoustic intensity sensor. J. Acoust. Soc. Am. 2004, 116, 3384–3392. [Google Scholar] [CrossRef] [PubMed]
- Liikonen, L.; Alanko, M.; Jokinen, S.; Niskanen, I.; Virrankoski, L. Snowmobile Noise; Ministry of the Environment: Helsinki, Finland, 2007.
- Mullet, T.C.; Morton, J.M.; Gage, S.H.; Huettmann, F. Acoustic footprint of snowmobile noise and natural quiet refugia in an Alaskan wilderness. Nat. Areas J. 2017, 37, 332–349. [Google Scholar] [CrossRef]
- Thode, A.M.; Sakai, T.; Michalec, J.; Rankin, S.; Soldevilla, M.S.; Martin, B.; Kim, K.H. Displaying bioacoustic directional information from sonobuoys using “azigrams”. J. Acoust. Soc. Am. 2019, 146, 95–102. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
- Kim, Y.; Denton, C.; Hoang, L.; Rush, A.M. Structured Attention Networks. arXiv 2017, arXiv:1702.00887. [Google Scholar] [CrossRef]
- Global Positioning System Standard Positioning Service Performance Analysis Report; ANG-E66; FAA William J. Hughes Technical Center: Egg Harbor Townshi, NJ, USA, 2021; Volume Q1, pp. 20–21.
- Frank, E.; Hall, M. A Simple Approach to Ordinal Classification. In Machine Learning: ECML 2001; Springer: Berlin/Heidelberg, Germany, 2001; pp. 145–156. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar] [CrossRef]
- Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 1 September 2019).
- Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 2015. Available online: https://www.tensorflow.org (accessed on 1 September 2019).
- Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The Long-Document Transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar] [CrossRef]
- Khan, S.; Naseer, M.; Hayat, M.; Zamir, S.W.; Khan, F.S.; Shah, M. Transformers in Vision: A Survey. ACM Comput. Surv. Just Accepted Dec 2021. [CrossRef]
- Jiang, Z.H.; Hou, Q.; Yuan, L.; Zhou, D.; Shi, Y.; Jin, X.; Wang, A.; Feng, J. All Tokens Matter: Token Labeling for Training Better Vision Transformers. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–14 December 2021; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2021; Volume 34, pp. 18590–18602. [Google Scholar]
CNN | LSTM | Transformer | ViT | |
---|---|---|---|---|
Large | 16 | 5 | 8 | 12 |
Small | 4 | 1 | 1 | 8 |
CNN | LSTM | Transformer | ViT | |
---|---|---|---|---|
Large | k | k | k | k |
Small | 892 k | 905 k | 843 k | 928 k |
CNN | LSTM | |||
---|---|---|---|---|
Large | Small | Large | Small | |
Regression | ||||
Categorical | ||||
Ordinal | ||||
Transformer | ViT | |||
Large | Small | Large | Small | |
Regression | ||||
Categorical | ||||
Ordinal |
CNN | LSTM | |||
---|---|---|---|---|
Large | Small | Large | Small | |
Regression | 671 s | 654 s | 2358 s | 1931 s |
Categorical | 620 s | 657 s | 2170 s | 1933 s |
Ordinal | 675 s | 656 s | 2150 s | 1930 s |
Transformer | ViT | |||
Large | Small | Large | Small | |
Regression | 1358 s | 639 s | 1700 s | 654 s |
Categorical | 1188 s | 648 s | 1785 s | 658 s |
Ordinal | 1070 s | 647 s | 1752 s | 660 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Whitaker, S.; Barnard, A.; Anderson, G.D.; Havens, T.C. Through-Ice Acoustic Source Tracking Using Vision Transformers with Ordinal Classification. Sensors 2022, 22, 4703. https://doi.org/10.3390/s22134703
Whitaker S, Barnard A, Anderson GD, Havens TC. Through-Ice Acoustic Source Tracking Using Vision Transformers with Ordinal Classification. Sensors. 2022; 22(13):4703. https://doi.org/10.3390/s22134703
Chicago/Turabian StyleWhitaker, Steven, Andrew Barnard, George D. Anderson, and Timothy C. Havens. 2022. "Through-Ice Acoustic Source Tracking Using Vision Transformers with Ordinal Classification" Sensors 22, no. 13: 4703. https://doi.org/10.3390/s22134703
APA StyleWhitaker, S., Barnard, A., Anderson, G. D., & Havens, T. C. (2022). Through-Ice Acoustic Source Tracking Using Vision Transformers with Ordinal Classification. Sensors, 22(13), 4703. https://doi.org/10.3390/s22134703