Ship Detection in Synthetic Aperture Radar Images Based on BiLevel Spatial Attention and Deep Poly Kernel Network
Abstract
:1. Introduction
- We construct DPK-Net in the backbone network, which consists of the OC module and PK module. The OC module aims to reduce data redundancy and optimize the efficiency of information processing. The PK module extracts dense ship features from different receptive fields. These features are adaptively fused along the channel dimensions to collect contextual information more efficiently.
- We design the BSAM attention mechanism to obtain global information through sparsity while preserving ship detail information and achieving faster regression through P-IoU.
- Many experiments have been conducted on the SSDD and HRSID datasets, with excellent experimental results proving the effectiveness of the proposed models.
2. Related Work
2.1. MultiScale Ship Detection
2.2. Attention Mechanism
2.3. Loss Function
3. Materials and Methods
3.1. The Overview of YOLO-MSD
3.2. Deep Poly Kernel Network (DPK-Net)
3.2.1. Optimized Convolution Module (OC)
3.2.2. Poly Kernel Module (PK)
3.3. BiLevel Spatial Attention Module (BSAM)
3.3.1. BiLevel Routing Attention (BRA)
3.3.2. Spatial Attention Module
3.4. Powerful-IoU Loss Function (P-IoU)
4. Experiment and Results
4.1. Experimental Platform
4.2. Datasets
4.2.1. HRSID
4.2.2. SSDD
4.3. Model Evaluation
4.4. Experimental Results
4.4.1. Comparison with Existing Methods
4.4.2. Ablation Experiment
5. Discussion
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Dudczyk, J.; Rybak, Ł. Application of Data Particle Geometrical Divide Algorithms in the Process of Radar Signal Recognition. Sensors 2023, 23, 8183. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Xu, C.; Su, H.; Gao, L.; Wang, T. Deep Learning for SAR Ship Detection: Past, Present and Future. Remote Sens. 2022, 14, 2712. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOV4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOV6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOV7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into high quality object detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar] [CrossRef]
- Fu, J.; Sun, X.; Wang, Z.; Fu, K. An Anchor-Free Method Based on Feature Balancing and Refinement Network for Multi-scale Ship Detection in SAR Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1331–1344. [Google Scholar] [CrossRef]
- Tang, H.; Gao, S.; Li, S.; Wang, P.; Liu, J.; Wang, S.; Qian, J. A Lightweight SAR Image Ship Detection Method Based on Improved Convolution and YOLOv7. Remote Sens. 2024, 16, 486. [Google Scholar] [CrossRef]
- Li, X.; Li, D.; Liu, H.; Wan, J.; Chen, Z.; Liu, Q. A-BFPN: An attention-guided balanced feature pyramid network for SAR ship detection. Remote Sens. 2022, 14, 3829. [Google Scholar] [CrossRef]
- Li, Y.G.; Zhu, W.G.; Li, C.X.; Zeng, C.Z. SAR image near-shore ship object detection method in complex background. Int. J. Remote Sens. 2023, 44, 924–952. [Google Scholar] [CrossRef]
- Guo, Y.; Chen, S.; Zhan, R.; Wang, W.; Zhang, J. LMSD-YOLO: A lightweight YOLO algorithm for multi-scale SAR ship detection. Remote Sens. 2022, 14, 4801. [Google Scholar] [CrossRef]
- Suo, Z.; Zhao, Y.; Hu, Y. An Effective Multi-Layer Attention Network for SAR Ship Detection. J. Mar. Sci. Eng. 2023, 11, 906. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Ke, X. Quad-FPN: A Novel Quad Feature Pyramid Network for SAR Ship Detection. Remote Sens. 2021, 13, 2771. [Google Scholar] [CrossRef]
- Cordonnier, J.B.; Loukas, A.; Jaggi, M. Multi-head attention: Collaborate instead of concatenate. arXiv 2020, arXiv:2006.16362. [Google Scholar]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective Kernel Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar] [CrossRef]
- Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar] [CrossRef]
- Yuan, Y.; Wang, J. OCNet: Object Context Network for Scene Parsing. arXiv 2018, arXiv:1809.00916. [Google Scholar]
- Lin, X.; Guo, Y.; Wang, J. Global Correlation Network: End-to-End Joint Multi-Object Detection and Tracking. arXiv 2021, arXiv:2103.12511. [Google Scholar]
- Fu, J.; Liu, J.; Tian, H.; Fang, Z.; Lu, H. Dual Attention Network for Scene Segmentation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.M.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar] [CrossRef]
- Chen, S.; Zhan, R.; Zhang, J. Regional attention-based single shot detector for SAR ship detection. J. Eng. 2019, 21, 7381–7384. [Google Scholar] [CrossRef]
- Zhu, C.; Zhao, D.; Liu, Z.; Mao, Y. Hierarchical Attention for Ship Detection in SAR Images. In Proceedings of the IGARSS 2020–2020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA, 26 September–2 October 2020. [Google Scholar] [CrossRef]
- Yasir, M.; Shanwei, L.; Mingming, X.; Hui, S.; Hossain, S.; Colak, A.T.I.; Wang, D.; Jianhua, W.; Dang, K.B. Multi-scale ship object detection using SAR images based on improved Yolov5. Front. Mar. Sci. 2023, 9, 1086140. [Google Scholar] [CrossRef]
- Shan, H.; Fu, X.; Lv, Z.; Zhang, Y. SAR ship detection algorithm based on deep dense sim attention mechanism network. IEEE Sens. J. 2023, 23, 16032–16041. [Google Scholar] [CrossRef]
- Zhou, Y.; Liu, H.; Ma, F.; Pan, Z.; Zhang, F. A sidelobe-aware small ship detection network for synthetic aperture radar imagery. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Zhu, H.; Xie, Y.; Huang, H.; Jing, C.; Rong, Y.; Wang, C. DB-YOLO: A duplicate bilateral YOLO network for multi-scale ship detection in SAR images. Sensors 2021, 21, 8146. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Chen, J.; Sun, L.; Zhou, Z.; Huang, Z.; Wu, B. Unsupervised Domain-Adaptive SAR Ship Detection Based on Cross-Domain Feature Interaction and Data Contribution Balance. Remote Sens. 2024, 16, 420. [Google Scholar] [CrossRef]
- Hu, B.; Miao, H. An Improved Deep Netural Network for Small Ship Detection in SAR Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 2596–2609. [Google Scholar] [CrossRef]
- Chen, J.; Kao, S.H.; He, H.; Zhuo, W.; Wen, S.; Lee, C.H.; Chan, S.H.G. Run, Don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2023. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef]
- Cai, X.; Lai, Q.; Wang, Y.; Wang, W.; Sun, Z.; Yao, Y. Poly Kernel Inception Network for Remote Sensing Detection. arXiv 2024, arXiv:2403.06258. [Google Scholar]
- Zhu, L.; Wang, X.; Ke, Z.; Zhang, W.; Lau, R.W. Biformer: Vision transformer with bi-level routing attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2023. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020. [Google Scholar] [CrossRef]
- Liu, C.; Wang, K.; Li, Q.; Zhao, F.; Zhao, K.; Ma, H. Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism. Neural Netw. 2024, 170, 276–284. [Google Scholar] [CrossRef] [PubMed]
- Wei, S.; Zeng, X.; Qu, Q.; Wang, M.; Su, H.; Shi, J. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access 2020, 8, 120234–120254. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Li, J.; Xu, X.; Wang, B.; Zhan, X.; Xu, Y.; Ke, X.; Zeng, T.; Su, H.; et al. SAR ship detection dataset (SSDD): Official release and comprehensive data analysis. Remote Sens. 2021, 13, 3690. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 42, 318–327. [Google Scholar] [CrossRef]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint Triplets for Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar] [CrossRef]
- Chen, Z.; Liu, C.; Filaretov, V.F.; Yukhimets, D.A. Multi-Scale Ship Detection Algorithm Based on YOLOv7 for Complex Scene SAR Images. Remote Sens. 2023, 15, 2071. [Google Scholar] [CrossRef]
- Xiao, M.; He, Z.; Li, X.; Lou, A. Power Transformations and Feature Alignment Guided Network for SAR Ship Detection. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Liu, C.; Shi, J.; Wei, S.; Ahmad, I.; Zhan, X.; Zhou, Y.; Pan, D.; Li, J.; et al. Balance learning for ship detection from synthetic aperture radar remote sensing imagery. ISPRS J. Photogramm. Remote Sens. 2021, 182, 190–207. [Google Scholar] [CrossRef]
- Bai, L.; Yao, C.; Ye, Z.; Xue, D.; Lin, X.; Hui, M. Feature Enhancement Pyramid and Shallow Feature Reconstruction Network for SAR Ship Detection. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2023, 16, 1042–1056. [Google Scholar] [CrossRef]
- Tang, G.; Zhao, H.; Claramunt, C.; Zhu, W.; Wang, S.; Wang, Y.; Ding, Y. PPA-Net: Pyramid Pooling Attention Network for Multi-Scale Ship Detection in SAR Images. Remote Sens. 2023, 15, 2855. [Google Scholar] [CrossRef]
- He, J.; Su, N.; Xu, C.; Liao, Y.; Yan, Y.; Zhao, C.; Hou, W.; Feng, S. A Cross-Modality Feature Transfer Method for Target Detection in Sar Images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–15. [Google Scholar] [CrossRef]
- Chang, H.; Fu, X.; Dong, J.; Liu, J.; Zhou, Z. MLSDNet: Multiclass Lightweight SAR Detection Network Based on Adaptive Scale Distribution Attention. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Metric | Meaning |
---|---|
AP for IoU = 0.50:0.05:0.95 | |
AP for IoU = 0.50 | |
AP for small targets (area < 322) | |
AP for medium targets (322 < area < 962) | |
AP for large targets (area > 962) | |
Frames per second |
Method | HRSID | SSDD | FPS | Params (M) | FLOPs (G) | ||||
---|---|---|---|---|---|---|---|---|---|
P | R | mAP | P | R | mAP | ||||
Faster-RCNN | 0.378 | 0.560 | 0.454 | 0.502 | 0.944 | 0.851 | 13 | 41.3 | 251.4 |
Cascade R-CNN | 0.739 | 0.634 | 0.651 | 0.908 | 0.941 | 0.905 | 25 | 68.93 | 119.0 |
SSD | 0.928 | 0.438 | 0.681 | 0.936 | 0.552 | 0.899 | 92 | 23.7 | 30.4 |
EfficientDet | 0.969 | 0.331 | 0.484 | 0.959 | 0.533 | 0.713 | 29 | 3.8 | 2.3 |
YOLOv5_n | 0.890 | 0.717 | 0.776 | 0.925 | 0.833 | 0.897 | 95 | 1.9 | 4.5 |
YOLOv7 | 0.847 | 0.724 | 0.819 | 0.928 | 0.782 | 0.902 | 56 | 37.1 | 105.1 |
YOLOv7-tiny | 0.864 | 0.747 | 0.843 | 0.889 | 0.876 | 0.926 | 96 | 6.008 | 13.0 |
RetinaNet | 0.980 | 0.395 | 0.534 | 0.976 | 0.623 | 0.698 | 34 | 36.3 | 10.1 |
CenterNet | 0.948 | 0.696 | 0.788 | 0.948 | 0.604 | 0.785 | 48 | 32.6 | 6.7 |
CSD-YOLO | 0.932 | 0.804 | 0.861 | 0.959 | 0.959 | 0.986 | - | - | - |
Pow-FAN | 0.885 | 0.837 | 0.897 | 0.946 | 0.965 | 0.963 | 31 | 136 | - |
BL-Net | 0.915 | 0.897 | 0.886 | 0.912 | 0.961 | 0.952 | 5 | 47.8 | 417.8 |
FEPS-Net | - | - | 0.897 | - | - | 0.960 | 32 | 37.31 | - |
PPA-Net | 0.903 | 0.882 | 0.893 | 0.952 | 0.912 | 0.952 | - | - | - |
CMFT | 0.813 | 0.911 | 0.896 | 0.924 | 0.981 | 0.973 | - | - | - |
MLSDNet | - | - | 0.897 | - | - | 0.974 | 7 | 5.68 | 18.4 |
FBR-Net | - | - | 0.896 | 0.928 | 0.940 | 0.941 | 25 | 32.5 | 141.3 |
MANet | 0.871 | 0.782 | 0.863 | 0.953 | 0.949 | 0.957 | - | - | - |
Quad-FPN | 0.880 | 0.873 | 0.861 | 0.895 | 0.958 | 0.953 | 11 | - | - |
Ours | 0.902 | 0.788 | 0.902 | 0.968 | 0.902 | 0.988 | 45 | 12.3 | 33.8 |
Experiment | OC Module | PK Module | BSAM | P-IoU | Dataset | P | R | mAP 50 | mAP 50-95 |
---|---|---|---|---|---|---|---|---|---|
1 | - | - | - | - | SSDD | 0.889 | 0.876 | 0.926 | 0.578 |
2 | √ | - | - | - | 0.934 | 0.880 | 0.948 | 0.606 | |
3 | - | √ | - | - | 0.908 | 0.894 | 0.945 | 0.575 | |
4 | - | - | √ | - | 0.895 | 0.868 | 0.931 | 0.552 | |
5 | - | - | - | √ | 0.941 | 0.909 | 0.966 | 0.651 | |
6 | √ | √ | - | - | 0.935 | 0.921 | 0.967 | 0.658 | |
7 | - | - | √ | √ | 0.927 | 0.951 | 0.970 | 0.657 | |
8 | √ | √ | √ | √ | 0.968 | 0.902 | 0.988 | 0.661 | |
9 | - | - | - | - | HRSID | 0.864 | 0.747 | 0.843 | 0.555 |
10 | √ | - | - | - | 0.885 | 0.823 | 0.852 | 0.586 | |
11 | - | √ | - | - | 0.865 | 0.754 | 0.878 | 0.555 | |
12 | - | - | √ | - | 0.874 | 0.772 | 0.856 | 0.560 | |
13 | - | - | - | √ | 0.871 | 0.766 | 0.854 | 0.559 | |
14 | √ | √ | - | - | 0.895 | 0.785 | 0.886 | 0.568 | |
15 | - | - | √ | √ | 0.884 | 0.788 | 0.877 | 0.583 | |
16 | √ | √ | √ | √ | 0.902 | 0.788 | 0.902 | 0.592 |
Experiment | OC Module | PK Module | BSAM | P-IoU | mAP | Params (M) | FLOPs (G) | |||
---|---|---|---|---|---|---|---|---|---|---|
1 | - | - | - | - | 0.843 | 0.462 | 0.748 | 0.215 | 6.008 | 13.0 |
2 | √ | - | - | - | 0.852 | 0.487 | 0.758 | 0.279 | 6.007 | 13.0 |
3 | - | √ | - | - | 0.878 | 0.496 | 0.784 | 0.629 | 11.233 | 18.8 |
4 | - | - | √ | - | 0.856 | 0.460 | 0.753 | 0.259 | 8.599 | 30.6 |
5 | - | - | - | √ | 0.854 | 0.464 | 0.746 | 0.280 | 6.008 | 13.0 |
6 | √ | √ | - | - | 0.886 | 0.518 | 0.785 | 0.433 | 9.689 | 16.0 |
7 | - | - | √ | √ | 0.877 | 0.494 | 0.758 | 0.285 | 10.905 | 31.1 |
8 | √ | √ | √ | √ | 0.902 | 0.534 | 0.784 | 0.426 | 12.281 | 33.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tian, S.; Jin, G.; Gao, J.; Tan, L.; Xue, Y.; Li, Y.; Liu, Y. Ship Detection in Synthetic Aperture Radar Images Based on BiLevel Spatial Attention and Deep Poly Kernel Network. J. Mar. Sci. Eng. 2024, 12, 1379. https://doi.org/10.3390/jmse12081379
Tian S, Jin G, Gao J, Tan L, Xue Y, Li Y, Liu Y. Ship Detection in Synthetic Aperture Radar Images Based on BiLevel Spatial Attention and Deep Poly Kernel Network. Journal of Marine Science and Engineering. 2024; 12(8):1379. https://doi.org/10.3390/jmse12081379
Chicago/Turabian StyleTian, Siyuan, Guodong Jin, Jing Gao, Lining Tan, Yuanliang Xue, Yang Li, and Yantong Liu. 2024. "Ship Detection in Synthetic Aperture Radar Images Based on BiLevel Spatial Attention and Deep Poly Kernel Network" Journal of Marine Science and Engineering 12, no. 8: 1379. https://doi.org/10.3390/jmse12081379
APA StyleTian, S., Jin, G., Gao, J., Tan, L., Xue, Y., Li, Y., & Liu, Y. (2024). Ship Detection in Synthetic Aperture Radar Images Based on BiLevel Spatial Attention and Deep Poly Kernel Network. Journal of Marine Science and Engineering, 12(8), 1379. https://doi.org/10.3390/jmse12081379