LD-Det: Lightweight Ship Target Detection Method in SAR Images via Dual Domain Feature Fusion
Abstract
:1. Introduction
- (1)
- We add the compression of the image before the convolutional layer. This can well reduce the computational capacity of this layer itself, thus improving the computational speed of the network and reducing the number of operational parameters. Two-dimensional wavelet transform is used for the image compression, which can compress the image resolution while retaining the edge texture features of the image.
- (2)
- In the feature extraction network, we add wavelet transform processing to compensate for the lack of spatial domain feature extraction in convolutional layers. After getting the frequency domain features of the input image, we return this image to the network and connect it with the processed feature maps according to the channel. This approach can improve the spatial and frequency domain feature extraction in the network.
- (3)
- To rationalize the weights of channel information and spatial information, we add a lightweight triple attention module in the detection head. It can well match the horizontal and vertical high-frequency features extracted from the wavelet transform image and further strengthen the network’s learning and training effect for multi-domain features.
- (4)
- To enhance the localization accuracy and robustness of bounding box regression, we introduce a hybrid loss function that combines Complete Intersection over Union (CIoU) and focal loss. This approach leverages CIoU to improve the alignment between predicted and ground truth boxes by considering their overlap, center distance, and aspect ratio consistency. Meanwhile, focal loss dynamically adjusts the loss weight based on IoU values, emphasizing the learning of hard-to-align samples. By integrating these two mechanisms, the hybrid loss function enhances detection performance, particularly for small or occluded objects, while maintaining model stability and convergence efficiency.
2. Methodology
2.1. Network Architecture
2.2. Image Compression Based on Wavelet Transform
2.3. Dual Domain Feature Extraction
2.4. Triple Attention v2 Module
2.5. Loss Function
2.5.1. CIoU
2.5.2. Focal Loss
2.5.3. Hybrid Loss Function
3. Experiment
4. Conclusions
- (1)
- The proposed LD-Det model introduces a novel lightweight architecture based on partial convolution, integrating multi-domain feature extraction, a highly compatible triplet attention mechanism, and a hybrid loss function. This design significantly enhances the model’s ability to detect ship targets by effectively fusing spatial and frequency domain features while maintaining computational efficiency.
- (2)
- The integration of wavelet transform into the network enables efficient compression of input feature maps and extraction of frequency domain features. This approach facilitates seamless feature fusion, improving the model’s ability to capture and utilize multi-domain information for enhanced detection performance.
- (3)
- The enhanced triple attention module v2 reduces computational complexity and parameter count by optimizing the number of convolutional layers and kernel sizes. This improvement allows the module to effectively capture channel features and spatial-frequency domain features, enabling efficient utilization of multi-domain features for robust target detection.
- (4)
- Hybrid IoU achieves a synergy between CIoU and focal loss, not only enhancing the localization accuracy of bounding boxes—allowing the predicted boxes to more closely approximate the ground truth—but also balancing the focus on hard-to-detect targets such as small objects, occluded objects, and densely packed objects. This dual approach reduces both false negatives and false positives, further improving the overall detection accuracy of the model.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sun, Y.; Lei, L.; Guan, D.; Li, X.; Kuang, G. SAR image change detection based on nonlocal low-rank model and two-level clustering. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 293–306. [Google Scholar] [CrossRef]
- Novak, L.M.; Halversen, S.D.; Owirka, G.; Hiett, M. Effects of polarization and resolution on SAR ATR. IEEE Trans. Aerosp. Electron. Syst. 1997, 33, 102–116. [Google Scholar] [CrossRef]
- Zhang, J.; Yang, J.; Li, X.; Fan, Z.; He, Z.; Ding, D. SAR ship target detection based on lightweight YOLOv5 in complex environment. In Proceedings of the 2022 Cross Strait Radio Science & Wireless Technology Conference (CSRSWTC), Beijing, China, 17–18 December 2022; pp. 1–3. [Google Scholar]
- Li, T.; Liu, Z.; Xie, R.; Ran, L. An improved superpixel-level CFAR detection method for ship targets in high-resolution SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 184–194. [Google Scholar] [CrossRef]
- Li, W.; Liu, G. A single-shot object detector with feature aggregation and enhancement. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 3910–3914. [Google Scholar]
- Du, L.; Zhang, R.; Wang, X. Overview of two-stage object detection algorithms. J. Phys. Conf. Ser. 2020, 1544, 012033. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2016, arXiv:1506.01497. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. arXiv 2016, arXiv:1612.08242. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications. arXiv 2022, arXiv:2209.03799. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016. [Google Scholar]
- He, J.; Wang, X.; Song, Y.; Xiang, Q. A Multi-scale Radar HRRP Target Recognition Method Based on Pyramid Depthwise Separable Convolution Network. In Proceedings of the 7th International Conference on Image, Vision and Computing (ICIVC), Xi’an, China, 22–24 June 2022; pp. 579–585. [Google Scholar]
- Cheng, X.; Zhang, X.; Zhao, Z.; Huang, X.; Han, X.; Wu, X. An improved SSD target detection method based on deep separable convolution. In Proceedings of the 6th International Conference on Internet of Things, Automation and Artificial Intelligence (IoTAAI), Guangzhou, China, 20–22 March 2024; pp. 92–96. [Google Scholar]
- Liu, G.; Dundar, A.; Shih, K.J.; Wang, T.C.; Reda, F.A.; Sapra, K.; Yu, Z.; Yang, X.; Tao, A.; Catanzaro, B. Partial Convolution for Padding, Inpainting, and Image Synthesis. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 6096–6110. [Google Scholar] [CrossRef]
- Chen, J.; Kao, S.-H.; He, H. Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. arXiv 2023, arXiv:2303.03667. [Google Scholar]
- Olkkonen, H.; Olkkonen, J.T.; Pesola, P. FFT-Based Computation of Shift Invariant Analytic Wavelet Transform. IEEE Signal Process. Lett. 2007, 14, 177–180. [Google Scholar] [CrossRef]
- Muthuvel, K.; Veni, S.H.K.; Suresh, L.P.; Kannan, K.B. ECG signal feature extraction and classification using Harr Wavelet Transform and Neural Network. In Proceedings of the International Conference on Circuits, Power and Computing Technologies (ICCPCT-2014), Nagercoil, India, 20–22 March 2014; pp. 1396–1399. [Google Scholar]
- Leontiev, N.A.; Nyurova, A.G. The Use of Discrete Meyer Wavelet for Speech Segmentation. In Proceedings of the International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), Vladivostok, Russia, 24–26 October 2019; pp. 1–3. [Google Scholar]
- Abdelliche, F.; Charef, A.; Ladaci, S. Complex fractional and complex Morlet wavelets for QRS complex detection. In Proceedings of the ICFDA’14 International Conference on Fractional Differentiation and Its Applications, Catania, Italy, 23–25 June 2014; pp. 1–5. [Google Scholar]
- Karim, S.A.A.; Singh, B.S.M.; Razali, R.; Yahya, N.; Karim, B.A. Solar radiation data analysis by using Daubechies wavelets. In Proceedings of the IEEE International Conference on Control System, Computing and Engineering, Penang, Malaysia, 25–27 November 2011; pp. 571–574. [Google Scholar]
- Dai, Y.; Huang, X.; Chen, Z. Application of Wavelet Denoising and Time-Frequency Domain Feature Extraction on Data Processing of Modulated Signals. In Proceedings of the 2nd International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), Shanghai, China, 25–27 December 2021; pp. 611–615. [Google Scholar]
- Misra, D.; Nalamada, T.; Arasanipalai, A.U.; Hou, Q. Rotate to Attend: Convolutional Triplet Attention Module. arXiv 2020, arXiv:2010.03045. [Google Scholar]
- Cao, J.; Han, F.; Wang, Y.; Wang, M.; Zheng, X.; Gao, H. A Novel YOLOv5-Based Hybrid Underwater Target Detection Algorithm Combining with CBAM and CIoU. In Proceedings of the 2023 China Automation Congress (CAC), Chongqing, China, 25–27 October 2023; pp. 8060–8065. [Google Scholar]
- Du, S.; Zhang, B.; Zhang, P.; Xiang, P. An Improved Bounding Box Regression Loss Function Based on CIOU Loss for Multi-scale Object Detection. In Proceedings of the 2021 IEEE 2nd International Conference on Pattern Recognition and Machine Learning (PRML), Chengdu, China, 24–26 December 2021; pp. 92–98. [Google Scholar]
- Zhang, T.; Zhang, X.; Li, J.; Xu, X.; Wang, B.; Zhan, X.; Xu, Y.; Ke, X.; Zeng, T.; Su, H.; et al. SAR Ship Detection Dataset (SSDD): Official Release and Comprehensive Data Analysis. Remote Sens. 2021, 13, 3690. Available online: https://github.com/TianwenZhang0825/Official-SSDD (accessed on 26 August 2021). [CrossRef]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef] [PubMed]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6568–6577. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9626–9635. [Google Scholar]
Model | P (%) | R (%) | AP50 (%) | FPS | Params (M) |
---|---|---|---|---|---|
SSD 512 | 80.1 | 83.1 | 88.8 | 222.1 | 26.3 |
RetinaNet | 84.1 | 84.8 | 92.9 | 296.1 | 36.5 |
CenterNet | 90.6 | 82.5 | 90.3 | 362.6 | 42.6 |
FCOS | 89.4 | 84.9 | 90.0 | 351.8 | 70.8 |
YOLOv8n | 90.9 | 86.1 | 95.5 | 370.4 | 30.5 |
LD-Det | 94.8 | 92.4 | 96.8 | 312.1 | 24.4 |
Experiment | Ablation | P (%) | R (%) | AP50 (%) | Params (M) | FPS | FLOPs (G) |
---|---|---|---|---|---|---|---|
Ablation-1 | Yolov8n | 90.9 | 86.1 | 95.5 | 30.5 | 370.4 | 8.4 |
Ablation-2 | +PConv | 90.5 | 81.8 | 93.1 | 22.9 | 500.0 | 6.4 |
Ablation-3 | +WT Extract | 89.1 | 89.9 | 95.9 | 28.4 | 360.3 | 7.8 |
Ablation-4 | +WT Comp | 90.0 | 83.3 | 94.1 | 22.0 | 342.6 | 6.2 |
Ablation-5 | +Tri At | 92.5 | 87.6 | 95.9 | 32.3 | 313.2 | 9.1 |
Ablation-6 | +Hybrid IoU | 95.8 | 93.1 | 98.1 | 30.5 | 332.6 | 10.4 |
Ablation-7 | LD-Det | 94.8 | 92.4 | 96.8 | 24.4 | 312.1 | 8.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, H.; Liu, B.; Wang, L.; Li, T. LD-Det: Lightweight Ship Target Detection Method in SAR Images via Dual Domain Feature Fusion. Remote Sens. 2025, 17, 1562. https://doi.org/10.3390/rs17091562
Yu H, Liu B, Wang L, Li T. LD-Det: Lightweight Ship Target Detection Method in SAR Images via Dual Domain Feature Fusion. Remote Sensing. 2025; 17(9):1562. https://doi.org/10.3390/rs17091562
Chicago/Turabian StyleYu, Hang, Bingzong Liu, Lei Wang, and Teng Li. 2025. "LD-Det: Lightweight Ship Target Detection Method in SAR Images via Dual Domain Feature Fusion" Remote Sensing 17, no. 9: 1562. https://doi.org/10.3390/rs17091562
APA StyleYu, H., Liu, B., Wang, L., & Li, T. (2025). LD-Det: Lightweight Ship Target Detection Method in SAR Images via Dual Domain Feature Fusion. Remote Sensing, 17(9), 1562. https://doi.org/10.3390/rs17091562