Ships’ Small Target Detection Based on the CBAM-YOLOX Algorithm
Abstract
:1. Introduction
2. Small Target Detection of Ships Based on the CBAM-YOLOX Network
2.1. CBAM-YOLOX Network Structure
2.2. Loss Function
2.3. Soft Non-Maximum Suppression Algorithm
3. A Ship’s Small Target Detection Results and Analysis
3.1. Experimental Design
3.1.1. Dataset
3.1.2. Experimental Environment and Parameter Settings
3.1.3. Evaluation Index
3.2. Ablation Experiment
3.2.1. Comparison of Network Training Results
3.2.2. Comparison of The Effect of CBAM Modules in Different Locations
3.2.3. Comparison Experiments of Different Versions of the YOLOX Network
3.2.4. F1 Curve Comparison
3.3. Comparison of Test Results
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Liu, W.; Anguelov, D.; Erhan, D. SSD: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Farhadi, A.; Redmon, J. Yolov3: An incremental improvement. In Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- He, K.; Zhang, X.; Ren, S. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; van der Maaten, L. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Lin, T.Y.; Dollar, P.; Girshick, R. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Woo, S.; Hwang, S.; Kweon, I.S. StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection. In Proceedings of the IEEE Computer Society, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Raghunandan, A.; Raghav, P.; Aradhya, H.V.R. Object detection algorithms for video surveillance applications. In Proceedings of the 2018 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 3–5 April 2018; pp. 563–568. [Google Scholar]
- Shrivastava, A.; Sukthankar, R.; Malik, J. Beyond skip connections: Top-down modulation for object detection. arXiv 2016, arXiv:1612.06851. [Google Scholar]
- Li, Z.; Peng, C.; Yu, G. DetNet: Design backbone for object detection. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 334–350. [Google Scholar]
- Li, Y.; Chen, Y.; Wang, N. Scale-aware trident networks for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 6054–6063. [Google Scholar]
- Zhang, Y.; Shen, T. Small object detection with multiple receptive fields. IOP Conf. Ser. Earth Environ. Sci. 2020, 440, 032093. [Google Scholar] [CrossRef]
- Cai, Z.; Fan, Q.; Feris, R.S. A unified multi-scale deep convolutional neural network for fast object detection. In Proceedings of the European conference on computer vision, Amsterdam, The Netherlands, 8–16 October 2016; Springer: Cham, Switzerland; pp. 354–370. [Google Scholar]
- Zhu, Y.; Zhao, C.; Wang, J. Couplenet: Coupling global structure with local parts for object detection. In Proceedings of the IEEE International Conference on Computer Vision 2017, Venice, Italy, 22–29 October 2017; pp. 4126–4134. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Kisantal, M.; Wojna, Z.; Murawski, J. Augmentation for small object detection. arXiv 2019, arXiv:1902.07296. [Google Scholar]
- Chen, Y.; Zhang, P.; Li, Z. Stitcher: Feedback-driven data provider for object detection. arXiv 2020, arXiv:2004.12432. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N. Mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Jaderberg, M.; Simonyan, K.; Zisserman, A. Spatial transformer networks. Adv. Neural Inf. Process. Syst. 2015, 28, 2017–2025. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018. [Google Scholar]
- Liu, T.; Pang, B.; Zhang, L. Sea Surface Object Detection Algorithm Based on YOLOv4 Fused with Reverse Depthwise Separable Convolution (RDSC) for USV. J. Mar. Sci. Eng. 2021, 9, 753. [Google Scholar] [CrossRef]
- Zhou, J.; Jiang, P.; Zou, A. Ship Target Detection Algorithm Based on Improved YOLOv5. J. Mar. Sci. Eng. 2021, 9, 908. [Google Scholar] [CrossRef]
- Hosang, J.; Benenson, R.; Schiele, B. Learning non-maximum suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4507–4515. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Chen, C.; Liu, M.Y.; Tuzel, O. RCNN for small object detection. In Proceeding of the 13th Asian Conference on Computer Vision, Taipei, Taiwan, 20–24 November 2016; pp. 214–230. [Google Scholar]
- Liu, R.W.; Yuan, W.; Chen, X. An enhanced CNN-enabled learning method for promoting ship detection in maritime surveillance system. Ocean. Eng. 2021, 235, 109435. [Google Scholar] [CrossRef]
Model | Depth_Multiple | Width_Multiple |
---|---|---|
YOLOX_s | 0.33 | 0.50 |
YOLOX_m | 0.67 | 0.75 |
YOLOX_l | 1.00 | 1.00 |
YOLOX_x | 1.33 | 1.25 |
Model | Head | Dark5 | Dark4 | Dark3 | [email protected]/% | Time/ms |
---|---|---|---|---|---|---|
YOLOX_s | × | × | × | × | 94.66 | 14.26 |
CBAM-YOLOX | √ | × | × | × | 98.59 | 16.10 |
× | √ | × | × | 98.61 | 15.77 | |
× | × | √ | × | 98.63 | 15.73 | |
× | × | × | √ | 98.67 | 15.72 | |
× | √ | √ | √ | 98.77 | 16.38 | |
√ | √ | √ | √ | 98.05 | 18.35 |
Model | R/% | P/% | [email protected]/% | Time/ms | Weights/KB |
---|---|---|---|---|---|
YOLOX_s | 88.56 | 89.59 | 94.66 | 14.26 | 35,110 |
YOLOX_m | 93.70 | 94.77 | 97.73 | 17.95 | 99,065 |
YOLOX_l | 95.87 | 94.49 | 97.78 | 23.76 | 212,966 |
YOLOX_x | 94.48 | 95.75 | 97.44 | 34.84 | 387,311 |
YOLOv5s | 92.51 | 93.38 | 94.26 | 15.88 | 7372.8 |
YOLOv4 | 90.86 | 91.73 | 92.80 | 22.73 | 25,1801.6 |
CBAM-YOLOX | 97.37 | 96.52 | 98.67 | 15.72 | 35,119 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Li, J.; Chen, Z.; Wang, C. Ships’ Small Target Detection Based on the CBAM-YOLOX Algorithm. J. Mar. Sci. Eng. 2022, 10, 2013. https://doi.org/10.3390/jmse10122013
Wang Y, Li J, Chen Z, Wang C. Ships’ Small Target Detection Based on the CBAM-YOLOX Algorithm. Journal of Marine Science and Engineering. 2022; 10(12):2013. https://doi.org/10.3390/jmse10122013
Chicago/Turabian StyleWang, Yuchao, Jingdong Li, Zeming Chen, and Chenglong Wang. 2022. "Ships’ Small Target Detection Based on the CBAM-YOLOX Algorithm" Journal of Marine Science and Engineering 10, no. 12: 2013. https://doi.org/10.3390/jmse10122013
APA StyleWang, Y., Li, J., Chen, Z., & Wang, C. (2022). Ships’ Small Target Detection Based on the CBAM-YOLOX Algorithm. Journal of Marine Science and Engineering, 10(12), 2013. https://doi.org/10.3390/jmse10122013