A Remote Sensing Image Target Detection Algorithm Based on Improved YOLOv8
Abstract
:1. Introduction
- (1)
- The addition of an extra detection layer on the basis of the original multi-scale feature fusion network structure to additionally generate a feature map of larger size, improving the network’s ability to learn feature information about small targets.
- (2)
- A C2f-E structure based on the Efficient Multi-Scale Attention Module (EMA) [18] is proposed, which enhances the network’s detection ability for targets of different sizes through a cross-space learning approach.
- (3)
- The use of Wise-IoU [19] to replace CIoU in the original network, making full use of the wise gradient gain assignment strategy to improve the generalization ability of the model based on the improvement of the overall performance of the detector.
2. Introduction of YOLOv8 Detection Network
3. Improvement of YOLOv8
3.1. Additional Detection Layer for Small Targets
3.2. C2f-E Module
3.3. Improvement of the Loss Function
4. Experimentation and Analysis
4.1. Dataset and Experimental Environment
4.2. Evaluation Indicators
4.3. Comparison of Loss Functions
4.4. Attention Module Comparison Test
4.5. Ablation Study
4.6. Comparison Test
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Niu, R.; Zhi, X.; Jiang, S.; Gong, J.; Zhang, W.; Yu, L. Aircraft Target Detection in Low Signal-to-Noise Ratio Visible Remote Sensing Images. Remote Sens. 2023, 15, 1971. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR′05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Budiman, A.; Fabian; Yaputera, R.A.; Achmad, S.; Kurniawan, A. Student attendance with face recognition (LBPH or CNN): Systematic literature review. Procedia Comput. Sci. 2023, 216, 31–38. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. arXiv 2015. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; MIT Press: Cambridge, MA, USA, 2015; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE Computer Society: Washington DC, USA, 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Li, Z.; Zhou, F. FSSD: Feature fusion single shot multibox detector. arXiv 2017. [Google Scholar] [CrossRef]
- Wu, T.; Dong, Y. YOLO-SE: Improved YOLOv8 for Remote Sensing Object Detection and Recognition. Appl. Sci. 2023, 13, 12977. [Google Scholar] [CrossRef]
- Yi, H.; Liu, B.; Zhao, B.; Liu, E. Small Object Detection Algorithm Based on Improved YOLOv8 for Remote Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 1734–1747. [Google Scholar] [CrossRef]
- Wang, S.; Cao, X.; Wu, M.; Yi, C.; Zhang, Z.; Fei, H.; Zheng, H.; Jiang, H.; Jiang, Y.; Zhao, X.; et al. Detection of Pine Wilt Disease Using Drone Remote Sensing Imagery and Improved YOLOv8 Algorithm: A Case Study in Weihai, China. Forests 2023, 14, 2052. [Google Scholar] [CrossRef]
- Wang, X.; Gao, H.; Jia, Z.; Li, Z. BL-YOLOv8: An Improved Road Defect Detection Model Based on YOLOv8. Sensors 2023, 23, 8361. [Google Scholar] [CrossRef] [PubMed]
- Ouyang, D.; He, S.; Zhang, G.; Luo, M.; Guo, H.; Zhan, J.; Huang, Z. Efficient Multi-Scale Attention Module with Cross-Spatial Learning. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding Box Regression Loss with Dynamic Focusing Mechanism. arXiv 2023. [Google Scholar] [CrossRef]
- Liu, Z.; Ye, K. YOLO-IMF: An improved YOLOv8 algorithm for surface defect detection in industrial manufacturing field. In Proceedings of the International Conference on Metaverse, Honolulu, HI, USA, 23–26 September 2023; Springer Nature: Cham, Switzerland, 2023; pp. 15–28. [Google Scholar]
- Zhu, Q.; Ma, K.; Wang, Z.; Shi, P. YOLOv7-CSAW for maritime target detection. Front. Neurorobotics 2023, 17, 1210470. [Google Scholar] [CrossRef] [PubMed]
- Xia, G.-S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-scale Dataset for Object Detection in Aerial Images. In Proceedings of the IEEE 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam:Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Liu, Y.C.; Shao, Z.R.; Hoffmann, N. Global attention mechanism: Retain information to enhance Channel-spatial interactions. arXiv 2021. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022. [Google Scholar] [CrossRef]
- Li, S.; Fu, X.; Dong, J. Improved Ship DetectionAlgorithm Based on YOLOX for SAR Outline Enhancement Image. Remote Sens. 2022, 14, 4070. [Google Scholar] [CrossRef]
Item | Name |
---|---|
Operating system | Windows11 |
CPU | Intel(R) Core(TM) i9-9820X |
GPU | NVIDIA GeForce RTX 3090 |
RAM | 32 G |
Deep learning framework | PyTorch (1.13.1) |
Interpreter | Python (3.10) |
CUDA version | CUDA (11.7) |
Attention Module | P/% | R/% | [email protected]% | [email protected]:0.95% |
---|---|---|---|---|
CBAM | 83.9 | 77.2 | 82.5 | 54.9 |
GAM | 84.3 | 77.2 | 82.7 | 54.8 |
ECA | 83.8 | 77.1 | 82.1 | 54.5 |
EMA | 84.5 | 77.3 | 82.7 | 55.1 |
Dataset | Base | Layer for Small Target | Wise-IoUv3 | EMA | Small Vehicle | Large Vehicle | Plane | Storage Tank | Ship | Harbor | [email protected]% |
---|---|---|---|---|---|---|---|---|---|---|---|
DOTA | √ | × | × | × | 64.6 | 86.4 | 91.8 | 71.5 | 89.2 | 84.9 | 81.4 |
√ | √ | × | × | 67.1 | 86.3 | 91.9 | 75.4 | 89.3 | 81.1 | 81.8 | |
√ | × | √ | × | 69.4 | 86.4 | 91.7 | 72.7 | 88.8 | 84.1 | 82.2 | |
√ | √ | √ | × | 69.0 | 86.6 | 92.2 | 75.8 | 88.9 | 82.3 | 82.5 | |
√ | √ | √ | √ | 71.2 | 87.3 | 92.7 | 76.3 | 89.4 | 79.2 | 82.7 |
Algorithm | P/% | R/% | [email protected]% |
---|---|---|---|
SSD | 79.8 | 76.4 | 79.3 |
YOLOv5 | 84.2 | 74.4 | 80.1 |
YOLOv7 [26] | 81.5 | 74.5 | 79.3 |
YOLOX [27] | 84.4 | 75.7 | 80.6 |
YOLOv8 | 84.0 | 76.6 | 81.4 |
Our algorithm | 84.5 | 77.3 | 82.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, H.; Yang, H.; Chen, H.; Wang, J.; Zhou, X.; Xu, Y. A Remote Sensing Image Target Detection Algorithm Based on Improved YOLOv8. Appl. Sci. 2024, 14, 1557. https://doi.org/10.3390/app14041557
Wang H, Yang H, Chen H, Wang J, Zhou X, Xu Y. A Remote Sensing Image Target Detection Algorithm Based on Improved YOLOv8. Applied Sciences. 2024; 14(4):1557. https://doi.org/10.3390/app14041557
Chicago/Turabian StyleWang, Haoyu, Haitao Yang, Hang Chen, Jinyu Wang, Xixuan Zhou, and Yifan Xu. 2024. "A Remote Sensing Image Target Detection Algorithm Based on Improved YOLOv8" Applied Sciences 14, no. 4: 1557. https://doi.org/10.3390/app14041557
APA StyleWang, H., Yang, H., Chen, H., Wang, J., Zhou, X., & Xu, Y. (2024). A Remote Sensing Image Target Detection Algorithm Based on Improved YOLOv8. Applied Sciences, 14(4), 1557. https://doi.org/10.3390/app14041557