YOLOv7-CHS: An Emerging Model for Underwater Object Detection
Abstract
:1. Introduction
- Following a comprehensive examination of the correlation between underwater image enhancement and target detection, it is determined that there is no association between the two. This implies that image enhancement is not a mandatory step in the process of detecting targets in underwater environments.
- To enhance the accuracy of underwater detection while minimizing computational complexity, we propose the high-order spatial interaction (HOSI) module as a replacement for efficient layer aggregation networks (ELAN) as the backbone network for YOLOv7. The HOSI module achieves superior flexibility and customization through the incorporation of high-order spatial interactions between gated convolution and recursive convolution, and greatly reduces model complexity.
- Drawing inspiration from the transformer’s working mechanism, we propose the contextual transformer (CT) module to augment our detection network, enabling the integration of both dynamic and static context representations to improve the model’s ability to detect small targets.
- We integrate the simplified parameter-free attention (SPFA) module into the detection network, enabling it to attend to both channel and spatial information simultaneously, thereby improving its ability to selectively extract relevant information.
2. Related Work
2.1. Underwater Object Detection
2.2. Small Object Detection
2.3. Image Enhancement
3. Methods
3.1. Network Architecture
3.2. High-Order Spatial Interaction (HOSI) Module
3.3. Contextual Transformer (CT) Module
3.4. Simple Parameter-Free Attention (SPFA) Module
4. Experiments
4.1. Dataset
4.2. Experimental Settings
4.3. Evaluation Metrics
4.4. Image Enhancement
4.5. Ablation Experiments
4.6. Comparisons with Other Methods
4.6.1. Selection of Optimizer
4.6.2. Results on the Starfish Dataset
4.6.3. Results on the DUO Dataset
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wang, F.; Zhu, J.; Chen, L.; Zuo, Y.; Hu, X.; Yang, Y. Autonomous and In Situ Ocean Environmental Monitoring on Optofluidic Platform. Micromachines 2020, 11, 69. [Google Scholar] [CrossRef]
- Qi, S.; Du, J.F.; Wu, M.; Yi, H.; Tang, L.; Qian, T.; Wang, X. Underwater Small Target Detection Based on Deformable Convolutional Pyramid. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 2784–2788. [Google Scholar]
- Yuan, X.; Guo, L.; Luo, C.; Zhou, X.; Yu, C. A Survey of Target Detection and Recognition Methods in Underwater Turbid Areas. Appl. Sci. 2022, 12, 4898. [Google Scholar] [CrossRef]
- Kaur, R.; Singh, S. A comprehensive review of object detection with deep learning. Digit. Signal Process. 2022, 132, 103812. [Google Scholar] [CrossRef]
- Hua, X.; Cui, X.; Xu, X.Y.; Qiu, S.; Liang, Y.-Y.; Bao, X.; Li, Z. Underwater object detection algorithm based on feature enhancement and progressive dynamic aggregation strategy. Pattern Recognit. 2023, 139, 109511. [Google Scholar] [CrossRef]
- Fayaz, S.; Parah, S.A.; Qureshi, G.J. Underwater object detection: Architectures and algorithms—A comprehensive review. Multimed. Tools Appl. 2022, 81, 20871–20916. [Google Scholar] [CrossRef]
- Qi, J.; Gong, Z.; Xue, W.; Liu, X.; Yao, A.; Zhong, P. An Unmixing-Based Network for Underwater Target Detection From Hyperspectral Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5470–5487. [Google Scholar] [CrossRef]
- Li, M.; Mathai, A.; Lau, S.L.H.; Yam, J.W.; Xu, X.; Wang, X. Underwater Object Detection and Reconstruction Based on Active Single-Pixel Imaging and Super-Resolution Convolutional Neural Network. Sensors 2021, 21, 313. [Google Scholar] [CrossRef]
- Khan, S.; Ullah, I.; Ali, F.; Shafiq, M.; Ghadi, Y.Y.; Kim, T. Deep learning-based marine big data fusion for ocean environment monitoring: Towards shape optimization and salient objects detection. Front. Mar. Sci. 2023, 9, 1094915. [Google Scholar] [CrossRef]
- Lei, F.; Tang, F.; Li, S. Underwater Target Detection Algorithm Based on Improved YOLOv5. J. Mar. Sci. Eng. 2022, 10, 310. [Google Scholar] [CrossRef]
- Dinakaranr, R.; Zhang, L.; Li, C.; Bouridane, A.; Jiang, R.M. Robust and Fair Undersea Target Detection with Automated Underwater Vehicles for Biodiversity Data Collection. Remote Sens. 2022, 14, 3680. [Google Scholar] [CrossRef]
- Fu, C.Z.; Liu, R.; Fan, X.; Chen, P.; Fu, H.; Yuan, W.; Zhu, M.; Luo, Z. Rethinking general underwater object detection: Datasets, challenges, and solutions. Neurocomputing 2022, 517, 243–256. [Google Scholar] [CrossRef]
- Liang, X.; Song, P. Excavating RoI Attention for Underwater Object Detection. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 2651–2655. [Google Scholar]
- Malathi, V.; Manikandan, A.; Krishnan, K. Optimzied resnet model of convolutional neural network for under sea water object detection and classification. Multimed. Tools Appl. 2023, 82, 37551–37571. [Google Scholar] [CrossRef]
- Zhao, S.; Zheng, J.; Sun, S.; Zhang, L. An Improved YOLO Algorithm for Fast and Accurate Underwater Object Detection. Symmetry 2022, 14, 1669. [Google Scholar] [CrossRef]
- Pan, T.-S.; Huang, H.-C.; Lee, J.-C.; Chen, C.-H. Multi-scale ResNet for real-time underwater object detection. Signal Image Video Process. 2020, 15, 941–949. [Google Scholar] [CrossRef]
- Li, X.; Yu, H.; Chen, H. Multi-scale aggregation feature pyramid with cornerness for underwater object detection. Vis. Comput. 2023, 1–12. [Google Scholar] [CrossRef]
- Cai, S.; Li, G.; Shan, Y. Underwater object detection using collaborative weakly supervision. Comput. Electr. Eng. 2022, 102, 108159. [Google Scholar] [CrossRef]
- Yeh, C.-H.; Lin, C.-H.; Kang, L.-W.; Huang, C.-H.; Lin, M.-H.; Chang, C.-Y.; Wang, C.-C. Lightweight Deep Neural Network for Joint Learning of Underwater Object Detection and Color Conversion. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6129–6143. [Google Scholar] [CrossRef]
- Zhang, M.; Xu, S.; Song, W.; He, Q.; Wei, Q. Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion. Remote Sens. 2021, 13, 4706. [Google Scholar] [CrossRef]
- Tan, C.; Chen, D.; Huang, H.; Yang, Q.; Huang, X. A Lightweight Underwater Object Detection Model: FL-YOLOV3-TINY. In Proceedings of the 2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 27–30 October 2021; pp. 127–133. [Google Scholar]
- Yu, Y.; Zhao, J.; Gong, Q.; Huang, C.; Zheng, G.; Ma, J. Real-Time Underwater Maritime Object Detection in Side-Scan Sonar Images Based on Transformer-YOLOv5. Remote Sens. 2021, 13, 3555. [Google Scholar] [CrossRef]
- Athira, P.K.; Mithun Haridas, T.P.; Supriya, M.H. Underwater Object Detection model based on YOLOv3 architecture using Deep Neural Networks. In Proceedings of the 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 19–20 March 2021; Volume 1, pp. 40–45. [Google Scholar]
- Zhang, J.; Zhang, J.; Zhou, K.; Zhang, Y.; Chen, H.; Yan, X. An Improved YOLOv5-Based Underwater Object-Detection Framework. Sensors 2023, 23, 3693. [Google Scholar] [CrossRef]
- Song, P.; Liu, H.; Dai, L.; Wang, T.; Chen, Z. Boosting R-CNN: Reweighting R-CNN Samples by RPN’s Error for Underwater Object Detection. arXiv 2022, arXiv:2206.13728. [Google Scholar] [CrossRef]
- Liu, Y.; Sun, P.; Wergeles, N.M.; Shang, Y. A survey and performance evaluation of deep learning methods for small object detection. Expert Syst. Appl. 2021, 172, 114602. [Google Scholar] [CrossRef]
- Li, J.; Liu, C.; Lu, X.; Wu, B. CME-YOLOv5: An Efficient Object Detection Network for Densely Spaced Fish and Small Targets. Water 2022, 14, 2412. [Google Scholar] [CrossRef]
- Alsubaei, F.S.; Al-Wesabi, F.N.; Hilal, A.M. Deep Learning-Based Small Object Detection and Classification Model for Garbage Waste Management in Smart Cities and IoT Environment. Appl. Sci. 2022, 12, 2281. [Google Scholar] [CrossRef]
- Chen, L.; Zhou, F.; Wang, S.; Dong, J.; Li, N.; Ma, H.; Wang, X.; Zhou, H. SWIPENET: Object detection in noisy underwater scenes. Pattern Recognit. 2022, 132, 108926. [Google Scholar] [CrossRef]
- Cao, C.; Wang, B.; Zhang, W.; Zeng, X.; Yan, X.; Feng, Z.; Liu, Y.; Wu, Z. An Improved Faster R-CNN for Small Object Detection. IEEE Access 2019, 7, 106838–106846. [Google Scholar] [CrossRef]
- Xu, F.; Ding, X.; Peng, J.; Yuan, G.; Wang, Y.; Zhang, J.; Fu, X. Real-Time Detecting Method of Marine Small Object with Underwater Robot Vision. In Proceedings of the 2018 OCEANS—MTS/IEEE Kobe Techno-Oceans (OTO), Kobe, Japan, 28–31 May 2018; pp. 1–4. [Google Scholar]
- Lim, J.-S.; Astrid, M.; Yoon, H.; Lee, S.-I. Small Object Detection using Context and Attention. In Proceedings of the 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Jeju Island, Republic of Korea, 13–16 April 2019; pp. 181–186. [Google Scholar]
- Xu, S.; Zhang, M.; Song, W.; Mei, H.; He, Q.; Liotta, A. A systematic review and analysis of deep learning-based underwater object detection. Neurocomputing 2023, 527, 204–232. [Google Scholar] [CrossRef]
- Zuiderveld, K.J. Contrast Limited Adaptive Histogram Equalization. In Graphics Gems; Elsevier B.V.: Amsterdam, the Netherlands, 1994. [Google Scholar]
- Malhotra, M. Single Image Haze Removal Using Dark Channel Prior. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 33, 2341–2353. [Google Scholar]
- Kupyn, O.; Martyniuk, T.; Wu, J.; Wang, Z. DeblurGAN-v2: Deblurring (Orders-of-Magnitude) Faster and Better. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8877–8886. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar]
- Rao, Y.; Zhao, W.; Tang, Y.; Zhou, J.; Lim, S.N.; Lu, J. HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions. arXiv 2022, arXiv:2207.14284. [Google Scholar]
- Li, Y.; Yao, T.; Pan, Y.; Mei, T. Contextual Transformer Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 45, 1489–1500. [Google Scholar] [CrossRef]
- Guo, M.-H.; Xu, T.; Liu, J.; Liu, Z.-N.; Jiang, P.-T.; Mu, T.-J.; Zhang, S.-H.; Martin, R.R.; Cheng, M.-M.; Hu, S. Attention mechanisms in computer vision: A survey. Comput. Vis. Media 2021, 8, 331–368. [Google Scholar] [CrossRef]
- Yang, L.; Zhang, R.-Y.; Li, L.; Xie, X. SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021. [Google Scholar]
- Liu, J.; Kusy, B.; Marchant, R.; Do, B.; Merz, T.; Crosswell, J.R.; Steven, A.D.L.; Heaney, N.; Richter, K.v.; Tychsen-Smith, L.; et al. The CSIRO Crown-of-Thorn Starfish Detection Dataset. arXiv 2021, arXiv:2111.14311. [Google Scholar]
- Liu, C.; Li, H.; Wang, S.; Zhu, M.; Wang, D.; Fan, X.; Wang, Z. A Dataset and Benchmark of Underwater Object Detection for Robot Picking. In Proceedings of the 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Bottou, L. Stochastic Gradient Descent Tricks. In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Method | Starfish Dataset (%) | DUO Dataset (%) | ||||||
---|---|---|---|---|---|---|---|---|
P | R | mAP@0.5 | mAP@0.5:0.95 | P | R | mAP@0.5 | mAP@0.5:0.95 | |
Original image | 74.5 | 41.8 | 48.4 | 21.4 | 86.7 | 78.3 | 84.1 | 65.5 |
CLAHE | 79.2 | 46.0 | 51.8 | 22.4 | 88.2 | 75.4 | 83.5 | 62.6 |
DCP | 75.1 | 32.7 | 39.7 | 18.1 | 86.0 | 75.2 | 82.9 | 62.9 |
DeblurGAN-v2 | 74.3 | 34.5 | 41.5 | 17.6 | 86.1 | 77.1 | 84.0 | 63.9 |
Group | CT3 | HOSI | SPFA | Layers | Param. (M) | FLOPs (G) | Starfish Dataset (%) | DUO Dataset (%) | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
P | R | mAP@0.5 | P | R | mAP@0.5 | |||||||
1 | × | × | × | 314 | 34.8 | 103.2 | 72.8 | 42.5 | 47.3 | 86.9 | 79.0 | 80.1 |
2 | ✓ | × | × | 354 | 29.13 | 90.7 | 83.9 | 40.9 | 45.3 | 88.9 | 78.1 | 81.4 |
3 | × | ✓ | × | 330 | 37.66 | 43.5 | 80.8 | 39.3 | 44.1 | 86.3 | 76.4 | 80.4 |
4 | × | × | ✓ | 312 | 34.71 | 102.7 | 75.7 | 40.6 | 46.9 | 86.7 | 80.2 | 82.0 |
5 | ✓ | ✓ | × | 370 | 32.00 | 40.4 | 82.7 | 40.5 | 46.3 | 89.8 | 66.2 | 76.3 |
6 | ✓ | × | ✓ | 353 | 29.13 | 90.6 | 81.1 | 42.3 | 50.0 | 86.4 | 79.5 | 81.1 |
7 | × | ✓ | ✓ | 329 | 37.66 | 43.4 | 78.8 | 41.5 | 47.5 | 87.8 | 68.8 | 74.9 |
8 | ✓ | ✓ | ✓ | 369 | 31.98 (−2.82) * | 40.3 (−62.9) * | 74.5 | 41.8 | 48.4 (+1.1) * | 86.7 | 78.3 | 84.1 (+4.0) * |
Model | Optimizer | Starfish Dataset (%) | DUO Dataset (%) | ||||||
---|---|---|---|---|---|---|---|---|---|
P | R | mAP@0.5 | FPS | P | R | mAP@0.5 | FPS | ||
YOLOv7-CHS | SGD | 74.5 | 41.8 | 48.4 | 30 | 86.7 | 78.3 | 84.1 | 32 |
Adam | 66.3 | 46.8 | 52.8 (+4.4) | 32 | 87.6 | 77.0 | 84.6 (+0.5) | 32 |
Model | P (%) | R (%) | mAP@0.5 (%) | FLOPs (G) | FPS |
---|---|---|---|---|---|
Swin-Transformer | 83.0 | 28.9 | 35.8 | 4.5 | 30 |
YOLOv5s | 86.1 | 29.2 | 35.7 | 15.8 | 61 |
YOLOv7-tiny | 72.8 | 28.7 | 34.3 | 13.0 | 106 |
YOLOv7 | 69.1 | 42.5 | 48.3 | 103.2 | 31 |
YOLOv7-CHS | 66.3 | 46.8 | 52.8 | 40.3 | 32 |
Model | AP (%) | mAP@0.5 (%) | FLOPs (G) | FPS | |||
---|---|---|---|---|---|---|---|
Holothurian | Echinus | Scallop | Starfish | ||||
Swin-Transformer | 79.3 | 90.2 | 54.7 | 91.0 | 78.8 | 4.5 | 30 |
YOLOv4s-mish | 83.1 | 92.5 | 56.7 | 93.3 | 81.4 | 20.6 | 37 |
YOLOv5s | 81.1 | 91.7 | 45.8 | 93.0 | 77.9 | 15.8 | 61 |
YOLOv7-tiny | 80.8 | 93.6 | 61.8 | 93.0 | 82.3 | 13.0 | 38 |
YOLOv7 | 81.3 | 91.0 | 55.4 | 92.8 | 80.4 | 103.2 | 31 |
YOLOv7-CHS | 85.3 | 93.9 | 64.0 | 95.2 | 84.6 | 40.3 | 32 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhao, L.; Yun, Q.; Yuan, F.; Ren, X.; Jin, J.; Zhu, X. YOLOv7-CHS: An Emerging Model for Underwater Object Detection. J. Mar. Sci. Eng. 2023, 11, 1949. https://doi.org/10.3390/jmse11101949
Zhao L, Yun Q, Yuan F, Ren X, Jin J, Zhu X. YOLOv7-CHS: An Emerging Model for Underwater Object Detection. Journal of Marine Science and Engineering. 2023; 11(10):1949. https://doi.org/10.3390/jmse11101949
Chicago/Turabian StyleZhao, Liang, Qing Yun, Fucai Yuan, Xu Ren, Junwei Jin, and Xianchao Zhu. 2023. "YOLOv7-CHS: An Emerging Model for Underwater Object Detection" Journal of Marine Science and Engineering 11, no. 10: 1949. https://doi.org/10.3390/jmse11101949
APA StyleZhao, L., Yun, Q., Yuan, F., Ren, X., Jin, J., & Zhu, X. (2023). YOLOv7-CHS: An Emerging Model for Underwater Object Detection. Journal of Marine Science and Engineering, 11(10), 1949. https://doi.org/10.3390/jmse11101949