A Real-Time Fish Target Detection Algorithm Based on Improved YOLOv5
Abstract
:1. Introduction
- The Gamma transform is added to the preprocessing part to improve the contrast and gray of the underwater image so that the model can better identify the target object from the image and improve the accuracy of the model.
- After integrating the SE channel attention mechanism into the ShuffleNetv2 lightweight network, the YOLOv5s backbone network is replaced for feature extraction, and the parameters are greatly reduced to achieve the purpose of being a lightweight model.
- The improved simplified version of the weighted bidirectional feature pyramid network is used as a module, and the feature fusion is repeated three times to obtain richer feature information and further improve the detection performance.
2. Materials and Methods
2.1. Dataset Acquisition
- Firstly, the number of images of each type of fish in F4K dataset was counted;
- Then, 200 fish samples were randomly sampled from more than 200 fish samples;
- Finally, the fish samples extracted in the second step were merged with the samples with fewer than 200 samples to form the dataset of this paper.
2.2. Dataset Annotation
3. Proposed Method
3.1. Data Image Enhancement
3.2. Lightweight Feature Extraction Network Design with a Fused Attention Mechanism
3.2.1. ShuffleNetv2 Model
3.2.2. SE Channel Attention Mechanism
3.2.3. Improved Feature Extraction Network
3.3. Improved Feature Fusion Network Design
3.3.1. BiFPN and Its Simplification
- Reduce the number of BiFPN input nodes to adapt to the three input effective feature layers of the lightweight backbone network;
- Delete the input nodes with only one direction because their contribution to the network is small;
- A cross-scale connection method is proposed, and an extra edge is added to fuse the features in the feature extraction network directly with the features of the same size in the bottom-up path, so that the network retains more shallow semantic information while not losing too much relatively deep semantic information.
3.3.2. Lightweight Ghost Convolution Module
3.3.3. Improved Feature Fusion Networks
4. Experiment and Analysis
4.1. Experimental Configuration
4.2. Evaluating Indicator
4.3. Experimental Results and Analysis
4.3.1. Experiment on the Selection of Gamma Value in Preprocessing
4.3.2. Experiments on Attention Mechanism Selection
4.3.3. Comparative Experiments before and after the Improvement
4.3.4. Ablation Experiment
4.3.5. Comparison of Different Detection Algorithms
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Huang, H.; Feng, C.; Li, L.; Rao, X.; Chen, S.; Yang, J. The development status and prospect of contemporary marine fisheries. J. Fish. Sci. China 2022, 29, 938–949. [Google Scholar]
- Bryson, M.; Johnson-Roberson, M.; Pizarro, O.; Williams, S.B. True color correction of autonomous underwater vehicle imagery. J. Field Robot. 2016, 33, 853–874. [Google Scholar] [CrossRef]
- Kim, H.-G.; Seo, J.; Kim, S.M. Underwater Optical-Sonar Image Fusion Systems. Sensors 2022, 22, 8445. [Google Scholar] [CrossRef] [PubMed]
- Mahmood, A.; Bennamoun, M.; An, S.; Sohel, F.A.; Boussaid, F.; Hovey, R.; Kendrick, G.A.; Fisher, R.B. Deep Image Representations for Coral Image Classification. IEEE J. Ocean Eng. 2018, 44, 121–131. [Google Scholar] [CrossRef] [Green Version]
- Bonin-Font, F.; Oliver, G.; Wirth, S.; Massot, M.; Negre, P.L.; Beltran, J.P. Visual sensing for autonomous underwater exploration and intervention tasks. Ocean Eng. 2015, 93, 25–44. [Google Scholar] [CrossRef]
- Qiao, X.; Bao, J.; Zeng, L.; Zou, J.; Li, D. An automatic active contour method for sea cucumber segmentation in natural underwater environments. Comput. Electron. Agric. 2017, 135, 134–142. [Google Scholar] [CrossRef]
- Sahoo, A.; Dwivedy, S.K.; Robi, P. Advancements in the field of autonomous underwater vehicle. Ocean Eng. 2019, 181, 145–160. [Google Scholar] [CrossRef]
- Wan, Q.; Li, Z.; Li, Y.; Ge, Z.; Wang, Y.; Wu, D. Target Tracking Method of Mobile Robot Based on Improved YOLOX. Acta Autom. Sin. 2022, 45, 1–15. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Li, J.; Liang, X.; Shen, S.; Xu, T.; Feng, J.; Yan, S. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans. Multimed. 2017, 20, 985–996. [Google Scholar] [CrossRef] [Green Version]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:Abs/1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.y. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Sung, M.; Yu, S.C.; Girdhar, Y. Vision Based Real-Time Fish Detection Using Convolutional Neural Network. In Proceedings of the OCEANS 2017-Aberdeen, Aberdeen, UK, 19–22 June 2017; pp. 1–6. [Google Scholar]
- Cai, K.; Miao, X.; Wang, W.; Pang, H.; Liu, Y.; Song, J. A modified YOLOv3 model for fish detection based on MobileNetv1 as backbone. Aquacult. Eng. 2020, 91, 102117. [Google Scholar] [CrossRef]
- Hua, Y.; Zhang, Z.; Long, S.; Zhang, Q. Remote sensing image target detection based on improved YOLO algorithm. Electron. Meas. Technol. 2020, 43, 87–92. [Google Scholar]
- Fang, R.; Wang, M. Retail product packaging type detection based on improved YOLO network. Electron. Meas. Technol. 2020, 43, 108–112. [Google Scholar]
- Fang, W.; Wang, L.; Ren, P. Tinier-YOLO: A real-time object detection method for constrained environments. IEEE Access 2019, 8, 1935–1944. [Google Scholar] [CrossRef]
- Li, Y.S.; Zhang, C.Y.; Zhao, Y.K. Research on lightweight obstacle detection model based on model compression. Laser J. 2022, 43, 38–43. [Google Scholar]
- Boom, B.J.; Huang, P.X.; He, J.; Fisher, R.B. Supporting ground-truth annotation of image datasets using clustering. In Proceedings of the 21st International Conference on Pattern Recognition, Tsukuba, Japan, 11–15 November 2012; pp. 1542–1545. [Google Scholar]
- Hu, K.; Weng, C.; Zhang, Y.; Jin, J.; Xia, Q. An overview of underwater vision enhancement: From traditional methods to recent deep learning. J. Mar. Sci. Eng. 2022, 10, 241. [Google Scholar] [CrossRef]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, DC, USA, 14–19 June 2020; pp. 10781–10790. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, DC, USA, 14–19 June 2020; pp. 1580–1589. [Google Scholar]
Parameter Name | Parameter Values |
---|---|
Learning rate | 0.01 |
Momentum | 0.937 |
Weight decay | 0.0005 |
Batch size | 16 |
Epochs | 150 |
Value of Gamma | mAP (%) |
---|---|
— | 97.00 |
0.25 | 96.93 |
0.50 | 97.35 |
0.75 | 97.60 |
1.00 | 97.52 |
1.25 | 96.95 |
1.50 | 96.50 |
1.75 | 96.12 |
Model | Precision/% | Recall/% | F1/% | Parameters | Model Size/MB | FLOPs/G | mAP/% |
---|---|---|---|---|---|---|---|
ShuffleNetv2 | 92.67 | 92.10 | 92.38 | 842,358 | 2.0 | 1.83 | 96.70 |
+ECA | 92.75 | 92.17 | 92.46 | 842,388 | 2.0 | 1.83 | 96.80 |
+CA | 93.31 | 92.37 | 92.84 | 861,734 | 2.1 | 1.83 | 97.01 |
+CBAM | 92.76 | 92.15 | 92.45 | 851,914 | 2.1 | 1.83 | 96.76 |
+SE | 93.37 | 92.90 | 93.13 | 850,934 | 2.1 | 1.83 | 97.45 |
Model | Parameters | FLOPs/G | Model Size/MB | mAP/% |
---|---|---|---|---|
YOLOv5s | 7,012,822 | 15.76 | 13.7 | 97.60 |
Ours | 1,290,218 | 2.96 | 3.2 | 98.10 |
Scheme | Replace CSPDarkNet53 with ShuffleNetv2 | Add SE Attention Mechanism | Add Improved BiFPN-Short |
---|---|---|---|
0 | |||
1 | √ | ||
2 | √ | √ | |
3 | √ | √ | √ |
Model | Precision/% | Recall/% | F1/% | Parameters | Model Size/MB | FLOPs/G | mAP/% |
---|---|---|---|---|---|---|---|
Scheme 0 | 95.58 | 92.09 | 93.80 | 7,012,822 | 13.7 | 15.76 | 97.60 |
Scheme 1 | 92.67 | 92.10 | 92.38 | 842,358 | 2.0 | 1.83 | 96.70 |
Scheme 2 | 93.37 | 92.90 | 93.13 | 850,934 | 2.1 | 1.83 | 97.45 |
Scheme 3 | 95.80 | 92.91 | 94.33 | 1,290,218 | 3.2 | 2.96 | 98.10 |
Method | mAP/% | F1/% | FLOPs/G | Parameters | Model Size/MB | FPS |
---|---|---|---|---|---|---|
Faster R-CNN | 96.15 | 82.12 | 369.72 | 136,689,024 | 521.4 | 6.57 |
YOLOv5x | 98.10 | 95.19 | 203.76 | 86,173,414 | 173.1 | 23.09 |
YOLOv4 | 97.16 | 92.00 | 59.95 | 63,937,686 | 244.4 | 14.49 |
SSD | 94.34 | 91.29 | 60.76 | 23,611,734 | 90.6 | 16.36 |
YOLOv5-Lite | 97.70 | 94.00 | 14.58 | 5,257,558 | 11.2 | 30.06 |
YOLOv5s | 97.60 | 93.80 | 15.76 | 7,012,822 | 13.7 | 32.96 |
Ours | 98.10 | 94.33 | 2.96 | 1,290,218 | 3.2 | 30.94 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, W.; Zhang, Z.; Jin, B.; Yu, W. A Real-Time Fish Target Detection Algorithm Based on Improved YOLOv5. J. Mar. Sci. Eng. 2023, 11, 572. https://doi.org/10.3390/jmse11030572
Li W, Zhang Z, Jin B, Yu W. A Real-Time Fish Target Detection Algorithm Based on Improved YOLOv5. Journal of Marine Science and Engineering. 2023; 11(3):572. https://doi.org/10.3390/jmse11030572
Chicago/Turabian StyleLi, Wanghua, Zhenkai Zhang, Biao Jin, and Wangyang Yu. 2023. "A Real-Time Fish Target Detection Algorithm Based on Improved YOLOv5" Journal of Marine Science and Engineering 11, no. 3: 572. https://doi.org/10.3390/jmse11030572
APA StyleLi, W., Zhang, Z., Jin, B., & Yu, W. (2023). A Real-Time Fish Target Detection Algorithm Based on Improved YOLOv5. Journal of Marine Science and Engineering, 11(3), 572. https://doi.org/10.3390/jmse11030572