An Accurate Detection Model of Takifugu rubripes Using an Improved YOLO-V7 Network
Abstract
:1. Introduction
- (1)
- Compared to common scenarios, underwater images are affected by lighting, water flow, and water quality, etc., and the fish bodies in the images form a relatively complex background due to overlapping and occlusion, which increases the difficulty of the detection and causes inaccurate detection results.
- (2)
- In the feature extraction and fusion, the feature map output from each node is not fully utilized and the feature extraction ability can be further strengthened during training.
- (3)
- Due to the high density of cultured T. rubripes and the different target sizes in the images, the detection head of the YOLO-V7 needs to be improved.
2. Materials and Methods
2.1. Dataset
2.1.1. Data Acquisition and Image Features
2.1.2. Image Annotation and Dataset Production
2.2. Related Works
2.2.1. YOLO-V7
2.2.2. Evaluation Metrics
2.3. The Proposed Algorithm
2.3.1. Improvements in Feature Extraction Capabilities
2.3.2. Improvement of the Detection Head
3. Results
3.1. Analysis of Training Results
3.2. Algorithm Performance Evaluation
3.2.1. Pre-Training
3.2.2. Performance Comparison to Improve Feature Extraction Capabilities
3.2.3. Performance Comparison to Improve Head
3.2.4. Comparison of Network Pruning Performance
3.3. Performance Comparison of the Overall Algorithm
4. Conclusions and Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Algorithm A1: The pytorch style code of the improved LgConv & LKDeXt module |
1: class LgConv(nn.Module): 2: def __init__(self, in_channels, dw_channels, block_lk_size, small_kernel, drop_path, small_kernel_merged=False): 3: super().__init__() 4: self.pw1 = conv_bn_relu(in_channels, dw_channels, 1, 1, 0) 5: self.pw2 = conv_bn(dw_channels, in_channels, 1, 1, 0) 6: self.large_kernel = ReparamLargeKernelConv(dw_channels, dw_channels, block_lk_size, 1, dw_channels, small_kernel, small_kernel_merged) 7: self.lk_nonlinear = nn.ReLU() 8: self.prelkb_bn = get_bn(in_channels) 9: self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity() 10: def forward(self, x): 11: out = self.prelkb_bn(x) 12: out = self.pw1(out) 13: out = self.large_kernel(out) 14: out = self.lk_nonlinear(out) 15: out = self.pw2(out) 16: return x + self.drop_path(out) 17: class LKDeXt(nn.Module): 18: def __init__(self, c1, c2, n=1, True, g=1, e=0.5): 19: super().__init__() 20: c_ = int(c2 * e) 21: self.cv1 = Conv(c1, c_, 1, 1) 22: self.cv2 = Conv(c1, c_, 1, 1) 23: self.cv3 = Conv(2 * c_, c2, 1) 24: self.m = nn.Sequential(*(LgConv(c_, c_, 21, 5, 0.0, False) for _ in range(n))) 25: def forward(self, x): 26: return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1)) |
Algorithm A2: The pseudocode of the improved ConvBlock & LKDeXt |
1: def ConvBlock(x): 2: x = Conv(x) 3: x = Batch_norm(x) 4: x = ReLU(x) 5: return x 6: def LgConv(x): 7: y = Batch_norm(x) 8: y = ConvBlock(y) 9: y = ReparamLargeKernelConv (y) 10: y = ReLU(y) 11: y = ConvBlock(y) 12: return x + dropout(y) 13: def LKDeXt(x): 14: y = Conv(concat(LgConv(Conv(x)),Conv(x),dim=1)) 15: return y |
References
- Guo, R.; Zhang, X.; Su, H.; Liu, H. The research status of nutrition value and by-products ultilization of puffer fish. J. Food Sci. Technol. 2018, 3, 113–116. [Google Scholar]
- Yang, D.; Zhang, S.; Tang, X. Research and development of fish species identification based on machine vision technology. Fish. Inf. Strategy 2019, 31, 112–120. [Google Scholar]
- Sun, L.; Wu, Y.; Wu, Y. Multi-objective fish object detection algorithm is proposed to study. J. Agric. Mach. 2019, 50, 260–267. [Google Scholar]
- Tu, B.; Wang, J.; Wang, S.; Zhou, X.; Dai, P. Research on identification of freshwater fish species based on fish back contour correlation coefficient. Comput. Eng. Appl. 2016, 52, 162–166. [Google Scholar]
- Wan, P.; Zhao, J.; Zhu, M.; Tan, H.; Deng, Z.; Huang, S.; Wu, W.; Ding, A. Freshwater fish species identification method based on improved ResNet50 model. J. Agric. Eng. 2021, 12, 159–168. [Google Scholar]
- Liu, S.; Li, G.; Tu, X.; Meng, F.; Chen, J. Research on the development of aquaculture production information technology. Fish. Mod. 2021, 48, 64–72. [Google Scholar]
- Zhao, Z.; Liu, Y.; Sun, X.; Liu, J.; Yang, X.; Zhou, C. Composited FishNet: Fish detection and species recognition from low-quality underwater videos. IEEE Trans. Image Process. 2021, 30, 4719–4734. [Google Scholar] [CrossRef]
- Li, S.; Yang, L.; Yu, H.; Chen, Y. Underwater fish species identification model and real-time recognition system. J. Intell. Agric. 2022, 4, 130–139. [Google Scholar]
- Wang, W.; Jiang, H.; Qiao, Q.; Zhu, H.; Zheng, H. Research on fish recognition and detection algorithm based on deep Learning. J. Inf. Technol. Netw. Secur. 2020, 33, 6157–6166. [Google Scholar]
- Sun, S.; Zhao, J. Pattern Recognition and Machine Learning. J. Sci. Technol. Publ. 2021, 322, 154. [Google Scholar]
- Li, J.; Xu, L. Research hot trend prediction model based on machine learning algorithm comparison and analysis, the BP neural network, support vector machine (SVM) and LSTM model. Mod. Intell. 2019, 33, 23–33. [Google Scholar]
- Amanullah, M.; Selvakumar, V.; Jyot, A.; Purohit, N.; Fahlevi, M. CNN based prediction analysis for web phishing prevention. In Proceedings of the International Conference on Edge Computing and Applications (ICECAA), Tamilnadu, India, 1–3 December 2022; pp. 1–7. [Google Scholar]
- Althubiti, S.A.; Alenezi, F.; Shitharth, S.; Reddy, C.V.S. Circuit manufacturing defect detection using VGG16 convolutional neural networks. Wirel. Commun. Mob. Comput. 2022, 2022, 1070405. [Google Scholar] [CrossRef]
- Alyoubi, K.H.; Shitharth, S.; Manoharan, H.; Khadidos, A.O.; Khadidos, A.O. Connotation of fuzzy logic system in underwater communication systems for navy applications with data indulgence route. Sustain. Comput. Inform. Syst. 2023, 38, 100862. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton Geoffrey, E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Naseer, A.; Baro, E.N.; Khan, S.D.; Vila, Y. A novel detection refinement technique for accurate dentification of nephrops norvegicus burrows in underwater imagery. Sensors 2022, 12, 4441. [Google Scholar] [CrossRef] [PubMed]
- Shitharth, S.; Prasad, K.M.; Sangeetha, K.; Kshirsagar, P.R.; Babu, T.S.; Alhelou, H.H. An enriched RPCO-BCNN mechanisms for attack detection and classification in SCADA systems. IEEE Access 2021, 9, 156297–156312. [Google Scholar] [CrossRef]
- Sun, H.; Li, Y.; Lin, Y. Significant target detection based on deep learning review. J. Data Acquis. Process. 2023, 38, 21–50. [Google Scholar]
- Qian, C. Target detection algorithm based on depth of learning research progress. J. Wirel. Commun. Technol. 2022, 31, 24–29. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Washington, DC, USA, 23–28 June 2014; IEEE: Pitscatway, NJ, USA, 2014. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real—Time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Liu, S.; Huang, D.; Wang, Y. Learning spatial fusion for single-shot object detection. arXiv 2019, arXiv:1911.09516. [Google Scholar]
- Liu, Y.; Wang, Y.; Hunag, L. Fish recognition and detection based on FML-Centernet algorithm. Laser Optoelectron. Prog. 2022, 59, 317–324. [Google Scholar]
- Cai, W.; Pang, H.; Zhang, Y.; Zhao, J.; Ye, Z. Recognition model of farmed fish species based on convolutional neural network. J. Fish. China 2022, 46, 1369–1376. [Google Scholar]
- Dong, S.; Liu, W.; Cai, W.; Rao, Z. Fish recognition based on hierarchical compact bilinear attention network. Comput. Eng. Appl. 2022, 5, 186–192. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Wu, K.; Zhang, J.; Yin, X.; Wen, S.; Lan, Y. An improved YOLO model for detecting trees suffering from pine wilt disease at different stages of infection. Remote Sens. Lett. 2023, 14, 114–123. [Google Scholar] [CrossRef]
- Wang, L.; Li, L.; Wang, H.; Zhu, S.; Zhai, Z.; Zhu, Z. Real-time vehicle identification and tracking during agricultural master-slave follow-up operation using improved YOLO v4 and binocular positioning. Proc. Inst. Mech. Eng. 2023, 237, 1393–1404. [Google Scholar] [CrossRef]
- Qiu, Q.; Lau, D. Real-time detection of cracks in tiled sidewalks using YOLO-based method applied to unmanned aerial vehicle (UAV) images. Autom. Constr. 2023, 147, 104745. [Google Scholar] [CrossRef]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. Scaled-YOLOv4: Scaling cross stage partial network. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13024–13033. [Google Scholar]
- Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. RepVGG: Making VGG-style convnets great again. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13728–13737. [Google Scholar]
- Goutte, C.; Gaussier, E. A probabilistic interpretation of precision, recall and F score, with implication for evaluation. In Proceedings of the European Conference on Information Retrieval, Santiago de Compostela, Spain, 21–23 March 2005; pp. 345–359. [Google Scholar]
- Khan, S.D.; Basalamah, S. Multi-Scale person localization with multi-stage deep sequential framework. Int. J. Comput. Intell. Syst. 2021, 14, 1217–1228. [Google Scholar] [CrossRef]
- Khan, S.D.; Alarabi, L.; Basalamah, S. A unified deep learning framework of multi-scale detectors for Geo-spatial object detection in high-resolution satellite images. Arab. J. Sci. Eng. 2022, 47, 9489–9504. [Google Scholar] [CrossRef]
Configuration | Parameter |
---|---|
CPU | Intel Xeon(R) Gold 5128R |
GPU | Nvidia RTX 3090 Ti |
Operating system | Ubuntu 20.04 |
Development environment | Pycharm 2022.2 |
Accelerated environment | CUDA11.1 |
Conf-Thresh = 0.25 IOU = 0.5 | Precision | Recall | F1-Score | TP | FP | FN |
---|---|---|---|---|---|---|
YOLO-V7 | 0.91 | 0.82 | 0.86 | 1859 | 174 | 421 |
Improved YOLO-V7 | 0.96 | 0.94 | 0.95 | 2154 | 79 | 126 |
Model | YOLO-V7 | Feature Extraction | Improved Head | Network Pruning | GFLOPs |
---|---|---|---|---|---|
1 | √ | 104.8 | |||
2 | √ | √ | 109.9 | ||
3 | √ | √ | 119.4 | ||
4 (Ours) | √ | √ | √ | √ | 68.2 |
Model | YOLO-V7 | Feature Extraction | Improved Head | Network Pruning | (%) | (%) |
---|---|---|---|---|---|---|
1 | √ | 87.79% | 52.76% | |||
2 | √ | √ | 91.37% | 55.82% | ||
3 | √ | √ | 89.81% | 56.65% | ||
4 (Ours) | √ | √ | √ | √ | 92.86% | 57.94% |
Model | (%) | (%) |
---|---|---|
YOLO-V5 | 87.11% | 51.80% |
Faster R-CNN | 88.71% | 53.55% |
SSD | 82.26% | 46.43% |
YOLO-V7 | 87.79% | 52.76% |
Ours | 92.86% | 57.94% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, S.; Cai, K.; Feng, Y.; Tang, X.; Pang, H.; He, J.; Shi, X. An Accurate Detection Model of Takifugu rubripes Using an Improved YOLO-V7 Network. J. Mar. Sci. Eng. 2023, 11, 1051. https://doi.org/10.3390/jmse11051051
Zhou S, Cai K, Feng Y, Tang X, Pang H, He J, Shi X. An Accurate Detection Model of Takifugu rubripes Using an Improved YOLO-V7 Network. Journal of Marine Science and Engineering. 2023; 11(5):1051. https://doi.org/10.3390/jmse11051051
Chicago/Turabian StyleZhou, Siyi, Kewei Cai, Yanhong Feng, Xiaomeng Tang, Hongshuai Pang, Jiaqi He, and Xiang Shi. 2023. "An Accurate Detection Model of Takifugu rubripes Using an Improved YOLO-V7 Network" Journal of Marine Science and Engineering 11, no. 5: 1051. https://doi.org/10.3390/jmse11051051
APA StyleZhou, S., Cai, K., Feng, Y., Tang, X., Pang, H., He, J., & Shi, X. (2023). An Accurate Detection Model of Takifugu rubripes Using an Improved YOLO-V7 Network. Journal of Marine Science and Engineering, 11(5), 1051. https://doi.org/10.3390/jmse11051051