Next Article in Journal
Sustainable Maritime Transport: A Review of Intelligent Shipping Technology and Green Port Construction Applications
Previous Article in Journal
Prediction-Based Submarine Cable-Tracking Strategy for Autonomous Underwater Vehicles with Side-Scan Sonar
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Improved Lightweight Fish Detection Algorithm Based on Yolov8n

by
Qingyang Zhang
1,3 and
Shizhe Chen
2,1,3,*
1
Institute of Oceanographic Instrumentation, Qilu University of Technology (Shandong Academy of Sciences), Qingdao 266100, China
2
Laoshan Laboratory, Qingdao 266237, China
3
School of Ocean Technology Sciences, Qilu University of Technology (Shandong Academy of Sciences), Qingdao 266300, China
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2024, 12(10), 1726; https://doi.org/10.3390/jmse12101726
Submission received: 5 August 2024 / Revised: 9 September 2024 / Accepted: 14 September 2024 / Published: 1 October 2024
(This article belongs to the Section Ocean Engineering)

Abstract

:
The fish detection algorithm is of great significance for obtaining aquaculture information, optimizing prey allocation in aquaculture, and improving the growth efficiency and survival rate of fish. To address the challenges of high complexity, large computational load, and limited equipment resources in existing fish target detection processes, a lightweight fish detection and recognition method based on the Yolov8 network, called the CUIB-YOLO algorithm, is proposed. This method introduces a secondary innovative C2f-UIB module to replace the original C2f module in the Yolov8 neck network, effectively reducing the model’s parameter count. Additionally, the EMA mechanism is incorporated into the neck network to enhance the feature fusion process. Through optimized design, the Params and FLOPs of the CUIB-YOLO algorithm model are reduced to 2.5 M and 7.5 G, respectively, which represent reductions of 15.7% and 7.5% compared to the original YOLOv8n model. The mAP @ 0.5–0.95/% value reaches 76.4%, which is nearly identical to that of the Yolov8n model. Experimental results demonstrate that compared with current mainstream target detection and recognition algorithms, the proposed model reduces computational load without compromising detection accuracy, achieves model lightweighting, improves inference speed, and enhances the algorithm’s real-time performance.

1. Introduction

With the continuous growth in demand for economic fish, combining fish detection technology with machine vision and deep learning has become a mainstream trend to optimize marine fishery supervision and improve the efficiency of fishery resource development. Currently, target detection methods in academic and industrial circles are mainly divided into two categories—R-CNN detection methods based on a two-stage system, including R-CNN [1], Faster R-CNN [2], and Mask R-CNN [3]; and single-stage detection algorithms, such as the SSD [4], RetinaNet [5], and YOLO [6] series. However, training or testing these deep learning models requires significant computing resources due to their complex structures and large number of parameters. Additionally, the computing power of the terminal equipment used in small- and medium-sized enterprises in fisheries and animal husbandry is often limited. Therefore, the practical application of these deep learning models in fish detection and identification presents certain challenges.
In the existing research, various methods are proposed to lightweight the network model while ensuring accuracy. For example, to address the limitation of computing resources in the intelligent harvesting of safflower, Zhang [7] and colleagues have explored lightweight models. Similarly, to tackle the challenge of the efficient detection of tobacco diseases and pests in complex environments, Sun et al. [8] proposed a lightweight network for tobacco pest identification. They replaced the original neck layer with an Asymptotic Feature Pyramid Network (AFPN) [9] and substituted the C2f module in the backbone network with the VoV-GSCSP module, resulting in a 52.66% reduction in model parameters and a 19.9% reduction in GFLOPs compared to the baseline Yolov8n model. To overcome the limitations of storage and computing power in underwater embedded devices, Wang et al. [10] proposed a lightweight underwater detector based on the YOLOv8n algorithm. They used FasterNet-T0 to replace the Darknet-53 backbone of YOLOv8s, reducing model parameters by 22.52%, FLOPs by 23.59%, and model size by 22.73%, successfully achieving a lightweight model. In another study, to automate the grading of sweet potatoes of varying quality, Zhang et al. [11] developed an improved YOLOv8 lightweight model. They incorporated Slim-neck [12] and VanillaNet [13] to reduce the model’s complexity, and enhanced feature extraction by replacing the loss function with the Wise-IoU loss function. To address the inefficiencies and errors of traditional seed germination test methods, Ouyang et al. [14] proposed a lightweight YOLOv8-R model based on an improved YOLOv8. This model uses PP-LCNet to replace the backbone network, CCFM to replace the neck part, and various other optimizations to reduce parameters and computational load. For real-time fish detection and recognition, Yan et al. [15] introduced a convolutional neural network-based method incorporating the CBAM module to enhance detection accuracy. The CBAM-YOLOv5m model was further lightweighted, leading to a nearly 25% reduction in network parameters and a 20% reduction in computational consumption. Additionally, in response to the increasing challenges of underwater environment detection, several other deep learning-based detection schemes [16,17,18,19] have been proposed to address issues posed by complex environments. Given the limited computing resources available at coal mining sites, handling the computational demands of extensive hardware can be challenging. To address this, Yingbo Fan [20] introduces CM-YOLOv8, a lightweight object detection algorithm specifically designed for coal mining environments. This algorithm incorporates adaptive predefined anchor boxes tailored to the coal mining data set to improve target detection. It also employs an L1 norm-based pruning method to significantly reduce the model’s computational and parameter requirements without sacrificing accuracy. Tested on the coal mining data set DsLMF+, CM-YOLOv8 achieves a 40% reduction in model volume with less than a 1% decrease in accuracy. The semantic web-based video surveillance system demands high real-time performance and accuracy for vehicle detection in challenging night scenes. Wang XF [21] introduced a lightweight night vehicle detection method, MC-YOLO, which combines MobileNetV2 with YOLOv3. MobileNetV2 replaces the DarkNet53 backbone in YOLOv3 for feature extraction. Additionally, a convolutional block attention module is added post-convolution. Experimental results show that the MC-YOLO model achieves an accuracy of 92.75%, outperforming several advanced models. To detect airport pavement damage quickly and accurately, Liang HM [22] introduces YOLOv5-APD, an efficient and lightweight detection algorithm designed for mobile devices. The model’s complexity is reduced by eliminating redundant nodes during feature fusion. Results indicate that YOLOv5-APD surpasses state-of-the-art algorithms in both performance and efficiency, achieving a mean average precision (mAP) of 0.924.
The above research indicates that significant progress has been made in lightweighting mainstream target detection algorithms across various fields. However, for small- and medium-sized enterprises with limited computing power in their terminal detection equipment, running these systems stably without increasing model parameters remains a challenge. While lightweight models are easier to deploy, they often come with a slight reduction in detection accuracy. To address these issues, this study proposes a more easily deployable lightweight fish detection algorithm based on YOLOv8n. Specifically, an improved C2f module (C2f_UIB) is designed to replace the C2f module in the original YOLOv8n backbone network, reducing the number of model parameters. Additionally, to mitigate the performance degradation due to lightweighting, the EMA mechanism is added to the neck network’s feature fusion part to enhance detection performance.

2. Analysis of YOLOv8 Basic Model

YOLO [6] (You Only Look Once) series algorithms have attracted much attention because of their high efficiency and accuracy. The YOLOv8 model was created by Ultralytics in early 2023. Based on the scaling factor, different sizes of models such as n (nano), s (small), m (medium), l (large), and x (extra large) are provided to meet the needs of different scenarios. Among them, the YOLOv8n model, as the smallest model, can maintain high detection accuracy and high processing speed, which is suitable for scenarios with limited resources or high requirements for processing speed. Therefore, this study is based on the YOLOv8n model for lightweight processing. The structure of the model is shown in Figure 1. The model structure includes five parts—input, backbone, neck, head, and output. The backbone network and neck part replace the C3 structure of YOLOv5 with the C2f structure with a richer gradient flow. Among them, the neck network (neck) reduces the convolution operation before upsampling, and adopts the feature fusion method of PAN-FPN [12] (Pyramid Attention Network-Feature Pyramid Network), which uses two upsampling and multiple C2f modules to strengthen the fusion and utilization of different scale feature layer information. In the head part, the current mainstream Decoupled-Head structure is used to decouple the classification and regression tasks, and the anchor-free algorithm, Anchor-Free, is used to replace Anchor-Based in the YOLOv5 network, which further improves the model detection performance.

3. Improved Model-CUIB-YOLO

Despite being one of the most lightweight mainstream models, the YOLOv8n model still faces challenges such as a relatively large amount of computation and model complexity. In fish detection applications, particularly for small- and medium-sized enterprises, there is a demand for even more lightweight models to better match terminal equipment with limited computing power, thereby reducing power consumption and operational costs. This study proposes a lightweight fish detection model, CUIB-YOLO, as illustrated in Figure 2. The model introduces an improved C2f_UIB (Faster Implementation of CSP Bottleneck with 2 Convolutions_Universal Inverted Bottleneck) module, which replaces the C2f module in the backbone network of the baseline YOLOv8n model. This modification involves substituting traditional convolutions with deep convolution, which reduces the redundancy associated with conventional convolution operations, lowers the model’s parameter count, and decreases its overall complexity. To counterbalance the potential performance degradation caused by these lightweight operations, an Efficient Multi-scale Attention (EMA) mechanism is incorporated into the neck network. This mechanism enhances the model’s feature representation capabilities while minimizing computational overhead, thus allowing the model to achieve a balance between reduced computational demand and maintained accuracy.

3.1. Improved C2f Module

The Universal Inverted Bottleneck (UIB) [23] is a module introduced in MobileNetV4, a lightweight end-to-end network architecture developed by Google for mobile devices. The structure of this module is depicted in Figure 3. The UIB module includes two optional Depthwise Convolutions (DW) within the inverted bottleneck block (IB). Each convolution operates on a single input channel, significantly reducing the number of parameters and computational requirements. One depthwise convolution is positioned before the expansion layer, while the other is placed between the expansion and projection layers. This design integrates elements from previous models, such as the inverted bottleneck (IB) of MobileNetV2, the ConvNext block, and the feedforward network (FFN) from the Vision Transformer (ViT). Additionally, it introduces a new variant structure called Extra Depthwise (ExtraDW), which effectively reduces computational load and parameter count while maintaining model accuracy. This allows the model to adaptively and efficiently scale across various platforms without the need for a complex architecture search process.
Because of the meaningless features extracted by the traditional convolution operation in feature extraction, these redundant features will reduce the extraction efficiency. In the YOLOv8n model, the C2f module takes up a lot of computing resources. In order to reduce this waste of resources, the UIB module replaces the BottleNeck part of the C2f module as the main gradient flow branch. The structure of the C2f and C2f _ UIB modules is shown in Figure 4.

3.2. Introducing the EMA Mechanism

To address the issue of accuracy reduction due to the simplification of the model, an Efficient Multi-Scale Attention (EMA) mechanism [24] is introduced. The structure of the EMA module is illustrated in Figure 5, where ‘g’ represents the number of groups into which the input channel is divided. ‘X Avg Pool’ and ‘Y Avg Pool’ denote horizontal and vertical global pooling operations, respectively. In the EMA module, the input is first divided into groups, which are then processed through different branches. One branch performs global pooling, while the other conducts feature extraction using a 3 × 3 convolution. The output features from both branches are modulated by applying a sigmoid function and normalization operation. These modulated features are then merged via a cross-channel interaction module to capture pixel-level pairwise relationships. The final output is obtained after an additional sigmoid function adjustment.
This approach primarily reduces computational overhead by reshaping certain channels into the batch dimension and grouping the channel dimension into multiple sub-features. By encoding global information, the channel weights in each parallel branch are recalibrated. The pixel-level relationships are then captured through cross-channel interaction, which enhances the model’s feature representation capabilities.

4. Results

4.1. Data Sets and Evaluation Indicators

The data set used in this study is taken from the Roboflow Universe data set library, which contains three kinds of fish pictures—Gurame; goldfish (mas); and baku fish (pacu). In order to simulate the complex underwater shooting environment, each image is randomly processed and expanded by using grayscale, adding noise, brightness adjustment, and other methods. The number of images in the data set is 5172. The image size is unified to 640 × 640. Some of the data are shown in Figure 6, and the comparison before and after data processing is shown in Figure 7.
The training set, test set, and verification set are divided according to the ratio of 8:0.5:1.5, and the specific division is shown in Table 1.
In this study, the model’s performance is evaluated using the following metrics: mean average precision (mAP), floating point operations (FLOPs), and the number of parameters (Params). A lower Params value indicates fewer parameters, resulting in a lighter model that requires less computing power. Similarly, a lower FLOPs value suggests reduced computational resource usage and a lower model complexity.

4.2. Test Configuration and Parameter Setting

In this experiment, the initial learning rate was set to 0.01, the weight attenuation value was 0.0005, the batch size was set to 16, and the number of training rounds (epochs) was 200. The experimental configuration environment of this study is shown in Table 2.

4.3. Analysis of Lightweight Network Structure Ablation Test

To validate the effectiveness of the improved CUIB-YOLO algorithm proposed in this study, the model’s performance is analyzed across four scenarios—the YOLOv8n baseline algorithm, the addition of the C2f_UIB module, the inclusion of the EMA mechanism, and the complete CUIB-YOLO algorithm. The results of the ablation experiments are presented in Table 3.
The experimental results in Table 3 indicate that replacing the C2f module in the baseline network with the C2f_UIB module leads to an increase in processing speed and a reduction in the model’s parameter count, albeit with a slight decrease in mAP, precision (P), and recall (R). When both the C2f_UIB module and the EMA mechanism are applied together, the performance is better than using the C2f_UIB module alone. Compared to the original YOLOv8n model, the improved model shows only a minor decrease in mAP, P, and R, but achieves a significant reduction of 50W in parameter count.
The primary goal of this study is to achieve a lightweight model that can be easily deployed on devices with limited computing resources, such as terminals or mobile platforms. The slight reduction in mAP, P, and R has minimal impact on the model’s overall effectiveness.

4.4. Comparative Experiment of Different Models

To assess the performance of the improved CUIB-YOLO algorithm, three additional mainstream YOLO models are included in comparative experiments. The results are summarized in Table 4. From the table, YOLOv3-tiny exhibits relatively high model parameters and computational requirements. YOLOv5n, with similar parameters and calculations to the model proposed in this study, has a lower mAP. YOLOv5s, being a larger model, does not align with the study’s goal of model lightweightness. YOLOv8n, while achieving a 0.1% higher mAP compared to the proposed model, has a superior performance in terms of model parameters and computational efficiency. The yolov9 model has the highest accuracy, but the number of parameters is 20 times that of the benchmark model, which is difficult to apply in resource-constrained settings. Although the number of parameters of yolov9c is reduced to 25M, the accuracy is not as high as that of the model proposed in this study. Although the number of parameters of the yolov10 model is similar to that of this model, the accuracy is 3% lower.
Experiments demonstrate that the improved model effectively reduces complexity, making it more suitable for deployment on devices with limited computing resources.
To verify the effectiveness of incorporating the EMA mechanism into our experimental model, CUIB-YOLO, we conducted comparative tests with three other attention mechanisms of simpler structures—iRMB [25], MLCA [26], and ACmix [27]. The results, presented in Table 5, clearly show that under identical conditions, the EMA mechanism significantly enhances model accuracy compared to the other three mechanisms. This demonstrates that the EMA mechanism is the optimal choice.

4.5. Visualization of Test Results

The comparison of detection results before and after using the YOLOv8n and other models is shown in Figure 8. It is evident that the lightweight model, CUIB-YOLO, introduced in this study not only maintains the ability to detect targets that YOLOv8n can detect, but also successfully identifies some occluded features and small targets with subtle characteristics. The CUIB-YOLO model achieves this with high confidence, further emphasizing its lightweight advantage without compromising detection accuracy.

5. Discussion

Through experimental comparison, the current limitations of the model can be seen. While successfully reducing the complexity of the model and the number of parameters, the detection accuracy achieved is comparable to that of the original model. In addition, although the addition of the EMA mechanism has enhanced the feature representation capability, the accuracy of the model is still insufficient when processing small target images. Future research will focus on maintaining the lightweight characteristics of the model while improving the precision of the small eye.

6. Conclusions

To address the challenge of deploying popular target detection algorithms, which often have a high model complexity and a large number of parameters, on devices with limited computing resources, this study proposes a lightweight fish detection algorithm, CUIB-YOLO. Based on ablation tests and comparative experiments, the following conclusions are drawn:
  • Model Optimization: The BottleNeck component of the C2f module is replaced with the UIB module as the primary gradient flow branch. This new combined module substitutes the C2f module in the original model’s backbone network, resulting in an 80% reduction in model parameters and achieving lightweight processing. Additionally, the incorporation of the EMA mechanism enhances the model’s feature processing performance and improves detection accuracy.
  • Performance Comparison: Compared to YOLOv3-tiny and YOLOv5s, the improved lightweight network model demonstrates parameter reductions of 79%, 72%, and 16%, respectively. When compared with the YOLOv8n model, the parameter count is reduced by 16%, with only a 1% decrease in accuracy.
The experimental results show that the improved CUIB-YOLO model effectively reduces model complexity and parameter count while maintaining high detection accuracy. This makes it particularly advantageous for small and medium-sized enterprises using terminals or mobile devices with limited computing resources, facilitating efficient fish farming information retrieval and enhancing farming efficiency.

Author Contributions

Resources, S.C.; Writing—original draft, Q.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Key Research and Development Program of China grant number 2017YFC1403303 and National Natural Science Foundation of China, grant number 41976179.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate objectdetection and segmentation]. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 142–158. [Google Scholar] [CrossRef] [PubMed]
  2. Choi, J.Y.; Han, J.M. Deep learning (Fast R-CNN)-basedevaluation of rail surface defects. Appl. Sci. 2024, 14, 1874. [Google Scholar] [CrossRef]
  3. Mu, X.; He, L.; Heinemann, P.; Schupp, J.; Karkee, M. Mask R-CNN basedapple flower detection and king lower identification forprecision pollination. Smart Agric. Technol. 2023, 4, 100151. [Google Scholar] [CrossRef]
  4. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
  5. Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
  6. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only lookonce: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
  7. Zhang, X.Y.; Hu, G.R.; Li, P.H.; Cao, X.Y.; Zhang, H.; Chen, J.; Zhang, L.L. Lightweight Safflower Recognition Method Based on Improved YOLOv8n. Acta Agric. Eng. 2024, 40, 163–170. [Google Scholar]
  8. Sun, D.; Zhang, K.; Zhong, H.; Xie, J.; Xue, X.; Yan, M.; Wu, W.; Li, J. Efficient Tobacco Pest Detection in Complex Environments Using an Enhanced YOLOv8 Model. Agriculture 2024, 14, 353. [Google Scholar] [CrossRef]
  9. Yang, G.; Lei, J.; Zhu, Z.; Cheng, S.; Feng, Z.; Liang, R. AFPN: Asymptotic Feature Pyramid Network for Object Detection. arXiv 2023, arXiv:2306.15988v2. [Google Scholar]
  10. Zhang, M.; Wang, Z.; Song, W.; Zhao, D.; Zhao, H. Efficient Small-Object Detection in Underwater Images Using the Enhanced YOLOv8 Network. Appl. Sci. 2024, 14, 1095. [Google Scholar] [CrossRef]
  11. Zhang, S.; Wang, K.; Zhang, H.; Wang, T.; Gao, X.; Song, Y.; Wang, F. An improved YOLOv8 for fiber bundle segmentation in X-ray computed tomography images of 2.5D composites to build the finite element model. Compos. Part A 2024, 185, 108337. [Google Scholar] [CrossRef]
  12. Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A lightweight-design for real-time detector architectures. arXiv 2022, arXiv:2206.02424v3. [Google Scholar] [CrossRef]
  13. Chen, H.; Wang, Y.; Guo, J.; Tao, D. VanillaNet: The Power of Minimalism in Deep Learning. arXiv 2023, arXiv:2305.12972v2. [Google Scholar]
  14. Ouyang, Z.; Fu, X.; Zhong, Z.; Bai, R.; Cheng, Q.; Gao, G.; Li, M.; Zhang, H.; Zhang, Y. An exploration of the influence of ZnO NPs treatment on germination of radish seeds under salt stress based on the YOLOv8-R lightweight model. Plant Methods 2024, 20, 110. [Google Scholar] [CrossRef] [PubMed]
  15. Yan, Z.; Hao, L.; Yang, J.; Zhou, J. Real-Time Underwater Fish Detection and Recognition Based on CBAM-YOLO Network with Lightweight Design. J. Mar. Sci. Eng. 2024, 12, 1302. [Google Scholar] [CrossRef]
  16. Zhao, H.; Jin, J.; Liu, Y.; Guo, Y.; Shen, Y. FSDF: A high-performance fire detection framework. Expert Syst. Appl. 2024, 238, 121665. [Google Scholar] [CrossRef]
  17. Liu, P.Z.; Qian, W.B.; Wang, Y.L. YWnet: A convolutional block attention-based fusion deep learning method for complex underwater small target detection. Ecol. Inform. 2024, 79, 102401. [Google Scholar] [CrossRef]
  18. Ji, W.; Peng, J.Q.; Xu, B.; Zhang, T. Real-time detection of underwater river crab based on multi-scale pyramid fusion image enhancement and Mobile CenterNet model. Comput. Electron. Agric. 2023, 204, 107522. [Google Scholar] [CrossRef]
  19. Xu, X.C.; Liu, Y.; Lyu, L.; Yan, P.; Zhang, J.Y. MAD-YOLO: A quantitative detection algorithm for dense small-scale marine benthos. Ecol. Inform. 2023, 75, 102022. [Google Scholar] [CrossRef]
  20. Fan, Y.; Mao, S.; Li, M.; Wu, Z.; Kang, J. CM-YOLOv8: Lightweight YOLO for Coal Mine Fully Mechanized Mining Face. Sensors 2024, 24, 1866. [Google Scholar] [CrossRef] [PubMed]
  21. Wang, X.; Hao, X.; Kang, K. MC-YOLO-Based Lightweight Detection Method for Nighttime Vehicle Images in a Semantic Web-Based Video Surveillance System. Int. J. Semant. Web Inf. Syst. (IJSWIS) 2023, 19, 1–18. [Google Scholar] [CrossRef]
  22. Liang, H.; Gong, H.; Gong, L.; Zhang, M. Automated detection of airfield pavement damages: An efficient light-weight algorithm. Int. J. Pavement Eng. 2023, 24. [Google Scholar] [CrossRef]
  23. Qin, D.; Leichner, C.; Delakis, M.; Fornoni, M.; Luo, S.; Yang, F.; Wang, W.; Banbury, C.; Ye, C.; Akin, B.; et al. MobileNetV4—Universal Models for the Mobile Ecosystem. arXiv 2024, arXiv:2404.10518. [Google Scholar]
  24. Ouyang, D.; He, S.; Zhang, G.; Luo, M.; Guo, H.; Zhan, J.; Huang, Z. Efficient Multi-Scale Attention Module with Cross-Spatial Learning. arXiv 2023, arXiv:2305.13563v2. [Google Scholar]
  25. Zhang, J.; Li, X.; Li, J.; Liu, L.; Xue, Z.; Zhang, B.; Jiang, Z.; Huang, T.; Wang, Y.; Wang, C. Rethinking Mobile Block for Efficient Attention-based Models. arXiv 2023, arXiv:2301.01146v4. [Google Scholar]
  26. Wan, D.; Lu, R.; Shen, S.; Xu, T.; Lang, X.; Ren, Z. Mixed local channel attention for object detection. Eng. Appl. Artif. Intell. 2023, 123, 106442. [Google Scholar] [CrossRef]
  27. Pan, X.; Ge, C.; Lu, R.; Song, S.; Chen, G.; Huang, Z.; Huang, G. On the integration of self-attention and convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 815–825. [Google Scholar]
Figure 1. YOLOv8n network structure diagram.
Figure 1. YOLOv8n network structure diagram.
Jmse 12 01726 g001
Figure 2. CUIB-YOLO network structure diagram.
Figure 2. CUIB-YOLO network structure diagram.
Jmse 12 01726 g002
Figure 3. Universal Inverted Bottleneck (UIB) module.
Figure 3. Universal Inverted Bottleneck (UIB) module.
Jmse 12 01726 g003
Figure 4. C2f and C2f _ UIB module structure diagrams.
Figure 4. C2f and C2f _ UIB module structure diagrams.
Jmse 12 01726 g004
Figure 5. EMA module structure diagram.
Figure 5. EMA module structure diagram.
Jmse 12 01726 g005
Figure 6. Fish data set example.
Figure 6. Fish data set example.
Jmse 12 01726 g006
Figure 7. Comparison of fish data before and after processing.
Figure 7. Comparison of fish data before and after processing.
Jmse 12 01726 g007
Figure 8. Comparison of test results.
Figure 8. Comparison of test results.
Jmse 12 01726 g008
Table 1. Fish data set division.
Table 1. Fish data set division.
Data SetNumber of ImagesProportion (%)
train406280
test3705
val74015
Table 2. Experimental environment configuration.
Table 2. Experimental environment configuration.
YOLOv8n ModelC2f_UIBEMA[email protected]/%[email protected]–0.95/%FLOPs/GParams/M
××95.476.58.13.0
×94.972.76.12.2
×96.577.18.13.1
95.776.47.52.5
Table 3. Model ablation experimental results.
Table 3. Model ablation experimental results.
NameConfiguration
Operating SystemWindows 11
CPUAMD EPYC 7642 48-Core
(AMD, Santa Clara, CA, USA)
GPURTX 3090
(NVIDIA, Santa Clara, CA, USA)
Memory24 GB
Python3.10
CUDA11.8
PyTorch Frame2.1.2
Table 4. Performance comparison results of different models.
Table 4. Performance comparison results of different models.
Model[email protected]/%[email protected]–0.95/%FLOPs/GParams/M
YOLOv3-tiny88.86218.912.1
YOLOv5n88.164.17.12.5
YOLOv5s92.269.723.89.1
YOLOv8n95.476.58.13.0
YOLOv996.879.4264.960
YOLOv9c93.375.1102.325
YOLOv10n92.672.97.22.7
CUIB-YOLO95.776.47.52.5
Table 5. Comparison of results of different attention mechanisms.
Table 5. Comparison of results of different attention mechanisms.
Model[email protected]/%[email protected]–0.95/%FLOPs/GParams/M
iRMB94.172.17.02.4
MLCA94.472.86.32.1
ACmix94.172.36.92.4
CUIB-YOLO95.776.47.52.5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Q.; Chen, S. Research on Improved Lightweight Fish Detection Algorithm Based on Yolov8n. J. Mar. Sci. Eng. 2024, 12, 1726. https://doi.org/10.3390/jmse12101726

AMA Style

Zhang Q, Chen S. Research on Improved Lightweight Fish Detection Algorithm Based on Yolov8n. Journal of Marine Science and Engineering. 2024; 12(10):1726. https://doi.org/10.3390/jmse12101726

Chicago/Turabian Style

Zhang, Qingyang, and Shizhe Chen. 2024. "Research on Improved Lightweight Fish Detection Algorithm Based on Yolov8n" Journal of Marine Science and Engineering 12, no. 10: 1726. https://doi.org/10.3390/jmse12101726

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop