Tomato Recognition Method Based on the YOLOv8-Tomato Model in Complex Greenhouse Environments
Abstract
:1. Introduction
2. Materials and Methods
2.1. Image Acquisition and Dataset Construction
2.1.1. Image Acquisition
2.1.2. Data Preprocessing and Dataset Construction
2.2. Improvements to YOLOv8
2.2.1. YOLOv8 Object Detection Algorithm
2.2.2. YOLOv8-Tomato Object Detection Algorithm
2.2.3. Large Separable Kernel Attention
2.2.4. Dysample Dynamic Upsampler
2.2.5. Inner-CIoU Loss Function
2.3. Model Training and Performance Evaluation
2.3.1. Experimental Environment
2.3.2. Assessment of Indicators
3. Results and Discussion
3.1. Visualization
3.1.1. Graphical Analysis of Results
3.1.2. Visualization and Analysis of Grad-CAM
3.1.3. Visualization of the Detection Results of the Improved Model
3.2. Ablation Experiment
3.3. Comparison of Different Models
4. Conclusions
- (1)
- Aimed at the situation in which the edges of green tomatoes are highly similar to the background of green leaves, which leads to some immature tomatoes being misrecognized as the background, the LSKA attention mechanism is added to the SPPF layer of the YOLOv8 model, which better captures the shape features of the tomatoes, enhances the network’s attention to the essential features, and effectively improves the overall performance of the model;
- (2)
- To increase resource utilization efficiency, Dysample, an ultra-lightweight and efficient dynamic upsampler, has replaced the original upsampling to provide higher-quality upsampling effects;
- (3)
- The loss function CIoU of the YOLOv8n model was replaced with the Inner-CIoU loss function, and an auxiliary bounding box was used to compute the loss and accelerate the bounding box regression to improve the recognition accuracy effectively;
- (4)
- It was confirmed that the YOLOv8-Tomato model produced superior test results in all performances under identical test settings after the ablation test. The final mAP0.5 value reached 99.4%, which is 7.5%, 11.6%, 8.6%, 3.3%, and 0.6% higher than the Faster RCNN, SSD, YOLOv3-tiny, YOLOv5 and YOLOv8 models. This enabled fast and accurate detection of tomatoes in complex environments and met the requirements of practical applications.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bac, C.W.; Van Henten, E.J.; Hemming, J. Harvesting robots for high-value crops: State-of-the-art review and challenges ahead. J. Field Robot. 2014, 31, 888–911. [Google Scholar] [CrossRef]
- Fu, L.; Gao, F.; Wu, J. Application of consumer RGB-D cameras for fruit detection and localization in field: A critical review. Comput. Electron. Agric. 2020, 177, 105687. [Google Scholar] [CrossRef]
- Goldenberg, L.; Yaniv, Y.; Porat, R. Mandarin fruit quality: A review. J. Sci. Food Agric. 2018, 98, 18–26. [Google Scholar] [CrossRef] [PubMed]
- Chen, Q.; Yin, C.; Guo, Z. Current status and future development of the key technologies for apple picking robots. Nung Yeh Kung Ch’eng Hsueh Pao 2023, 38, 1–15. [Google Scholar]
- Sun, S.; Jiang, M.; He, D. Recognition of green apples in an orchard environment by combining the GrabCut model and Ncut algorithm. Biosyst. Eng. 2019, 187, 201–213. [Google Scholar] [CrossRef]
- Lu, J.; Lee, W.S.; Gan, H. Immature citrus fruit detection based on local binary pattern feature and hierarchical contour analysis. Biosyst. Eng. 2018, 171, 78–90. [Google Scholar] [CrossRef]
- Hayashi, S.; Yamamoto, S.; Saito, S. Field operation of a movable strawberry-harvesting robot using a travel platform. Jpn. Agric. Res. Q. 2014, 48, 307–316. [Google Scholar] [CrossRef]
- Zhao, Y.; Gong, L.; Zhou, B.; Hua, Y.; Niu, Q.; Liu, C. Object recognition algorithm of tomato harvesting robot using non-color coding approach. Trans. Chin. Soc. Agric. Mach. 2016, 47, 1–7. [Google Scholar]
- Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 1980, 36, 193–202. [Google Scholar] [CrossRef] [PubMed]
- LeCun, Y.; Bottou, L.; Bengio, Y. Gradient-based learning applied to document recognition. Inst. Electr. Electron. Eng. Proc. 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Ren, S.; He, K.; Girshick, R. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Gkioxari, G.; Dollár, P. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Wang, Z.; Ling, Y.; Wang, X. An improved Faster R-CNN model for multi-object tomato maturity detection in complex scenarios. Ecol. Inform. 2022, 72, 101886. [Google Scholar] [CrossRef]
- Fu, L.; Feng, Y.; Majeed, Y. Kiwifruit detection in field images using Faster R-CNN with ZFNet. IFAC Pap. OnLine 2018, 51, 45–50. [Google Scholar] [CrossRef]
- Long, J.; Zhao, C.; Lin, S.; Guo, W.; Wen, C.; Zhang, Y. Segmentation method of the tomato fruits with different maturities under greenhouse environment based on improved Mask R-CNN. Trans. Chin. Soc. Agric. Eng. 2021, V37, 100–108. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R. You Only Look Once: Unified, real-time object Detection. In Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Li, C.; Li, L.; Jiang, H. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023. [Google Scholar]
- Zhang, X.; Gao, Q.; Pan, D. Picking recognition research of pineapple in complex field environment based on improved YOLOv3. J. Chin. Agric. Mech. 2021, 42, 201–206. [Google Scholar]
- Chen, J.; Wang, Z.; Wu, J.; Hu, Q.; Zhao, C.; Tan, C.; Teng, L.; Luo, T. An improved Yolov3 based on dual path network for cherry tomatoes detection. J. Food Process Eng. 2021, 44, e13803. [Google Scholar] [CrossRef]
- Gai, R.; Chen, N.; Yuan, H.; Gai, R.; Chen, N.; Yuan, H. A detection algorithm for cherry fruits based on the improved YOLO-v4 model. Neural Comput. Appl. 2023, 35, 13895–13906. [Google Scholar] [CrossRef]
- Li, T.; Sun, M.; Ding, X.; Li, Y.; Zhang, G.; Shi, G.; Li, W. Tomato recognition method at the ripening stage based on YOLO v4 and HSV. Trans. Chin. Soc. Agric. Eng. 2021, 37, 183–190. [Google Scholar]
- Xiong, J.T.; Han, Y.L.; Wang, X. Method of maturity detection for papaya fruits in natural environment based on YOLO v5-lite. Trans. Chin. Soc. Agric. Mach. 2023, 54, 243–252. [Google Scholar]
- Rong, J.; Zhou, H.; Zhang, F.; Yuan, T.; Wang, P. Tomato cluster detection and counting using improved YOLOv5 based on RGB-D fusion. Comput. Electron. Agric. 2023, 207, 107741. [Google Scholar] [CrossRef]
- Long, Y.; Yang, Z.; He, M. Recognizing apple targets before thinning using improved YOLOv7. Trans. Chin. Soc. Agric. Eng. 2023, 39, 191–199. [Google Scholar]
- Chen, W.; Liu, M.; Zhao, C.J. MTD-YOLO: Multi-task deep convolutional neural network for cherry tomato fruit bunch maturity detection. Comput. Electron. Agric. 2024, 216, 108533. [Google Scholar] [CrossRef]
- Yang, G.; Wang, J.; Nie, Z.; Yang, H.; Yu, S. A lightweight YOLOv8 tomato detection algorithm combining feature enhancement and attention. Agronomy 2023, 13, 1824. [Google Scholar] [CrossRef]
- Lau, K.W.; Po, L.M.; Rehman, Y.A.U. Large separable kernel attention: Rethinking the large kernel attention design in cnn. Expert Syst. Appl. 2024, 236, 121352. [Google Scholar] [CrossRef]
- Liu, W.; Lu, H.; Fu, H. Learning to Upsample by Learning to Sample. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W. Distance-IoU loss: Faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
- Zhang, H.; Xu, C.; Zhang, S. Inner-iou: More effective intersection over union loss with auxiliary bounding box. arXiv 2023, arXiv:2311.02877. [Google Scholar]
Module | Precision/% | Recall/% | mAP@50/% | mAP@50–95/% | FLOPs/G | Param/M |
---|---|---|---|---|---|---|
YOLOv8n | 96.1 | 96.6 | 98.8 | 90.1 | 8.1 | 3.006 |
+LSKA | 98.0 | 98.1 | 99.0 | 90.7 | 8.3 | 3.278 |
+Dysample | 97.2 | 97.2 | 99.1 | 90.6 | 8.1 | 3.018 |
+Inner-IoU | 95.5 | 97.9 | 99.3 | 90.4 | 8.1 | 3.006 |
+Inner + LSKA | 96.4 | 97.3 | 99.0 | 90.8 | 8.3 | 3.291 |
+Dy + LSKA | 96.7 | 97.6 | 99.1 | 90.7 | 8.3 | 3.291 |
+Inner + Dy | 97.5 | 97.7 | 99.2 | 90.8 | 8.1 | 3.018 |
YOLOv8-Tomato | 99.1 | 99.0 | 99.4 | 91.0 | 8.3 | 3.291 |
Model Name | Precision/% | Recall/% | mAP@50/% | Model Size/MB |
---|---|---|---|---|
Faster-RCNN | 82.5 | 89.6 | 91.9 | 110.8 |
SSD | 81.7 | 89.8 | 87.8 | 93.3 |
YOLOv3-tiny | 87.9 | 88.3 | 90.8 | 23.8 |
YOLOv5 | 90.1 | 94.4 | 96.1 | 17.1 |
YOLOv8n | 96.1 | 96.6 | 98.8 | 6.1 |
YOLOv8-Tomato | 99.1 | 99.0 | 99.4 | 6.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, S.; Jia, X.; He, M.; Zheng, Z.; Lin, T.; Weng, W. Tomato Recognition Method Based on the YOLOv8-Tomato Model in Complex Greenhouse Environments. Agronomy 2024, 14, 1764. https://doi.org/10.3390/agronomy14081764
Zheng S, Jia X, He M, Zheng Z, Lin T, Weng W. Tomato Recognition Method Based on the YOLOv8-Tomato Model in Complex Greenhouse Environments. Agronomy. 2024; 14(8):1764. https://doi.org/10.3390/agronomy14081764
Chicago/Turabian StyleZheng, Shuhe, Xuexin Jia, Minglei He, Zebin Zheng, Tianliang Lin, and Wuxiong Weng. 2024. "Tomato Recognition Method Based on the YOLOv8-Tomato Model in Complex Greenhouse Environments" Agronomy 14, no. 8: 1764. https://doi.org/10.3390/agronomy14081764