Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision
Abstract
:1. Introduction
2. Related Work
2.1. Attention Mechanism: Focusing on Selective Information
2.2. Deformable Convolution
2.3. Spatial Pyramid Pooling
2.4. The Model Structure of the YOLOv8 Network
3. Material and Methods
3.1. Image Acquisition
3.2. Data Pre-Processing
3.3. The Proposed Tea-YOLOv8s Model
3.4. Experimental Environment
3.5. Training Parameters
3.6. Evaluation Metrics
4. Experiments and Results
4.1. Data Augmentation Performance
4.2. Comparison of the Overall Precision of the Network Models
4.3. Ablation Experiment
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, J.; Li, X.; Yang, G.; Wang, F.; Men, S.; Xu, B.; Xu, Z.; Yang, H.; Yan, L. Research on Tea Trees Germination Density Detection Based on Improved YOLOv5. Forests 2022, 13, 2091. [Google Scholar] [CrossRef]
- Zhang, S.; Yang, H.; Yang, C.; Yuan, W.; Li, X.; Wang, X.; Zhang, Y.; Cai, X.; Sheng, Y.; Deng, X.; et al. Edge Device Detection of Tea Leaves with One Bud and Two Leaves Based on ShuffleNetv2-YOLOv5-Lite-E. Agronomy 2023, 13, 577. [Google Scholar] [CrossRef]
- Yang, R.; Hu, Y.; Yao, Y.; Gao, M.; Liu, R. Fruit Target Detection Based on BCo-YOLOv5 Model. Mob. Inf. Syst. 2022, 2022, 8457173. [Google Scholar] [CrossRef]
- Xie, J.; Peng, J.; Wang, J.; Chen, B.; Jing, T.; Sun, D.; Gao, P.; Wang, W.; Lu, J.; Yetan, R.; et al. Litchi Detection in a Complex Natural Environment Using the YOLOv5-Litchi Model. Agronomy 2022, 12, 3054. [Google Scholar] [CrossRef]
- Wu, D.; Jiang, S.; Zhao, E.; Liu, Y.; Zhu, H.; Wang, W.; Wang, R. Detection of Camellia oleifera Fruit in Complex Scenes by Using YOLOv7 and Data Augmentation. Appl. Sci. 2022, 12, 11318. [Google Scholar] [CrossRef]
- Zhou, Y.; Tang, Y.; Zou, X.; Wu, M.; Tang, W.; Meng, F.; Zhang, Y.; Kang, H. Adaptive Active Positioning of Camellia oleifera Fruit Picking Points: Classical Image Processing and YOLOv7 Fusion Algorithm. Appl. Sci. 2022, 12, 12959. [Google Scholar] [CrossRef]
- Lai, Y.; Ma, R.; Chen, Y.; Wan, T.; Jiao, R.; He, H. A Pineapple Target Detection Method in a Field Environment Based on Improved YOLOv7. Appl. Sci. 2023, 13, 2691. [Google Scholar] [CrossRef]
- Lawal, M.O. Tomato detection based on modified YOLOv3 framework. Sci. Rep. 2021, 11, 1447. [Google Scholar] [CrossRef]
- Gu, C.; Wang, D.; Zhang, H.; Zhang, J.; Zhang, D.; Liang, D. Fusion of Deep Convolution and Shallow Features to Recognize the Severity of Wheat Fusarium Head Blight. Front. Plant Sci. 2021, 11, 599886. [Google Scholar] [CrossRef]
- Li, H.; Huang, M.; Zhu, Q.; Guo, Y. Peduncle Detection of Sweet Pepper Based on Color and 3D Feature. In Proceedings of the 2018 ASABE Annual International Meeting, Detroit, MI, USA, 29 July–1 August 2018. [Google Scholar] [CrossRef]
- Fu, L.; Tola, E.; Al-Mallahi, A.; Li, R.; Cui, Y. A novel image processing algorithm to separate linearly clustered kiwifruits. Biosyst. Eng. 2019, 183, 184–195. [Google Scholar] [CrossRef]
- Koirala, A.; Walsh, K.B.; Wang, Z.; McCarthy, C. Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of ‘MangoYOLO’. Precis. Agric. 2019, 20, 1107–1135. [Google Scholar] [CrossRef]
- Chen, J.; Liu, H.; Zhang, Y.; Zhang, D.; Ouyang, H.; Chen, X. A Multiscale Lightweight and Efficient Model Based on YOLOv7: Applied to Citrus Orchard. Plants 2022, 11, 3260. [Google Scholar] [CrossRef]
- Guo, S.; Yoon, S.C.; Li, L.; Wang, W.; Zhuang, H.; Wei, C.; Liu, Y.; Li, Y. Recognition and Positioning of Fresh Tea Buds Using YOLOv4-lighted. Agriculture 2023, 13, 518. [Google Scholar] [CrossRef]
- Gui, Z.; Chen, J.; Li, Y.; Chen, Z.; Wu, C.; Dong, C. A lightweight tea bud detection model based on YOLOv5. Comput. Electron. Agric. 2023, 205, 107636. [Google Scholar] [CrossRef]
- Cao, M.; Fu, H.; Zhu, J.; Cai, C. Lightweight tea bud recognition network integrating GhostNet and YOLOv5. Math. Biosci. Eng. 2022, 19, 12897–12914. [Google Scholar] [CrossRef]
- Yan, L.; Wu, K.; Lin, J.; Xu, X.; Zhang, J.; Zhao, X.; Tayor, J.; Chen, D. Identification and picking point positioning of tender tea buds based on MR3P-TS model. Front. Plant Sci. 2022, 13, 962391. [Google Scholar] [CrossRef]
- Li, Y.; Ma, R.; Zhang, R.; Cheng, Y.; Dong, C. A tea buds counting method based on YOLOV5 and Kalman filter tracking algorithm. Plant Phenomics 2023, 5, 0030. [Google Scholar] [CrossRef]
- Cheng, Y.; Li, Y.; Zhang, R.; Gui, Z.; Dong, C.; Ma, R. Locating Tea Bud Keypoints by Keypoint Detection Method Based on Convolutional Neural Network. Sustainability 2023, 15, 6898. [Google Scholar] [CrossRef]
- Meng, J.; Wang, Y.; Zhang, J.; Tong, S.; Chen, C.; Zhang, C. Tea Bud and Picking Point Detection Based on Deep Learning. Forests 2023, 14, 1188. [Google Scholar] [CrossRef]
- Zhang, Z.; Lu, Y.; Zhao, Y.; Pan, Q.; Jin, K.; Xu, G.; Hu, Y. TS-YOLO: An All-Day and Lightweight Tea Canopy Shoots Detection Model. Agronomy 2023, 13, 1411. [Google Scholar] [CrossRef]
- Nie, X.; Duan, M.; Ding, H.; Hu, B.; Wong, E.K. Attention Mask R-CNN for ship detection and segmentation from remote sensing images. IEEE Access 2020, 8, 9325–9334. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. Supplementary material for “ECA-Net: Efficient channel attention for deep convolutional neural networks”. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Park, J.; Woo, S.; Lee, J.Y.; Kweon, I.S. BAM: Bottleneck attention module. arXiv 2018, arXiv:1807.06514. [Google Scholar]
- Liu, Y.; Shao, Z.; Hoffmann, N. Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv 2021, arXiv:2112.05561. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable convnets V2: More deformable, better results. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA; 2019; pp. 9300–9308. [Google Scholar]
- Wang, R.; Shivanna, R.; Cheng, D.; Jain, S.; Lin, D.; Hong, L.; Chi, E. DCN V2: Improved deep & cross network and practical lessons for web-scale learning to rank systems. In Proceedings of the Web Conference 2021—World Wide Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; Volume 1, pp. 1785–1797. [Google Scholar]
- Msonda, P.; Uymaz, S.A.; Karaaǧaç, S.S. Spatial pyramid pooling in deep convolutional networks for automatic tuberculosis diagnosis. Trait. Signal 2020, 37, 1075–1084. [Google Scholar] [CrossRef]
- Yan, J.; Zhou, Z.; Zhou, D.; Su, B.; Zhe, X.; Tang, J.; Lai, Y.; Chen, J.; Liang, W. Underwater object detection algorithm based on attention mechanism and cross-stage partial fast spatial pyramidal pooling. Front. Mar. Sci. 2022, 9, 1056300. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unifified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Precision of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Glenn, J. YOLOv5. Git Code. 2020. Available online: https://github.com/ultralytics/YOLOv5 (accessed on 20 November 2022).
- Wang, C.Y.; Bochkovskiy, A.; Liao, H. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Glenn, J. YOLOv8. Git Code. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 15 February 2023).
- Zhang, H.; Wang, Y.; Dayoub, F.; Sünderhauf, N. VarifocalNet: An IoU-aware dense object detector. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8510–8519. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 2020, 21002–21012. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI 2020—34th AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020. [Google Scholar]
- Zahedi, M.; Rahimi, M. 3-D color histogram equalization by principal component analysis. Vis. Commun. Image Represent. 2016, 39, 58–64. [Google Scholar] [CrossRef]
- Dong, Y.; Liu, L.; Xu, J.; Wan, G. Target detection algorithm based on improved homomorphic filter in haze days. In Proceedings of the 2022 Global Reliability and Prognostics and Health Management (PHM-Yantai), Yantai, China, 13–16 October 2022; pp. 1–5. [Google Scholar]
- GShang. 2018. Git Code. 2021. Available online: https://gitee.com/gshang/block-homo (accessed on 21 February 2023).
- Chen, Y.; Chen, S. Localizing plucking points of tea leaves using deep convolutional neural networks. Comput. Electron. Agric. 2020, 171, 105298. [Google Scholar] [CrossRef]
- Xu, W.; Zhao, L.; Li, J.; Shang, S.; Ding, X.; Wang, T. Detection and classification of tea buds based on deep learning. Comput. Electron. Agric. 2022, 192, 106547. [Google Scholar] [CrossRef]
- Li, W.; Li, B.; Yuan, C.; Li, Y.; Wu, H.; Hu, W.; Wang, F. Anisotropic Convolution for Image Classification. IEEE Trans. Image Process. 2020, 29, 5584–5595. [Google Scholar] [CrossRef]
- Guan, J.; Lai, R.; Lu, Y.; Li, Y.; Li, H.; Feng, L.; Yang, Y.; Gu, L. Memory-Efficient Deformable Convolution Based Joint Denoising and Demosaicing for UHD Images. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 7346–7358. [Google Scholar] [CrossRef]
- Wang, G.; Lin, J.; Cheng, L.; Dai, Y.; Zhang, T. Instance segmentation convolutional neural network based on multi-scale attention mechanism. PLoS ONE 2022, 17, e0263134. [Google Scholar]
- Ablin, R.; Sulochana, C.H.; Prabin, G. An investigation in satellite images based on image enhancement techniques. Eur. J. Remote Sens. 2019, 53, 86–94. [Google Scholar] [CrossRef] [Green Version]
- Zhuang, J.; Dong, Y.; Bai, H.; Zuo, P.; Cheng, J. Auto-Selecting Receptive Field Network for Visual Tracking. IEEE Access 2019, 7, 157449–157458. [Google Scholar] [CrossRef]
- Chen, C.; Lu, J.; Zhou, M.; Yi, J.; Liao, M.; Gao, Z. A YOLOv3-based computer vision system for identification of tea buds and the picking point. Comput. Electron. Agric. 2022, 198, 107116. [Google Scholar] [CrossRef]
- Zhang, X.; Gao, Q.; Pan, D.; Cao, P.C.; Huang, D.H. Research on Spatial Positioning System of Fruits to be Picked in Field Based on Binocular Vision and SSD Model. J. Phys. Conf. Ser. 2021, 1748, 042011. [Google Scholar] [CrossRef]
- Xu, L.; Xie, Y.; Chen, X.; Chen, Y.; Kang, Z.; Huang, P.; Zou, Z.; He, Y.; Yang, N.; Peng, Y.; et al. Design of an efficient combined multipoint picking scheme for tea buds. Front. Plant Sci. 2022, 13, 1042035. [Google Scholar] [CrossRef]
Environmental Parameter | Value |
---|---|
CPU | AMD R9 4900 H |
GPU | GeForce 3060 Ti |
RAM | 16 GB |
Video memory | 6 GB |
Operating system | Windows10 |
Deep learning framework | Pytorch |
Cudnn | Cudnn10.1 |
OpenCV | 4.5.2 |
Parameter | Value | Parameter | Value |
---|---|---|---|
Learning Rate | 0.01 | Batch Size | 4 |
Image Size | 640 × 640 | Epoch | 300 |
Momentum | 0.937 | Mixup_prob | 0.5 |
Optimizer | sgd | Weight Decay | 0.0005 |
Block Size | 8 × 8 | 10 × 10 | 12 × 12 | 14 × 14 | 16 × 16 | 18 × 18 |
---|---|---|---|---|---|---|
Original image | 3.225 | 3.5879 | 3.8071 | 4.0067 | 4.1375 | 4.3077 |
Global homomorphic filtering | 2.7675 | 3.0972 | 3.3048 | 3.4921 | 3.6157 | 3.7762 |
Local homomorphic filtering | 3.4887 | 3.9484 | 4.1994 | 4.4251 | 4.5674 | 4.7571 |
Models | P | R | F1 | |
---|---|---|---|---|
YOLOv3 | 90.04% | 64.91% | 75.44% | 74.25% |
YOLOv4 | 90.49% | 15.18% | 26.00% | 51.96% |
YOLOv5s | 94.57% | 72.02% | 81.77% | 81.87% |
YOLOxs | 94.05% | 54.61% | 69.10% | 73.93% |
YOLOv8s | 93.08% | 76.13% | 83.76% | 84.68% |
Tea-YOLOv8s | 94.80% | 81.23% | 87.49% | 88.27% |
Models | P | R | F1 | ||
---|---|---|---|---|---|
YOLOv8s | 93.08% | 76.13% | 83.76% | 84.68% | 56.93% |
YOLOv8s + GAM | 91.98% | 79.96% | 85.55% | 86.25% | 59.67% |
YOLOv8s + SPPFCSPC | 92.79% | 78.94% | 85.31% | 86.12% | 59.81% |
YOLOv8s + DCNv2 | 91.48% | 79.01% | 84.79% | 85.36% | 60.27% |
YOLOv8s + GAM + SPPFCSPC | 93.77% | 80.92% | 86.87% | 87.39% | 64.30% |
YOLOv8s + GAM + DCNv2 | 93.04% | 81.17% | 86.70% | 86.84% | 63.69% |
YOLOv8s + SPPFCSPC + DCNv2 | 93.83% | 79.38% | 86.00% | 86.19% | 61.38% |
Tea-YOLOv8s | 94.80% | 81.23% | 87.49% | 88.27% | 65.72% |
Models | Layers | Parameters | Inference Time | GFLOPs |
---|---|---|---|---|
YOLOv8s | 168 | 11.1 M | 19.0 ms | 28.4 |
YOLOv8s + GAM | 201 | 19.7 M | 27.5 ms | 44.2 |
YOLOv8s + SPPFCSPC | 178 | 17.6 M | 21.6 ms | 33.6 |
YOLOv8s + DCNv2 | 178 | 11.4 M | 24.9 ms | 25.0 |
YOLOv8s + GAM + SPPFCSPC | 211 | 26.2 M | 31.3 ms | 49.3 |
YOLOv8s + GAM + DCNv2 | 211 | 20.0 M | 35.7 ms | 40.8 |
YOLOv8s + SPPFCSPC + DCNv2 | 188 | 17.9 M | 28.2 ms | 30.2 |
Tea-YOLOv8s | 221 | 26.5 M | 37.1 ms | 45.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xie, S.; Sun, H. Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision. Sensors 2023, 23, 6576. https://doi.org/10.3390/s23146576
Xie S, Sun H. Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision. Sensors. 2023; 23(14):6576. https://doi.org/10.3390/s23146576
Chicago/Turabian StyleXie, Shuang, and Hongwei Sun. 2023. "Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision" Sensors 23, no. 14: 6576. https://doi.org/10.3390/s23146576
APA StyleXie, S., & Sun, H. (2023). Tea-YOLOv8s: A Tea Bud Detection Model Based on Deep Learning and Computer Vision. Sensors, 23(14), 6576. https://doi.org/10.3390/s23146576