YOLOv5-ACS: Improved Model for Apple Detection and Positioning in Apple Forests in Complex Scenes
Abstract
:1. Introduction
Research Motivation and Contribution
2. Related Work
2.1. Object Detection
2.2. YOLOv5 Network
2.3. Application of Object Detection in Fruit Detection on the Tree
3. Materials and Methods
3.1. Data Collection
3.2. Apple Object Detection Based on Improved YOLOv5s Network
3.2.1. Focus Module
3.2.2. SPD-Conv Module
- Space-to-Depth (SPD) Module
f1,0 = F [1:S:scale, 0:S:scale]
…
fscale-1,0 =F [scale-1:S:scale, 0:S:scale]
f1,1= F [1:S:scale, 1:S:scale]
…
fscale-1,1 = F [scale-1:S:scale, 1:S:scale]
f1,scale-1 = F [1:S:scale, scale-1:S:scale]
fscale-1,scale-1 = F [scale-1:S:scale, scale-1:S:scale]
- Nonstrided Convolution Module
3.2.3. C3SE Module
3.2.4. Context Augmentation Module
3.2.5. Multi-Scale Feature Fusion
3.2.6. Context Aggregation Block
3.2.7. CoordConv
3.3. Experimental Environment and Parameter Settings
3.3.1. Experimental Platform and Parameter Settings
3.3.2. Pre-Trained Model
3.3.3. Model Evaluation Indicators
4. Results and Analysis
4.1. Comparison with Other Deep Learning Models
4.2. Ablation Experiments
4.3. Multi-Scale Object Comparison Experiments
4.4. Detection of Fruits on the Apple Tree in Complex Scenes
4.5. Holdout Cross-Validation
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- United States Department of Agriculture. Fresh Apples, Grapes, and Pears: World Markets and Trade; Foreign Agricultural Service: Washington, DC, USA, 2019; pp. 1–10.
- Yue, Y.; Tian, K.; Wang, H.; Zhao, H. Research on apple detection in complex environment based on improved Mask RCNN. J. Chin. Agric. Mech. 2019, 40, 128–134. [Google Scholar]
- Bhagya, C.; Shyna, A. An Overview of Deep Learning Based Object Detection Techniques. In Proceedings of the 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), Chennai, India, 25–26 April 2019; pp. 1–6. [Google Scholar]
- Chu, P.; Li, Z.; Zhang, K.; Chen, D.; Lammers, K.; Lu, R. O2RNet: Occluder-Occludee Relational Network for Robust Apple Detection in Clustered Orchard Environments. arXiv 2023, arXiv:2303.04884. [Google Scholar] [CrossRef]
- Sun, M.; Xu, L.; Chen, X.; Ji, Z.; Zheng, Y.; Jia, W. BFP Net: Balanced Feature Pyramid Network for Small Apple Detection in Complex Orchard Environment. Plant Phenomics 2022, 2022, 9892464. [Google Scholar] [CrossRef] [PubMed]
- Xuan, G.; Gao, C.; Shao, Y.; Zhang, M.; Wang, Y.; Zhong, J.; Li, Q.; Peng, H. Apple Detection in Natural Environment Using Deep Learning Algorithms. IEEE Access 2020, 8, 216772–216780. [Google Scholar] [CrossRef]
- Meng, H. Optimized Detection Algorithm for Green Fruit Based on Attention Mechanism; Shandong Normal University: Jinan, China, 2023. [Google Scholar]
- Shf, P.; Zhao, C. Review on Deep Based Object Detection. In Proceedings of the 2020 International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI), Sanya, China, 4–6 December 2020; pp. 372–377. [Google Scholar]
- Peng, X.; Yu, X.; Luo, Y.; Chang, Y.; Lu, C.; Chen, X. Prediction Model of Greenhouse Tomato Yield Using Data Based on Different Soil Fertility Conditions. Agronomy 2023, 13, 1892. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Ren, S. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision–ECCV 2016, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 213–229. [Google Scholar]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv 2021, arXiv:2010.04159. [Google Scholar]
- Yuan, X.; Ma, X.; Liu, S. An Improved Algorithm of Pedestrian and Vehicle Detection Based on YOLOv3. Sci. Technol. Eng. 2021, 21, 3192–3198. [Google Scholar]
- Song, X.; Zhang, D.; Zhang, P.; Liang, L.; Hei, X. Real-time object detection algorithm for complex construction environments. J. Comput. Appl. 2023, 1–9. [Google Scholar] [CrossRef]
- Li, Q.; Yang, X.; Lu, R.; Wang, S.; Xie, X.; Zhang, T. Transformer in Computer Vision: A Survey. J. Chin. Mini-Micro Comput. Syst. 2023, 44, 850–861. [Google Scholar]
- Li, Y.; Xue, J.; Zhang, M.; Yin, J.; Liu, Y.; Qiao, X.; Zheng, D.; Li, Z. YOLOv5-ASFF: A Multistage Strawberry Detection Algorithm Based on Improved YOLOv5. Agronomy 2023, 13, 1901. [Google Scholar] [CrossRef]
- Li, Y.; Li, X.; Hu, Z.; Su, X.; Chen, F. The research on lightweight SAR ship detection method based on regression model and attention. J. Infrared Millim. Waves 2022, 41, 618–625. [Google Scholar]
- Dong, W.; Liang, H.; Liu, G.; Hu, Q.; Yu, X. Review of Deep Convolution Applied to Target Detection Algorithms. J. Front. Comput. Sci. Technol. 2022, 5, 1025–1042. [Google Scholar]
- Peng, C.; Zhang, Q.; Tang, Z.; Gui, W. Research on Mask Wearing Detection Method Based on YOLOv5 Enhancement Model. Comput. Eng. 2022, 48, 39–49. [Google Scholar]
- Hu, D.; Zhang, Z. Road target detection algorithm for autonomous driving scenarios based on improved YOLOv5s. CAAI Trans. Intell. Syst. 2023, 1–9. Available online: http://kns.cnki.net/kcms/detail/23.1538.TP.20230913.1825.004.html (accessed on 8 October 2023).
- Zhou, H.; Ou, J.; Meng, P.; Tong, J.; Ye, H.; Li, Z. Reasearch on Kiwi Fruit Flower Recognition for Efficient Pollination Based on an Improved YOLOv5 Algorithm. Horticulturae 2023, 9, 400. [Google Scholar] [CrossRef]
- Mu, L.; Gao, Z.; Cui, Y.; Li, K.; Liu, H.; Fu, L. Kiwifruit Detection of Far-view and Occluded Fruit Based on Improved AlexNet. Trans. Chin. Soc. Agric. Mach. 2019, 50, 24–34. [Google Scholar]
- Li, C.; Wang, S. Identification and Detection of Picking Targets of Orah Mandarin Orange in Natural Environment Based on SSD Model. In Proceedings of the 2021 IEEE 3rd Eurasia Conference on IOT, Communication and Engineering (ECICE), Yunlin, Taiwan, 29–31 October 2021; pp. 439–442. [Google Scholar]
- Wu, X.; Qi, Z.; Wang, L.; Yang, J.; Xia, X. Apple Detection Method Based on Light-YOLOv3 Convolutional Neural Network. Trans. Chin. Soc. Agric. Mach. 2020, 51, 17–25. [Google Scholar]
- Li, Y.; Rao, Y.; Jin, X.; Jiang, Z.; Wang, Y.; Wang, T.; Wang, F.; Luo, Q.; Liu, L. YOLOv5s-FP: A Novel Method for In-Field Pear Detection Using a Transformer Encoder and Multi-Scale Collaboration Perception. Sensors 2023, 23, 30. [Google Scholar] [CrossRef]
- Chen, J.; Ma, A.; Huang, L.; Su, Y.; Li, W.; Zhang, H.; Wang, Z. GA-YOLO: A Lightweight YOLO Model for Dense and Occluded Grape Target Detection. Horticulturae 2023, 9, 443. [Google Scholar] [CrossRef]
- Qiu, Z.; Zeng, J.; Tang, W.; Yang, H.; Lu, J.; Zhao, Z. Research on Real-Time Automatic Picking of Ground-Penetrating Radar Image Features by Using Machine Learning. Horticulturae 2022, 8, 1116. [Google Scholar] [CrossRef]
- Sunkara, R.; Luo, T. No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects. arXiv 2022, arXiv:2208.03641. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Xiao, J.; Zhao, T.; Yao, Y.; Yu, Q.; Chen, Y. Context Augmentation and Feature Refinement Network for Tiny Object Detection. 2021. Available online: https://openreview.net/forum?id=q2ZaVU6bEsT (accessed on 6 September 2023).
- Liu, Y.; Li, H.; Hu, C.; Luo, S.; Luo, Y.; Chen, C.W. Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images. arXiv 2021, arXiv:2111.11057. [Google Scholar]
- Liu, R.; Lehman, J.; Molino, P.; Such, F.P.; Frank, E.; Sergeev, A.; Yosinski, J. An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution. In Proceedings of the 2018 Conference on Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Du, J.; Cui, S.; Jin, M.; Ru, C. Improved the Complex Road Scene Object Detection Algorithm of YOLOv7. Comput. Eng. Appl. 2023, 1–12. Available online: http://kns.cnki.net/kcms/detail/11.2127.TP.20230811.1710.026.html (accessed on 26 August 2023).
- Bhusal, S.; Karkee, M.; Zhang, Q. Apple Dataset Benchmark from Orchard Environment in Modern Fruiting Wall; Washington State University: Pullman, WA, USA, 2019. [Google Scholar] [CrossRef]
Dataset | Number | Resolution Ratio | Feature and Manual Enhancement |
---|---|---|---|
Daytime | 800 | 1920 × 1080 | Daytime light: sunlight, backlight; Night light: dim light (76), uneven artificial light (124); Occlusion: leaves occlusion, branch occlusion, occlusion between apples; Individual difference: size difference, maturity differences; Overall: high density arrangement; The overall brightening (800) and darkening (800) of daytime images; Vertical blur (1000) and horizontal blur (1000) |
Nighttime | 200 | 1280 × 960 |
Model | Precision (%) | Recall (%) | mAP_0.5 (%) | mAP_0.5:0.95 (%) |
---|---|---|---|---|
C3SE_YOLOv5-ACS. | 95.1 | 93.9 | 98.3 | 74.3 |
C3CBAM_YOLOv5-ACS | 94.6 | 94.0 | 98.2 | 74.0 |
C3ECA_YOLOv5-ACS | 95.2 | 93.5 | 98.2 | 74.2 |
Method | Precision (%) | Recall (%) | mAP_0.5 (%) | mAP_0.5:0.95 (%) |
---|---|---|---|---|
Weighted Fusion | 95.1 | 93.9 | 98.3 | 74.1 |
Adaptive Fusion | 95.1 | 93.9 | 98.3 | 74.3 |
Concatenation Fusion | 95.1 | 93.6 | 98.2 | 74.1 |
Model | Precision (%) | Recall (%) | mAP_0.5 (%) | mAP_0.5:0.95 (%) |
---|---|---|---|---|
Faster RCNN | 73.4 | 92.2 | 90.6 | 48.7 |
SSD | 94.3 | 29.6 | 76.1 | 36.6 |
YOLOv5s | 95.0 | 92.9 | 97.7 | 71.6 |
YOLOv7 | 95.0 | 93.0 | 97.7 | 70.8 |
YOLOv5-ACS | 95.1 | 93.9 | 98.3 | 74.3 |
Model | Precision (%) | Recall (%) | mAP_0.5 (%) | mAP_0.5:0.95 (%) | |
---|---|---|---|---|---|
Backbone | YOLOv5s | 95.0 | 92.9 | 97.7 | 71.6 |
YOLOv5s + SPD | 94.7 | 94.0 | 97.9 | 72.2 | |
YOLOv5s + SPD + C3SE | 94.7 | 94.4 | 98.0 | 73.1 | |
YOLOv5s + SPD + C3SE + CAM | 95.3 | 94.0 | 98.0 | 73.2 | |
Neck | YOLOv5s + P2 | 95.0 | 92.8 | 97.9 | 72.7 |
Neck + Head | YOLOv5s + P2 + CABlock | 94.2 | 93.6 | 98.0 | 72.9 |
YOLOv5s + P2 + CABlock + CoordConv | 95.0 | 93.0 | 98.0 | 72.7 | |
Backbone + Neck + Head | YOLOv5s + SPD + C3SE + CAM + P2 + CABlock | 94.7 | 94.0 | 98.2 | 74.1 |
YOLOv5s + SPD + C3SE + CAM + P2 + CABlock + Conv | 95.1 | 93.8 | 98.2 | 74.1 | |
YOLOv5s + SPD + C3SE + CAM + P2 + CABlock + CoordConv | 95.1 | 93.9 | 98.3 | 74.3 |
Model | APs (%) | APm (%) | APl (%) | ARs (%) | ARm (%) | ARl (%) |
---|---|---|---|---|---|---|
YOLOv5s | 19.0 | 67.0 | 83.0 | 27.1 | 70.8 | 86.0 |
YOLO5s + SPD + C3SE | 21.6 | 68.3 | 83.9 | 32.3 | 72.0 | 86.8 |
YOLOv5-ACS | 29.0 | 69.7 | 85.0 | 39.6 | 73.6 | 87.7 |
Model | Scenes | Precision (%) | Recall (%) | mAP_0.5 (%) | mAP_0.5:0.95 (%) |
---|---|---|---|---|---|
YOLOv5s | Nighttime | 91.6 | 92.3 | 96.6 | 75.1 |
YOLOv5-ACS | 91.7 | 93.3 | 97.2 | 78.1 | |
YOLOv5s | Nighttime motion blur | 89.9 | 89.2 | 94.7 | 68.2 |
YOLOv5-ACS | 90.4 | 89.6 | 95.2 | 70.9 | |
YOLOv5s | Daytime | 95.7 | 94.8 | 98.4 | 74.8 |
YOLOv5-ACS | 95.8 | 95.8 | 98.9 | 77.6 | |
YOLOv5s | Daytime motion blur | 94.8 | 91.6 | 97.1 | 69.4 |
YOLOv5-ACS | 93.7 | 93.5 | 97.8 | 72.2 | |
YOLOv5s | Dimming | 95.7 | 93.7 | 98.1 | 73.4 |
YOLOv5-ACS | 95.7 | 95.0 | 98.7 | 76.2 | |
YOLOv5s | Brightening | 95.8 | 94.5 | 98.3 | 74.7 |
YOLOv5-ACS | 96.0 | 95.6 | 98.8 | 77.4 |
Model | Random Seed | Precision (%) | Recall (%) | mAP_0.5 (%) | mAP_0.5:0.95 (%) |
---|---|---|---|---|---|
YOLOv5s | 1 | 95.2 | 92.8 | 97.4 | 71.7 |
YOLOv5s-ACS | 95.4 | 93.7 | 98.1 | 74.6 | |
YOLOv5s | 2 | 94.8 | 93.0 | 97.0 | 71.6 |
YOLOv5s-ACS | 95.0 | 93.6 | 98.1 | 74.5 | |
YOLOv5s | 3 | 95.4 | 92.4 | 97.3 | 71.4 |
YOLOv5s-ACS | 95.7 | 93.1 | 98.1 | 74.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, J.; Wang, C.; Xing, J. YOLOv5-ACS: Improved Model for Apple Detection and Positioning in Apple Forests in Complex Scenes. Forests 2023, 14, 2304. https://doi.org/10.3390/f14122304
Liu J, Wang C, Xing J. YOLOv5-ACS: Improved Model for Apple Detection and Positioning in Apple Forests in Complex Scenes. Forests. 2023; 14(12):2304. https://doi.org/10.3390/f14122304
Chicago/Turabian StyleLiu, Jianping, Chenyang Wang, and Jialu Xing. 2023. "YOLOv5-ACS: Improved Model for Apple Detection and Positioning in Apple Forests in Complex Scenes" Forests 14, no. 12: 2304. https://doi.org/10.3390/f14122304
APA StyleLiu, J., Wang, C., & Xing, J. (2023). YOLOv5-ACS: Improved Model for Apple Detection and Positioning in Apple Forests in Complex Scenes. Forests, 14(12), 2304. https://doi.org/10.3390/f14122304