Multi-Object Detection Method in Construction Machinery Swarm Operations Based on the Improved YOLOv4 Model
Abstract
:1. Introduction
2. Improved YOLOv4 Network Model
2.1. The YOLOv4 Network Model
2.2. Improved Network Model
2.2.1. K-Means Algorithm
2.2.2. Dilated Convolution
2.2.3. Focal Loss
3. Model Training and Tuning
3.1. Experimental Dataset
3.1.1. Dataset Acquisition
3.1.2. Calibration of the Dataset
3.2. Experimental Platform and Parameters
4. Experimental Results and Analysis
4.1. Evaluation Metrics for Model Performance
4.2. Comparative Analysis of Experimental Results
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Huan, H.; Chen, Y.F.; Zhang, L.; Li, P.; Zhu, R. An improved BR-YOLOv3 object detection network. Comput. Eng. 2021, 47, 186–193. [Google Scholar] [CrossRef]
- Joshi, K.A.; Thakore, D.G. A survey on moving object detection and tracking in video surveillance system. Int. J. Soft Comput. Eng. 2012, 2, 44–48. [Google Scholar]
- Severyn, A.; Moschitti, A. Unitn: Training deep convolutional neural network for twitter sentiment classification. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, CO, USA, 4–5 June 2015; pp. 464–469. Available online: https://aclanthology.org/S15-2079.pdf (accessed on 10 March 2022).
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA, 8–14 December 2001; Volume 1, p. I. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M.J. Robust real-time face detection. Int. J. Comput. Vis. 2004, 57, 137–154. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrellt, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; IEEE Computer Society: Washington, DC, USA, 2014; pp. 580–587. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; IEEE Computer Society: Washington, DC, USA, 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- REN, S.; HE, K.; GIRSHICK, R.; Sun, J. Faster R-CNN: Towards real-time object detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision—ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; ECCV 2016. Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2016; Volume 9905. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Ge, Z.; Liu, S.; Wang, F.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Xiao, B.; Lin, Q.; Chen, Y. A vision-based method for automatic tracking of construction machines at nighttime based on deep learning illumination enhancement. Autom. Constr. 2021, 127, 103721. [Google Scholar] [CrossRef]
- Zhang, S.; Liu, Z.X.; Li, Y.X.; Chen, Q.; Zhang, Z. An Improved Faster-RCNN Detection Method for Construction Machinery Under Three-Span Points of Transmission Lines. Instrumentation 2021, 28, 85–90. [Google Scholar] [CrossRef]
- Xiao, B.; Kang, S.C. Vision-Based Method Integrating Deep Learning Detection for Tracking Multiple Construction Machines. J. Comput. Civ. Eng. 2021, 35, 04020071. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Zhong, B.; Peter, E.D. Love, Hanbin Luo. Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach. Adv. Eng. Inform. 2018, 37, 139–149. [Google Scholar] [CrossRef]
- Wu, D.H.; Lv, S.C.; Jiang, M.; Song, H. Using channel pruning-based YOLOv4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments. Comput. Electron. Agric. 2020, 178, 105742. [Google Scholar] [CrossRef]
- Yu, J.; Zhang, W. Face mask wearing detection algorithm based on improved YOLO-v4. Sensors 2021, 21, 3263. [Google Scholar] [CrossRef]
- Jiang, Z.; Zhao, L.; Li, S.; Jia, Y. Real-time object detection method based on improved YOLOv4-tiny. arXiv 2020, arXiv:2011.04244. [Google Scholar] [CrossRef]
- Hu, X.; Liu, Y.; Zhao, Z.; Liu, J.; Yang, X.; Sun, C.; Chen, S.; Li, B.; Zhou, C. Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 2021, 185, 106135. [Google Scholar] [CrossRef]
- Fu, H.X.; Song, G.Q.; Wang, Y.C. Improved YOLOv4 marine target detection combined with CBAM. Symmetry 2021, 13, 623. [Google Scholar] [CrossRef]
- Guo, F.; Qian, Y.; Shi, Y. Real-time railroad track components inspection based on the improved YOLOv4 framework. Autom. Constr. 2021, 125, 103596. [Google Scholar] [CrossRef]
- Wu, L.; Ma, J.; Zhao, Y.H.; Liu, H. Apple detection in complex scene using the improved YOLOv4 model. Agronomy 2021, 11, 476. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft Coco: Common Objects in Context, European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar] [CrossRef]
- Shen, F.; Gan, R.; Zeng, G. Weighted residuals for very deep networks. In Proceedings of the 2016 3rd International Conference on Systems and Informatics (ICSAI), Shanghai, China, 19–21 November 2016; pp. 936–941. [Google Scholar] [CrossRef]
- Yao, Z.; Cao, Y.; Zheng, S.; Huang, G.; Lin, S. Cross-iteration batch normalization. arXiv 2020, arXiv:2002.05712. Available online: https://arxiv.org/pdf/2002.05712.pdf (accessed on 10 March 2022).
- Misra, D. Mish: A self regularized non-monotonic neural activation function. arXiv 2019, arXiv:1908.08681. Available online: https://arxiv.org/pdf/1908.08681.pdf (accessed on 12 March 2022).
- Ghiasi, G.; Lin, T.-Y.; Le, Q.V. Dropblock: A regularization method for convolutional networks. Adv. Neural Inf. Process. Syst. 2018, 31, 10727–10737. Available online: https://arxiv.org/pdf/1810.12890.pdf (accessed on 12 March 2022).
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. Proc. AAAI Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
- Hartigan, J.A.; Wong, M.A. Algorithm AS 136: A k-means clustering algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Yu, F.; Koltun, V.; Funkhouser, T. Dilated Residual Networks. In Proceedings of the IEEE Computer Society, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef] [Green Version]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef] [Green Version]
Parameter α | Parameter γ | mAP (%) |
---|---|---|
0.25 | 2 | 96.39 |
0.3 | 2 | 96.62 |
0.35 | 2 | 96.47 |
0.4 | 2 | 96.45 |
0.45 | 2 | 96.27 |
0.5 | 2 | 96.32 |
0.25 | 1.5 | 96.75 |
0.3 | 1.5 | 97.03 |
0.35 | 1.5 | 96.46 |
0.4 | 1.5 | 96.32 |
0.45 | 1.5 | 96.13 |
0.5 | 1.5 | 96.24 |
0.3 | 1 | 96.47 |
0.3 | 0 | 95.75 |
Model | K-Means | Focal Loss | Dilated Convolution | F1 | mAP (%) | FPS |
---|---|---|---|---|---|---|
YOLOv4 | × | × | × | 0.938 | 94.87 | 31.70 |
√ | × | × | 0.938 | 96.15 | 32.11 | |
× | √ | × | 0.93 | 95.96 | 30.17 | |
× | × | √ | 0.935 | 95.81 | 30.14 | |
√ | × | √ | 0.94 | 95.68 | 30.86 | |
× | √ | √ | 0.928 | 95.37 | 30.34 | |
√ | √ | × | 0.938 | 96.14 | 30.49 | |
√ | √ | √ | 0.94 | 97.03 | 31.11 |
Model | AP (%) | F1 | mAP (%) | FPS | |||
---|---|---|---|---|---|---|---|
Loader | Excavator | Truck | Person | ||||
Faster-RCNN (Resnet50) | 97.21 | 95.20 | 92.11 | 86.43 | 0.75 | 93.00 | 20.94 |
Faster-RCNN (Vgg) | 96.77 | 94.14 | 90.80 | 84.54 | 0.68 | 91.56 | 21.32 |
SSD (mobilenetv2) | 96.08 | 91.60 | 86.73 | 71.05 | 0.82 | 86.37 | 71.37 |
SSD (Vgg) | 99.25 | 96.03 | 92.82 | 83.38 | 0.897 | 92.87 | 72.04 |
YOLOv4 | 98.13 | 97.41 | 91.25 | 92.69 | 0.938 | 94.87 | 31.70 |
Improved YOLOv4 | 99.79 | 97.39 | 96.61 | 94.33 | 0.94 | 97.03 | 31.11 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hou, L.; Chen, C.; Wang, S.; Wu, Y.; Chen, X. Multi-Object Detection Method in Construction Machinery Swarm Operations Based on the Improved YOLOv4 Model. Sensors 2022, 22, 7294. https://doi.org/10.3390/s22197294
Hou L, Chen C, Wang S, Wu Y, Chen X. Multi-Object Detection Method in Construction Machinery Swarm Operations Based on the Improved YOLOv4 Model. Sensors. 2022; 22(19):7294. https://doi.org/10.3390/s22197294
Chicago/Turabian StyleHou, Liang, Chunhua Chen, Shaojie Wang, Yongjun Wu, and Xiu Chen. 2022. "Multi-Object Detection Method in Construction Machinery Swarm Operations Based on the Improved YOLOv4 Model" Sensors 22, no. 19: 7294. https://doi.org/10.3390/s22197294
APA StyleHou, L., Chen, C., Wang, S., Wu, Y., & Chen, X. (2022). Multi-Object Detection Method in Construction Machinery Swarm Operations Based on the Improved YOLOv4 Model. Sensors, 22(19), 7294. https://doi.org/10.3390/s22197294