Applying Deep Learning and Single Shot Detection in Construction Site Image Recognition
Abstract
:1. Introduction
2. Literature Review
- Sliding window: a simple but time-consuming method based on the method of exhaustion. It works by establishing windows of various sizes for image scanning and extracting the feature information of every image window. Next, the data is fed to a classifier for object recognition to determine if the probability of the window matching the object to be detected is accurate. This method is the simplest but most time-consuming [30], as presented in Figure 3.
- 2.
- Region proposal: information in the image, such as texture, edges, and color, are used to predetermine the regions of interest (ROI) containing the object and determine the probability of these regions for matching. The high recall is maintained by filtering thousands of regions per second. Similar algorithms are R-CNN, Fast R-CNN, and Faster R-CNN [31,32,33,34], as shown in Figure 4.
- 3.
- Grid-based regression: a picture is divided into grids, and regions of various sizes are selected with the grids as centers. Regression determines the probability that every bounding box contains the target. This approach is suitable for real-time detection. Similar algorithms are you only look once (YOLO) and single shot multibox detector (SSD) [35], as shown in Figure 5.
3. Methodology
3.1. Study Setup
3.2. Collection of Job Site Images for a Construction Project
- Legal and free job site pictures obtained from Google under “Creative Commons”;
- Free databases provided by computer vision institutes, such as ImageNet and Labelme of MIT; and
- Photos of construction job sites taken for the study.
3.3. Method of Object Detection (SSD)
4. Study Contents and Outcomes
4.1. Establishment and Testing of Single Shot MultiBox Detector Model
4.2. Model Training Data Analysis
4.3. Single Shot MultiBox Detector Deep Learning Model Training Outcomes
- Monitoring the operation status of construction site personnel and equipment: real-time monitoring of the operation status of construction site personnel and equipment, including entry and exit times, the number of construction personnel, and the number of equipment appearing at that time, thereby effectively improving construction safety and efficiency.
- Ensuring the supply of construction site materials: effectively monitoring the entry and exit of construction site materials and inventory status, ensuring the timely use of materials, and ensuring the adequate and timely supply of materials on site.
- Improving the efficiency of construction site management: automatically recording the entry and exit time, location, and other information of construction site personnel and equipment, reducing the cost and risk of manual management, and improving the efficiency and accuracy of site management.
- Optimizing construction site scheduling: using image recognition technology to record construction logs and monitor the progress of various works at the construction site, adjusting the schedule promptly, improving construction efficiency, and reducing construction delays.
5. Conclusions and Suggestions
- Regarding detection personnel: For construction personnel, posture changes, construction site brightness changes, and object occlusion these problems would lead to false detections.
- Regarding detection materials: densely packed rebar would produce different degrees of joint and section difficulties; in addition, in the single target detection algorithm, the stacking between the background and the foreground was different, which may have led to a decrease in the sensitivity of the model to the sample. It resulted in false detections.
- Detection of equipment: Construction equipment detection items included excavators, shovel loaders, dump trucks, cranes, concrete mixer trucks, etc. There were more data sets than construction personnel and materials, and their identification performance was better. But to enhance the training of another project may have led to further overfitting.
- The evolutionary many-objective optimization algorithm with new techniques, such as domain decomposition and multi-objective optimization decomposition can improve the efficiency and accuracy of construction site management and enhance image recognition in construction engineering [46].
- Optimizing truck scheduling through algorithms can improve the efficiency and accuracy of material transportation and scheduling at construction sites, leading to intelligent and automated material transportation and ultimately enhancing construction efficiency and quality [47].
- Multi-objective optimization algorithms can significantly enhance the efficiency and accuracy of construction sites management tasks, such as material transportation, equipment scheduling, and personnel management. Integrating image recognition applications with these algorithms enables the intelligent and automated monitoring and control of construction sites, improving construction efficiency and quality [48].
- Image recognition technology can monitor the construction site in real time, detect potential risk factors, and determine the direction of improvement. At the same time, efficient dock scheduling algorithms can optimize construction materials and equipment logistics, reduce waiting time, and improving overall productivity [49].
- The direction is to combine image recognition technology to monitor the safety of construction sites in real time, detecting potential safety hazards early, and using NSGA-II and MOPSO algorithms for ambulance routing to improve rescue efficiency and emergency response capabilities [50].
- Applying the augmented self-adaptive parameter control method to a broader range of construction scenarios can improve construction efficiency and safety. Further research will explore combining the technique with other optimization algorithms to enhance its effectiveness and reduce construction costs [51].
- To enhance the simultaneous detection of personnel, equipment, and materials, upcoming methods will include feature pyramid, complete intersection over union (Ciou) loss, focal loss, and bag of freebies target detection optimization [52].
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yang, J.; Arif, O.; Vela, P.A.; Teizer, J.; Shi, Z. Tracking multiple workers on construction sites using video cameras. Adv. Eng. Inform. 2010, 24, 428–434. [Google Scholar] [CrossRef]
- Riveiro, B.; Lourenço, P.B.; Oliveira, D.V.; González-Jorge, H.; Arias, P. Automatic morphologic analysis of quasi-periodic masonry walls from LiDAR. Comput.-Aided Civ. Infrastruct. Eng. 2016, 31, 305–319. [Google Scholar] [CrossRef]
- Thakar, V.; Saini, H.; Ahmed, W.; Soltani, M.M.; Aly, A.; Yu, J.Y. Efficient Single-Shot Multi-Box Detector for Construction Site Monitoring. In Proceedings of the 2018 IEEE International Smart Cities Conference (ISC2), Kansas City, MO, USA, 16–19 September 2018; pp. 1–6. [Google Scholar]
- Zhu, Z.; Ren, X.; Chen, Z. Visual tracking of construction jobsite workforce and equipment with particle filtering. J. Comput. Civ. Eng. 2016, 30, 04016023. [Google Scholar] [CrossRef]
- Wang, Q.; Cheng, J.C.; Sohn, H. Automated estimation of reinforced precast concrete rebar positions using colored laser scan data. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 787–802. [Google Scholar] [CrossRef]
- Nimmo, J.; Green, R. Pedestrian avoidance in construction sites. In Proceedings of the 2017 International Conference on Image and Vision Computing New Zealand (IVCNZ), Christchurch, New Zealand, 4–6 December 2017; pp. 1–6. [Google Scholar]
- Alizadehslehi, S.; Yitmen, I. A Concept for Automated Construction Progress Monitoring: Technologies Adoption for Benchmarking Project Performance Control. Arab. J. Sci. Eng. 2018, 44, 4993–5008. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Luo, H.; Love, P.E. Falls from heights: A computer vision-based approach for safety harness detection. Autom. Constr. 2018, 91, 53–61. [Google Scholar] [CrossRef]
- Mahami, H.; Nasirzadeh, F.; Ahmadabadian, A.H.; Esmaeili, F.; Nahavandi, S. Imaging network design to improve the automated construction progress monitoring process. Constr. Innov. 2019, 19, 386–404. [Google Scholar] [CrossRef]
- Greeshma, A.S.; Edayadiyil, J.B. Automated progress monitoring of construction projects using Machine learning and image processing approach. Mater. Today Proc. 2022, 65, 554–563. [Google Scholar]
- Del Savio, A.; Luna, A.; Cárdenas-Salas, D.; Vergara, M.; Urday, G. Dataset of manually classified images obtained from a construction site. Data Brief 2022, 42, 108042. [Google Scholar] [CrossRef]
- Fang, Q.; Li, H.; Luo, X.; Ding, L.; Luo, H.; Rose, T.M.; An, W. Detecting non-hardhat-use by a deep learning method from far-field surveillance videos. Autom. Constr. 2018, 85, 1–9. [Google Scholar] [CrossRef]
- Fang, W.; Ding, L.; Zhong, B.; Love, P.E.; Luo, H. Automated detection of workers and heavy equipment on construction sites: A convolutional neural network approach. Adv. Eng. Inform. Rmatics 2018, 37, 139–149. [Google Scholar] [CrossRef]
- Kim, Y.; Choi, Y. Smart Helmet-Based Proximity Warning System to Improve Occupational Safety on the Road Using Image Sensor and Artificial Intelligence. Int. J. Environ. Res. Public Health 2022, 19, 16312. [Google Scholar] [CrossRef] [PubMed]
- Buniya, M.K.; Othman, I.; Sunindijo, R.Y.; Kashwani, G.; Durdyev, S.; Ismail, S.; Antwi-Afari, M.F.; Li, H. Critical Success Factors of Safety Program Implementation in Construction Projects in Iraq. Int. J. Environ. Res. Public Health 2021, 18, 8469. [Google Scholar] [CrossRef] [PubMed]
- Yeşilmen, S.; Tatar, B. Efficiency of convolutional neural networks (CNN) based image classification for monitoring construction related activities: A case study on aggregate mining for concrete production. Case Stud. Constr. Mater. 2022, 17, e01372. [Google Scholar] [CrossRef]
- Lee, H.; Grosse, R.; Ranganath, R.; Ng, A.Y. Convolutional deep belief networks for calable unsupervised learning of hierarchical representations. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 609–616. [Google Scholar]
- Makantasis, K.; Protopapadakis, E.; Doulamis, A.; Doulamis, N.; Loupos, C. Deep convolutional neural networks for efficient vision based tunnel inspection. In Proceedings of the 2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 3–5 September 2015; pp. 335–342. [Google Scholar]
- Pan, Y.; Zhang, L. Roles of artificial intelligence in construction engineering and management: A critical review and future trends. Autom. Constr. 2021, 122, 10357. [Google Scholar] [CrossRef]
- Yan, X.; Li, H.; Wang, C.; Seo, J.; Zhang, H.; Wang, H. Development of ergonomic posture recognition technique based on 2D ordinary camera for construction hazard prevention through view-invariant features in 2D skeleton motion. Adv. Eng. Inform. 2017, 34, 152–163. [Google Scholar] [CrossRef]
- Sepas-Moghaddam, A.; Etemad, A. Deep Gait Recognition: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 264–284. [Google Scholar] [CrossRef]
- Lin, C.-L.; Fan, K.-C.; Lai, C.-R.; Cheng, H.-Y.; Chen, T.-P.; Hung, C.-M. Applying a Deep Learning Neural Network to Gait-Based Pedestrian Automatic Detection and Recognition. Appl. Sci. 2022, 12, 4326. [Google Scholar] [CrossRef]
- Arabi, S.; Haghighat, A.K.; Sharma, A. A deep-learning-based computer vision solution for construction vehicle detection. Comput.-Aided Civ. Infrastruct. Eng. 2020, 35, 753–767. [Google Scholar] [CrossRef]
- Chou, J.-S.; Liu, C.-H. Automated Sensing System for Real-Time Recognition of Trucks in River Dredging Areas Using Computer Vision and Convolutional Deep Learning. Sensors 2021, 21, 555. [Google Scholar] [CrossRef]
- Li, Y.; Lu, Y.J.; Chen, J. A deep learning approach for real-time rebar counting on the construction site based on YOLOv3 detector. Autom. Constr. 2021, 124, 103602. [Google Scholar] [CrossRef]
- Cha, Y.J.; Choi, W.; Büyüköztürk, O. Deep learning-based crack damage detection using convolutional neural networks. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 361–378. [Google Scholar] [CrossRef]
- Chang, C.W.; Lin, C.H.; Lien, H.S. Measurement radius of reinforcing steel bar in concrete using digital image GPR. Constr. Build. Mater. 2009, 23, 1057–1063. [Google Scholar] [CrossRef]
- CS231n Convolutional Neural Networks for Visual Recognition, Stanford. 2016. Available online: http://cs231n.stanford.edu/ (accessed on 27 March 2023).
- Martinez, P.; Al-Hussein, M.; Ahmad, R. A scientrometic analysis and critical review of computer vision applications for construction. Autom. Constr. 2019, 107, 102947. [Google Scholar] [CrossRef]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv 2013, arXiv:1312.6229. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 142–158. [Google Scholar] [CrossRef]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 11–18 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot Multi-box detector. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Yudin, D.; Slavioglo, D. Usage of fully convolutional network with clustering for traffic light detection. In Proceedings of the 2018 7th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 10–14 June 2018; pp. 1–6. [Google Scholar]
- Wang, Y.; Wang, C.; Zhang, H. Combining a single shot Multi-box detector with transfer learning for ship detection using sentinel-1 SAR images. Remote Sens. Lett. 2018, 9, 780–788. [Google Scholar] [CrossRef]
- Deshpande, A. A Beginner’s Guide to Understanding Convolutional Neural Networks. Retrieved March 2017; Volume 31. Available online: https://adeshpande3.github.io/A-Beginner’s-Guide-To-Understanding-Convolutional-Neural-Networks/ (accessed on 27 March 2023).
- Dorafshan, S.; Thomas, R.J.; Maguire, M. Comparison of deep convolutional neural networks and edge detectors for image-based crack detection in concrete. Constr. Build. Mater. 2018, 186, 1031–1045. [Google Scholar] [CrossRef]
- Spencer, B.F., Jr.; Hoskere, V.; Narazaki, Y. Advances in computer vision-based civil infrastructure inspection and monitoring. Engineering 2019, 5, 199–222. [Google Scholar] [CrossRef]
- Dung, C.V. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Constr. 2019, 99, 52–58. [Google Scholar] [CrossRef]
- Fang, W.; Love, P.E.; Luo, H.; Ding, L. Computer vision for behaviour-based safety in construction: A review and future directions. Adv. Eng. Inform. 2020, 43, 100980. [Google Scholar] [CrossRef]
- Li, X.; Chi, H.L.; Lu, W.; Xue, F.; Zeng, J.; Li, C.Z. Federated transfer learning enabled smart work packaging for preserving personal image information of construction worker. Autom. Constr. 2021, 128, 103738. [Google Scholar] [CrossRef]
- Del Savio, A.A.; Luna, A.; Cárdenas-Salas, D.; Vergara Olivera, M.; Urday Ibarra, G. The use of artificial intelligence to identify objects in a construction site. In Proceedings of the International Conference on Artificial Intelligence and Energy System (ICAIES) in Virtual Mode, Jaipur, India, 12–13 June 2021. [Google Scholar]
- Zhao, H.; Zhang, C. An online-learning-based evolutionary many-objective algorithm. Inf. Sci. 2020, 509, 1–21. [Google Scholar] [CrossRef]
- Dulebenets, M.A. An Adaptive Polyploid Memetic Algorithm for scheduling trucks at a cross-docking terminal. Inf. Sci. 2021, 565, 390–421. [Google Scholar] [CrossRef]
- Pasha, J.; Nwodu, A.L.; Fathollahi-Fard, A.M.; Tian, G.; Li, Z.; Wang, H.; Dulebenets, M.A. Exact and metaheuristic algorithms for the vehicle routing problem with a factory-in-a-box in multi-objective settings. Adv. Eng. Inform. 2022, 52, 101623. [Google Scholar] [CrossRef]
- Dulebenets, M.A. A novel memetic algorithm with a deterministic parameter control for efficient berth scheduling at marine container terminals. Marit. Bus. Rev. 2017, 2, 302–330. [Google Scholar] [CrossRef] [Green Version]
- Rabbani, M.; Oladzad-Abbasabady, N.; Akbarian-Saravi, N. Ambulance routing in disaster response considering variable patient condition: NSGA-II and MOPSO algorithms. J. Ind. Manag. Optim. 2022, 18, 1035. [Google Scholar] [CrossRef]
- Kavoosi, M.; Dulebenets, M.A.; Abioye, O.F.; Pasha, J.; Wang, H.; Chi, H. An augmented self-adaptive parameter control in evolutionary computation: A case study for the berth scheduling problem. Adv. Eng. Inform. 2019, 42, 100972. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
Method | FPS | Boxes | mAP |
---|---|---|---|
Faster R-CNN | 7 | 6000 | 73.2 |
Faster YOLO | 155 | 98 | 52.7 |
SSD300 | 29 | 8732 | 74.3 |
Author (Year) | Abstract |
---|---|
Dorafshan, S., Thomas, R. J., and Maguire, M. (2018) [40] | Compares the performance of deep convolutional neural networks and edge detection algorithms for image-based crack detection in concrete, finding that the neural network approach outperforms traditional edge detection methods. |
Spencer Jr, B. F., Hoskere, V. and Narazaki, Y. (2019) [41] | Recent advances in computer vision-based civil infrastructure inspection and monitoring techniques, including object detection, semantic segmentation, and deep learning methods, highlight their benefits and challenges. |
Dung, C. V. (2019) [42] | Proposes an autonomous system for concrete crack detection using a deep, fully convolutional neural network, achieving high accuracy and efficiency compared to traditional manual inspection methods. |
FANG, Weili, et al. (2020) [43] | A review and discussion of future directions of computer vision for behavior-based safety in construction. |
Li, Y., Lu, Y. and Chen, J. (2021) [25] | A deep learning approach based on the YOLOv3 detector is proposed for real-time rebar counting on construction sites, which can effectively improve construction efficiency and safety. |
Chou, J. S. and Liu, C. H. (2021) [24] | An automated system for recognizing trucks in real-time in river dredging areas using computer vision and deep learning. |
Li, X., Chi, H., Lu, W., Xue, F., Zeng, J., and Li, C. Z. (2021) [44] | An intelligent work packaging system that preserves construction workers’ personal image information using federated transfer learning. |
DEL SAVIO, Alexandre Almeida, et al. (2021) [45] | Artificial intelligence (AI) and computer vision are used to identify objects and equipment on a construction site and how they can improve safety and efficiency. |
LIN, Chih-Lung, et al. (2022) [22] | Presents a gait-based pedestrian automatic detection and recognition system using a deep learning neural network. |
Greeshma, A. S. and Edayadiyil, J. B. (2022) [10] | An automated system that uses machine learning and image processing to monitor construction project progress. |
Del Savio, A., Luna, A., Cárdenas-Salas, D., Vergara, M., and Urday, G. (2022) [11] | A manually classified dataset of construction site images containing 1046 images of eight object classes that can be used to develop computer vision techniques in the engineering and construction fields. |
Yeşilmen, S. and Tatar, B. (2022) [16] | The efficiency of using convolutional neural networks (CNN) for image classification in monitoring construction-related activities, with a case study on aggregate mining for concrete production. |
TP | 30 | FN | 18 |
FP | 3 | TN | 10 |
Indicators | Value |
---|---|
F1 Measure | 64% |
Overall Accuracy | 66% |
Originals1 | Outcomes1 |
Image Data Form 1 | |
Originals 2 | Outcomes 2 |
Image Data Form 2 | |
mAP | Recall (Threshold = 0.5) | Precision (Threshold = 0.5) | F1-Score (Threshold = 0.5) | |
---|---|---|---|---|
Rebar | 0.29 | 0.09 | 1.00 | 0.17 |
Worker | 0.53 | 0.37 | 0.86 | 0.52 |
Machine | 0.69 | 0.62 | 0.95 | 0.75 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lung, L.-W.; Wang, Y.-R. Applying Deep Learning and Single Shot Detection in Construction Site Image Recognition. Buildings 2023, 13, 1074. https://doi.org/10.3390/buildings13041074
Lung L-W, Wang Y-R. Applying Deep Learning and Single Shot Detection in Construction Site Image Recognition. Buildings. 2023; 13(4):1074. https://doi.org/10.3390/buildings13041074
Chicago/Turabian StyleLung, Li-Wei, and Yu-Ren Wang. 2023. "Applying Deep Learning and Single Shot Detection in Construction Site Image Recognition" Buildings 13, no. 4: 1074. https://doi.org/10.3390/buildings13041074