Bolt Positioning Detection Based on Improved YOLOv5 for Bridge Structural Health Monitoring

Wang, Diyong; Zhang, Meixia; Sheng, Danjie; Chen, Weiming

doi:10.3390/s23010396

Open AccessArticle

Bolt Positioning Detection Based on Improved YOLOv5 for Bridge Structural Health Monitoring

Faculty of Engineering, China University of Geosciences, Wuhan 430074, China

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(1), 396; https://doi.org/10.3390/s23010396

Submission received: 21 November 2022 / Revised: 26 December 2022 / Accepted: 27 December 2022 / Published: 30 December 2022

(This article belongs to the Special Issue Advanced Deep Learning and Sensing Techniques for Complex Structural Health Monitoring)

Download

Browse Figures

Versions Notes

Abstract

:

To improve the stability of the bridge structure, we detect bolts in the bridge which cause the symmetry failure of the bridge center. For data acquisition, bolts are small-scale objects under complex background in images, and their feature expression ability is limited. Due to those questions, we propose a new bolt positioning detection based on improved YOLOv5 for bridge structural health monitoring. This paper makes three major contributions. Firstly, according to the calibration anchor boxes of bolts, the size and proportion parameters of the initial anchor boxes are optimized by K-means++ clustering algorithm to solve the initial clustering problem of anchor boxes in object detection. Second, the hypercolumn (HC) technique fuses the low-level global features of the trunk and the high-level local features of three different scales to solve the problem of the inefficient distribution of anchors and insufficient extraction of classification features. In this way, we improve the detection accuracy and speed of bolt detection. Finally, we establish a dataset of bridge bolts through network collection and public datasets, including 1494 images. We compare and verify the new method in the collected bolt dataset. The experimental results show that the precision (P) of the improved YOLOv5x is up to 87.3%, and the average precision (AP) is up to 86.3%, which are 6.5% and 5.9% higher than the original YOLOv5x, respectively.

Keywords:

bolt detection; YOLOv5; bridge structural health monitoring; anchor box; feature fusion

1. Introduction

The sudden loss of symmetry stability of the bridge centerline affects the bridge health’s structure and then causes accidents [1]. As the main components commonly used in bridges, bolts can effectively reduce distortions and fatigue stresses and avoid the failure of bridge health structures [2]. The missing and loose bolts caused by long-term exposure to harsh environments affects the overall structural health of the bridge and further cause accidents [3]. Bolt detection usually adopts manual detection. This method has a large amount of detection, low efficiency, and large errors and requires high energy and the experience of inspection personnel [4]. In this way, it not only consumes a lot of manpower and material resources but also has the possibility of missing bolt detection. Therefore, to maintain bridge structure health, it is necessary to develop a real-time accurate detection method for bridge bolts.

Structure health monitoring based on machine learning shows a good ability to detect defects and damage in engineering structures [5]. As a branch of machine learning, computer vision has become an important way and development direction of structure health monitoring [6]. Some studies have proposed that loose bolt detection based on computer vision is an important research direction in structure health monitoring [7], and bolt loosening causes repeated load and vibration of steel structures in bridges. Computer vision object detection is used to locate bolts and realize real-time damage monitoring of bolts, which plays an important role in ensuring the structure health and stability of bridges, vehicles, and dams [8]. The connection status of bolts has an important impact on the safety and reliability of the whole bridge structure’s health [9], so it is necessary to strengthen the application of intelligent technology to detect bolt status in bridge structure health monitoring.

Object detection based on computer vision is a process in which the computer extracts the color, geometry, corner, texture, and other attribute features of the target objects, quickly locates the relevant information in the image, extracts and analyzes the relevant information, and finally understands the target object. Existing object detection algorithms are divided into traditional object detection methods and object detection methods based on deep learning. Traditional object detection methods obtain the features of detected objects in images by manually designing feature descriptors. This method encodes the features and uses machine learning classifiers for detection [10,11,12]. The selection of different descriptors for detected object features affects the detection accuracy. The generalization ability of traditional object detection methods needs to be improved. Object detection methods based on deep learning are mainly divided into regression/classification-based frameworks and region proposal-based frameworks [13]. The region proposal-based framework usually includes a multi-stage process such as region proposal, region classification, and region position adjustment. The region proposal-based framework has the advantages of high detection accuracy and shared calculation parameters, but it has the disadvantages of long training time and slow detection speed. The regression/classification-based framework directly detects the class probabilities and region coordinates of the objects. The framework can detect the object in one stage. Regression/classification-based framework is fast in detection and good at learning the features of objects. This framework mainly includes YOLO series [14,15,16,17] and SSD [18]. Regression/classification-based framework is more suitable for real-time detection, but the accuracy of small-scale object detection needs to be improved.

YOLO series object detection network is widely used in the object detection of bridge structure health monitoring because of its advantages of high precision and fast speed [19]. Deng J et al. [20] detected concrete cracks through the YOLOv2 object detector, effectively assessing the robustness of bridge structures. At the same time, the proposed YOLOv2 detector was compared with Faster-RCNN, and the YOLOv2 detector showed better results in terms of both accuracy and detection speed. Teng S et al. [21] used the improved YOLOv3 to detect surface defects (cracks and exposed rebar) in bridge structure health monitoring. The results showed that the improved YOLOv3 performed best in terms of detection accuracy and speed. Han Q et al. [22] proposed to use the YOLOV3 for panoramic crack location monitoring in view of the structural health monitoring of common steel structure connections, and YOLO V3 proved good performance in both running speed and small-scale object detection. At present, bridge structural health monitoring is mostly based on object detection of cracks in concrete structures. Hou R et al. [23] innovatively proposed the use of YOLO networks to locate trucks and detect trucks in real time to trigger the work of a bridge structural health monitoring system. Bolts are related to the nature of the bridge structure and are one of the key points of bridge structure health monitoring [24]. The bridge structure health monitoring system triggered by real-time bolt positioning detection needs further research.

Bolts on the real scene are small-scale objects in computer vision object detection. The object detection combined with the geometry and texture features of the bolt have become the current research direction. In terms of feature description and classifier training, Ramana L et al. [25] used histograms of oriented gradient (HOG) features and two classifiers to conduct classification training for small-scale bolts. Wang C et al. [26] converted color images into gray images to improve detection accuracy, and they positioned bolts by convolutional neural network digit recognition and detected bolt rotation angles by Hough transform line detection. Dou Y et al. [27] used the template matching method and geometric structure constraints to locate railway bolts. Park J et al. [28] combined Canny edge detector, Hough transform (HT), and circular Hough transform (CHT) to detect nuts. In the region proposal-based framework, Huynh TC et al. [29] firstly converted the color image to the gray level and then generated candidate areas based on a regional convolutional neural network (RCNN) to detect and locate bolts, which enhanced the accuracy of bolt detection. For the real-time positioning detection of bolts, Sun Y et al. [30] marked bolts and trained YOLOv5 to locate and detect the marking circle on bolts. Although the detection accuracy is very high, manual labeling of each bolt is required. Zhao K et al. [31] proposed an improved YOLOv3 algorithm which realizes high-precision detection of bolts through data enhancement, improved clustering algorithms, and fusion features.

Existing object detection methods based on computer vision realize bolt detection through the extraction or constraint of bolt geometric features. Although the above methods achieve high accuracy, most of them still need manual intervention. Bolt detection shows good results in regression/classification-based frameworks, which can adapt to current intelligent real-time detection requirements, but they still lack a high-precision positioning method for bridge bolts.

In this paper, the regression/classification-based framework of YOLOv5 is used to detect bolts on a bridge. The main difficulty of this study is that bolts, as a small-scale objects, make it difficult to extract features and to distribute anchors. It results in insufficient detection accuracy and waste of computing power. The contributions of this study are as follows:

(1) According to the calibration anchor boxes of bridge bolt images, we propose a method based on the K-means++ clustering algorithm to optimize the initial anchor box size and proportion parameters. This method solves the initial anchor boxes clustering problem in new datasets.

(2) To solve the problem that inefficient distribution of anchors and insufficient extraction of classification features, we use the hypercolumn technique to combine the low-level global features of the trunk with the high-level local features of three different scales and use the stairstep upsample structure to generate a single-scale output. This method enhances the detection accuracy and speeds up the detection.

2. Improved YOLOv5 Network

There are four versions of the YOLOv5 network framework, namely YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. The network depths and widths of the four versions are different. YOLOv5s is the fastest and smallest network version, but its detection performance is also the lowest. YOLOv5x is the network version with the best detection performance, the largest depth and width, and the slowest detection speed. In order to achieve efficient detection of bolts on the bridge, we proposed an improved YOLOv5x. This method first optimizes the anchor box parameters through the K-Means ++ clustering algorithm to improve the accuracy of object detection and location and then creates a single output prediction structure to aggregate the detection feature expression of various scales. The entire architecture is shown in Figure 1.

As shown in Figure 1, the architecture difference between YOLOv5 and the improved YOLOv5 is the head. The redundant structure of the YOLOv5 head is replaced by a lightweight block with HC, which is upsampled with a stairstep structure. At the same time, the improved head uses a single output instead of three to provide higher resolution and speed.

2.1. Optimization of Anchor

YOLOv5x applied K-means clustering algorithm for adaptive anchor boxes, and the initial nine preset anchor box parameters were generated based on the COCO dataset. The nine initial anchor boxes in YOLOv5x are divided into three groups: P3/8 is the initial anchor box in the large scale output layer, P4/16 is the initial anchor box in the medium scale output layer, and P5/32 is the initial anchoring box in the small-scale output layer. YOLOv5x sets the maximum threshold (thr) of the width/height ratio of the anchor boxes in the dataset to 4.0. During the training, YOLOv5x clustered the labeled anchor boxes by K-means clustering algorithm under the thr = 4.0 and updated the anchor boxes for the new dataset. K-means clustering algorithm is an unsupervised algorithm that can cluster data into nine specified categories. Before the clustering, nine cluster centers are randomly initialized. Therefore, the clustering method largely depends on the initial selection of the clustering center. In this study, the K-means++ clustering algorithm was adopted to cluster the labeled anchor boxes. It can modify the initial value of the anchor boxes and the thr of the aspect ratio of the labeled anchor boxes. It can improve the accuracy and effect of detection to a certain extent.

K-means clustering algorithm is an unsupervised learning classification algorithm that can divide a dataset into K categories. The algorithm takes k points as clustering centers to classify the data, and iteratively updates the values of each clustering center to obtain the optimal classification. The K-mean clustering algorithm randomly selects the initial clustering center points, and different selections of the center points affect the clustering results. The K-means++ clustering algorithm is an improvement of the K-means clustering algorithm. When initializing the center point, the clustering centers of different classes are as far away as possible. This improvement effectively avoids the problem that the random selection of the center point is limited to finding the local optimal. The steps of the K-Means ++ clustering algorithm are shown in Figure 2.

The dataset includes the public dataset and network image collection. In order to obtain the anchor boxes suitable for the dataset, this study set the clustering center k as 9 for the clustering test. The clustering results are shown in Figure 3.

2.2. Multi-Scale Feature Fusion

YOLOv5x trains based on the anchor boxes and detects bolts in three output layers. The initial nine anchor boxes are split into triplets and connected with the large, medium, and small anchor boxes detected in the output layers. The detection object is evenly distributed in large, medium, and small scales, and it can show a good detection ability. However, in the detection process of small-scale objects such as bolts, the object expression ability is weak. The small-scale object feature map input into an improper scale output layer leads to inefficient allocation of anchors and waste of computing power. The small-scale object is mistakenly input into different scale output layers. It not only results in low precision detection but also greatly reduces the use efficiency of the other two output layers. An example of output layer feature map visualization for three scales is shown in Figure 4.

In the feature extraction process of the convolutional neural network, with the increase in network layers, more semantic information of high-level feature description objects is obtained. The YOLOv5 network framework uses upsample to reduce the loss of global features in the network layer number to a certain extent. In the prediction stage, the direct use of the non-maximum suppression (NMS) method is not good for the expression of small-scale object detection. On the actual scene, bolts belong to small-scale objects. The fusion of high-level features and low-level features can express the object features more effectively and improve the detection accuracy.

In order to solve the above problems, we propose a network structure with a single output layer to integrate multiple scales. HyperColumn(HC) [32] is a technique that combines the activation values of convolutional network units of all corresponding positions behind an input pixel position into a column vector. The HC classifier takes an anchor box and resizes it to the fixed input size of the corresponding network. In this way, the different scale feature maps are extracted. The bilinear interpolation method is used to resize the feature map, and then the resized feature map is stitched together to obtain a matrix. Each vector in the matrix fuses all the information from the pixel. Finally, sigmiod was used to classify the feature data and obtain the corresponding object prediction result for each pixel. Considering that the bolt is a small-scale object and the background information can enhance the detection accuracy of small-scale objects, we use the fusion method of low-level global features and high-level local features to enhance the detection accuracy. The overall structure based on HC classifiers is shown in Figure 5.

In terms of formula calculation, the output feature map using the HC is shown in Equation (1).

F = g [F_{1}, k (F_{2}, 2^{1}), k (F_{3}, 2^{2}), \dots, k (F_{n}, 2^{n - 1})]

(1)

where F is the feature map obtained by fusion, F_n is the feature map on different scales, k[·, 2ⁿ⁻¹] is a function that scales the feature graph to a uniform size by bilinear interpolation, 2ⁿ⁻¹ is a scaling factor, and g(·) is the function that splices the feature maps. It can be seen from Equation (1) that the existence of scaling factors leads to imbalance in the fusion process. Therefore, we propose a new HC classifier structure with stairsteps. We add the features maps of the previous scale into the fusion process of each step so as to avoid the unbalanced fusion of features in the process. The comparison of the two HC classifier structures is shown in Figure 6.

In terms of formula calculation, the new HC with stairsteps is shown in Equation (2).

F^{'} = \dots k \{[k (F_{n}, 2) + F_{n - 1}], 2\} \dots

(2)

where

F^{'}

is the new feature map obtained by HC with stairstep. It not only solves the problem of inefficient distribution of anchors but also solves the problem of insufficient high-level local features of small targets through the fusion of low-level global features.

3. Experiment

3.1. Dataset

A dataset of 1494 bolt images was constructed, partly from the public dataset NPU-BOLT [33] (337 images) and partly from the web collection (1157 images). In order to better enhance the robustness of network detection, the bolt background in the dataset includes whiteboards and bridges. It also contains multi-angle and multi-resolution images. Then we used the original Mosaic data enhancement method in YOLOv5. It randomly clips and scales the four images and then randomly arranges and stitches them together to form an image. This data enhancement method not only enriches the data set but also adds small-scale objects to improve the training speed of the network. We selected 1045 (70%) images as the training set, 299 (20%) images as the val set, and 150 (10%) images as the test set. In this paper, LabelImg image annotation tool was used to label the position of bolts in the image, and the corresponding txt file was generated for training. When constructing the samples, the length of the captured images is resized to 640 pixels, and the width is adjusted accordingly to maintain the original proportion of the image.

3.2. Details

The experimental environment parameters in this paper are as follows: Windows 11 operating system, AMD R7-5800H central processing unit (CPU), Nvidia GeForce RTX 3060LP graphics processor (GPU), pytorch1.10.0+cu113 algorithm framework, and the programming language is Python3.6.

The training parameters in this paper are as follows: the number of training steps is 100 epochs; the initial learning rate is 0.01; the optimizer select SGD to update parameters; the batch size is 8; and the image size is 640 × 640.

Average precision (AP), precision (P), and frames per second (FPS) are selected as evaluation indexes to evaluate the model. FPS is used to evaluate the speed of object detection, which is the number of images detected by the network per second. The shorter the object detection time of an image is, the faster the network detection speed is. That is, the higher the FPS value, the better the object detection network can meet the requirements of intelligent industry real-time detection. The specific calculation formulas for P and AP are as follows:

P = \frac{T P}{T P + F P}

(3)

R = \frac{T P}{T P + F N}

(4)

A P = \int_{0}^{1} P R d r

(5)

where TP is the result of correctly identifying the object sample, FP is the result of incorrectly identifying the non-object sample as the object, FN is the result of incorrectly identifying the non-object sample as the object, and TN is the result of correctly identifying the non-object sample as the non-object. P is the accuracy rate, R is the recall rate, R measures the integrity of the test results, and P measures the accuracy of the test. AP is the area under the P-R curve. AP is an important index to evaluate an object detection network, and the AP value is proportional to the classification performance of the object detection network.

3.3. Experimental Results

Among the four versions of YOLOv5, the detection accuracy of YOLOv5x is the highest, and the detection speed of YOLOv5s is the fastest. Therefore, we applied the improved method in the experiment of YOLOv5x and YOLOv5s models. The experimental results are shown in Table 1.

As can be seen from Table 1, AP and P of the improved YOLOv5x network reach 86.3% and 87.3%, respectively, and the detection speed is also improved. The two metrics of AP and P score the highest in the improved YOLOv5x, and the FPS score the highest in the improved YOLOv5s. Compared with the original YOLOv5x, AP increased by 6.5% and P increased by 5.9%, which proved the effectiveness of the proposed method in improving the detection accuracy. Compared with the improved YOLOv5s, the improved YOLOv5x has higher detection accuracy and lower detection speed, and the FPS of improved YOLOv5x is 26.46, which meets the requirements for real-time detection. Considering the actual requirements, the improved YOLOv5x in this paper is superior to other models in the case of comprehensive consideration of detection accuracy and speed.

In order to verify the accuracy and real-time performance of our method, Figure 7 shows the variation in precision during training epochs. A sufficient number of epochs ensures the convergence of the whole training process and enables the module to achieve the best performance under specific parameter configuration. As can be seen from Figure 7, the methods achieve convergence, and the improved YOLOv5 converges after about 30 epochs, while YOLOv5 requires about 60 epochs to reach convergence. Moreover, P of the improved YOLOv5 is always greater than that of the original YOLOv5. Therefore, the new method proposed by us can effectively improve the precision and speed of bolt detection.

In order to more intuitively verify the validity of our method, we selected several representative images and compared the detection results with the original YOLOv5.

Figure 8 shows the visual comparison of detection results. The first row (a) is the detection result of YOLOv5x, and the second row (b) is the detection result of the improved YOLOv5x. Compared with sample 1 in the first column, the improved YOLOv5x can effectively avoid false detection. Compared with sample 2 in the second column, the improved YOLOv5x can improve the accuracy of object detection. Compared with sample 3 in the third column, the improved YOLOv5x can effectively avoid false detection and improve detection accuracy in multi-target detection and can detect some distant minimal objects. It can be seen that the K-means++ clustering algorithm and HC with stairsteps can improve the effect of bolt detection. In conclusion, the improved YOLOv5 can improve the positioning detection accuracy of bolts and reduce the misjudgment of small-scale objects, which can verify the effectiveness of the improved measures in this paper.

3.4. Ablation Experiments

In order to evaluate the influence of the two improved methods on the results, ablation comparison experiments were conducted on the models. The experimental results are shown in Table 2.

It can be seen from Table 2 that the addition of K-means++ clustering algorithm can effectively improve the detection accuracy for bolts, and the addition of the HC with stairsteps can not only effectively improve the detection accuracy but also improve the detection speed.

(1) By comparing the experimental results of YOLOv5x and YOLOv5x + Kmeans++, it can be seen that after the improvement of the anchor parameter by the Kmeans++ clustering algorithm, AP and P detected by the model increased by 3.7% and 2.7%, respectively, with no significant change in detection speed. The results showed that the anchor parameter obtained by the Kmeans++ clustering algorithm could improve the accuracy of bolt detection. Therefore, high AP and P values can be guaranteed by selecting appropriate anchor initialization parameters.

(2) By comparing the experimental results of YOLOv5x and YOLOv5x + HC with stairstep, we can see that after the addition of HC with stairstep structures, AP and P detected by the algorithm are improved by 2.9% and 2.1%, respectively, and the detection speed is improved. The results show that the clustering initialization of anchor and the change of network frame parameters can effectively realize the detection of small-scale objects and improve the detection accuracy. HC with stairsteps not only completes the fusion of high-level local features and low-level global features and solves the problem of unbalanced distribution of small-scale object anchor boxes of bolts in the output scale but also avoids the waste of calculation power in the three scale prediction branches.

(3) Compared with the experimental results of YOLOv5x and improved YOLOv5x, it can be seen that after combining the two improved measures, AP and P detected by the algorithm are improved by 5.9% and 6.5%, respectively, and the detection rate is improved. The results show that the combination of the above two improvement measures can improve the overall detection accuracy of the model, and the detection speed is also improved compared with the original model.

4. Conclusions

In this paper, an improved YOLOv5x model is proposed for bolt positioning detection, which is a regression/classification-based framework. Firstly, improved YOLOv5x adopts K-means++ clustering algorithm to learn the proportion and distribution of anchor boxes according to the new dataset. Secondly, the improved YOLOv5x fused the low-level global features of the trunk and the high-level local features of three different scales with an HC technique. In this way, it can solve the problem of the inefficient distribution of anchor boxes and enhance the feature extraction of small-scale objects. Finally, we built a multi-scenario bolts dataset through public datasets and network data acquisition. The improved YOLOv5x was evaluated in the comparison experiment. The results show that our improved YOLOv5x is effective. The final detection AP is 86.3%, P is 87.3%, and an FPS equaling 26.46 was reached, which meets real-time detection requirements.

At the same time, this paper also has some shortcomings. To ensure the bridge structural health monitoring, bolt positioning detection is the first step, and then identifying bolt loosening and wear is the next direction of our research. In order to monitor the health of bridge structures in real time, we will research the relative position judgment of bolts and nuts, calculations of the bolt rotation angle, bolt material judgments, etc.

Author Contributions

Conceptualization, D.W. and W.C.; methodology, D.W.; formal analysis, M.Z.; investigation, D.S.; resources, D.S.; data curation, M.Z.; writing—original draft preparation, D.W.; writing—review and editing, W.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The study used the open data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Plaut, R.H.; Davis, F.M. Sudden lateral asymmetry and torsional oscillations of section models of suspension bridges. J. Sound Vib. 2007, 307, 894–905. [Google Scholar] [CrossRef]
Wang, Y.; Wang, C.; Duan, L. Bonding and bolting angle reinforcement for distortion-induced fatigue in steel girder bridges. Thin-Walled Struct. 2021, 166, 108027. [Google Scholar] [CrossRef]
Wang, L.; Qu, W.L.; Li, Y.F.; Wang, Y.F. Dynamic analysis of power transmission tower collapse with wind load. Adv. Mater. Res. 2014, 838, 494–497. [Google Scholar] [CrossRef]
Chen, Z.; Wang, L.; Fei, Z.; Deng, Y. Weakly Supervised Bolt Detection Model Based on Attention Mechanism. In Proceedings of the International Forum on Digital TV and Wireless Multimedia Communications, Shanghai, China, 3–4 December 2021; Springer: Singapore, 2022; pp. 325–337. [Google Scholar]
Flah, M.; Nunez, I.; Ben Chaabene, W.; Nehdi, M.L. Machine learning algorithms in civil structural health monitoring: A systematic review. Arch. Comput. Methods Eng. 2021, 28, 2621–2643. [Google Scholar] [CrossRef]
Ye, X.W.; Dong, C.Z.; Liu, T. A review of machine vision-based structural health monitoring: Methodologies and applications. J. Sensors 2016, 2016, 7103039. [Google Scholar] [CrossRef] [Green Version]
Dong, C.Z.; Catbas, F.N. A review of computer vision–based structural health monitoring at local and global levels. Struct. Health Monit. 2021, 20, 692–743. [Google Scholar] [CrossRef]
Zhang, Y.; Sun, X.; Loh, K.J.; Su, W.; Xue, Z.; Zhao, X. Autonomous bolt loosening detection using deep learning. Struct. Health Monit. 2020, 19, 105–122. [Google Scholar] [CrossRef]
Chen, Y.; Xue, X. Advances in the Structural Health Monitoring of Bridges Using Piezoelectric Transducers. Sensors 2018, 18, 4312. [Google Scholar] [CrossRef] [Green Version]
Viola, P.; Jones, M.J. Robust Real-Time Face Detection. Int. J. Comput. Vis. 2004, 57, 137–154. [Google Scholar] [CrossRef]
Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2005), San Diego, CA, USA, 20–25 June 2005; pp. 886–893. [Google Scholar]
Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object Detection with Discriminatively Trained Part-Based Models. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32, 1627–1645. [Google Scholar] [CrossRef]
Zhao, Z.Q.; Zheng, P.; Xu, S.T.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot Multibox Detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
Zhang, G.Q.; Wang, B.; Li, J.; Xu, Y.L. The application of deep learning in bridge health monitoring: A literature review. Adv. Bridge Eng. 2022, 3, 22. [Google Scholar] [CrossRef]
Deng, J.; Lu, Y.; Lee, V.C.S. Imaging-based crack detection on concrete surfaces using You Only Look Once network. Struct. Health Monit. 2021, 20, 484–499. [Google Scholar] [CrossRef]
Teng, S.; Liu, Z.; Li, X. Improved YOLOv3-Based Bridge Surface Defect Detection by Combining High- and Low-Resolution Feature Images. Buildings 2022, 12, 1225. [Google Scholar] [CrossRef]
Han, Q.; Liu, X.; Xu, J. Detection and Location of Steel Structure Surface Cracks Based on Unmanned Aerial Vehicle Images. J. Build. Eng. 2022, 50, 104098. [Google Scholar] [CrossRef]
Hou, R.; Jeong, S.; Lynch, J.P.; Law, K.H. Cyber-physical system architecture for automating the mapping of truck loads to bridge behavior using computer vision in connected highway corridors. Transp. Res. Part C Emerg. Technol. 2020, 111, 547–571. [Google Scholar] [CrossRef]
Sivasuriyan, A.; Vijayan, D.S.; LeemaRose, A.; Revathy, J.; Monicka, S.G.; Adithya, U.R.; Daniel, J.J. Development of Smart Sensing Technology Approaches in Structural Health Monitoring of Bridge Structures. Adv. Mater. Sci. Eng. 2021, 2021, 2615029. [Google Scholar] [CrossRef]
Ramana, L.; Choi, W.; Cha, Y.J. Automated Vision-Based Loosened Bolt Detection Using the Cascade Detector. In Sensors and Instrumentation; Springer: Cham, Switzerland, 2017; Volume 5, pp. 23–28. [Google Scholar]
Wang, C.; Wang, N.; Ho, S.-C.; Chen, X.; Song, G. Design of a New Vision-Based Method for the Bolts Looseness Detection in Flange Connections. IEEE Trans. Ind. Electron. 2019, 67, 1366–1375. [Google Scholar] [CrossRef]
Dou, Y.; Huang, Y.; Li, Q.; Luo, S. A fast template matching-based algorithm for railway bolts detection. Int. J. Mach. Learn. Cybern. 2014, 5, 835–844. [Google Scholar] [CrossRef]
Park, J.; Kim, T.; Kim, J. Image-Based Bolt-Loosening Detection Technique of Bolt Joint in Steel Bridges. In Proceedings of the 6th International Conference on Advances in Experimental Structural Engineering, 11th International Workshop on Advanced Smart Materials and Smart Structures Technology, Champaign, IL, USA, 1–2 August 2015. [Google Scholar]
Huynh, T.C.; Park, J.H.; Jung, H.J.; Kim, J.T. Quasi-autonomous bolt-loosening detection method using vision-based deep learning and image processing. Autom. Constr. 2019, 105, 102844. [Google Scholar] [CrossRef]
Sun, Y.; Li, M.; Dong, R.; Chen, W.; Jiang, D. Vision-Based Detection of Bolt Loosening Using YOLOv5. Sensors 2022, 22, 5184. [Google Scholar] [CrossRef] [PubMed]
Zhao, K.; Wang, Y.; Zuo, Y.; Zhang, C. Palletizing Robot Positioning Bolt Detection Based on Improved YOLO-V3. J. Intell. Robot. Syst. 2022, 104, 41. [Google Scholar] [CrossRef]
Hariharan, B.; Arbeláez, P.; Girshick, R.; Malik, J. Hypercolumns for object segmentation and fine-grained localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 447–456. [Google Scholar]
YARTINZ. NPU-BOLT. Available online: https://www.kaggle.com/datasets/yartinz/npu-bolt (accessed on 20 November 2022).

Figure 1. A comparison of YOLOv5x and improved YOLOv5x architecture. (a) YOLOv5x; (b) improved YOLOv5x.

Figure 2. K-Means++ clustering algorithm.

Figure 3. K-means++ algorithm clustering results.

Figure 4. Visual feature maps of three output layers: (a) original image; (b1) 19*19 feature map; (b2) 38*38 feature map; (b3) 76*76 feature map.

Figure 5. The overall structure based on HC classifiers.

Figure 6. Structure of HC classifier and HC classifier with stairsteps. (a) HC classifier; (b) HC classifier with stairsteps.

Figure 7. Variation in precision during training epochs.

Figure 8. Visual comparison of detection results. The row (a): the detection result of YOLOv5x. The row (b): the detection result of the improved YOLOv5x.

Table 1. Comparison of experimental results.

Model	AP	P	FPS
YOLOv5x	80.4%	80.8%	23.47
Improved YOLOv5x	86.3%	87.3%	26.46
YOLOv5s	76.2%	74.9%	39.37
Improved YOLOv5s	82.6%	82.3%	53.76

Table 2. Comparison results of ablation experiments.

Model	AP	P	FPS
YOLOv5x	80.4%	80.8%	23.47
Improved YOLOv5x	86.3%	87.3%	26.46
YOLOv5x + Kmeans++	84.1%	83.5%	23.87
YOLOv5x+ HC with stairstep	83.3%	82.9%	25.97

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, D.; Zhang, M.; Sheng, D.; Chen, W. Bolt Positioning Detection Based on Improved YOLOv5 for Bridge Structural Health Monitoring. Sensors 2023, 23, 396. https://doi.org/10.3390/s23010396

AMA Style

Wang D, Zhang M, Sheng D, Chen W. Bolt Positioning Detection Based on Improved YOLOv5 for Bridge Structural Health Monitoring. Sensors. 2023; 23(1):396. https://doi.org/10.3390/s23010396

Chicago/Turabian Style

Wang, Diyong, Meixia Zhang, Danjie Sheng, and Weiming Chen. 2023. "Bolt Positioning Detection Based on Improved YOLOv5 for Bridge Structural Health Monitoring" Sensors 23, no. 1: 396. https://doi.org/10.3390/s23010396

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bolt Positioning Detection Based on Improved YOLOv5 for Bridge Structural Health Monitoring

Abstract

1. Introduction

2. Improved YOLOv5 Network

2.1. Optimization of Anchor

2.2. Multi-Scale Feature Fusion

3. Experiment

3.1. Dataset

3.2. Details

3.3. Experimental Results

3.4. Ablation Experiments

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI