Research on an Insulator Defect Detection Method Based on Improved YOLOv5

Qi, Yifan; Li, Yongming; Du, Anyu

doi:10.3390/app13095741

Open AccessArticle

Research on an Insulator Defect Detection Method Based on Improved YOLOv5

by

Yifan Qi

,

Yongming Li

^*

and

Anyu Du

College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(9), 5741; https://doi.org/10.3390/app13095741

Submission received: 7 March 2023 / Revised: 25 April 2023 / Accepted: 3 May 2023 / Published: 6 May 2023

Download

Browse Figures

Versions Notes

Abstract

:

Insulators are widely used in various aspects of the power system and play a crucial role in ensuring the safety and stability of power transmission. Insulator detection is an important measure to guarantee the safety and stability of the transmission system, and accurate localization of insulators is a prerequisite for detection. In this paper, we propose an improved method based on the YOLOv5s model to address the issues of slow localization speed and low accuracy in insulator detection in power systems. In our approach, we first re-cluster the insulator image samples using the k-means algorithm to obtain different sizes of anchor box parameters. Then, we add the non-local attention module (NAM) to the feature extraction module of the YOLOv5s algorithm. The NAM improves the attention mechanism using the weights’ contribution factors and scaling factors. Finally, we recursively replace the ordinary convolution module in the neck network of the YOLOv5 model with the gated normalized convolution (gⁿConv). Through these improvements, the feature extraction capability of the network is enhanced, and the detection performance of YOLOv5s is improved, resulting in increased accuracy and speed in insulator defect localization. In this paper, we conducted training and evaluation on a publicly available dataset of insulator defects. Experimental results show that the proposed improved YOLOv5s model achieves a 1% improvement in localization accuracy compared to YOLOv5. The proposed method balances accuracy and speed, meeting the requirements of online insulator localization in power system inspection.

Keywords:

insulator defect detection; YOLOv5s; anchor; attention; gⁿConv

1. Introduction

Transmission lines often need to cross mountains and rivers and are primarily distributed in complex terrains such as mountains and hills and harsh climate environments, significantly complicating line inspections. Insulators have been operating in harsh and complex environments such as strong electric fields, high-temperature sunlight, and mechanical stress for a long time. When their degradation reaches a certain level, their insulation performance will decrease [1,2,3,4]. Especially on high-voltage transmission lines, the deterioration of insulators directly threatens the safe operation of power systems. In order to ensure the stable and safe operation of the power grid, the defect detection of power insulators has become an essential task in the surveillance of power systems. Therefore, it is significant to study insulator image recognition and defect detection methods [5]. In recent years, with the maturity of deep learning and the limitation of image recognition technology, an earlier insulator defect detection method used the edge recognition feature method to extract the insulator’s shape. Then it used the elliptic equation to fit the contour and finally determined the missing parts of the insulator using the analysis and counting method. There is also a method of using the image filtering algorithm to intercept the insulator’s edge to realize insulator image recognition. The above method could be more computationally complex and efficient. With the development of artificial intelligence technology, computer vision has been gradually applied to detect defects in transmission line insulators.

The current deep learning-based target detection recognition algorithms are broadly classified into two categories. One class is the target detection algorithm based on region suggestion, such as the representative algorithms: recurrent neural networks (R-CNN) [6], Fast R-CNN [7], Faster R-CNN [8], spatial pyramid pooling networks (SPP-Net) [9], and so on. Another category is regression-based target detection and recognition algorithms such as the single shot multibox detector (SSD) [10], and the YOLOv2 [11], YOLOv3 [12], YOLOv4 [13], etc., in the you-only-look-once (YOLO) [14] series.

Wang et al. [15] proposed a transmission line ice thickness identification method based on a combination of a MobileNet v3 lightweight feature extraction network and an SSD detection network, with an accuracy of 74.5%. Zhao et al. [16] constructed an automatic defect detection model called the automatic visual shape clustering network (AVSCNet) to detect missing parts of bolts, and its detection accuracy can reach up to 87.6%. Davari et al. [17] used Faster RCNN to detect defects on distribution lines in each frame of UV–visible video, then identified corona discharges on the lines by color thresholding, and finally described the severity of the occurrence of faults by the ratio of the area of spots to the area of defects. Rong et al. [18] applied Faster RCNN, Hough transform, and advanced stereovision (SV) to detect vegetation encroachment on power transmission lines, converting two-dimensional (2D) images of vegetation and power transmission lines into three-dimensional (3D) height and location results to obtain accurate identification and localization. Feng et al. [19] proposed a YOLOv5 target detection model based on the method for the automatic detection of insulator defects. Compared with four different versions of the YOLOv5 model, the YOLOv5x model based on k-mean clustering can effectively identify and locate insulator defects in transmission lines. However, the maximum accuracy is only 86.8%. Liu et al. [20] proposed the MTI-YOLO network model, which uses a multi-scale detection head and multi-scale feature fusion structure to improve the model’s detection accuracy, but only detects defects of common insulators. Wu et al. [21] proposed a CenterNet-based insulator defect detection method, which simplified the backbone network and applied an attention mechanism in the model to suppress useless information, improving the accuracy of network detection. However, the detection speed is not high, and when two different defect classes share the same centroid, CenterNet can only detect one of the defect types. Qiu et al. [22] proposed an improved YOLOv4-based algorithm for defect detection of insulators. A GraphCut image enhancement method was used, and then the images were processed by sharpening to recreate a dataset. A MobileNet lightweight network is used to fuse with the YOLOv4 model structure. Tao et al. [23] proposed a new convolutional neural network cascade architecture for performing defect localization and detection of insulators. This network uses a CNN based on a region suggestion network to transform defect detection into a two-level object detection problem. Wang et al. [24] proposed an insulator defect detection method based on the improved ResNeSt and region proposal network (RPN). A new network based on ResNeSt was first built, and then the improved RPN was added to the improved network for feature extraction to better detect minor defects on insulators.

Insulators are diverse and complex, and traditional manual inspection is inefficient and prone to leakage and misdetection. Therefore, it is especially important to research a method for insulator defect detection. This article focuses on the problem of insulator defects in power systems. It applies deep learning-based target detection to insulator defect detection, which is significant in improving the inspection intelligence of power systems.

The YOLO series algorithm is a deep learning neural network image recognition algorithm. The YOLOv5 algorithm, as a representative of a single-stage detection algorithm, has the advantages of a small amount of code, simple procedure, fast detection speed, and high detection accuracy. It is the closest image recognition algorithm to engineering technology. In this paper, the YOLOv5 algorithm is improved for insulator image and defective insulator detection. Firstly, the insulator defect image samples are re-clustered using the k-means clustering algorithm to obtain different-sized a priori frame parameters; the attention mechanism NAM is added to the feature extraction module of the YOLOv5 algorithm, and the NAM attention is redesigned for the channel and space attention sub-module. The contribution factor in weights is used to improve the attention mechanism. Furthermore, batch-normalized scale factors are used, and the standard deviation is used to indicate the weight importance; a gⁿConv recursive gated convolution is used instead of the typical convolution module of the neck network in the YOLOv5 model. gⁿConv achieves the modeling of higher-order spatial interactions. It has the features of high performance, scalability, and translation invariance. The improved algorithm is used to enhance the network feature extraction capability, as well as the efficiency of feature fusion, to improve YOLOv5s detection performance. The proposed method achieves a good balance between accuracy and speed, and the performance fully meets the demand of power inspection for insulator online localization.

2. Related Work

The YOLO algorithm is a target detection algorithm proposed by Joseph Redmon [14], and YOLOv5 uses the most optimized strategy in recent years based on the original YOLO target detection framework, resulting in great performance improvement in both speed and accuracy. The YOLOv5 algorithm has four versions: YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. The difference between them lies in the depth and width settings of the model. The deeper the backbone network is, the more feature maps are obtained, and deeper networks imply more complexity. Among them, YOLOV5s is the network with the smallest depth and the smallest width of the feature map, but it is not true that the larger the model is, the better the detection accuracy obtained will be. It needs to be practical. We can usually divide the YOLO model into four general modules: the input, the benchmark network backbone, the neck network, and the head.

As shown in Figure 1, the network structure of YOLOv5 is as follows: the input includes mosaic enhancement, adaptive anchor frame calculation, adaptive image scaling, and finally, conversion of the image into a 640*640*3 tensor input network. The backbone is mainly used for feature extraction of the input image, firstly through the slice operation of the conv+bn+silu (CBS) layer, secondly, through the convolution module’s role of downsampling for feature extraction, then through C3 (BottleneckCSP) and the Conv operation, which obtain the feature map, and finally, the improvement of the accuracy through the spatial pyramid pooling (SPPF) module. The neck network is usually located in the middle of the backbone network and the head network. Using the neck network can better utilize the features extracted from the backbone network to achieve multi-scale prediction through the feature pyramid network (FPN) and perceptual confrontation network (PAN) structures for upsampling and downsampling processes [25]. The head output layer mainly uses the previously extracted features to make predictions and complete the output of target detection results. This article will describe an improvement of the YOLOv5s network that improves the accuracy of insulator defect detection.

3. Methodology

3.1. K-Means Algorithm for Re-Clustering Anchor Frames

In general, in anchor-based target detection algorithms, most of the anchors are designed by hand. For example, in the classical algorithm models SSD and Faster-RCNN, nine anchors of different sizes and aspect ratios are designed, respectively. However, there is a disadvantage, in that the manually designed anchors are not guaranteed to be well-suited to different datasets. If the size of the anchor design and the size of the target in the dataset are relatively different, the detection effect of the model will be affected. For YOLOv2, Joseph Redmon suggests using k-means clustering instead of manual design, by clustering the bounding box of the training set and automatically generating a set of anchors that are more suitable for the dataset, which can make the network detect better. Since the default anchor box of the YOLOv5 algorithm saves the preset anchor box for the COCO dataset and insulator defect detection, the image size and detection target of the COCO dataset do not match with this dataset. We use the k-means clustering method to recalculate the anchor box that matches the labeled box of this dataset. The k-means clustering method is performed mainly by calculating the distance (similarity) between samples to cluster the closer samples into the same class (cluster) [26]. The k-means algorithm’s primary process is as follows:

Step 1:: Initialize K cluster centers (assume K = 2).
Step 2:: Randomly select k samples among all samples as the initial centers of clusters, as shown in Figure 2a, where the two black solid dots represent the two cluster centers randomly initialized.
Step 3:: Calculate the distance of each sample from the center of each cluster (Euclidean distance), and then divide the sample into the clusters closest to it. Different colors are used to distinguish different clusters, as in Figure 2b.
Step 4:: Update the cluster centers and calculate the mean value of all samples in each cluster as the new cluster centers. As shown in Figure 2c, the two blue solid points have moved to the centers of the corresponding clusters.
Step 5:: Repeat Step 3 to Step 4 until the cluster centers are not changing or the cluster centers are changing very little to satisfy the given termination condition. The final clustering results are shown in Figure 2d.

Figure 2. K-means clustering. (a) Randomly initialize two cluster centers; (b) Calculate the Euclidean distance between the sample and the cluster center. (c) Update the cluster centers. (d) Final clustering result.

For the k-means algorithm to re-cluster the anchor box, usually, the bounding box is represented by the top-left vertex and the bottom-right vertex, i.e., (x1, y1, x2, y2). When doing clustering on the box, we only need the width and height of the box as features and also need to perform normalization on the width and height of the box using the width and height of the image first, i.e.,

w = \frac{w_{box}}{w_{img}}

,

h = \frac{h_{b o x}}{h_{i m g}}

; if we directly use the Euclidean distance as a metric in the standard k-means algorithm, then the large box clusters will generate more error than the small box clusters in the clustering results. Since we only care about the intersection over union (IOU) of the anchor and box and do not care about the box size, it is more appropriate to use the IOU as a metric, as shown in Figure 3.

Suppose we have

anchor = (w_{a}, h_{b})

,

b o x = (w_{b}, h_{b})

, then we have (1):

I O U (b o x, a n c h o r) = \frac{i n t e r s e c t i o n (b o x, a n c h o r)}{u n i o n (b o x, a n c h o r) - i n t e r s e c t i o n (b o x, a n c h o r)}

(1)

We do not care about the box’s position when calculating the IOU here. We assume all boxes’ upper left vertices are at the origin. Obviously, the value of the IOU is between 0 and 1. If two boxes are more similar, their IOU values are more significant. The more similar the two boxes are, the closer they should be, so the final metric is shown in (2).

d (b o x, a n c h o r) = 1 - I O U (b o x, a n c h o r)

(2)

Steps to perform k-means on a box

Step 1:: Random initialization: select K boxes as the initial anchor.
Step 2:: Using the IOU metric, assign each box to the anchor that is closest to it.
Step 3:: Calculate the average width and height of all the boxes in each cluster; update the anchor.
Step 4:: Repeat Steps 2 and 3 until the anchor no longer changes, or the maximum number of iterations is reached.

A new set of anchor frames was obtained by clustering the anchor frames of the dataset used in this paper. The new anchor frames were sized more closely to the locations of insulator defects in the dataset, making the actual detection results more consistent with the task requirements.

3.2. Normalization-Based Attention Mechanism (NAM)

The NAM serves as an efficient and lightweight attention mechanism [27]. Compared with other attention mechanisms, it does not require additional computations and parameters such as full connectivity, convolution, etc. It adopts the modular integration of CBAM and redesigns the attention submodules of channel and space.

For the channel attention submodule, the scale factor in batch normalization (BN) is used to measure the variance of the channels and indicate their importance, as shown in (3).

μ_{β}

and

σ_{β}

are the mean and variance, respectively, of the small batch and are the trainable affine transformation parameters. The channel sub-attention module is shown in Figure 4 and (4).

B_{o u t} = B N (B_{i n}) = γ \frac{B_{i n} - μ_{β}}{\sqrt{σ_{β}^{2} + ε}} + β

(3)

M_{c} = s i g m o i d (w_{r} (B N (F_{1})))

(4)

M_{s} = s i g m o i d (w_{λ} (B N_{s} (F_{1})))

(5)

L o s s = \sum_{(x, y)} l (f (x, W), y) + p \sum g (γ) + p \sum g (λ)

(6)

where

M_{c}

represents the output features,

γ

is the scaling factor of each channel, and

F_{1}

is the input feature with the weight

w_{γ} = γ_{i} / \sum_{j = 0} γ_{j}

, so we can obtain the weights of each channel.

For its spatial attention submodule, the scale factor of BN is applied to the spatial dimension to measure the importance of pixels. It is called pixel normalization. The corresponding spatial attention submodule is shown in Figure 5 and (5).

where the output features are denoted as

M_{S}

.

λ

is the scaling factor. The weights are

W_{λ} = λ_{i} / \sum_{j = 0} λ_{j}

. To suppress the less significant weights, we add a regularization term to the loss function, as shown in (6), where x denotes the input; y is the output; W represents the network weigh; l(·) is the loss function; g(·) is the

l_{1}

parametric penalty function, and is the balance of

g (γ)

and

g (λ)

of the penalty. The loss value is denoted as (6).

The effectiveness of the NAM attention mechanism varies with the additional position, and the depth of the module affects the insertion position of the attention. In this paper, the NAM module is integrated into the YOLOv5s model. Through continuous network training, the module is eventually inserted into the neck part of the upsampled position. The core of this is to use multiple pools of different sizes to sample the obtained feature maps (pooling) for feature extraction. We add the NAM attention mechanism behind this, which can reduce the weight of less essential features. This approach applies a sparse weight penalty to the attention module, which makes the computation of these weights more efficient. With this method, channel and spatial dimensions can be better integrated to reacquire channel features and spatial location information of insulator defect images, which not only extracts useful essential location information of insulator defects but also identifies the class of defects more quickly and accurately. It helps to improve the accuracy of network detection.

3.3. Recursive Gated Convolutions (gⁿConv)

gⁿConv is an efficient operation for performing higher-order spatial interactions based on gated convolution and recursive designs [28]. gⁿConv is constructed using standard convolution, linear projection, and elemental multiplication, and the new operation is highly flexible and customizable, has input-adaptive spatial mixing similar to self-attentiveness, and does not add a large amount of computation. Figure 6 shows the structure of the gⁿConv recursive gated convolution.

3.3.1. Gated Convolution-Based Input Adaptive Interaction

We seek to perform spatial interactions more efficiently and effectively through some simple convolutional and fully connected layer operations.

The basic operation of this method is gated convolution (gConv). Let

x \in ℝ^{H W \times C}

be the input feature: the output of the gated convolution

y = gConv (x)

can be expressed as (7) and (8).

The input features

x

are first linearly projected to give

p_{0}

and

q_{0}

.

ϕ_{i n}

and

ϕ_{o u t}

are linear projection layers to perform channel blending. f is the depth convolution,

p_{1}^{(i, c)} = \sum_{j \in Ω_{i}} w_{i \to j}^{c} q_{0}^{(j, c)} p_{0}^{(i, c)}

, where

Ω_{i}

is the local window centered on

i

, and

w

represents the weights of the convolution of

f

. The above equation introduces the neighboring features

p_{0}^{i}

and

q_{0}^{j}

by multiplying between elements

q_{0}

. After deep convolution, the dot product operation is performed with

p_{0}

to obtain

p_{1}

after another linear projection to obtain the output

y

, at which time the first-order spatial interaction is extracted.

3.3.2. Higher-Order Interactions for Recursive Gated Convolution

After achieving an effective first-order spatial interaction with gConv, a recursive gated convolution-gⁿConv is designed to further enhance the capacity of the algorithmic model by introducing higher-order interactions.

We start with

x

after a higher-order linear projection to obtain the features

p_{0}

and

{\{q_{k}\}}_{k = 0}^{n - 1}

as shown in (9). Then, recursively executing the gated convolution, we can sequentially obtain

p_{k + 1}

as shown in (10). Here the output is scaled to

1 / α

for stable training.

\{f_{k}\}

is a set of deep convolutional layers, and

\{g_{k}\}

is used for dimensional alignment at different orders of, as shown in (11).

As shown in (10), the interaction order of

p_{k}

increases by 1 for each recursive order, and the output of the last recursive step

q_{n}

is input to the projection layer

ϕ_{o u t}

to obtain the result of gⁿConv, so gⁿConv can perform spatial interaction of n orders.

To ensure that higher-order interactions do not introduce a tremendous computational effort, the channel dimension of each order is set to (12); (12) indicates that higher-order spatial interactions will be performed in a coarse-to-acceptable manner, where lower orders are computed with fewer channels.

gⁿConv is not designed to mimic only the aspect of self-attention. It has the following three advantages: (1) Simplicity and efficiency: The convolution-based implementation avoids the secondary complexity of self-attentiveness. Performing spatial interactions progressively increases the channel width, allowing the model to achieve higher-order interactions with limited complexity; (2) Scalability: We use the second-order interactions in self-attention to extend to arbitrary orders, further improving the detection capability of the model; (3) gⁿConv fully inherits the translational equivariance of standard convolution, introducing a functional inductive bias for current defect detection tasks and avoiding the asymmetry caused by local attention.

Because the neck part of the YOLOv5 model is a further depth to the feature extraction, we applied the gⁿConv recursive gate convolution to the neck part, replacing the original ordinary standard convolution, and obtained the best results through experiments, reflecting the efficiency of accurate recognition of defects in insulators. Figure 7 shows the network structure of the improved optimal model.

[p_{0}^{H W \times C}, q_{0}^{H W \times C}] = ϕ_{i n} (x) \in ℝ^{H W \times 2 C}

(7)

p_{1} = f (q_{0}) ⊙ p_{0} \in ℝ^{H W \times C}, y = ϕ_{o u t} (p_{1}) \in ℝ^{H W \times C}

(8)

[p_{0}^{H W \times C}, q_{0}^{H W \times C}, \dots q_{n - 1}^{H W \times C_{n - 1}}] = ϕ_{i n} (x) \in ℝ^{H W \times (C_{0} + \sum_{0 \leq k \leq n - 1} C_{k})}

(9)

p_{k + 1} = f_{k} (q_{k}) ⊙ g_{k} (p_{k}) / α, k = 0, 1, \dots, n - 1

(10)

g_{k} = {\begin{array}{l} Identity k = 0 \\ Linear (C_{k - 1}, C_{k}) 1 \leq k \leq n - 1 \end{array}

(11)

C_{k} = \frac{C}{2^{n - k - 1}} 0 \leq k \leq n - 1

(12)

4. Experimental Results and Analysis

4.1. Experimental Environment

The experimental environment used in this paper is built by the local host SSH remote connection to the server, and the experimental environment configuration of the model is as follows (Table 1).

4.2. Evaluation Indicators

The model evaluation metrics include three main points: one is mean average precision (mAP), two is precision, and three is recalled. In (13), AP is the average precision. In (14), N is the number of categories, and mAP is the mean average precision. The accuracy is the vertical coordinate, the recall is the horizontal coordinate to draw the PR curve, and the value of AP is the area under the PR curve. The higher the mAP value, the better the detection performance of the model, and vice versa. Precision, shown in (15), represents the proportion of classes that the classifier considers as positive and are indeed positive to the proportion of classes that the classifier considers as positive. Recall, shown in (16), indicates the proportion of the fraction of the classifier that is considered as positive class and is indeed positive class to all the fractions that are indeed positive class [29]. In this experiment, TP, FP, and FN denote the number of correctly detected frames, misdirected frames, and missed frames, respectively.

A P = \int_{0}^{1} P (R) d R

(13)

m A P = \frac{\sum_{i = 1}^{N} A P_{i}}{N}

(14)

P r e c i s i o n = \frac{T P}{T P + F P}

(15)

R e c a l l = \frac{T P}{T P + F N}

(16)

4.3. Model Training

In the model training process, the weights used were YOLOv5s trained weights, file YOLOv5s.pt on the COCO dataset, as the initial weights of the YOLOv5s model. The small-batch stochastic gradient descent (SGD) method was used to train the model. The BatchSize of the model was set to 16, the epoch was set to 300, and the initial learning rate of the network was 0.01. According to the data shown in Table 2, a suitable base model was selected for the experiments.

The data in the table are all measured in the same experimental environment. In Figure 8a–d are the results tested by the above four models, respectively. As can be seen from the figure, we selected three graphs for each of the above four YOLOv5 models to make predictions, and the results are precise, such that the YOLOv5s model is more accurate in terms of relative detection accuracy compared to the other three models, is faster, has fewer parameters, and has lower computational complexity.

From the perspective of the data set selected in this paper, the data features are relatively single, the complexity is relatively low, and most of them are single-label. From the perspective of the actual project requirements, applicable to company projects, the demands for defect detection tend to be lightweight and balanced detection accuracy and detection speed. From the analysis of the data in the table, the spatial complexity (parameters), model size, number of parameters, and temporal complexity are noted. GFLOPs measure the complexity of the model while affecting the training time; after careful consideration, this paper uses the YOLOv5s model as the baseline. To verify the effectiveness of this paper based on the improved YOLOv5s algorithm, ablation experiments were conducted on the same training set and test set samples for this study algorithm with each improved content, and the results of the ablation experiments are shown in Table 3. The mAP value and class loss function of the optimal model in this paper are shown in Figure 9a,b.

The improvement effect of each module is analyzed qualitatively by controlling one or more variables in anchor, NAM, and gⁿConv. With the addition of the NAM module, the attention mechanism assigns reasonable weights to channels and spaces, improves the network’s attention to specific regions of objects, and improves target missing detection and omission to some extent. Adding gⁿConv introduces a beneficial inductive bias in the current defect detection task and avoids the asymmetry caused by local attention. Furthermore, the features are further extracted by recursive gated convolution, which is a particular improvement compared with the original algorithm; the value of mAP is improved by 1%.

In this paper, the selected YOLOv5s is compared with classical target detection algorithms: single-stage SSD, YOLOv3, YOLOv4, and two-stage Fast-RCNN, etc. The main comparisons are P, R, and mAP for each defect class, and the results are shown in Table 4 below.

As can be seen from the table, the average detection accuracy of the algorithm in this paper is higher than that of other mainstream detection algorithms such as SSD, Fast-RCNN, YOLOv3, YOLOv4, YOLOv5, YOLOX, and YOLOv7 by 11.3%, 5.9%, 2.7%, 2.4%, 1%, 1.3%, and 0.2%, respectively. By comparing the experimental analysis, it is found that the improved algorithm proposed in this paper has higher mAP values on the insulator defect data set and the mAP values of each class than other models.

As shown in Figure 10, we use three images of arbitrary insulator defects for different algorithm models to make predictions. After comparison, we find that the Fast-RCNN and SSD algorithms have individual cases of missing detection, and the YOLOv3, YOLOv4, and YOLOv5 algorithms are much better compared to the first two algorithms. The detection accuracy has been improved, and the detection speed is faster. Compared to the other algorithms, our improved algorithm can identify and locate the specific location of insulators and defects almost accurately and without difference from the actual detection results.

5. Summary

In this paper, a YOLOv5s algorithm is proposed for the insulator defect detection engineering problem. First, the anchor is re-generated using the k-means clustering algorithm according to its dataset, and then the attention mechanism NAM is introduced to the YOLOv5s network, and the standard convolutional layer in the neck structure is changed to recursive gated convolution for optimal adjustment. On the dataset used in this paper, the best detection accuracy reaches 93.7%, which exceeds the detection accuracy of YOLOv5 by 1%. At the same time, it is far stronger than other detection algorithms when compared with other improvements, effectively balancing detection accuracy and detection speed in engineering to achieve fast and accurate end-to-end detection on the dataset.

Author Contributions

Conceptualization, Y.Q.; Data curation, Y.Q.; Investigation, Y.L.; Methodology, Y.Q.; Software, Y.Q.; Validation, Y.L.; Visualization, A.D.; Writing—original draft, Y.Q.; Writing—review and editing, A.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science Foundation of China under Grant U1903213, and the Scientific and Technological Innovation 2030 major project under Grant 2022ZD0115802.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. These data can be found here: [http://www.cs.toronto.edu/~kriz/cifar.html], [https://paperswithcode.com/datasets/nuswide] and [https://image-net.org] (all accessed on 18 July 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, Y.; Ni, M.; Lu, Y. Insulator defect detection for power grid based on light correction enhancement and YOLOv5 model. Energy Rep. 2022, 8, 807–814. [Google Scholar] [CrossRef]
Zheng, J.; Wu, H.; Zhang, H.; Wang, Z.; Xu, W. Insulator-Defect Detection Algorithm Based on Improved YOLOv7. Sensors 2022, 2, 8801. [Google Scholar] [CrossRef]
Xu, S.; Deng, J.; Huang, Y.; Ling, L.; Han, T. Research on Insulator Defect Detection Based on an Improved MobilenetV1-YOLOv4. Entropy 2022, 24, 1588. [Google Scholar] [CrossRef]
Huang, W.; Li, T.; Xiao, Y. Insulator defect detection algorithm based on improved YOLOv5s. In Proceedings of the 5th International Conference on Computer Information Science and Application Technology (CISAT 2022), Chongqing, China, 29–31 July 2022. [Google Scholar]
Li, X.; Su, H.; Liu, G. Insulator defect recognition based on global detection and local segmentation. IEEE Access 2020, 8, 59934–59946. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Part I 14; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Wang, B.; Ma, F.; Ge, L.; Ma, H.; Wang, H. Icing-EdgeNet: A pruning lightweight edge intelligent method of discriminative driving channel for ice thickness of transmission lines. IEEE Trans. Instrum. Meas. 2020, 70, 1–12. [Google Scholar] [CrossRef]
Zhao, Z.; Qi, H.; Qi, Y.; Zhang, K.; Zhai, Y.; Zhao, W. Detection method based on automatic visual shape clustering for pin-missing defect in transmission lines. IEEE Trans. Instrum. Meas. 2020, 69, 6080–6091. [Google Scholar] [CrossRef]
Davari, N.; Akbarizadeh, G.; Mashhour, E. Intelligent diagnosis of incipient fault in power distribution lines based on corona detection in UV-visible videos. IEEE Trans. Power Deliv. 2020, 36, 3640–3648. [Google Scholar] [CrossRef]
Rong, S.; He, L.; Du, L.; Li, Z.; Yu, S. Intelligent detection of vegetation encroachment of power lines with advanced stereovision. IEEE Trans. Power Deliv. 2020, 36, 3477–3485. [Google Scholar] [CrossRef]
Feng, Z.; Guo, L.; Huang, D.; Li, R. Electrical insulator defects detection method based on yolov5. In Proceedings of the 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS), Suzhou, China, 14–16 May 2021; pp. 979–984. [Google Scholar]
Liu, C.; Wu, Y.; Liu, J.; Han, J. MTI-YOLO: A light-weight and real-time deep neural network for insulator detection in complex aerial images. Energies 2021, 14, 1426. [Google Scholar] [CrossRef]
Wu, C.; Ma, X.; Kong, X.; Zhu, H. Research on insulator defect detection algorithm of transmission line based on CenterNet. PLoS ONE 2021, 16, e0255135. [Google Scholar] [CrossRef] [PubMed]
Qiu, Z.; Zhu, X.; Liao, C.; Shi, D.; Qu, W. Detection of transmission line insulator defects based on an improved lightweight YOLOv4 model. Appl. Sci. 2022, 12, 1207. [Google Scholar] [CrossRef]
Tao, X.; Zhang, D.; Wang, Z.; Liu, X.; Zhang, H.; Xu, D. Detection of power line insulator defects using aerial images analyzed with convolutional neural networks. IEEE Trans. Syst. Man Cybern. Syst. 2018, 50, 1486–1498. [Google Scholar] [CrossRef]
Wang, S.; Liu, Y.; Qing, Y.; Wang, C.; Lan, T.; Yao, R. Detection of insulator defects with improved resnest and region proposal network. IEEE Access 2020, 8, 184841–184850. [Google Scholar] [CrossRef]
Wang, Z.; Wu, L.; Li, T.; Shi, P. A smoke detection model based on improved YOLOv5. Mathematics 2022, 10, 1190. [Google Scholar] [CrossRef]
Luo, Q.; Yang, K.; Yan, X.; Li, Y.; Wang, C.; Zhou, Z. An improved trilateration positioning algorithm with anchor node combination and K-Means clustering. Sensors 2022, 22, 6085. [Google Scholar] [CrossRef]
Liu, Y.; Shao, Z.; Teng, Y.; Hoffmann, N. NAM: Normalization-based attention module. arXiv 2021, arXiv:2111.12419. [Google Scholar]
Rao, Y.; Zhao, W.; Tang, Y.; Zhou, J.; Lim, S.-N.; Lu, J. Hornet: Efficient high-order spatial interactions with recursive gated convolutions. arXiv 2021, arXiv:2207.14284, 2022. [Google Scholar]
Gao, J.; Chen, X.; Lin, D. Insulator defect detection based on improved YOLOv5. In Proceedings of the 2021 5th Asian Conference on Artificial Intelligence Technology (ACAIT), Haikou, China, 29–31 October 2021; pp. 53–58. [Google Scholar]

Figure 1. YOLOv5 network model.

Figure 3. IOU.

Figure 4. The channel attention module.

Figure 5. Spatial attention module.

Figure 6. gⁿConv.

Figure 7. Improved model.

Figure 8. Comparison of the test results of four YOLOv5 models. (a) YOLOv5s test results; (b) YOLOv5m test results; (c) YOLOv5l test results; (d) YOLOv5x test results.

Figure 9. mAP and loss curve. (a) ours-mAP; (b) ours-Class-Loss.

Figure 10. Test results of different algorithms. (a) SSD prediction results; (b) Fast-RCNN prediction results; (c) YOLOv3 prediction results; (d) YOLOv4 prediction results; (e) YOLOv5-S prediction results; (f) ours prediction results.

Table 1. Experimental environment configuration.

Software, Hardware/Systems	Configuration
system	Linux
CPU	Intel(R) Core(TM) i7-9700 CPU @ 3.00 GHz
GPU	RTX A5000
Development Languages	python 3.8
Deep Learning Framework	pytorch 1.8
Accelerated Environment	CUDA 11.2

Table 2. Performance of insulator defect detection models with different YOLOv5 models.

Models	P	R	[email protected]	[email protected]–0.95	Parameters (M)	GFLOPs (B)
YOLOv5s	0.956	0.907	0.927	0.626	7.0	15.8
YOLOv5m	0.904	0.823	0.873	0.536	20.1	47.9
YOLOv5l	0.96	0.88	0.924	0.613	46.1	107.7
YOLOv5x	0.924	0.884	0.925	0.591	86.2	203.8

Table 3. Results of ablation experiments.

Grouping	Anchor	NAM	gⁿConv	mAP		mAP
				Insulator		Defect
1	√			0.868	0.981	0.925
2		√		0.841	0.943	0.892
3			√	0.832	0.951	0.892
4	√	√		0.866	0.978	0.922
5	√		√	0.862	0.975	0.919
6		√	√	0.839	0.878	0.859
7	√	√	√	0.888	0.985	0.937

Table 4. Performance comparison of different detection algorithms.

Model	P (%)	R (%)	mAP	F1	GFLOPs
SSD	84.9	81.5	82.4	81.0	18.2 M
Fast-RCNN	93.8	83.7	87.8	88.0	44.52 M
YOLOv3	94.2	86.3	91	89.0	191.5 M
YOLOv4	92.1	89.6	91.3	89.0	118.77 M
YOLOv5s	95.6	90.7	92.7	90.0	15.8 M
YOLOX	93.8	90.0	92.4	91.0	26.61 M
YOLOv7	94.5	91.5	93.5	93.0	102.5 M
OURS	94.8	91.9	93.7	93.0	15.65 M

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qi, Y.; Li, Y.; Du, A. Research on an Insulator Defect Detection Method Based on Improved YOLOv5. Appl. Sci. 2023, 13, 5741. https://doi.org/10.3390/app13095741

AMA Style

Qi Y, Li Y, Du A. Research on an Insulator Defect Detection Method Based on Improved YOLOv5. Applied Sciences. 2023; 13(9):5741. https://doi.org/10.3390/app13095741

Chicago/Turabian Style

Qi, Yifan, Yongming Li, and Anyu Du. 2023. "Research on an Insulator Defect Detection Method Based on Improved YOLOv5" Applied Sciences 13, no. 9: 5741. https://doi.org/10.3390/app13095741

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on an Insulator Defect Detection Method Based on Improved YOLOv5

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. K-Means Algorithm for Re-Clustering Anchor Frames

3.2. Normalization-Based Attention Mechanism (NAM)

3.3. Recursive Gated Convolutions (gⁿConv)

3.3.1. Gated Convolution-Based Input Adaptive Interaction

3.3.2. Higher-Order Interactions for Recursive Gated Convolution