Research on an Improved Detection Algorithm Based on YOLOv5s for Power Line Self-Exploding Insulators

Hu, Caiping; Min, Shiyu; Liu, Xinyi; Zhou, Xingcai; Zhang, Hangchuan

doi:10.3390/electronics12173675

Open AccessArticle

Research on an Improved Detection Algorithm Based on YOLOv5s for Power Line Self-Exploding Insulators

by

Caiping Hu

^1,*

,

Shiyu Min

¹

,

Xinyi Liu

²,

Xingcai Zhou

²

and

Hangchuan Zhang

¹

Department of Computer Engineering, Jinling Institute of Technology, Nanjing 211169, China

²

School of Statistics and Data Science, Nanjing Audit University, Nanjing 211815, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(17), 3675; https://doi.org/10.3390/electronics12173675

Submission received: 23 July 2023 / Revised: 23 August 2023 / Accepted: 29 August 2023 / Published: 31 August 2023

Download

Browse Figures

Versions Notes

Abstract

:

In the process of inspecting the self-exploding defects of power line insulators, traditional algorithms suffer from various issues such as long detection time, insufficient accuracy, and difficulties in effective detection under complex environments. To address these problems, we introduce an advanced one-stage object detection algorithm called YOLOv5s, which offers fast training and excellent detection performance. In this paper, we applied the YOLOv5s algorithm to improve the detection precision and classification accuracy of insulator self-explosions. To further enhance the YOLOv5s algorithm, we introduced a BiFPN (Bidirectional Feature Pyramid Network) module for feature fusion. This module improved the feature fusion process by learning the importance weights of different input features, considering their contributions. To tackle the challenge of detecting small objects in the self-exploding insulator dataset, we incorporated an SPD (spatial-to-depth convolution) module that focuses on capturing features in small regions and utilizes one-step convolution layers to avoid losing fine-grained information. To address the issue of high similarity between self-exploding insulator regions and intact insulator regions, we introduced an attention mechanism that concentrates attention on the defective insulator regions to gather more information about insulator defects. Experimental results validate that all three improvement methods significantly enhance detection precision. The final model achieves improvements of 2.0% in precision, 0.9% in recall, and 1.5% in average detection accuracy. Through target detection of the test dataset, insulators with self-explosion cases can be effectively detected.

Keywords:

insulator defect; object detection; YOLOv5s algorithm; BiFPN; SPD; attention mechanism

1. Introduction

Insulators play a crucial role in power lines, providing electrical insulation and mechanical support [1]. However, insulator failure has become one of the main factors affecting the operational stability and line safety of power equipment [2]. In recent years, breakthroughs in AI [3] and deep learning algorithms [4] have provided new possibilities to solve this problem. In particular, convolutional neural networks, with their high accuracy and wide applicability, make insulator defect detection even more feasible [5]. At the same time, the inspection method of UAVs combined [6] with machine vision technology [7] also greatly reduces the pressure of traditional manual operation [8].

In the detection of insulator defects, many scholars have carried out related studies [9]. These studies aim to propose effective methods and algorithms to help automatically detect insulator defects and improve the detection accuracy [10]. These research works involve different techniques and methods, including the traditional image-processing algorithm [11], machine learning method [12], and deep learning technology [13]. The SVM-based method uses PMU to achieve insulator intelligent detection [14]. A modified SSD method to detect defects can be used [15]. There are also intelligent classification algorithms using multi-channel CNN [16] and improved FDM [17]. Many scholars have successfully achieved fault identification using the FineMask RCNN algorithm [18] and have achieved a good improvement in accuracy [19]. They have also used the lightweight network [20] to detect the self-explosion and crack problems of glass insulators [21] and proposed a solution of the lightweight algorithm on the embedded system [22]. In addition, the real-time two-stage FastR-CNN algorithm is used to inspect the power line [23], and the real-time FastR-CNN algorithm is improved, proposing an enhanced CNN backbone network [24] and successfully identifying and judging the insulator defects in the UAV image. Some scholars have achieved the extraction of feature masks and defect detection [25] based on the U-Net segmentation model. In addition, through the optimization of deep learning feature positioning, the level of insulator pollution is judged [17], which provides an important basis for line safety assessment.

In addition, some scholars used the transfer learning method to locate and identify [26] power equipment components, with the average accuracy improved by 1.1%. Some scholars have established the transfer learning method based on mixed samples and proposed the equilibrium loss function to solve the sample imbalance problem [8]. The detection accuracy of insulator defects has reached 75.1%. Some scholars have proposed the insulator defect detection method [2] for multi-scale feature coding and dual attention fusion, whose accuracy has been significantly improved [20]. In addition, we also study and compare the YOLO series algorithms to detect transmission lines in real-time [27] and use embedded systems such as drones to carry the latest algorithms [28] to improve the algorithm’s lightweight aspects, such as lightweight processing on the new network, CenterNet [29].

Although many studies have addressed the insulator defect detection problem to some extent, some challenges remain [30]. For example, there are few original data scenarios and the inability to combine both detection accuracy and model size in a complex background [31]. As the YOLO series of algorithms are constantly optimized and updated, there are a large number of versions being explored, such as YOLOv5n, YOLOv5s, YOLOv5m, YOLOv5l, YOLOv5x, YOLOv6n, YOLOv6t, YOLOv6s, YOLOv6m, YOLOv6l, YOLOv7, and YOLOv7x [32]. Recently, the eighth YOLO architectures were released, where YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, and YOLOv8x are available for target detection [33]. The accuracy, speed, and detection degree brought great improvement to the latest algorithm [34]. However, the model in this paper researches the main application of individual users and small equipment and scenarios [35], and a more practical, more mature model is needed. The model pays attention to the basis of the mature YOLOv5 series algorithm [36], improving accuracy and detection degree; at the same time for personal small equipment and scenarios [37], we chose the YOLOv5 series of YOLOv5s as the basis for improving the reference algorithm [38]. In the latest YOLO algorithm, although it has a certain increase in accuracy and speed, the YOLOv6, YOLOv7, and YOLOv8 series algorithm applications are not mature [39], do not belong to the widespread stage of the target detection algorithm [40], and for individual users’ actual operation and training data sets have certain difficulties [41]. At the same time, the new series algorithm has higher hardware requirements, increasing the training and use cost [42]. The YOLOv5 uses a modified version of the YOLOv3 architecture CSPDarknet53, which is a mature and advanced one-stage target detection algorithm [43], which contains cross-stage part (CSP) modules to help reduce computational costs while maintaining accuracy [44]. Therefore, in terms of practicality and economy, we mainly chose the YOLOv5s algorithm.

In summary, although many studies have made progress in addressing insulator defect detection to some extent, there are still challenges that need to be overcome [16]. These challenges include limited original data scenes [45] and difficulties in simultaneously considering detection accuracy [5] and model size under complex backgrounds [46]. Therefore, this study focuses on improving the YOLOv5s algorithm and dataset characteristics to further enhance the effectiveness of insulator defect detection. Three innovations are introduced into the YOLOv5s network structure: the BiFPN feature fusion concept, the SPD (spatial-to-depth) convolution module, and attention mechanisms (CBAM, SimAM). These innovations aim to address issues with feature fusion, small object detection, and high complexity around insulators.

Firstly, to address the problem of direct concatenation without considering weight design during feature fusion in YOLOv5s, we adopt the idea of weighted feature fusion using the BiFPN module. By introducing weights to learn the importance of different input features, we enhance the model’s feature extraction capabilities. Secondly, to tackle the challenge of detecting small objects in the dataset, we incorporate the SPD convolution module. This module utilizes a unique approach to improve small object detection by segmenting input feature maps into sub-feature maps and adjusting stride settings in Conv layers to prevent fine-grained information loss. Typically, using a larger stride in convolutional operations results in the loss of subtle and fine-grained information in input feature maps. This is particularly important for small objects as they may occupy only a small area in the image. To address this issue, we change the stride of Conv layers from the traditional setting of 2 to 1, allowing us to focus more on feature information within small regions and preserve more details. Furthermore, to address the issue of high complexity around insulators and their similarity with other objects, we introduce attention mechanisms (CBAM, SimAM). By applying attention mechanisms, we can allocate more attention resources to regions relevant to insulator self-explosion detection, thereby obtaining more information about insulator defects. Experimental results demonstrate that all three improvement methods significantly enhance the model’s accuracy. Therefore, we integrate these methods and further improve model performance using transfer learning. Based on evaluation metrics (P, R, mAP), it is found that the YOLOv5s + BiFPN + SPD + CBAM model exhibits optimal performance. There are noticeable improvements in precision (P), recall (R), and mean average precision (mAP). Using the trained YOLOv5s + BiFPN + SPD + CBAM model for object detection on test datasets enables the identification of images with insulator self-exploding conditions and the detection of such conditions in power line images.

2. Materials and Methods

2.1. YOLOv5s Algorithm

The YOLOv5s algorithm consists of three main components: the backbone network, the neck network, and the head network. The backbone network is responsible for extracting features from input images and includes modules such as the Focus module, Convolutional module, Bottleneck layer (C3), and Spatial Pyramid Pooling Fusion (SPPF). In this study, the YOLOv5s network used replaces the SPP module with SPPF and places it at the last layer of the backbone network. Compared to SPP, SPPF achieves faster computation speed while yielding similar results.

The neck network is located between the backbone network and the head network and is responsible for fusing extracted features. It combines high-level semantic information with low-level positional information through up-sampling layers and CSP modules [47]. By fusing features in this way, the neck network can effectively utilize information from different levels to enhance the model’s adaptability to various scales and complex scenes. The neck network plays a crucial role in object detection tasks and contributes positively to improving feature robustness and diversity.

In object detection tasks, after processing by the backbone and neck networks, the extracted features are passed to the detection layer for further processing. Firstly, through non-maximum suppression (NMS), the detection layer filters out redundant bounding boxes [48]. This method eliminates overlapping boxes and retains only those with the highest confidence scores. It determines whether a box should be kept by comparing the overlap between different boxes. If the overlap exceeds a certain threshold, it selects the box with higher confidence as a result to keep. Secondly, after filtering out bounding boxes, the detection layer returns predictions for the most probable class based on the highest confidence score. This prediction indicates which class is most likely associated with that particular bounding box, enabling object classification in object detection tasks. Additionally, the detection layer returns coordinate information of the predicted bounding box, representing the object’s position in the image. These coordinate values help accurately locate the object for subsequent processing and analysis. Through these steps, the detection layer effectively processes and analyzes input features to generate final object detection results. This approach has been widely applied in practice with good performance. Therefore, the detection layer plays a crucial role in object detection algorithms.

With such a network structure design, the YOLOv5s algorithm can effectively perform object detection tasks and demonstrate good performance in experiments. In Figure 1 of the experiment, we used different algorithms, including the improved algorithm mentioned in this paper, such as the Faster R-CNN two-stage target detection algorithm and the YOLO series, including relatively small algorithms (YOLOv6s, YOLOv7s, YOLOv8s). We used the same dataset for training and testing. The improved YOLOv5s algorithm shows improvements in accuracy (P), recall (R), and average detection accuracy (mAP).

2.2. YOLOv5s Algorithm Improvement

2.2.1. BiFPN Module

BiFPN, short for Bidirectional Feature Pyramid Network, is a weighted bidirectional feature pyramid network. This module was proposed in the paper “EfficientDet: Scalable and Efficient Object Detection” [49] and has been proven to facilitate fast and convenient multi-scale feature fusion.

In the YOLOv5s algorithm, the neck network used is PAN (Path Aggregation Network) [49], as shown in Figure 2a. However, in the fusion process of low-level feature information and high-level semantic information, the PAN structure directly uses concatenation operation without any weight design. This simple bidirectional fusion fails to consider the differences in contribution between different features.

To address this issue, this paper introduces the weighted feature fusion concept of the BiFPN module. Introducing weights to learn the importance of different input features enhances the feature extraction capability of the YOLOv5s algorithm and further improves its accuracy. The network structure of this module is shown in Figure 2b.

By using the BiFPN module for weighted feature fusion, the YOLOv5s algorithm can better handle multi-scale features and improve performance in object detection tasks.

Below, we will introduce three methods of weighted fusion:

(1): Unbounded fusion:

\begin{matrix} O = \sum_{i = 1}^{N} w_{i} \cdot I_{i} \end{matrix}

(1)

where N represents the number of features, I_i represents the i-th feature, and w_i represents the corresponding weight. As the weight w_i can take any value, it is called unbounded fusion. However, due to the unbounded nature of weight values, it can lead to unstable model training. Therefore, Tan et al. proposed normalizing the weights [50] to constrain their range of values. Limiting the range of weight values effectively controls the learning ability and generalization ability of the model.

(2): Softmax-based fusion:

\begin{matrix} O = \sum_{i = 1}^{N} \frac{e^{w_{i}}}{\sum_{j = 1}^{N} e^{w_{j}}} \cdot I_{i} \end{matrix}

(2)

Softmax-based fusion is a technique used in neural networks to combine multiple features or representations by applying the Softmax function to the weights associated with each feature. This process normalizes the weights, transforming them into values between 0 and 1, which represent the relative importance or contribution of each feature. By using Softmax-based fusion, the network can effectively weigh and integrate different features to make informed decisions or predictions. Applying the Softmax function to each weight normalizes them to values between 0 and 1, representing the importance of each feature. Experimental verification shows that additional Softmax processing can significantly slow down the training speed of the network. Therefore, Tan et al. [50] further proposed a fast fusion method to reduce the additional latency cost.

(3): Fast normalized fusion:

\begin{matrix} O = \sum_{i = 1}^{N} \frac{w_{i}}{ϵ + \sum_{j = 1}^{N} w_{j}} \cdot I_{i} \end{matrix}

(3)

To ensure numerical stability, an activation function is used for each processing

w_{i}

to ensure

w_{i}

≥ 0. By setting

ϵ

= 0.0001, it can avoid numerical instability. The fast normalization fusion method is an efficient algorithm used for weight normalization. Compared to traditional Softmax operations, it not only restricts each weight between 0 and 1 but also improves computational efficiency.

The BiFPN structure applies fast normalization fusion to normalize the weights to the range of [0, 1], enhancing the perception capability of targets in different scenarios. It also fuses information from feature maps at different levels in the prediction stage. In this paper, leveraging the idea of fast normalization feature fusion in BiFPN, we improve the YOLOv5s network to enhance its feature extraction capability and thereby improve the detection accuracy of the YOLOv5s algorithm.

2.2.2. SPD Module

The SPD (space-to-depth) convolutional layer was proposed by Sunkara and Raja [51] to address the issues of low image resolution and small object detection. In some application scenarios, due to device limitations or other factors, the image resolution may be relatively low, resulting in lost or blurred details. Additionally, for small objects, their small size makes it more challenging to accurately detect and recognize them at low resolutions. Although convolutional neural networks (CNNs) have achieved significant success in image classification and object detection tasks, common design flaws in existing CNN architectures lead to performance degradation in difficult tasks such as low-resolution images and small object detection.

To overcome this problem, the SPD module is introduced into CNNs. The SPD module is a space-to-depth convolutional layer that applies image transformation techniques to CNNs and performs down sampling on feature maps. Unlike traditional convolution operations, the SPD module retains all information in the channel dimension without any loss of information.

Specifically, given an arbitrary-sized feature map X, the SPD module divides it into a series of sub-feature maps f_x_,y. Each sub-feature map is formed by all entries X(i,j) from the original feature map X that satisfy i + x and j + y being divisible proportionally, as shown in Equation (4) below. Here T represents the division ratio.

By introducing the SPD module for space-to-depth convolutional operations, fine-grained information can be effectively preserved and feature learning efficiency can be improved when dealing with tasks involving low image resolution and small object detection. This is of great significance for improving model performance and accuracy.

f_{0,0} = X [0 : S : T, 0 : S : T], f_{1,0} = X [1 : S : T, 0 : S : T], \dots \begin{matrix} f_{T - 1,0} = X [T - 1 : S : T, 0 : S : T]; \end{matrix} f_{0, T - 1} = X [0 : S : T, T - 1 : S : T], f_{1, T - 1} = X [1 : S : T, T - 1 : S : T], \dots f_{T - 1, T - 1} = X [T - 1 : S : T, T - 1 : S : T];

(4)

According to Figure 3, in the SPD module, when the division ratio T = 2, the feature map X is divided into 4 sub-feature maps:

f_{0,0}, f_{0,1}, f_{1,0}, f_{1,1}

. These sub-feature maps have sizes that are half of the original feature map

(S / 2, S / 2, C_{1})

. Then, these four sub-feature maps are concatenated along the channel dimension to form a new feature map

X^{'}

. The spatial dimensions of the new feature map become half

(\frac{S}{T})

, and the channel dimension becomes double

(T^{2} \cdot C_{1})

. Through this process in the SPD module, the original feature map

X (S, S, C_{1})

is transformed into a new feature map

X^{'} (S / 2, S / 2,4 C_{1})

. This transformation preserves all information and changes the feature map in both spatial and channel dimensions, providing more possibilities for subsequent feature extraction and object detection tasks.

2.2.3. CBAM Attention Mechanism

The CBAM (Convolutional Block Attention Module) attention mechanism was proposed by Woo and Park in their paper “CBAM: Convolutional Block Attention Module” [52]. They stated that CBAM is a lightweight and versatile module that can be applied to feed-forward convolutional neural networks, achieving consistent accuracy improvements across various classification and detection datasets, demonstrating the wide applicability of the CBAM attention mechanism.

The CBAM attention module is a key technique for enhancing the performance of neural networks. It consists of two main components: the channel attention module and the spatial attention module. The channel attention module plays a role in aggregating spatial information of features. The channel attention module structure is shown in Figure 4. Specifically, it performs average pooling and max pooling operations on the input features to obtain the average pooled feature Favg and max pooled feature Fmax. Then, these two features are processed to generate a weight vector Wc. This weight vector represents the importance of each channel. Next, this weight vector is element-wise multiplied with the original input features to achieve weighted adjustment for each channel. This allows better utilization of correlations between different channels and enhances the network’s expressive power in distinguishing different classes of objects. In addition to the channel attention module, CBAM also includes a spatial attention module. The spatial attention module is primarily used for extracting spatial information from feature maps. It performs average pooling and max pooling operations along the channel dimension to obtain the average pooled feature Fsavg and max pooled feature Fsmax. Then, these two features are processed to generate a weight vector Ws. This weight vector represents the importance of each spatial position. Subsequently, this weight vector is element-wise multiplied with the original input features to achieve weighted adjustment for each spatial position. This allows better focus on important regions within an image and improves performance in detection and localization tasks. Finally, both processed features are further processed using a shared fully connected layer that includes multiple perceptrons (MLP) and a hidden layer. In the CBAM attention module, the average pooled feature Favg and max pooled feature Fmax obtained from average pooling and max pooling operations are further processed to generate a new feature. Then, this new feature is activated by applying the sigmoid activation function to obtain channel attention map Mc. Here, φ represents the sigmoid activation function, as shown in Equation (5) below.

M_{C} (F) = φ [M L P (F_{a v g}) + M L P (F_{m a x})]

(5)

Unlike the channel attention module, the spatial attention module focuses on extracting important information when processing feature maps. Specifically, the spatial attention module first performs average pooling and max pooling operations on the input feature map, compressing the data along the channel dimension. This reduces computational complexity and extracts key information along the channel dimension. The compressed feature maps are then concatenated along the channel dimension to form an effective feature map. This concatenation operation allows each position to incorporate information from different pooling operations. In this way, the spatial attention module captures correlations between different positions and highlights important regions. Through the processing of the spatial attention module, neural networks can better focus on important information in feature maps, improving accuracy in recognizing target objects. Additionally, the spatial attention module reduces computational complexity while extracting key information effectively. The spatial attention module has unique advantages when processing feature maps. By performing average pooling and max pooling operations, compressing and concatenating data along the channel dimension, it can extract important information while reducing computational complexity. The application of this module in neural networks can significantly enhance performance and is applicable to various image-processing tasks.

To further extract important information along the spatial dimension, a convolutional layer with a size of 7 × 7

{(f}^{7 \times 7})

is used to process the feature map.

Finally, by mapping the processed feature map to a range of 0 to 1 using the sigmoid function, we obtain the spatial attention map Ms as shown in Figure 5. This mapping represents the importance of each position on the feature map. Through the processing of the spatial attention module, critical information on the feature map can be captured more accurately, enhancing model perception in terms of spatial dimensions. This attention mechanism is significant for tasks such as image classification and object detection and can be widely applied in deep learning models across different domains.

The operation principle of the CBAM (Convolutional Block Attention Module) attention module is shown in Figure 6. The CBAM attention mechanism processes the input features through a series of steps to extract important information from the image and enhance the connections between features.

First, the input features are processed as intermediate features F. Then, the channel attention map Mc is computed using channel attention. This map represents the importance of each channel for the overall feature representation. Before computing spatial attention, the intermediate features F are multiplied element-wise with the channel attention map Mc, adjusting the intermediate features based on their channel importance.

Next, spatial attention is applied to the weighted intermediate features to obtain new feature maps in both spatial and channel dimensions. This process enhances the connections between features and better extracts effective features. The newly obtained feature maps are then weighted to obtain the final feature F’. The weighting operation can be adjusted according to specific task requirements to further optimize model performance.

Through the processing of the CBAM attention mechanism, deep learning models can accurately extract important information from images and enhance connections between features in both channel and spatial dimensions.

The CBAM attention mechanism plays an important role in image processing and can be widely applied in various image-processing tasks across different domains. Whether it is classification, detection, or segmentation tasks, introducing the CBAM attention mechanism can improve model performance. Its flexibility and effectiveness make it an indispensable part of deep learning models.

2.2.4. SimAM No Attention Mechanism

SimAM (Simple, Parameter-Free Attention Module) was proposed by Yang and Zhang et al. [53] in the article “SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks”. Traditionally, channel attention mechanisms improve model performance by assigning weights over the channel dimension, but doing so adds additional parameters. Unlike this, SimAM takes a novel approach by assessing the importance of each neuron by designing the energy function and assigning 3D attention weights to the feature maps. This technique requires no additional parameters and can be directly applied to existing models to improve the ability to extract features. Specifically, SimAM evaluates its importance by calculating the energy value for each neuron. Then, the 3D attention weights were assigned to the feature maps according to these energy values. Such weights can be assigned in three dimensions, channel, space, and depth, to better capture the key information in the feature map. The schematic diagram of assigning 3D attention weights is in Figure 7.

With the application of SimAM, the model can effectively extract important features and requires no additional parameters. This makes SimAM a simple and efficient attention mechanism with broad applications in a variety of deep learning tasks.

The construction of the energy function mainly relies on the features of the SVM [54] algorithm. Specifically, the sample points far from the edge of the decision are classified as easy samples (Easy Sample), while the sample points near the edge of the decision are difficult samples (Hard Sample), which is very consistent with the basic view of neurological theory. The target neurons of input features in a single channel are represented by t, other neurons except the target neurons are represented by

x_{i}

, the weight and bias of t and

x_{i}

linear transformation are represented by

w_{t}

and

b_{t}

, respectively, the number of all neurons in a single channel is represented by M,

M = H \times W

, the index on the spatial dimension is represented by i, and the energy function is defined as:

e_{t} (w_{t}, b_{t}, y, x_{t}) = \frac{1}{M - 1} \sum_{i = 1}^{M - 1} {(- 1 - (w_{t} x_{i} + b_{t}))}^{2} + {(1 - (w_{t} t + b_{t}))}^{2} + λ {w_{t}}^{2}

(6)

The calculation of attention weights can be accelerated by calculating the closed solutions of

w_{t}

and

b_{t}

. For

w_{t}

and

b_{t}

, and then substitute the Equation (7) to obtain the minimum energy formula

e_{t}

as:

{e_{t}}^{*} = \frac{4 ({\hat{σ}}^{2} + λ)}{{(t - \hat{μ})}^{2} + 2 {\hat{σ}}^{2} + 2 λ}

(7)

According to Equation (7), we can conclude that the smaller the energy function value, the greater the linear separation between the neuron t and other neurons, and the higher the importance. This means that in the SimAM module, the energy function is calculated to measure the degree of association between different neurons and to determine their importance. Further, in the SimAM module, the final optimization target can be expressed by Equation (8). In this formulation, E represents classifying the energy functions in all channels and spatial dimensions. By classifying the energy function, we can better understand the correlation of different positions and channels in the feature map, and determine the importance of each position and channel. The SimAM module enables the neural network to more accurately learn the relationship between features as well as extract information useful for the task. The application of this module in the deep learning model can improve the model performance, and it is suitable for various image-processing tasks. By optimizing and classifying the energy function, the SimAM module is able to better understand the important information in the feature graph and provide a more accurate feature representation for subsequent tasks.

\tilde{X} = s i g m o i d (\frac{1}{E}) ⨀ X

(8)

3. Experimentation and Result Analysis

3.1. Data Preparation

A part of the experimental dataset was collected from the China Power Line Insulator Dataset (CPLID), which included 600 normal insulator images taken by UAVs and 248 defective insulator images. This portion of the dataset was utilized as the validation set (InsulatorDate2023_verify). Additionally, a portion of the insulator dataset was provided by the Power Supply Company in Peshawar, Pakistan, comprising a total of 7371 images. The test set (InsulatorDate2023_test) consists of 2543 images, encompassing 64 images of normal insulators captured from various angles, 149 images of normal insulators on power lines, 15 images of normal insulators on power lines in complex environments, 845 images of defective insulators on power lines, and 1485 images of defective power line insulators. The training set (InsulatorDate2023_train) comprises 4828 diverse images of insulator defects. To enhance the dataset’s diversity, we employed techniques such as image rotation, cropping, brightness adjustment, and merging insulator images from different backgrounds. These backgrounds encompass a range of large and complex environments, including rivers, farmland, cities, and grasslands. For annotation, we utilized the YOLO format, employing two labels: insulator and defect. The labeling annotation tool in the Python environment facilitated this process. Through these efforts, we successfully created a comprehensive dataset named InsulatorDate2023, which includes the validation set (InsulatorDate2023_verify), test set (InsulatorDate2023_test), and training set (InsulatorDate2023_train), ensuring its wide representativeness.

3.2. Experimental Environment

In this experiment, we used the same platform to detect the insulator defects in the transmission line and evaluate them. To save computational cost, we uniformly adjusted the size of all raw images to 320 * 320 pixels. During training, we set the batch size to 8 and performed 300 epochs. To optimize the model performance, we used the stochastic gradient descent method as the optimization algorithm and set the initial learning rate to 0.01. Meanwhile, we set the termination learning rate to 0.2. After a long period of training and parameter tuning, in the final stage, we obtained the training weight of this model. These weights can be used for subsequent tests and applications to enable the accurate detection of insulator defects in transmission lines. The experimental environment configuration is shown in Table 1.

3.3. Evaluation Indicators

The detection accuracy (P), recall (R), and average detection accuracy (mAP) were selected as the evaluation indexes. The specific calculation formula is shown below. Among them, TP represents the true case, which is positive and actually positive; FP represents the false positive case, which is positive but actually negative; FN represents the false negative case, the number of positive cases but the model is negative; the precision of C indicates the prediction accuracy of the ith sample under a certain category; and C represents the number of samples under a certain category; and the number of sample types indicates how many different sample types are in total. By calculating these metrics, the performance of the model in the target detection task. The precision and recall can be used to measure the precision and recall of the model, while the average detection accuracy (mAP) considers the accuracy between each category and gives an overall evaluation value.

P = \frac{T P}{T P + F P}, R = \frac{T P}{T P + F N}

(9)

{A P}_{C} = \frac{\sum_{i = 1}^{N} P_{C i}}{N}, m A P = \frac{\sum_{c = 1}^{N C} {A P}_{C}}{N C}

(10)

I O U = \frac{X \cap Y}{X \cup Y}

(11)

3.4. Result Analysis

3.4.1. YOLOv5s Compared with Other YOLO Algorithms

To further illustrate the feasibility and advantages of selecting YOLOv5s as the basic algorithm, we compared the YOLOv5s algorithm with the YOLO series algorithms mentioned in the literature [32,33], including the small model algorithms YOLOv5s, YOLOv6s, and YOLOv8s; mid-model algorithms YOLOv5m, YOLOv6m, and YOLOv8m; large model algorithms YOLOv5l, YOLOv6l, and YOLOv8l; and large model algorithms YOLOv5x, YOLOv7x, and YOLOv8x. It is worth noting that the YOLOv6, YOLOv7, and YOLOv8 series algorithms are not one-stage target detection algorithms but the small model algorithm YOLOv5s, which adopts a modified version of YOLOv3 architecture CSPDarknet53, is a mature and advanced one-stage target detection algorithm. The improved model that we proposed is mainly used for small-volume datasets with small sample sizes and insignificant targets, so the small model algorithm is our preferred choice, and this type of algorithm helps to reduce the computational cost. Compared with the YOLOv6, YOLOv7, and YOLOv8 algorithms, the YOLOv5 series algorithms are more mature and more widely used [36]. Our proposed algorithm can be improved based on the existing mature small model algorithm YOLOv5s, so as to optimize the existing algorithm. We mainly compared YOLOv5s with other small model algorithms, YOLOv6s and YOLOv8s, as well as other algorithms in the YOLOv5 series, YOLOv5n, YOLOv5m, YOLOv5l, and YOLOv5x. To ensure the consistency of the index data, we used the same platform and a unified test set (InsulatorDate2023_test). The effect of the model is analyzed according to the three evaluation indexes of precision accuracy (P), recall (R), and average detection accuracy (mAP). Comparative analysis of the experimental results is shown in Table 2.

According to the experimental results, we can see that in the YOLO series of algorithms, there is little difference in the three evaluation indexes of accuracy (P), recall (R), and average detection accuracy (mAP). The precision (P) of the large model algorithm is the same as YOLOv5s, but the recall (R) and average detection accuracy (mAP) are lower than YOLOv5s. Although the super-large model algorithm YOLOv5x is higher than YOLOv5s in recall (R), it is lower than YOLOv5s in both precision precision (P) and average detection accuracy (mAP). Although the latest small model algorithm YOLOv8s is equal to YOLOv5s in recall (R), it is still lower than YOLOv5s in precision (P) and average detection accuracy (mAP). The other three YOLO series models are lower than YOLOv5s in all three indicators. Therefore, we chose YOLOv5s as the basic algorithm to improve it.

3.4.2. Comparing the Improved YOLOv5s with Other State-of-the-Art Algorithms

In order to further demonstrate the superiority of the improved algorithm based on YOLOv5s, we used some advanced algorithms mentioned in reference [23,24], such as Faster-RCNN, SSD, YOLOv3, YOLOv3-tiny, and other models, to compare the experiments. In order to ensure the consistency of the index data, we used the same platform and a unified test set (InsulatorDate2023_test). The effect of the model is analyzed according to the three evaluation indexes of precision accuracy (P), recall (R), and average detection accuracy (mAP). Comparative analysis of the experimental results is shown in Table 3.

According to the experimental results, our analysis: when Faster-RCNN algorithm is on fast, recall (R) is higher than the improved algorithm by 1.8%, precision (P) is lower than the improved algorithm by 40.6%, the average detection accuracy (mAP) is 2.9% lower than the YOLOv5s improvement algorithm, this improved YOLOv5s on accuracy and detection accuracy compared with the latest Faster-RCNN, the improved detection is more accurate. Although the YOLOv3 algorithm uses the same base model as the YOLOv5s algorithm in this paper, the average detection accuracy (mAP) is 0.5% higher than that of the YOLOv5s improvement algorithm, but the recall (R) and precision (P) are 3.4% and 4.1% lower than that of the YOLOv5s improvement algorithm, respectively. In terms of accuracy and speed, the improved YOLOv5s algorithm is greatly improved. Therefore, the YOLOv5s + BiFPN + SPD + CBAM algorithm proposed in this paper is the optimal model compared with the mainstream advanced algorithm, especially in terms of accuracy (P).

3.4.3. Impact of Improved YOLOv5s on Model Performance

In this paper, three ways were proposed to improve the YOLOv5s network structure: BiFPN weighted feature fusion, SPD space to deep convolution layer, and attention mechanism (CBAM, SimAM). These improvements are all designed to further improve the detection accuracy of the YOLOv5s algorithm. First, by introducing the BiFPN_Concat2 module, we applied the fast normalization fusion method to introduce weights during feature fusion to enhance the feature extraction ability of the model. This improved method better captured the important features in the network training process with improved detection accuracy. Second, through the introduction of SPD modules, we replaced the convolution layer of the network with SPD-Conv and changed the step stride of the convolutional layer to 1, which improved the detection effectiveness for small target tasks, enabling it to better capture the detailed information of small targets. Finally, an attention mechanism (CBAM, SimAM) was introduced to improve the model’s ability to extract insulator features and to obtain detailed information related to the self-explosion detection of insulators, enhancing the attention of the model on the critical regions, thus improving the detection effectiveness. Taken together the results showed that the three algorithms all effectively improved the accuracy of the model. Therefore, this paper further studies whether combining three improvement algorithms jointly improves the detection accuracy.

To analyze the influence of several improvements of YOLOv5s on the experimental performance, this paper combines 11 improved models for these three improvements and analyzes the effect of the model according to the three evaluation indexes of precision (P), recall (R), and average detection accuracy (mAP). To ensure the consistency of the index data, we used a unified test set (InsulatorDate2023_test) for test verification in the experiment. A comparative analysis of the experimental results is shown in Table 4.

Based on the analysis of the effectiveness of the YOLOv5s model, we found that each algorithm had a positive effect on the model. In particular, the improvement of increasing the fusion weight of the BiFPN feature has achieved a significant improvement in the mAP value. In addition, SPD and CBAM also improved the model performance to some extent. However, the SimAM attention mechanism did not significantly improve the average detection accuracy. Through the analysis of accuracy (P), we found that a single algorithm or two ideas of the YOLOv5s algorithm did not significantly improve the accuracy. However, when the three improvements were applied to the YOLOv5s algorithm, the accuracy (P) increased by 2.0%. Taken together, the YOLOv5s model by BiFPN, SPD, and CBAM performed the best when compared with other models. With this improvement, the precision (P) increased from 0.958 to 0.978 by 2.0%; the recall (R) increased from 0.874 to 0.883 by 0.9%; and the average detection accuracy (mAP) increased from 0.865 to 0.89 by 1.5%.

Based on the comparative analysis of the results, we chose YOLOv5s + BiFPN + SPD + CBM as the test model. The model is improved in three aspects: First, the BiFPN_Concat2 module was introduced in the YOLOv5s network, and the BiFPN fast normalization feature fusion idea was borrowed. This improvement introduces weight parameters in the feature fusion process, and the importance of each input feature was gradually learned through the training process. Second, we applied the SPD module to the YOLOv5s network. Since the YOLOv5s network uses some convolutional layers with step 2 in the backbone and neck, to improve the detection of small target tasks, we replaced these convolutional layers with SPD-Conv and set the step size to 1. Finally, CBAM modules were added to the last layer of the backbone network and to layers 23,28,33 of the neck network. The CBATM module enhanced the focus of the model on the critical regions, thus improving the detection accuracy. The YOLOv5s + BiFPN + SPD + CBAM model is the optimal model of our choice. The final network structure is shown in Figure 8, where the round green box is the BiFPN_Concat2 module, the round red box is the SPD-Conv module, and the round blue box is the CBAM module.

As shown in Figure 9, Figure 10 and Figure 11, according to the three comparison charts of recall, precision, and mAP_0.5, the accuracy (precision), recall (recall), and average detection accuracy (mAP) of recall + BiFPN + SPD + CBATM model were significantly improved compared to the YOLOv5s algorithm.

The detection results obtained from the YOLOv5s + BiFPN + SPD + CBAM model and the YOLOv5s model are shown in Figure 12. In the first set of contrast images with a large number of insulators and their distribution and dispersion, the lower edges are worn, and the improved YOLOv5s model is able to detect all insulators and detect worn insulators. However, the YOLOv5s algorithm did not detect all insulators and did not find worn insulators. In the second set of contrast images, the improved model identified broken insulators similar to other items with a higher confidence level, and the detection is much more accurate and convincing. In the third group of contrast pictures, with distant smaller damaged insulators, the improved model was able to detect small targets, while the YOLOv5s algorithm did not detect them. In the fourth set of comparison pictures, there is an incomplete damaged insulator at the edge of the picture. The YOLOv5s algorithm did not identify this insulator, but the improved model identified the insulator and marked the damage. To demonstrate the effect of the improved algorithm on insulator defect detection, the algorithm is tested on a publicly available Chinese power line insulator dataset. The average accuracy value of insulator defect detection is 99.5%, the leakage rate is 0, and the effect is significant.

The following is a further explanation of the improvement in the detection accuracy of the optimal model:

(1): Improved multi-target detection accuracy

The optimal model has a significant improvement in the target detection accuracy. As shown in Figure 13a below, in the original image, the target is relatively scattered, and the target size is small. As shown in Figure 13b below, when the original Yolov5s model was applied, all the targets were not completely detected. The leakage rate was high. Smaller targets with defective size were not detected, and the accuracy needed to be improved. As shown in Figure 13c, the optimal YOLOv5s model detects all the targets including the small defect ones, with significantly improved accuracy and reduced leakage rate of defect targets.

(2): Improved detection accuracy of confusing target

The optimal model has significantly improved the accuracy of the detection of confusing targets. As shown in Figure 14a below, the damaged insulator in the original image is similar to other items in the background environment, which is easy to confuse. The detection results of the original Yolov5s model are shown in Figure 14b below. The original model did not detect confusing defect targets, but only the targets with obvious features were detected. The detection results of the improved YOLOv5s model proposed in this paper are shown in Figure 14c. The model detected all targets, and successfully detected the defects of easily confused targets and marked them.

(3): Improved detection accuracy of remote small targets

Using the optimal model, the detection accuracy is significantly improved, as shown in Figure 15a below. In the original image, the insulator distance is far away and the target size is small. The detection results of the original Yolov5s model are shown in Figure 15b below. For those distant small targets, the original model did not detect the distant targets, but only detected the close targets. As shown in Figure 15c, the improved YOLOv5s model proposed in this paper detects all targets, including small-size targets with long distances.

(4): Improved detection accuracy of incomplete targets

The optimal model significantly improves the detection accuracy with incomplete display. As shown in Figure 16a below, there are insulators with an incomplete display at the edge of the image in the original image. The detection results of the original Yolov5s model are shown in Figure 16b below. For the incomplete insulators, the original model did not detect it. The results of the improved YOLOv5s model are shown in Figure 16c. The model detects all targets including incomplete display targets.

4. Discussion

With the maturity of artificial intelligence technology, deep learning methods have been widely used in various fields. Among them, the convolutional neural network, with its excellent spatial information analysis ability, plays an important role in bioinformatics, medical image recognition, disease prediction, clinical aid in decision-making, and drug development. Compared to traditional shallow networks, convolutional neural networks have a higher integrated recognition rate. In this paper, the YOLOv5s algorithm is our focus, to deal with the existing problems and to improve the detection effect. We performed descriptive statistical analysis of the insulator dataset and found some problems such as high similarity between self-bursting insulator regions and normal insulators, insufficient sample size, unbalanced sample distribution, and small target detection. To address these problems, this paper proposed three improvement ideas.

Firstly, this paper introduces the idea of the BiFPN module for weighted feature fusion to address the problem of feature fusion in the YOLOv5s algorithm. By introducing weight parameters, the model learns the importance of different input features to improve its feature extraction ability. Secondly, this paper introduces the SPD space to the deep convolution module. This module splits the feature graphs into a series of sub-feature graphs and focuses on the feature information of small regions. Meanwhile, the step size of the convolution layer was changed from 2 to 1 to avoid the loss of fine-grained information. Finally, the attention mechanism (CBAM, SimAM) is introduced to address the problems of complex insulator image background and high similarity between insulator regions and other regions. By investing more attention resources in the insulator regions, the model obtains more information related to self-explosion detection. The results of the empirical analysis show that the improvement of the YOLOv5s algorithm by the improvement alone or the combination of the two ideas does not significantly improve the accuracy (P). However, considering the recall (R) and the average detection accuracy (mAP), applying the three improvement methods simultaneously can significantly improve the model performance. Based on the comparison of the evaluation indexes, the improved YOLOv5s model based on three factors: BiFPN, SPD, and CBAM was selected as the optimal model. The model improved by 2.0%, 0.9%, and 1.5% in precision (P), recall (R), and average detection accuracy (mAP), respectively.

Author Contributions

Conceptualization, C.H. and X.Z.; methodology, C.H. and X.L.; software, C.H. and X.L.; validation, S.M., X.L. and C.H.; formal analysis, X.L. and X.Z.; investigation, H.Z.; resources, S.M. and C.H.; data curation, H.Z.; writing—original draft preparation, C.H. and S.M.; writing—review and editing, C.H. and X.L.; visualization, S.M.; supervision, X.Z. and C.H.; project administration, C.H. and X.Z.; funding acquisition, C.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Jinling Institute of Technology High-level Talent Research Start-up Project (JIT-RCYJ-202102), Key R&D Plan Project of Jiangsu Province (BE2022077), Jinling Institute of Technology Science and Education Integration Project (2022KJRH18), Jiangsu Province College Student Innovation Training Program Project (202313573080Y, 202313573081Y).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This research employed publicly available datasets for its experimental studies.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, L.; Fan, J.; Liu, Y.; Li, E.; Peng, J.; Liang, Z. A Review on State-of-the-Art Power Line Inspection Techniques. IEEE Trans. Instrum. Meas. 2020, 69, 9350–9365. [Google Scholar] [CrossRef]
Zhong, H.; Liu, Y.; Wei, J.; Fu, Q.; Yi, B. A real-time railway fastener inspection method using the lightweight depth estimation network. Measurement 2022, 189, 110613. [Google Scholar] [CrossRef]
Meng, F.; Xu, B.; Zhang, T.; Muthu, B.; Sivaparthipan, C.B. Application of AI in image recognition technology for power line inspection. Energy Syst. 2021, 2021, 3073248. [Google Scholar]
Arun, R.A.; Umamaheswari, S. Effective and efficient multi-crop pest detection based on deep learning object detection models. J. Intell. Fuzzy Syst. 2022, 43, 5185–5203. [Google Scholar] [CrossRef]
Lan, Y.; Xu, W. Insulator defect detection algorithm based on a lightweight network. J. Phys. Conf. Ser. 2022, 2181, 12007. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, Y.; Xin, M.; Liao, J.; Xie, Q. A Light-Weight Network for Small Insulator and Defect Detection Using UAV Imaging Based on Improved YOLOv5. Sensors 2023, 23, 5249. [Google Scholar] [CrossRef]
Wang, J.; Liu, G.; Yuan, J.; Wei, G. Image fusion technology and application in power inspection. Tech. Autom. Appl. 2019, 38, 4. [Google Scholar]
Taylor, T.; Lanovaz, M.J. Agreement between visual inspection and objective analysis methods: A replication and extension. J. Appl. Behav. Anal. 2022, 55, 986–996. [Google Scholar] [CrossRef]
Niu, S.; Zhou, X.; Zhou, D.; Yang, Z.; Liang, H.; Su, H. Fault Detection in Power Distribution Networks Based on Comprehensive-YOLOv5. Sensors 2023, 23, 6410. [Google Scholar] [CrossRef]
Wang, D.; Sun, J.; Zhang, T. Self-explosion defect detection method of glass insulator based on improved generative adversarial network. High Volt. Eng. 2022, 48, 1096–1103. [Google Scholar]
Miao, X.; Liu, X.; Chen, J.; Zhuang, S.; Fan, J.; Jiang, H. Insulator Detection in Aerial Images for Transmission Line Inspection Using Single Shot Multibox Detector. IEEE Access 2019, 7, 9945–9956. [Google Scholar] [CrossRef]
Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning Deep Features for Discriminative Localization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
Lv, B.; Gao, X.; Feng, S.; Yuan, J. Deep Learning Detection Algorithm for Surface Defects of Automobile Door Seals. Teh. Vjesn. 2022, 29, 1499–1506. [Google Scholar]
Saranya, K.; Muniraj, C. A SVM Based Condition Monitoring of Transmission Line Insulators Using PMU for Smart Grid Environment. Power Energy Eng. 2016, 4, 47–60. [Google Scholar] [CrossRef]
Li, R.; Zhang, Y.; Zhai, D. Pin Defect Detection of Transmission Line Based on Improved SSD. High Volt. Eng. 2021, 47, 3795–3802. [Google Scholar]
Ding, J.; Cao, H.; Ding, X.; An, C. High Accuracy Real-Time Insulator String Defect Detection Method Based on Improved YOLOv5. Front. Energy Res. 2022, 10, 928164. [Google Scholar] [CrossRef]
Zhao, M.; Chang, C.H.; Xie, W.; Xie, Z.; Hu, J. Cloud shape classification system based on multi-channel cnn and improved fdm. IEEE Access 2020, 8, 44111–44124. [Google Scholar] [CrossRef]
Zhao, S.; Liu, J.; Bao, X. Influence of pollution flashover on outdoor high-voltage power equipment and preventive measures. Mod. Manuf. Technol. Equip. 2021, 57, 138–140. [Google Scholar]
Li, X.; Su, H.; Liu, G. Insulator Defect Recognition Based on Global Detection and Local Segmentation. IEEE Access 2020, 8, 59934–59946. [Google Scholar] [CrossRef]
Vriesman, D.; Britto, A.S.; Zimmer, A.; Koerich, A.L.; Paludo, R. Automatic visual inspection of thermoelectric metal pipes. Signal Image Video Process. 2019, 13, 975–983. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef]
Lu, W.; Zhou, Z.; Ruan, X.; Yan, Z.; Cui, G. Insulator Detection Method Based on Improved Faster R-CNN with Aerial Images. In Proceedings of the 2021 2nd International Symposium on Computer Engineering and Intelligent Communications (ISCEIC), Nanjing, China, 6–8 August 2021; pp. 417–420. [Google Scholar]
Wu, C.; Ma, X.; Kong, X.; Zhu, H. Research on insulator defect detection algorithm of transmission line based on CenterNet. PLoS ONE 2021, 16, e0255135. [Google Scholar] [CrossRef]
Xia, H.; Yang, B.; Li, Y.; Wang, B. An improved center-net model for insulator defect detection using aerial im-agery. Sensors 2022, 22, 2850. [Google Scholar] [CrossRef]
Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 2020, Virtual, 14–19 June 2020; pp. 390–391. [Google Scholar]
Jin, B.; Cruz, L.; Gonçalves, N. Deep facial diagnosis: Deep transfer learning from face recognition to facial diagnosis. IEEE Access 2020, 8, 123649–123661. [Google Scholar] [CrossRef]
Li, Z.; Rao, Z.; Ding, L.; Ding, B.; Fang, J.; Ma, X. YOLOv5s-D: A Railway Catenary Dropper State Identification and Small Defect Detection Model. Appl. Sci. 2023, 13, 7881. [Google Scholar] [CrossRef]
Stefenon, F.S.; Singh, G.; Souza, J.B.; Freire, R.Z. Optimized hybrid YOLOu-Quasi-ProtoPNet for insulators classification. IET Gener. Transm. Distrib. 2023, 17, 3501–3511. [Google Scholar] [CrossRef]
Zhou, T. A lightweight improvement of YOLOv5 for insulator fault detection. J. Phys. Conf. Ser. 2023, 2492, 12029. [Google Scholar] [CrossRef]
Qi, Y.; Li, Y.; Du, A. Research on an Insulator Defect Detection Method Based on Improved YOLOv5. Appl. Sci. 2023, 13, 5741. [Google Scholar] [CrossRef]
Wu, J.; Zhou, Y. An Improved Few-Shot Object Detection via Feature Reweighting Method for Insulator Identification. Appl. Sci. 2023, 13, 6301. [Google Scholar] [CrossRef]
Singh, G.; Stefenon, S.F.; Yow, K.C. Interpretable visual transmission lines inspections using pseudo-prototypical part network. Mach. Vis. Appl. 2023, 34, 41. [Google Scholar] [CrossRef]
Souza, B.J.; Stefenon, S.F.; Singh, G.; Freire, R.Z. Hybrid-YOLO for classification of insulators defects in transmission lines based on UAV. Int. J. Electr. Power Energy Syst. 2023, 148, 108982. [Google Scholar] [CrossRef]
Yuan, J.; Zheng, X.; Peng, L.; Qu, K.; Luo, H.; Wei, L.; Jin, J.; Tan, F. Identification method of typical defects in transmission lines based on YOLOv5 object detection algorithm. Energy Rep. 2023, 9, 323–332. [Google Scholar] [CrossRef]
Wang, S.; Gao, L.; Hu, T.; Fu, D.; Liu, W. Component Detection of Overhead Transmission Line Based on CBAM-Efficient-YOLOv5. J. Phys. Conf. Ser. 2023, 2456, 12020. [Google Scholar] [CrossRef]
Zhao, J.; Liu, L.; Chen, Z.; Ji, Y.; Feng, H. A New Orientation Detection Method for Tilting Insulators Incorporating Angle Regression and Priori Constraints. Sensors 2022, 22, 9773. [Google Scholar] [CrossRef]
Huang, Y.; Jiang, L.; Han, T.; Xu, S.; Liu, Y.; Fu, J. High-Accuracy Insulator Defect Detection for Overhead Transmission Lines Based on Improved YOLOv5. Appl. Sci. 2022, 12, 12682. [Google Scholar] [CrossRef]
Yang, Y.; Wang, X. Insulator self-shattering detection based on YOLOv5 under small sample conditions. J. Phys. Conf. Ser. 2022, 2378, 12073. [Google Scholar] [CrossRef]
He, H.; Zhang, Z.; Jia, Q.; Huang, L.; Cheng, Y.; Chen, B. Wildfire detection for transmission line based on improved lightweight YOLO. Energy Rep. 2023, 9, 512–520. [Google Scholar] [CrossRef]
Zhang, J.; Lei, J.; Qin, X.; Li, B.; Li, Z.; Li, H.; Zeng, Y.; Song, J. A Fitting Recognition Approach Combining Depth-Attention YOLOv5 and Prior Synthetic Dataset. Appl. Sci. 2022, 12, 11122. [Google Scholar] [CrossRef]
Li, Y.; Zou, G.; Zou, H.; Zhou, C.; An, S. Insulators and Defect Detection Based on the Improved Focal Loss Function. Appl. Sci. 2022, 12, 10529. [Google Scholar] [CrossRef]
Li, Y.; Ni, M.; Lu, Y. Insulator defect detection for power grid based on light correction enhancement and YOLOv5 model. Energy Rep. 2022, 8, 807–814. [Google Scholar] [CrossRef]
Wang, Q.; Si, G.; Qu, K.; Gong, J.; Cui, L. Transmission Line Foreign Body Fault Detection Using Multi-Feature Fusion Based on Modified YOLOv5. J. Phys. Conf. Ser. 2022, 2320, 12028. [Google Scholar] [CrossRef]
Xu, W.; Zhong, X.; Luo, M.; Weng, L.; Zhou, G. End-to-End Insulator String Defect Detection in a Complex Background Based on a Deep Learning Model. Front. Energy Res. 2022, 10, 928162. [Google Scholar] [CrossRef]
Han, G.; He, M.; Gao, M.; Yu, J.; Liu, K.; Qin, L. Insulator Breakage Detection Based on Improved YOLOv5. Sustainability 2022, 14, 6066. [Google Scholar] [CrossRef]
Huang, W.; Li, T.; Xiao, Y.; Wen, Y.; Deng, Z. Insulator Defect Detection Algorithm Based on Improved YOLOv5s; Guilin University of Electronic Technology (China): Guilin, China; Guangxi Normal University: Guilin, China, 2022. [Google Scholar]
Neubeck, A.; Van Gool, L. Efficient non-maximum suppression. In Proceedings of the 18th International Conference on Pattern Recognition, Washington, DC, USA, 20–24 August 2006; Volume 3, pp. 850–855. [Google Scholar]
Setio, A.A.; Ciompi, F.; Litjens, G.; Gerke, P.; Jacobs, C.; van Riel, S.J.; Wille, M.M.; Naqibullah, M.; Sanchez, C.I.; van Ginneken, B. Pulmonary nodule detection in CT images: False positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging 2016, 35, 1160–1169. [Google Scholar] [CrossRef]
Mollalo, A.; Rivera, K.M.; Vahedi, B. Artificial neural network modeling of novel coronavirus (COVID-19) incidence rates across the continental United States. Int. J. Environ. Res. Public Health 2020, 17, 4204. [Google Scholar] [CrossRef] [PubMed]
Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
Sunkara, R.; Luo, T. No More Strided Convolutions or Pooling: A New CNN Building Block for Low-Resolution Images and Small Objects. arXiv 2022, arXiv:2208.03641. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Yang, L.; Zhang, R.Y.; Li, L.; Xie, X. Simam: A simple, parameter-free attention module for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 11863–11874. [Google Scholar]
Müller, K.R.; Mika, S.; Tsuda, K.; Schölkopf, K. An introduction to kernel-based learning algorithms. In Handbook of Neural Network Signal Processing; CRC Press: Boca Raton, FL, USA, 2018; pp. 4–16. [Google Scholar]

Figure 1. YOLOv5s algorithm network architecture.

Figure 2. (a) PAN network structure, (b) BiFPN network structure.

Figure 3. The SPD module (a) the feature map X (b) divide the original feature map (c) the divided sub-feature (d) the sub-feature maps concatenated along the channel dimension (e) the new feature map

X^{'}

maps.

Figure 3. The SPD module (a) the feature map X (b) divide the original feature map (c) the divided sub-feature (d) the sub-feature maps concatenated along the channel dimension (e) the new feature map

X^{'}

maps.

Figure 4. Channel attention module structure.

Figure 5. Spatial attention module structure.

Figure 6. The CBAM structure.

Figure 7. Schematic diagram of assigning 3D attention weights.

Figure 8. The optimal model network structure in the combination.

Figure 9. Comparison chart of precision.

Figure 10. Comparison chart of recall.

Figure 11. Comparison chart of mAP.

Figure 12. Detection diagram with the improved model.

Figure 13. Improvement model: detection for dispersed targets, (a) original picture, (b) the result of YOLOv5s model, and (c) the result of improved YOLOv5s model.

Figure 14. The comparison diagram of the original and improved models for the detection of confusing targets, (a) original picture, (b) the result of YOLOv5s model, and (c) the result of improved YOLOv5s model.

Figure 15. The comparison map of the original and improved model for the remote target detection, (a) original picture, (b) the result of YOLOv5s model, and (c) the result of improved YOLOv5s model.

Figure 16. The comparison diagram of the original and improved detection model for incomplete targets, (a) original picture, (b) the result of YOLOv5s model, and (c) the result of improved YOLOv5s model.

Table 1. Experimental environment configuration.

Parameter	Configuration
GPU	NVIDIA GeForce RTX3050 (4G)
Cuda	12.2
Python	3.9.12
Torch	1.11
Image size	320 * 320
Batch size	8
Epochs	300
Learning rate (initial)	0.01
Learning rate (final)	0.2

Table 2. The model of YOLO series algorithm analysis.

Model	Base Model	P	R	mAP
Yolov5n	CSPDarknet53	0.955	0.870	0.861
Yolov5s	CSPDarknet53	0.958	0.874	0.865
Yolov5m	CSPDarknet53	0.957	0.872	0.864
Yolov5l	CSPDarknet53	0.958	0.871	0.863
Yolov5x	CSPDarknet53	0.957	0.875	0.862
Yolov6s	EfficientRep	0.955	0.873	0.863
Yolov8s	CSPDarknet53	0.957	0.874	0.864

Table 3. Comparative analysis of the advanced algorithms.

Model	Base Model	P	R	mAP
Faster-RCNN	Resnet-50	0.572	0.901	0.861
SSD	VGG-SSD	0.956	0.621	0.852
YOLOv3	CSPDarknet53	0.944	0.842	0.895
YOLOv3-tiny	Darknet53-tiny	0.909	0.821	0.862
Yolov5s + BiFPN + SPD + CBAM	CSPDarknet53	0.978	0.883	0.890

Table 4. Model validity analysis.

Model	Complication				P	R	mAP
Model	BiFPN	SPD	CBAM	SimAM	P	R	mAP
Yolov5s	×	×	×	×	0.958	0.874	0.865
Yolov5s + BiFPN	√	×	×	×	0.961	0.886	0.877
Yolov5s + SPD	×	√	×	×	0.944	0.893	0.873
Yolov5s + CBAM	×	×	√	×	0.949	0.889	0.876
Yolov5s + SimAM	×	×	×	√	0.951	0.904	0.869
Yolov5s + BiFPN + SPD	√	√	×	×	0.949	0.880	0.878
Yolov5s + BiFPN + CBAM	√	×	√	×	0.955	0.886	0.874
Yolov5s + BiFPN + SimAM	√	×	×	√	0.954	0.884	0.869
Yolov5s + SPD + CBAM	×	√	√	×	0.930	0.879	0.877
Yolov5s + SPD + SimAM	×	√	×	√	0.974	0.865	0.88
Yolov5s + BiFPN + SPD + CBAM	√	√	√	×	0.978	0.883	0.89
Yolov5s + BiFPN + SPD + SimAM	√	√	×	√	0.964	0.866	0.865

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, C.; Min, S.; Liu, X.; Zhou, X.; Zhang, H. Research on an Improved Detection Algorithm Based on YOLOv5s for Power Line Self-Exploding Insulators. Electronics 2023, 12, 3675. https://doi.org/10.3390/electronics12173675

AMA Style

Hu C, Min S, Liu X, Zhou X, Zhang H. Research on an Improved Detection Algorithm Based on YOLOv5s for Power Line Self-Exploding Insulators. Electronics. 2023; 12(17):3675. https://doi.org/10.3390/electronics12173675

Chicago/Turabian Style

Hu, Caiping, Shiyu Min, Xinyi Liu, Xingcai Zhou, and Hangchuan Zhang. 2023. "Research on an Improved Detection Algorithm Based on YOLOv5s for Power Line Self-Exploding Insulators" Electronics 12, no. 17: 3675. https://doi.org/10.3390/electronics12173675

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on an Improved Detection Algorithm Based on YOLOv5s for Power Line Self-Exploding Insulators

Abstract

1. Introduction

2. Materials and Methods

2.1. YOLOv5s Algorithm

2.2. YOLOv5s Algorithm Improvement

2.2.1. BiFPN Module

2.2.2. SPD Module

2.2.3. CBAM Attention Mechanism

2.2.4. SimAM No Attention Mechanism

3. Experimentation and Result Analysis

3.1. Data Preparation

3.2. Experimental Environment

3.3. Evaluation Indicators

3.4. Result Analysis

3.4.1. YOLOv5s Compared with Other YOLO Algorithms

3.4.2. Comparing the Improved YOLOv5s with Other State-of-the-Art Algorithms

3.4.3. Impact of Improved YOLOv5s on Model Performance

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI