Next Article in Journal
Quality Assessment of Greenhouse-Cultivated Cucumbers (Cucumis sativus) during Storage Using Instrumental and Image Analyses
Previous Article in Journal
Assessment of Remediation Efficiency for Soils Contaminated with Metallic Mercury in Hydrocarbon Extraction Zones
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Lightweight Insulator and Defect Detection Method Based on Improved YOLOv8

1
Power Marketing Service & Operation Management Branch, Inner Mongolia Power (Group) Co., Ltd., Hohhot 010010, China
2
College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(19), 8691; https://doi.org/10.3390/app14198691 (registering DOI)
Submission received: 26 August 2024 / Revised: 24 September 2024 / Accepted: 24 September 2024 / Published: 26 September 2024

Abstract

:
Insulator and defect detection is a critical technology for the automated inspection of transmission and distribution lines within smart grids. However, the development of a lightweight, real-time detection platform suitable for deployment on drones faces significant challenges. These include the high complexity of existing algorithms, limited availability of UAV images, and persistent issues with false positives and missed detections. To address this issue, this paper proposed a lightweight drone-based insulator defect detection method (LDIDD) that integrates data augmentation and attention mechanisms based on YOLOv8. Firstly, to address the limitations of the existing insulator dataset, data augmentation techniques are developed to enhance the diversity and quantity of samples in the dataset. Secondly, to address the issue of the network model’s complexity hindering its application on UAV equipment, depthwise separable convolution is incorporated for lightweight enhancement within the YOLOv8 algorithm framework. Thirdly, a convolutional block attention mechanism is integrated into the feature extraction module to enhance the detection of small insulator targets in aerial images. The experimental results show that the improved network reduces the computational volume by 46.6% and the mAP stably maintains at 98.3% compared to YOLOv8, which enables the implementation of a lightweight insulator defect network suitable for the UAV equipment side without affecting the detection performance.

1. Introduction

Insulators [1,2,3,4] are one of the most common electrical components in transmission and distribution lines, and because insulators are exposed to the natural environment for a long time, they are susceptible to defects or self-detonation under the influence of extreme weather such as gusty winds and heavy rains, which affect the normal operation of transmission and distribution lines; therefore, the state of insulators is an important element of electric power inspection. The emerging Unmanned Aerial Vehicle (UAV) inspection technology, compared with the traditional manual inspection method, presents advantages of high flexibility, high security, high economy, etc. It can effectively solve problems of high danger levels in manual power-carrying operations, high labor intensity due to the wide distribution of lines, etc. [5,6,7,8], which are of great significance for ensuring the safety of electric power.
Although the use of drones for photo shooting and analysis is more efficient than traditional manual on-site inspections, the images obtained by drones usually have a wide field of view, resulting in a complex and cluttered background of the insulator. In addition, a single image may contain multiple insulators at the same time, or the insulator target may be small. In this context, the detection of insulator defects faces significant challenges. Therefore, there is an urgent need for a detection method for small target insulators and their defects in complex backgrounds to improve detection efficiency.
In recent years, there have been many studies on collecting images with drones and then analyzing and detecting insulator defects with the help of deep learning algorithms. Cao et al. [9] introduced prior knowledge on insulators, namely, the edge information of insulators, to improve the accuracy of model detection. Their experimental results show that the improved ResNet-18 model can detect insulator self-explosion defects with an accuracy of 95.1%, and the method can be applied to different types of CNN networks. Huang et al. [10] proposed combining the insulator multi-fault target detection algorithm with deep learning to automatically learn the high-level features of multiple insulator images with different surface faults and added traditional low-level visual features (color features and texture features) to more fully extract the effective features of the image, thereby improving the recognition accuracy. The insulator fault detection method proposed by Han et al. [11] includes two steps. First, a new neural network is designed to accurately obtain the position of the insulator. Then, an insulator fault location method based on RoI (Region of Interest) is designed. The experimental results on various aerial images of insulators show that this method has high average location accuracy and low running time.
Yang et al. [12] proposed a defect detection method based on incremental learning for the problems of slow detection speed and high false detection rate in existing defect detection methods. This method can quickly screen and locate defective bolts in high-resolution images, thus assisting inspectors to a certain extent and improving work efficiency. Liu et al. [13] proposed a more effective detection method based on the YOLOv7 [14] model by introducing the Kmeans++ algorithm and an attention mechanism. This method is able to achieve the detection of tiny insulator targets in complex backgrounds, which substantially improves the accuracy of detection. Liu et al. [15] added a Coordinate Attention (CA) mechanism to YOLOv5 [16], proposed to replace the C3 network with a Transformer encoder block, and used a small target orientated detection head in order to improve accuracy in detecting small target insulators in UAV images, which significantly enhanced the detection effect. Liu et al. [17] achieved excellent performance in insulator fault detection in aerial images with different backgrounds by improving the YOLOv3 model with a Spatial Pyramid Pooling (SPP) network and multi-scale prediction network, and the detection capability of the improved model in complex backgrounds was significantly enhanced.
Extensive research on improving the performance of insulator defect detection has also been conducted [18,19,20,21,22]. The Single-Shot Detector (SSD) detection algorithm proposed by Liu et al. [23] fuses the Bounding Box positioning method in the YOLO algorithm with the region-preselected-box method in the Faster Region-based Convolutional Neural Networks (R-CNN) [24] algorithm, so that the SSD can cope with images with low resolution when it can also guarantee a certain detection accuracy. Subsequently, the YOLO series algorithms [25,26,27] underwent multiple optimizations and improvements. Notably, the YOLOv3 algorithm was employed by Adou et al. [28] for insulator positioning and fault detection tasks, demonstrating superior detection performance and speed. However, due to the high complexity of YOLOv3, including its large computational and parameter demands, its practical application in engineering projects remains limited. These studies have demonstrated the feasibility and effectiveness of fault detection and defect analysis of UAV inspection images based on deep learning techniques. However, there are still many issues that need to be further investigated, such as optimizing the performance of insulator defect feature detection for small targets, lightweighting the model to adapt to the hardware conditions of inspection UAVs, and reducing the number of network parameters so as to improve the real-time detection performance.
To address the aforementioned issues, this paper conducts an in-depth study on intelligent detection and identification of insulator defects in transmission line inspections, which are together called the lightweight drone-based insulator defect detection method (LDIDD). The main contributions are as follows:
  • For the problems of the small number of samples and unbalanced proportion in the existing dataset, this paper adopts a variety of data enhancement techniques, including methods such as brightness transformation and adding noise, to expand the size and diversity of the dataset. These enhancement techniques not only increase the number of samples in the dataset but also improve the robustness of the model. In addition, in this paper, the labels of discolored insulators are added on top of the existing labels of normal insulators and defective insulators. By incorporating the label type for discolored insulators, the dataset’s label types are enriched, allowing the model to identify more types of insulator defects and enhancing the comprehensiveness and accuracy of the detection.
  • This paper takes the YOLOv8 target detection algorithm as the base network and makes lightweight improvement for its feature extraction module. Specifically, the MobileNetv3 model [29,30,31] is used to replace the C3 module and convolution module in the Backbone part of YOLOv8. MobileNetv3 adopts depthwise separable convolution and the BNeck module, which significantly reduce the computational complexity and parameter count of the model and make the model achieve lightweighting while maintaining high detection accuracy. In addition, this paper also optimizes the complexity of the improved model, which further improves the running efficiency of the model and makes it more suitable for deployment on resource-constrained embedded systems and mobile devices.
  • Recognizing that insulator defects are often small target features in captured images susceptible to misidentification or oversight, this study integrates a lightweight attention mechanism to alleviate these challenges. A convolutional block attention module (CBAM) [32] module enhances the feature representation in the channel and spatial dimensions through two sub-modules, namely, channel and spatial attention, respectively, which increases the model’s attention to key features. This enhancement significantly boosts the accuracy of insulator and defect detection, especially when dealing with small target detection tasks.
The rest of this article is structured as follows: Section 2 provides a comprehensive overview of the data augmentation techniques employed, including image transformation and noise injection, as well as the dataset enhancement strategies implemented through the introduction of label variations. Section 3 delves into the development of the enhanced LDIDD method, encompassing the deployment of a depthwise separable convolutional algorithm, modifications to the YOLOv8 framework, and the integration of an attention mechanism. Section 4 describes the experimental setup, presents a detailed analysis of the results, and compares the performance metrics. Finally, Section 5 summarizes this study’s main findings and suggests potential directions for future research.

2. Dataset Processing and Augmentation

2.1. Basic Dataset

This study is based on the open-source Chinese Power Line Insulator Dataset (CPLID) [33]. This dataset is divided into two parts: normal insulators and defective insulators, containing 600 images of normal insulators and 248 images of defective insulators, respectively, all captured by UAVs. Due to the limited number of images in the dataset, the training model may not acquire sufficient discriminative features for generalization, potentially leading to overfitting. Therefore, this paper first employs data augmentation techniques to expand the dataset, constructing a comprehensive UAV inspection image dataset for training and testing.

2.2. Image Transformation for Data Augmentation

By analyzing the original images in the dataset, it was found that most images were captured under good lighting conditions. To enhance the model’s capability to handle target images under varying lighting environments and aerial angles, this section first applies brightness transformation to the original images in the dataset, simulating insulators under different lighting conditions. Subsequently, new image samples are generated through random rotation operations, including both clockwise and counterclockwise rotations. The matrices and formulas for random rotations are as follows:
x y 1 = cos θ sin θ 0 sin θ cos θ 0 0 0 1 x 0 y 0 1
x = x 0 cos θ y 0 sin θ y = x 0 sin θ + y 0 cos θ
where x and y represent the coordinates of the pixels after rotation, while x 0 and y 0 represent the original coordinates of the pixels.

2.3. Noise Injection for Data Augmentation

During UAV aerial photography of outdoor transmission lines, images may be affected by noise interference due to the increase in camera temperature or uneven lighting conditions. The probability density functions of these two types of noise follow a normal distribution. Therefore, Gaussian noise can be introduced for image augmentation. The probability density function is given by
G x = 1 2 π δ e x μ 2 2 δ 2
where μ is the mean and δ is the standard deviation. After introducing Gaussian noise, further random rotation operations are performed to generate new samples. The effects of image enhancement through rotation, brightness transformation, and noise processing are shown in Figure 1 below.

2.4. Adding Label Variations

In the CPLID dataset, all insulators are made of red ceramic, with the defect label type exclusively limited to missing defects. This narrow scope restricts the generalizability and practical engineering applicability of the trained model, as it fails to account for the diverse range of insulator defects encountered in real-world scenarios. A thorough review of the literature reveals that in addition to missing defects, insulators are also susceptible to discoloration defects, often resulting from oxidation and corrosion. The absence of these defect types in the dataset significantly limits the model’s ability to address the broader spectrum of issues that can arise during insulator inspections, thereby diminishing its utility in practical engineering contexts.
To simulate the whitening defects commonly observed in red insulators, this study utilizes advanced image processing techniques to modify the color of the original insulator images. These alterations are designed to closely replicate the whitening phenomenon that occurs in real-world red insulators due to factors such as weathering, oxidation, and other environmental influences. The results of this discoloration process, which effectively mimic the visual characteristics of actual whitening defects, are presented in Figure 2.
By incorporating the simulation of real-world discoloration phenomena, the model becomes more versatile and better equipped to handle the diverse challenges encountered in practical engineering environments. Consequently, this approach not only broadens the model’s applicability but also increases its practical utility, ensuring that it can be effectively deployed in various real-world scenarios where different types of insulator defects are prevalent.

3. Improved YOLOv8 Lightweight Object Detection Network

3.1. Depthwise Separable Convolution

In the original YOLOv8s network architecture, the convolutional layer sequentially performs convolution operations across all channels of the input. While this approach enables the simultaneous extraction of both channel and spatial features, it also substantially escalates the computational complexity of the network. Specifically, each convolutional kernel must conduct convolution operations on every input channel, leading to a significant increase in computational and parameter overhead. This heightened complexity not only demands more computational resources but also results in a larger model size, potentially hindering the network’s efficiency and suitability for deployment in resource-constrained environments. The trade-off between feature extraction capability and computational efficiency presents a critical challenge in optimizing the network for practical applications, particularly where real-time processing and lower computational demands are essential.
To reduce the computational complexity, a depthwise separable convolutional algorithm [34,35,36] can be used. It reduces the computational and parameter count by decomposing the regular convolutional computation into two separate steps. The first is deep convolution, which performs the convolution operation independently on each input channel, thus focusing on extracting the spatial features of each channel. Then, this is followed by point-by-point convolution, which combines information from different channels by blending these feature maps across channels with a 1×1 convolution kernel. This approach not only significantly reduces the amount of computation and the number of parameters but also achieves a lighter model and improves the efficiency and speed of the network while maintaining the performance of the model.
As illustrated in Figure 3, the left side depicts the traditional convolutional approach, while the right side presents the depthwise separable convolutional method, which has been integrated into the MobileNetv3 model. This comparison effectively highlights the distinctions between the two convolutional operations. The conventional approach involves performing convolution operations across all input channels simultaneously, resulting in substantial computational and parameter overhead. In contrast, the depthwise separable convolutional approach, as applied in MobileNetv3, separates the convolution process into two distinct stages: depthwise convolution and pointwise convolution. This separation reduces the computational complexity and parameter count by focusing on individual channels separately before combining them, thereby optimizing both efficiency and performance.
The differences in computational complexity between depthwise separable convolution and conventional convolution can be analyzed as follows. Assume that the input feature map size is D F × D F × M and the output feature map size is D F × D F × N , where D F is the width and height of the same input and output feature maps.
For conventional convolution D K × D K , the computational complexity is
D K × D K × M × N × D F × D F
For depthwise separable convolution, including the computational cost of depthwise convolution part, which is D K × D K × M × D F × D F , and the pointwise convolution part, which is  M × N × D F × D F , the total computational complexity is
D K × D K × M × D F × D F + M × N × D F × D F
Therefore, the comparison between the depthwise separable convolution and conventional convolution is
D K × D K × M × D F × D F + M × N × D F × D F D K × D K × M × N × D F × D F = 1 N + 1 D K 2  
Generally, N is relatively large. For example, if a 3 × 3 convolution kernel is used, depthwise separable convolution can reduce the amount of computation by about 9 times compared to conventional convolution.
The visual representation in Figure 3 underscores the practical advantages of adopting depthwise separable convolution, particularly in terms of reducing computational demands while maintaining effective feature extraction capabilities.
Here, the ReLU activation function uses ReLU6, which has the advantage of being able to limit the output of the convolution to 6, which can improve the robustness of the model. The definition of ReLU6 is as follows:
R e L U 6 = 0 ,       x 0 x ,       0 x 6 6 ,       x 6

3.2. Lightweight YOLOv8s Enhancement

The basic YOLOv8s consists of three main components, namely, Backbone, Neck, and Head. Each component plays a critical role in the overall functionality of YOLOv8s, contributing to its efficacy in object detection and classification tasks.
The Backbone is mainly used for feature extraction, which consists of a convolutional neural network with pre-training parameters and extracts feature maps at different levels through multi-layer convolution and pooling operations on the input image, which contains rich information of the image and is the basis for subsequent processing.
The main function of the Neck part is to fuse the multi-layer features extracted by the backbone part. Through feature fusion, the Neck part can retain and enhance the key information of the image, which makes the subsequent target detection and classification more accurate. The Neck part usually adopts structures such as Feature Pyramid Network (FPN) [37] or Path Aggregation Network (PANet) [38], in order to realize multi-scale feature fusion.
The Head part is responsible for using the fused features obtained from the Neck part for predicting the types and locations of the targets in the Bounding Box. It performs the target detection task by generating the class probability and location information of each detection frame through a series of convolutional and fully connected layers.
In optimizing YOLOv8s network, this paper introduces the BNeck module part of MobileNetv3 to replace the convolutional and C3 modules in the original backbone part. The network structure after the optimization is shown in Figure 4. The red line in the figure represents the multi-scale fusion operation performed, which further improves the network’s ability to detect small targets by fusing features at different scales. This multi-scale feature fusion combines feature information from different scales so that the network can maintain high accuracy in detecting targets of different sizes; particularly, the detection of small targets is significantly improved.

3.3. Integration of CBAM Attention Mechanism

Considering that insulators have a small spatial proportion in the overall aerial image, they are typical small targets and thus are easily misidentified or missed. To solve this problem, a lightweight attention module CBAM is added to the feature extraction part of YOLOv8s, the specific structure of which is shown in Figure 5. The CBAM module enhances the accuracy of small target detection by adding an attention mechanism to the feature map, which enhances attention to the target features.
The CBAM module contains two sub-modules: the Channel Attention Module (CAM) and Spatial Attention Module (SAM). The CAM mainly works on the channel dimension. It enhances the representation of useful features by learning the importance weights of each channel while suppressing irrelevant or redundant features. SAM acts mainly on the spatial dimension. It enhances the spatial feature representation of the target object by learning the importance weights of each spatial location.
The CAM module first performs global average pooling and global maximum pooling on the input feature maps and then inputs the pooling results into a shared MLP, which adjusts the feature strengths of each channel by the weight output from the MLP. This approach effectively maintains the stability of the channels and compresses the spatial extent, allowing the network to focus more on the key feature information in the input image. The SAM module first performs global average pooling and global maximum pooling on the input feature map in the channel dimension and then concatenates the pooling results in the channel dimension to form a feature map. This feature map is passed through a convolutional layer that outputs a spatial attention map, which is used to adjust the feature intensity at each spatial location in the input feature map. This approach effectively maintains spatial stability and compresses the number of channels, allowing the network to focus more on the spatial information of the target object.

4. Experimental Results and Analysis

4.1. Experimental Environment

The processor of the host used for training is AMD Ryzen7 4800H (Advanced Micro Devices, Inc., Santa Clara, CA, USA) with a clock frequency of 2.9GHz, 8 cores and 16 threads, and strong parallel processing capability. It has an RTX2060 graphics card (NVIDIA, Santa Clara, CA, USA) with 6G of dedicated video memory and 8G of shared video memory, which can meet the training requirements of the model after lightweighting. In this paper, the training is carried out on the expanded dataset, and the batch size is set to 16 and the epoch is set to 200. In order to visually display the training process and results, the various visual graphs generated by the YOLO network can be viewed during the training process, so as to understand the changes in various data during the training process. The dataset used is the CPLID dataset after data enhancement, in which the ratio of the training set, validation set, and test set is 8:1:1. In the network design, the activation function employed is ReLU, and the loss function used is Cross-Entropy. In summary, the selection of these settings and parameters aims to ensure the efficiency of the training process and the optimization of model performance.

4.2. Experimental Evaluation Metrics

If the precision threshold is allowed to vary from 0 to 1 and if the P and R values under different thresholds are calculated, a two-dimensional curve—the PR curve—can be obtained by taking the recall and precision as the horizontal and vertical coordinates, and the detection precision AP can be obtained by taking the area under the PR curve as a measure, with the following expression:
A P = 0 1 P th ( r ) d r
where the expressions for P and R are given next:
P = T P T P + F P
R = T P T P + F N
where TP represents the number of true positive samples, FP represents the number of false-positive samples, and FN represents the number of false-negative samples.
The mean average precision (mAP) is the mean AP value obtained by averaging over multiple validation sets and is expressed as
m A P = 1 N i = 1 N A P
where N is the number of classes. In the experimental process, [email protected] (mAP at an IoU threshold of 0.5) is used as the evaluation metric. In this paper, mAP refers to [email protected]. The expression is given by Equation (12):
m A P @ 0.5 = 1 N i = 1 N A P ( I o U t h = 0.5 )

4.3. Experimental Result Analysis

Figure 6 illustrates the curves of precision and recall metrics as they vary with epochs during training. It can be observed from the curves that the network converges rapidly in the early stages of training, with both precision and recall reaching approximately 0.9 after around the 30th epoch. As training progresses, the two metrics gradually stabilize around the 50th epoch and eventually settle at a high level above 0.97 by the 200th epoch. This indicates that the model exhibits strong detection performance and generalization ability on this dataset. Moreover, the stability of precision and recall suggests that the model did not experience significant overfitting or underfitting during training, validating the effectiveness and robustness of the designed network. Overall, these results demonstrate the excellent performance of the proposed method on this dataset, as well as the rationality of the network architecture and training strategy.
To comprehensively evaluate the performance of the optimized lightweight model, comparative experiments were conducted using the CPLID dataset. It is a large-scale and high-resolution dataset specifically developed for the detection of defects in power line insulators under various real-world conditions, including lighting variations, complex backgrounds, and multiple types of defects such as cracks and damage. Each image in the dataset was meticulously annotated by domain experts to ensure accuracy and reliability, making CPLID a robust benchmark for assessing defect detection performance.
The optimized models were trained and tested on the CPLID dataset together with several typical detection frameworks (such as YOLOv3, SSD, and Faster RCNN). The hyperparameter settings of these frameworks were set according to the parameters recommended in their original papers. By training and testing these models on the same dataset, their specific performance data under different evaluation criteria were obtained. All models were run on the same hardware environment, and evaluation indicators such as mAP, precision, and recall were calculated to ensure a fair and consistent comparison.
The experimental results, as shown in Table 1, demonstrate significant improvements in the accuracy and efficiency achieved by the optimized lightweight model compared to other models. The evaluation criteria values for the other frameworks were obtained by replicating their experiments using the same CPLID dataset, following the hyperparameters and settings outlined in their respective publications.
From the table, it can be seen that for the detection of insulators, which involves multi-scale detection tasks with larger insulator targets and smaller defect targets, two-stage object detection algorithms like Fast R-CNN perform poorly. Particularly for detecting smaller defect targets, the detection accuracy is only 63.6%, much lower than that of other single-stage object detection networks. This indicates the rationality of selecting the YOLO network based on a single-stage object detection algorithm for lightweight improvement in this study.
Through iterations of the YOLO network version, the network structure has become more reasonable. The computational and parameter complexity of the YOLOv8s model is significantly lower than earlier versions like YOLOv3, mainly due to updates that reduced the number of residual blocks and convolutional kernels and reorganized module sequences to improve network performance. For the detection task in this study, YOLOv8s demonstrates high precision, with all metrics above 96% and an average accuracy of 96.3%. However, in terms of model parameters, YOLOv8s remains larger compared to lightweight networks such as MobileNetv3.
Compared to conventional object detection networks, lightweight networks like MobileNetv3 and ShuffleNetv2 have lower computational and parameter complexity but also exhibit lower recall rates (R), indicating poorer detection capabilities. In practical detection scenarios, these lightweight models often miss detections, identifying many potential targets as background, especially smaller defect targets. Therefore, this study addresses the insufficient detection accuracy of lightweight networks by adopting the multi-scale fusion concept of the YOLO network, achieving practical effectiveness.
Furthermore, the proposed algorithm reduces the model’s parameter and computational complexity by incorporating MobileNetv3 modules to optimize the YOLOv8s Backbone network, achieving reductions of 46.6%. Meanwhile, our improved model has a slight increase in precision, recall, and detection accuracy compared to the original YOLOv8s network; for example, the [email protected] increased by about 2% while the GFLOPs decreased. Our model has all metrics remaining above 97%, indicating that the performance of the proposed model far exceeds the task requirements. Thus, appropriate lightweight modifications do not compromise performance. The improved network outperforms the MobileNetv3 network itself and effectively accomplishes detection tasks, with computational complexity comparable to MobileNetv3 after lightweight optimization, significantly enhancing detection accuracy.
Based on the comparative analysis above, the optimized method in this study introduces depthwise separable convolution to lightweighting, enhancing the YOLOv8s model. The incorporation of depthwise separable convolution into the YOLOv8s model improves its ability to sustain high accuracy and robustness in object detection tasks while significantly reducing resource consumption. This enhancement results in a substantial decrease in both parameter count and computational complexity, without compromising precision. This adaptation ensures that the YOLOv8s model meets the practical requirements for real-time applications on mobile platforms, balancing high performance with minimal computational overhead.

4.4. Ablation Experiment

In order to verify the effectiveness of the CBAM module for the detection of small target insulators, ablation experiments are also conducted in this paper. The experimental results are shown in Table 2 and Figure 7.
As can be seen from Table 2, the introduction of the CBAM module makes the number of parameters of the network rise slightly, but the impact on the computational volume is negligible. The data in the table show that, in the network’s added CBAM module, the detection accuracy of all types of labels have risen; particularly, the color-changing insulators show the most obvious improvement at 4%. Our analysis shows that the reason for this may be that the CBAM module can enhance the model in terms of the color of the key features of attention, thus improving the ability of the model to distinguish the color change.
As can be seen from Figure 7, the overall detection performance of the network has been further improved after the introduction of the CBAM module. By comparing Figure 7a,b, it can be seen that after adding CBAM, the detection accuracy of large insulators has increased from 0.82 to 0.90, and the detection accuracy of discoloration has also increased by 0.05; the detection accuracy of small insulators has also increased from 0.72 to 0.78; by comparing Figure 7c,d, it can be seen that under a relatively simple background environment, adding CBAM does not show much improvement, and the detection accuracy of insulators and defects has only increased by 0.01. Therefore, the introduction of the CBAM module can maintain good overall detection performance while ensuring the network is lightweight and improve the detection performance of small target insulators. In particular, the detection improvement effect is obvious for insulators that account for a small proportion of the image background, individual insulators with defects such as missed detection and discoloration, and small targets.
A comprehensive analysis of the results in Table 2 and Figure 7 shows that the CBAM module significantly improves the accuracy of insulator defect detection while ensuring the network remains lightweight; this not only improves the detection performance of the network but also ensures its high efficiency and real-time performance, making it suitable for practical application scenarios such as UAV inspection and providing a more effective and reliable solution for the intelligent detection of insulator defects in the inspection of power transmission lines.

4.5. Results

Figure 8 shows the test results of the last epoch of the training process. Figure 8a shows the labeled visualization result, where all the contents in the labeled file are fully presented on the image. On the other hand, Figure 8b shows the test results of the network on the same batch of images, again presented in the form of boxes on the image. As can be seen from the figure, the model has achieved a high accuracy of more than 0.9 in detecting insulator defects, and the average detection accuracy for large insulators is 0.94. Although the detection accuracy of small insulators is only 0.76, for small insulators with complex backgrounds, such as the picture in the first row and third column of Figure 8b, all insulators can be accurately detected. This shows that the improved model can not only accurately identify and locate insulators of different sizes but also efficiently detect insulator defects. This high-precision detection capability is very important for practical applications and can effectively improve the efficiency and reliability of transmission line inspection.
To further assess the model’s robustness, it was tested on samples both before and after the addition of noise. Figure 9 presents a comparative analysis of the model’s performance in detecting insulator defects under these two conditions.
To further evaluate the robustness of the model, the samples were tested before and after adding noise. Figure 9 shows a comparative analysis of the performance of the model in detecting insulator defects under these two conditions. As can be seen from the figure, when no noise is added, the accuracy of the model for the overall insulator in the same picture is 0.84, and the accuracy for the defective insulator is 0.89; when noise is added to the picture, the accuracy of the model for the defective insulator decreases by 0.02, but the accuracy for the overall insulator increases by 0.05. This observation shows that the model maintains a high level of detection accuracy and completeness even when the accuracy of defective insulator detection is reduced due to noise interference. Despite the adverse effects of noise, the model can still maintain comprehensive defect identification, which highlights its robustness and reliability in noisy environments and enhances its potential in practical applications where such challenges are prevalent.
It can also be seen from the comparison graph that the addition of noise has little effect on the overall detection of insulators. The model is still able to accurately identify and locate insulators in the presence of noise. This indicates that the improved, optimized model performs well against noise interference, with strong robustness and stability.
The enhancement in anti-noise performance is particularly critical for practical applications, especially in the context of transmission line inspections where image data are often subject to interference from various external factors, such as adverse weather conditions and variations in the performance of filming equipment. The model proposed in this study demonstrates the capability to maintain high accuracy and robust detection integrity even in the presence of noise, thereby significantly enhancing the practicality and reliability of the system in complex, real-world environments. This improvement underscores the model’s suitability for deployment in challenging operational scenarios where consistent performance is essential.

5. Conclusions

This paper presents a lightweight drone-based algorithm for insulator defect detection. Due to the limited number and poor balance of samples in the CPLID insulator dataset, data augmentation techniques such as brightness variation and noise addition were applied to enrich sample types and quantity. Additionally, the dataset was enhanced by introducing color-changing insulator labels alongside existing labels for normal and defective insulators, thereby diversifying label types. Considering the limited hardware conditions of mobile devices, this study introduced depthwise separable convolution and used the MobileNetv3 model as the Backbone network to replace the feature extraction part of YOLOv8s for lightweight improvement. Furthermore, given the relatively small proportion of insulators in the images, a lightweight attention mechanism, CBAM, was integrated into the feature extraction module of YOLOv8s.
The experimental results demonstrate that the proposed LDIDD achieved a 46.6% reduction in computational complexity and a reduction of about 2% in [email protected] compared to the base YOLOv8s, while maintaining a stable mAP of around 98.3%. The findings confirmed that the proposed method effectively achieves both a lightweight design and high recognition accuracy in detecting insulator defects within UAV inspection images. Furthermore, it is suitable for deployment on the mobile devices of UAVs, offering a viable alternative to traditional manual inspection and meeting the real-time requirements necessary for insulator detection on mobile platforms.
Future work will focus on adding more types of labels and refining the identification accuracy for each type of defect. Also, research could focus on optimizing the method’s real-time processing capabilities and enhancing its adaptability across diverse environments and tasks, especially in more intricate transmission line inspection scenarios. Additionally, exploring the integration of the model with edge computing technologies or expanding its application to smart grids and smart cities is also a promising area.

Author Contributions

Conceptualization, Y.L., X.L., and Z.W.; methodology, Z.W.; software, Y.L.; validation, X.L. and R.Q.; formal analysis, Y.C. and A.P.; investigation, Z.W.; resources, X.H.; data curation, R.Q.; writing—original draft preparation, X.L. and A.P.; writing—review and editing, Y.L. and A.P.; visualization, Y.C.; supervision, Z.W.; project administration, Y.L.; funding acquisition, Y.L. and Z.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Science and Technology Project of Inner Mongolia Power Co., Ltd., under Grant No. LX01234742, and was partly supported by Key Foundation of Zhejiang Provincial Natural Science of China, under Grant No. LZ22F010005 and LTGY24F010002.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

Authors Yanxing Liu, Xudong Li, Ruyu Qiao, Yu Chen, Xueliang Han were employed by the company Inner Mongolia Power (Group) Co., Ltd. who provided funding and technical support for the work. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Lu, Z.; Li, Y.; Shuang, F.; Han, C. InsDef: Few-Shot Learning-Based Insulator Defect Detection Algorithm With a Dual-Guide Attention Mechanism and Multiple Label Consistency Constraints. IEEE Trans. Power Deliv. 2023, 38, 4166–4178. [Google Scholar] [CrossRef]
  2. Liu, J.; Hu, M.; Dong, J.; Lu, X. Summary of Insulator Defect Detection Based on Deep Learning. Electr. Power Syst. Res. 2023, 224, 109688. [Google Scholar] [CrossRef]
  3. Liu, Y.; Liu, D.; Huang, X.; Li, C. Insulator Defect Detection with Deep Learning: A Survey. IET Gener. Transm. Distrib. 2023, 17, 3541–3558. [Google Scholar] [CrossRef]
  4. Li, X.; Su, H.; Liu, G. Insulator Defect Recognition Based on Global Detection and Local Segmentation. IEEE Access 2020, 8, 59934–59946. [Google Scholar] [CrossRef]
  5. Ju, M.; Luo, H.; Wang, Z.; Hui, B.; Chang, Z. The Application of Improved YOLO V3 in Multi-Scale Target Detection. Appl. Sci. 2019, 9, 3775. [Google Scholar] [CrossRef]
  6. Wang, Z.; Gao, Q.; Xu, J.; Li, D. A Review of UAV Power Line Inspection. In Proceedings of the Advances in Guidance, Navigation and Control; Yan, L., Duan, H., Yu, X., Eds.; Springer: Singapore, 2022; pp. 3147–3159. [Google Scholar]
  7. Lekidis, A.; Anastasiadis, A.G.; Vokas, G.A. Electricity Infrastructure Inspection Using AI and Edge Platform-Based UAVs. Energy Rep. 2022, 8, 1394–1411. [Google Scholar] [CrossRef]
  8. Zormpas, A.; Moirogiorgou, K.; Kalaitzakis, K.; Plokamakis, G.A.; Partsinevelos, P.; Giakos, G.; Zervakis, M. Power Transmission Lines Inspection Using Properly Equipped Unmanned Aerial Vehicle (UAV). In Proceedings of the 2018 IEEE International Conference on Imaging Systems and Techniques (IST), Krakow, Poland, 16–18 October 2018; pp. 1–5. [Google Scholar]
  9. Cao, Y.; Xu, H.; Su, C.; Yang, Q. Accurate Glass Insulators Defect Detection in Power Transmission Grids Using Aerial Image Augmentation. IEEE Trans. Power Deliv. 2023, 38, 956–965. [Google Scholar] [CrossRef]
  10. Huang, X.; Shang, E.; Xue, J.; Ding, H.; Li, P. A Multi-Feature Fusion-Based Deep Learning for Insulator Image Identification and Fault Detection. In Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020; Volume 1, pp. 1957–1960. [Google Scholar]
  11. Han, J.; Yang, Z.; Zhang, Q.; Chen, C.; Li, H.; Lai, S.; Hu, G.; Xu, C.; Xu, H.; Wang, D.; et al. A Method of Insulator Faults Detection in Aerial Images for High-Voltage Transmission Lines Inspection. Appl. Sci. 2019, 9, 2009. [Google Scholar] [CrossRef]
  12. Yang, W.; Hao, J.; Yu, J. Defect Detection Method for Key Components of Transmission Line Based on Incremental Learning. In Proceedings of the 2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Xi’an, China, 15–17 July 2022; pp. 315–319. [Google Scholar]
  13. Liu, Z.; Tang, H.; Zheng, W.; Lu, X.; Huang, F.; Yu, B. Insulator Micro-Defect Recognition Based on Improved YOLOv7 Model. In Proceedings of the 2023 International Conference on Internet of Things, Robotics and Distributed Computing (ICIRDC), Rio De Janeiro, Brazil, 29–31 December 2023; pp. 454–458. [Google Scholar]
  14. Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
  15. Liu, L.; Fu, J.; Yan, L.; Xie, H. Insulator Defect Detection Based on Deep Learning. In Proceedings of the 2023 3rd International Conference on Electrical Engineering and Mechatronics Technology (ICEEMT), Nanjing, China, 21–23 July 2023; pp. 836–839. [Google Scholar]
  16. Jocher, G. YOLOv5 by Ultralytics. Available online: https://github.com/ultralytics/yolov5 (accessed on 10 July 2024).
  17. Liu, J.; Liu, C.; Wu, Y.; Xu, H.; Sun, Z. An Improved Method Based on Deep Learning for Insulator Fault Detection in Diverse Aerial Images. Energies 2021, 14, 4365. [Google Scholar] [CrossRef]
  18. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
  19. Zhong, J.; Liu, Z.; Yang, C.; Wang, H.; Gao, S.; Núñez, A. Adversarial Reconstruction Based on Tighter Oriented Localization for Catenary Insulator Defect Detection in High-Speed Railways. IEEE Trans. Intell. Transp. Syst. 2022, 23, 1109–1120. [Google Scholar] [CrossRef]
  20. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
  21. Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
  22. Zhang, H.; Chang, H.; Ma, B.; Wang, N.; Chen, X. Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XV; Springer: Berlin/Heidelberg, Germany, 2020; pp. 260–275. [Google Scholar]
  23. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
  24. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
  25. Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
  26. He, M.; Qin, L.; Deng, X.; Liu, K. MFI-YOLO: Multi-Fault Insulator Detection Based on an Improved YOLOv8. IEEE Trans. Power Deliv. 2024, 39, 168–179. [Google Scholar] [CrossRef]
  27. Zhang, Q.; Zhang, J.; Li, Y.; Zhu, C.; Wang, G. IL-YOLO: An Efficient Detection Algorithm for Insulator Defects in Complex Backgrounds of Transmission Lines. IEEE Access 2024, 12, 14532–14546. [Google Scholar] [CrossRef]
  28. Adou, M.W.; Xu, H.; Chen, G. Insulator Faults Detection Based on Deep Learning. In Proceedings of the 2019 IEEE 13th International Conference on Anti-Counterfeiting, Security, and Identification (ASID), Xiamen, China, 25–27 October 2019; pp. 173–177. [Google Scholar]
  29. Zhao, L.; Wang, L. A New Lightweight Network Based on MobileNetV3. KSII Trans. Internet Inf. Syst. TIIS 2022, 16, 1–15. [Google Scholar] [CrossRef]
  30. Jia, L.; Wang, Y.; Zang, Y.; Li, Q.; Leng, H.; Xiao, Z.; Long, W.; Jiang, L. MobileNetV3 With CBAM for Bamboo Stick Counting. IEEE Access 2022, 10, 53963–53971. [Google Scholar] [CrossRef]
  31. Qian, S.; Ning, C.; Hu, Y. MobileNetV3 for Image Classification. In Proceedings of the 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Nanchang, China, 26–28 March 2021; pp. 490–497. [Google Scholar]
  32. Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision—ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 3–19. [Google Scholar]
  33. Zhang, X.; Zhang, Y.; Liu, J.; Zhang, C.; Xue, X.; Zhang, H.; Zhang, W. InsuDet: A Fault Detection Method for Insulators of Overhead Transmission Lines Using Convolutional Neural Networks. IEEE Trans. Instrum. Meas. 2021, 70, 1–12. [Google Scholar] [CrossRef]
  34. Yang, D.; Luo, Z. A Parallel Processing CNN Accelerator on Embedded Devices Based on Optimized MobileNet. IEEE Internet Things J. 2023, 10, 18844–18852. [Google Scholar] [CrossRef]
  35. Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.-C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
  36. Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.-C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
  37. Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
  38. Wang, K.; Liew, J.H.; Zou, Y.; Zhou, D.; Feng, J. PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment. arXiv 2019, arXiv:1908.06391. [Google Scholar] [CrossRef]
  39. Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
  40. Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. arXiv 2018, arXiv:1807.11164. [Google Scholar] [CrossRef]
  41. Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLO. Available online: https://github.com/ultralytics/ultralytics (accessed on 20 September 2024).
Figure 1. Data enhancement diagram: (a) original image; (b) clockwise rotation 90°; (c) rotated 180°; (d) increased brightness; (e) reduced brightness; (f) Gaussian noise added.
Figure 1. Data enhancement diagram: (a) original image; (b) clockwise rotation 90°; (c) rotated 180°; (d) increased brightness; (e) reduced brightness; (f) Gaussian noise added.
Applsci 14 08691 g001
Figure 2. Added color-changing label. The red squares highlight the visual characteristics of the actual whitening defects.
Figure 2. Added color-changing label. The red squares highlight the visual characteristics of the actual whitening defects.
Applsci 14 08691 g002
Figure 3. A comparison of the structure of depthwise separable convolution with that of conventional convolution, indicating two distinct stages, depthwise and pointwise.
Figure 3. A comparison of the structure of depthwise separable convolution with that of conventional convolution, indicating two distinct stages, depthwise and pointwise.
Applsci 14 08691 g003
Figure 4. A diagram of the proposed network structure, indicating the revisions in the Backbone, Neck, and Head parts of the original YOLOv8.
Figure 4. A diagram of the proposed network structure, indicating the revisions in the Backbone, Neck, and Head parts of the original YOLOv8.
Applsci 14 08691 g004
Figure 5. A diagram of the CBAM’s structure, indicating the composition of the CAM and SAM substructures.
Figure 5. A diagram of the CBAM’s structure, indicating the composition of the CAM and SAM substructures.
Applsci 14 08691 g005
Figure 6. The performance curves during training: (a) recall; (b) precision.
Figure 6. The performance curves during training: (a) recall; (b) precision.
Applsci 14 08691 g006
Figure 7. Comparison before and after the addition of the CBAM module: (a) Image 1 without CBAM; (b) Image 1 with CBAM; (c) Image 2 without CBAM; (d) Image 2 with CBAM.
Figure 7. Comparison before and after the addition of the CBAM module: (a) Image 1 without CBAM; (b) Image 1 with CBAM; (c) Image 2 without CBAM; (d) Image 2 with CBAM.
Applsci 14 08691 g007
Figure 8. A detailed diagram of the test’s visualization results: (a) label chart; (b) prediction chart.
Figure 8. A detailed diagram of the test’s visualization results: (a) label chart; (b) prediction chart.
Applsci 14 08691 g008aApplsci 14 08691 g008b
Figure 9. Detection result: (a) before adding noise; (b) after adding noise.
Figure 9. Detection result: (a) before adding noise; (b) after adding noise.
Applsci 14 08691 g009
Table 1. Performance comparison between lightweight models and other models.
Table 1. Performance comparison between lightweight models and other models.
Insulator Detection
AP/%
Defect Detection
AP/%
Precision/%Recall/%[email protected]/%GFLOPs/G
SSD [23]89.387.279.680.788.3115.7
Fast R-CNN [24]87.963.664.885.975.8130.4
MobileNetv3 [31]88.480.892.776.384.66.3
YOLOv3 [39]91.682.290.583.587.9154.6
ShuffleNetv2 [40]80.284.491.576.582.38.0
YOLOv8s [41]96.396.296.096.996.313.5
Our Improved Method98.997.897.498.498.37.2
Table 2. Comparison of detection accuracy and network complexity before and after the introduction of CBAM.
Table 2. Comparison of detection accuracy and network complexity before and after the introduction of CBAM.
Overall/%Insulator/%Defect/%Fading/%Parameters/MbGFLOPs/G
Without CBAM93.896.999.184.93,940,7657.1
With CBAM 94.898.996.988.93,982,3557.2
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Li, X.; Qiao, R.; Chen, Y.; Han, X.; Paul, A.; Wu, Z. Lightweight Insulator and Defect Detection Method Based on Improved YOLOv8. Appl. Sci. 2024, 14, 8691. https://doi.org/10.3390/app14198691

AMA Style

Liu Y, Li X, Qiao R, Chen Y, Han X, Paul A, Wu Z. Lightweight Insulator and Defect Detection Method Based on Improved YOLOv8. Applied Sciences. 2024; 14(19):8691. https://doi.org/10.3390/app14198691

Chicago/Turabian Style

Liu, Yanxing, Xudong Li, Ruyu Qiao, Yu Chen, Xueliang Han, Agyemang Paul, and Zhefu Wu. 2024. "Lightweight Insulator and Defect Detection Method Based on Improved YOLOv8" Applied Sciences 14, no. 19: 8691. https://doi.org/10.3390/app14198691

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop