Research on Defect Diagnosis of Transmission Lines Based on Multi-Strategy Image Processing and Improved Deep Network

Gou, Ming; Tang, Hao; Song, Lei; Chen, Zhong; Yan, Xiaoming; Zeng, Xiangwen; Fu, Wenlong

doi:10.3390/pr12091832

Open AccessArticle

Research on Defect Diagnosis of Transmission Lines Based on Multi-Strategy Image Processing and Improved Deep Network

by

Ming Gou

¹,

Hao Tang

¹,

Lei Song

¹,

Zhong Chen

¹,

Xiaoming Yan

^2,*

,

Xiangwen Zeng

³ and

Wenlong Fu

^2,*

¹

Yichang Electric Power Survey & Design Institute Co., Ltd., Yichang 443000, China

²

College of Electrical Engineering and New Energy, China Three Gorges University, Yichang 443002, China

³

Hubei Zefeng Electric Power Design Co., Ltd., Yichang 443002, China

^*

Authors to whom correspondence should be addressed.

Processes 2024, 12(9), 1832; https://doi.org/10.3390/pr12091832

Submission received: 29 July 2024 / Revised: 25 August 2024 / Accepted: 26 August 2024 / Published: 28 August 2024

(This article belongs to the Special Issue Sustainable and Intelligent Energy Systems and Processes: Recent Advances and Challenges)

Download

Browse Figures

Versions Notes

Abstract

:

The current manual inspection of transmission line images captured by unmanned aerial vehicles (UAVs) is not only time-consuming and labor-intensive but also prone to high rates of false detections and missed inspections. With the development of artificial intelligence, deep learning-based image recognition methods can automatically detect various defect categories of transmission lines based on images captured by UAVs. However, existing methods are often constrained by incomplete feature extraction and imbalanced sample categories, which limit the precision of detection. To address these issues, a novel method based on multi-strategy image processing and an improved deep network is proposed to conduct defect diagnosis of transmission lines. Firstly, multi-strategy image processing is proposed to extract the effective area of transmission lines. Then, a generative adversarial network is employed to generate images of transmission lines to enhance the trained samples’ diversity. Finally, the deep network GoogLeNet is improved by superseding the original cross-entropy loss function with a focal loss function to achieve the deep feature extraction of images and defect diagnosis of transmission lines. An actual imbalance transmission line dataset including normal, broken strands, and loose strands is applied to validate the effectiveness of the proposed method. The experimental results, as well as contrastive analysis, reveal that the proposed method is suitable for recognizing defects of transmission lines.

Keywords:

defect diagnosis; multi-strategy image processing; morphological analysis; generative adversarial network; GoogLeNet; focal loss

1. Introduction

Transmission lines serve as critical infrastructure for the transmission of electricity, playing a crucial role in modern industrial and urban social life [1,2]. However, these lines, exposed to natural environments over long periods, are prone to defects such as broken and loose strands, posing threats to the safe operation of electrical systems [3,4]. To ensure the reliable and safe operation of the power system, regular inspections of transmission lines are necessary. Initially, these inspections heavily relied on manual methods, which were both dangerous and inefficient [5,6,7]. With the advancement of drone technology, UAVs are now employed for transmission line inspections. While UAVs significantly reduce the need for hazardous manual aerial work, the analysis of transmission line images captured under visible light by UAVs requires costly manual rechecking and often leads to high misdiagnosis rates. Given these challenges, the effective and precise detection of faults in transmission lines has become an urgent problem that needs to be addressed.

Before the integration of deep learning with defect detection in power transmission lines, previous research focused on detecting broken and loose strands in these lines. Some researchers employed non-destructive testing theories to inspect the lines, which offered higher sensitivity compared with several common methods at the time. However, due to the unique structure of power transmission lines compared with other metal components, this method often reduces the effectiveness of defect recognition. Komoda et al. [8] used visual inspection methods to detect defects in the lines. This method collects line images and identifies defects by extracting line contours or comparing images. Although this method is more direct than the previous one, it is prone to the angle of the acquisition device and background recognition, and it cannot achieve automatic identification of defects in transmission line images. Subsequently, researchers used traditional methods and signal characteristics to detect transmission lines. Cheng et al. [9] proposed a method based on image space features to detect defects in the insulators of transmission lines. Yuan et al. [10] used non-destructive ultrasonic phased array technology to detect composite insulators. Xiao et al. [11] designed a new overhead ground line detection technology based on the effect of magnetic leakage signal on gap, elevation distance, defect width, and section loss rate.

With the rapid advancement of artificial intelligence [12,13], methods for detecting defects in transmission line images based on deep learning have been widely adopted for automatically modeling collected transmission line images, thereby facilitating the detection of defects such as broken and loose strands in transmission lines. Ni et al. [14] adapted the traditional Faster R-CNN model and utilized the concept-ResNet-v2 network as a foundational feature extractor to detect defects in critical components of transmission lines. Chen et al. [15] proposed an enhanced Faster R-CNN network incorporating deformable convolutions and feature pyramid modules for the intelligent detection of transmission line defects. Fu et al. [16] employed a three-channel feature fusion network to enhance feature extraction capabilities while preserving spatial and semantic information, achieving high-precision detection of transmission line defects.

In recent years, an increasing number of scholars have dedicated their efforts to researching efficient denoising algorithms to enhance the clarity of images. Yu et al. [17] proposed a noise-reduction algorithm, the adaptive neighborhood weighted median filtering (NW-AMF) algorithm, to accurately identify insulator defects. The algorithm utilizes a weighted summation technique to calculate the median value of the neighborhood of a pixel point, effectively filtering out noise in the captured aerial images. Bhadra et al. [18] presented a novel architecture for anomaly detection and classification of high-voltage transmission lines. The architecture utilizes a self-attentive convolutional neural network augmented with wavelet transform (WSAT-CNN). The WSAT-CNN model is designed to improve noise immunity and prioritize fault characteristics. Shen et al. [19] designed a transmission line safety warning technology based on multi-source data sensing to address the issue of poor timeliness in traditional transmission line safety warnings. The multi-source data for the transmission line are acquired through preprocessing the transmission line video image, which includes histogram equalization, denoising, sharpening, edge detection, and segmentation.

The aforementioned studies provide various new perspectives on detecting faults in transmission lines. However, in practical situations, instances of transmission line faults are relatively rare compared with normal conditions, resulting in an imbalance between normal samples and samples with broken and loose strands. This significantly impacts the accuracy of line condition recognition. Additionally, directly applying photos captured by drones for image detection may lead to lower detection accuracy. Therefore, this paper proposes a method for detecting broken and loose strands in transmission lines based on multi-strategy image processing and an improved deep network. Firstly, to address the influence of lighting or other background factors on the recognition of captured images of broken and loose strands, multi-strategy image processing, including wavelet denoising-based image enhancement [20,21], HSV color space-based multi-threshold segmentation [22], and morphological analysis, is proposed to process the images. Subsequently, GAN [23] is employed to generate images of transmission lines to enhance the sample morphological diversity and reduce the effect of data imbalance. Finally, the deep network GoogLeNet is improved by superseding the original cross-entropy loss function with focal loss function [24] to enhance the defect recognition accuracy. The contributions of this paper are as follows:

(1): A multi-strategy image processing method is proposed to extract the transmission line area to reduce the interference of the environmental background in UAV photos;
(2): GAN is used to generate transmission line images, which enhance the diversity of sample morphology and reduce the impact of data imbalance;
(3): The focal loss function is introduced into the GoogLeNet feature extraction network so that the network can achieve higher fault detection accuracy in the case of an imbalance between class samples.

The rest of this paper is as follows: Section 2 describes the process of multi-strategy processing of the transmission line image to extract the valid region. Section 3 presents the transmission line defect detection method based on the improved deep network. Subsequently, in Section 4, a real transmission line image dataset is used to test the performance of the proposed method. Finally, the conclusions are provided in Section 5.

2. Multi-Strategy Image Processing

Extracting the line area is an important guarantee for the accurate detection of broken strand defects in transmission lines. In this paper, a multi-strategy image processing method including wavelet denoising-based image enhancement, HSV color space-based multi-threshold segmentation, and morphological analysis to extract the line regions. This method can effectively solve the problem that the transmission line is disturbed by background noise, which creates a premise for the subsequent detection of transmission line defects.

2.1. Image Enhancement Based on Wavelet Denoising

The influence of the image acquisition device or its surrounding environment can negatively affect the image quality, resulting in noise in the image. These noisy signals will degrade the quality of the image and may confuse useful information in the image, thus reducing the stability and accuracy of image processing.

In order to reduce noise interference in image processing, this paper selects the wavelet hard threshold to denoise the collected transmission line images. The basic idea of wavelet hard threshold denoising is to separate the image signal from the noise by using wavelet transform and then process the wavelet coefficient according to the set threshold. The hard threshold function decompositions the decomposition coefficients smaller than the threshold in different-scale spaces to zero, while preserving the decomposition coefficients larger than the threshold. Wavelet hard thresholding effectively separates image information from noise, accurately removes noise, and retains useful details. It is a simple, easy-to-implement method widely used in image denoising.

The wavelet coefficient preservation calculation formula of wavelet hard threshold image denoising is as follows:

{\hat{w}}_{j, k} = \{\begin{cases} w_{j, k}, | w_{j, k} | \geq λ \\ 0, | w_{j, k} | < λ \end{cases},

(1)

where w_j,k is the wavelet coefficient and λ is the critical threshold.

2.2. Multi-Threshold Segmentation Based on HSV Color Space

Image segmentation is a crucial step in image processing and is also one of the most critical tasks in image processing. The so-called image segmentation refers to the process of dividing the pixels of an image into several different sets of regions; each set represents an entity or background in the image. The threshold method converts a gray image to a binary image by dividing it into two regions based on pixel values and a threshold. It is essentially a transformation from an input image F to an output image G. The transformation formula is as follows:

G (i, j) = \{\begin{cases} 1, F (i, j) \geq T \\ 0, F (i, j) < T \end{cases},

(2)

where T is the threshold value, F(i, j) = 1 for elements in the target region, and F(i, j) = 0 for elements in the background.

HSV multi-threshold segmentation is an image processing technology that uses the characteristics of HSV color space to segment images by setting multiple thresholds. This method considers both color and brightness information, improving the accuracy of extracting target objects and features. By setting different thresholds, multiple target regions in an image can be effectively segmented and extracted, which can be used for various image processing and analysis tasks. Since the multi-threshold segmentation of HSV color space can better reflect the salient color features, it is widely used in the field of image processing. Among them, the three components of the HSV color space corresponding to the image are:

H = \{\begin{cases} 0 °, & \max = \min \\ 60 ° \times \frac{G - B}{\max - \min} + 0 °, & \max = R a n d G \geq B \\ 60 ° \times \frac{G - B}{\max - \min} + 360 °, & \max = R a n d G < B \\ 60 ° \times \frac{B - R}{\max - \min} + 120 °, & \max = G \\ 60 ° \times \frac{R - G}{\max - \min} + 240 °, & \max = B \end{cases},

(3)

S = \{\begin{cases} 0, & \max = 0 \\ 1 - \frac{\min}{\max}, & other \end{cases},

(4)

where R represents the red channel in the original image, G represents the green channel in the original image, B represents the green channel in the original image, max is the maximum value of the original image pixel, min is the minimum value of the original image pixel, and S represents the saturation of the image.

Thresholding is one of the most commonly used segmentation methods that classifies pixels in an image according to their gray value with a preset threshold. In transmission line segmentation, it is difficult to extract the transmission line region with single thresholding. To address this, we use multi-threshold segmentation with threshold intervals to divide the image into regions and then apply morphological processing to remove background noise and isolate the transmission line region.

2.3. Extraction of Transmission Line Regions Based on Morphological Processing

Through the multi-threshold segmentation method, multiple connected regions can be obtained, including the line region, the insulator region, the transmission line region, and many small areas of interference. Firstly, the segmented image is processed by noise reduction and the closing operation to eliminate small noise interference and form large target connected regions. Then, all connected region targets are calculated, and the transmission line area is selected by brushing and extracted. The formula for calculating the area of the connected region is as follows:

A = \sum_{i = 1}^{n} \sum_{j = 1}^{m} I (i, j),

(5)

where A represents the area of the connected region, I(i, j) represents the value of the pixel (i, j), and n and m represent the width and height of the image, respectively.

The principle of this formula is that the value of all pixels in the connected region is summed, and the result is the area of the connected region. In this study, through the analysis of transmission line images, the area occupied by transmission lines is the largest, and the corresponding connected area is the largest. The transmission line region can be extracted by sorting out the largest connected region.

3. Detection of Transmission Line Defects

Multi-strategy image processing can effectively remove the noise and extract the effective region of the transmission line. The following work will focus on defect diagnosis of transmission lines based on improved deep network. This innovative approach will help to improve classification accuracy for a small number of categories, making the model more focused on difficult-to-classify samples and thereby enhancing the overall performance of the detection system.

3.1. GAN

GAN was proposed by Ian Goodfellow [25], which consists of two sub-networks as illustrated in Figure 1, a generator and a discriminator. This algorithm has demonstrated strong capabilities in learning data representations through mutual competition. The training strategy is defined by a maximum–minimum game, simultaneously training both components. The generator (G) extracts samples from a simple noise distribution, such as Gaussian or uniform distribution, maps them to the data space similar to the input real data, and aims to generate data that appear as realistic as possible through training. On the other hand, the discriminator (D) is trained to maximize the probability of correctly identifying the source of input data. As a result of this adversarial training process, the distribution of generated fake data tends to approximate that of real data.

However, the original GAN uses Jensen–Shannon divergence to measure differences, which fails to accurately reflect the disparity between the two distributions. This issue makes it hard for the generator to use gradients for optimizing parameters, often leading to poor sample quality in practice. To address this issue, Arjovsky [26] proposed the Wasserstein GAN (WGAN) algorithm. But weight clipping in WGAN can lead to optimization difficulties. In this method, parameters often converge to boundary values, meaning that the discriminator tends to learn a simplistic mapping function. The powerful fitting capability of WGAN is not fully realized. Weight clipping easily causes gradient vanishing or exploding.

To further enhance training stability, a gradient penalty (GP) [27] was introduced on top of the original WGAN, leading to the development of WGAN with gradient penalty (WGAN-GP). The gradient penalty ensures that the discriminator meets Lipschitz constraints. The specific definition of the gradient penalty term is:

G P = λ E_{\hat{x} ~ p_{\hat{x}}} [{({‖\nabla_{\hat{x}} D (\hat{x})‖}_{2} - 1)}^{2}],

(6)

where

\hat{x}

represents the linear interpolation between real samples and generated samples:

\hat{x} = t x_{r} + (1 - t) x_{f}, t \in [0, 1],

(7)

where

\hat{x}

is obtained by randomly interpolating the sampling.

Upon the foundation of the original WGAN optimization objective, the objective function of WGAN-GP is as follows:

W_{(P, Q)} = E_{x_{r} \sim P_{r (g)}} [D (x_{r})] - E_{x_{f} \sim P_{g (g)}} [D (x_{f})] - λ E_{\hat{x} ~ p_{\hat{x}}} [{({‖\nabla_{\hat{x}} D (\hat{x})‖}_{2} - 1)}^{2}],

(8)

where

{‖•‖}_{2}

represents L2 norm, ∇ is the gradient operator, and λ is the coefficient of gradient penalty term and is set to 10.

3.2. GoogLeNet

GoogLeNet [28] is a deep convolutional neural network developed by the Google research team in 2014. It won first place in the ImageNet competition. The network structure is shown in Figure 2. The network structure made some improvements on the basis of LeNet [29] and AlexNet [30], and introduced new design ideas and techniques so that the network could better deal with complex image classification tasks. Compared with traditional convolutional neural networks, GoogLeNet uses a parallel architecture called the “Inception” module, which is able to simultaneously extract features at different scales and merge them together. This design enhances the network’s ability to capture details and global information without increasing its depth or number of parameters.

The network structure of GoogLeNet contains 22 network layers, but the number of parameters is only 1/36 of that of VGGNet. This is mainly due to the design of the “Inception” module, which reduces the number of parameters by using various convolution kernels of different sizes and pooling operations, and concatenating the feature maps at the end. This parameter efficiency enables GoogLeNet to perform training and inference with little computational resources. GoogLeNet also uses a parallel structure, letting different convolution and pooling operations run simultaneously in separate branches. This design can accelerate the training process of the network, reduce the training period, and improve the convergence and generalization ability of the network. The success of GoogLeNet not only promotes the development of deep learning in the field of image classification, but also provides important implications for subsequent network design. Its innovation and efficiency have enabled deep learning research to enter a new phase, laying the foundation for more complex computer vision tasks. In this paper, the powerful feature extraction capability of GoogLeNet is used to extract features of transmission lines.

3.3. Focal Loss

The focal loss function [31] was proposed by Sung-Yi et al. as a solution for imbalanced data for object detection. The focal loss function considers the contribution of each sample to the loss according to the classification error. When using this loss function, when the model classifies the sample correctly, the loss will be reduced. This approach addresses class imbalance by focusing the loss indirectly on the challenging classes. In order to introduce the focal loss function, first end the common binary classification cross-entropy loss function for classification, which is expressed as follows:

C E (p, y) = \{\begin{cases} - \log (p), & y = 1 \\ - \log (1 - p), & other \end{cases},

(9)

where p is the estimated probability of the model and p ∈ [0, 1], y ∈ {±1} represents the true class. The focal loss function can be extended to the case of multi-class classification by defining the parameter p_t, which is defined as follows:

p_{t} = \{\begin{cases} p, & y = 1 \\ 1 - p, & other \end{cases},

(10)

According to the above Equation (7), the binary classification cross-entropy loss function is expressed as follows:

C E (p, y) = C E (p_{t}) = - \log (p_{t}),

(11)

The balanced binary classification cross-entropy loss function is introduced as follows:

C E (p_{t}) = - α_{t} \log (p_{t}),

(12)

The above Equation (9) solves the problem of class imbalance by adding a weight factor α to class 1 and adding 1 − α to class −1. At the same time, this formulation is considered as a simple extension of the binary classification cross-entropy loss function, where α can be set by the inverse of the class frequency or set to a hyperparameter fixed by cross-validation. The focal loss function is an extension of the cross-loss entropy loss function, which includes a weighting term. The formula is given in Equation (10):

L_{f} = - α_{t} {(1 - P_{t})}^{γ} l o g (P_{t}),

(13)

where α and γ are both adjustable parameters, and γ is a fixed positive value used to adjust the weighting speed of samples. When the focal loss function is similar to the cross-entropy loss function, and when γ increases, the efficiency of the modulation factor will also increase:

α_{t} = \{\begin{cases} α, & y = 1 \\ 1 - α, & other \end{cases}

(14)

where α is used as a fixed value between 0 and 1 to balance the positive and negative labeled samples; this parameter constitutes a general solution for the balanced class, and the classification accuracy using the α-balanced form is better than that using the non-α-balanced form.

3.4. Proposed Method

In practice, there are relatively few defects in transmission lines, resulting in an imbalance between normal samples and fault samples, which adversely affects the accuracy of line state recognition. Therefore, this paper proposes a novel method for defect diagnosis of transmission lines. Firstly, considering potential illumination and background interferences in captured images, multi-strategy image processing including wavelet denoising-based image enhancement, HSV color space-based multi-threshold segmentation, and morphological analysis is proposed to extract the transmission line area. Subsequently, GAN is employed to generate images of transmission lines to enhance the sample morphological diversity and reduce the effect of data imbalance. Finally, the deep network GoogLeNet is improved by superseding the original cross-entropy loss function with focal loss function to achieve deep feature extraction of images and defect diagnosis of transmission lines. The specific implementation steps are outlined below, and the flowchart is depicted in Figure 3.

Step 1: Utilize drones and other equipment to acquire original images of transmission lines.

Step 2: Preprocess the original image, including wavelet denoising, multi-threshold segmentation, and morphological processing.

Step 3: Perform sample enhancement on the processed transmission line image through GAN to reduce the impact caused by sample imbalance.

Step 4: Divide the transmission line image data into a training set and test set.

Step 5: Train the imbalanced training set using a GoogLeNet feature-extraction model based on the focal loss function.

Step 6: Save the best-performing trained model and evaluate it using the test set.

Step 7: Obtain detection results for broken strands and loose strands of the transmission lines.

4. Experimental Results and Analysis

The experimental part aims to verify the validity of the proposed method on the actual transmission line image dataset. In this study, three classical deep learning models, including AlexNet, MobileNet-V2 [32], and DenseNet [33], were selected as comparative experimental objects, which have excellent performance in fault diagnosis and are used widely in the area of image classification. By comparing the performance of different models in classification accuracy and calculation efficiency, the advantages and disadvantages of the proposed method are evaluated. Through the experimental results, the practicability and effect of the proposed method in practical application will be verified, and a scientific basis and reference will be provided for further engineering applications.

4.1. Dataset Introduction

A total of 1660 transmission line images were collected in this experiment, which contained three health states: healthy, loose strand, and broken strand. Among them, there were 660 normal samples, 646 scattered stock fault samples, and 354 broken stock fault samples, and the resolution of each image was 3024 × 4032. The dataset of transmission lines was processed using the proposed multi-threshold segmentation and morphological method based on HSV color space, and the dataset shown in Figure 4 was obtained. Then, the data set processed by the multi-threshold segmentation and morphology method based on HSV color space was divided, in which 60 samples of each class were used as the test set, and the remaining samples of each class were used as the training set. The division of the data set is shown in Table 1.

4.2. Results and Analysis

In this paper, the programming language is Python3.8, the programming environment is pytorch1.13.1, the running computer configuration is as follows: the operating system is Windows10, the CPU is Intel(R)Core(TM)i5-13490F, and the CPU is Intel(R)Core(TM). At the same time, the NVIDIA GeForce RTX3060Ti graphics card is loaded, the video memory is 8 G, and the general parallel computer CUDA architecture is 11.6.1. In order to alleviate the excessive training parameters caused by the large size of the input image, firstly, the collected image of the transmission line with 3024 × 4032 resolution was adjusted to 224 × 224, the Adam optimizer was used with a focal loss function, and the learning rate was set to 0.0002. The batch size of GoogLeNet was set to 16 and the number of training rounds was set to 200.

To verify the effectiveness of the improved model proposed in this paper, AlexNet with the cross-entropy loss function, MobileNet-V2, and DenseNet were selected for comparison. The other conditions of the three models were consistent with those of the model in this paper, except for the loss function, and all models were trained for 200 rounds. All experiments were repeated 10 times and the average metrics of 10 tests were employed for evaluation, including accuracy, recall, F1-score, and minimum loss. The metrics of all models are shown in Table 2, while the training curve and loss curve of the training set are shown in Figure 5, and the confusion matrices with the highest results among the 10 tests for all models are plotted in Figure 6.

As shown in Table 2, AlexNet exhibited the lowest classification accuracy of 86.34%, indicating its inferiority compared with the other three models in terms of overall classification capability. Although AlexNet achieved a recall of 90.11% and an F1-score of 89.10%, demonstrating strong performance in correctly identifying samples, its overall classification performance was limited by its lower accuracy, suggesting challenges in recognizing certain categories relative to the other models. In contrast, DenseNet excelled, with a classification accuracy of 94.01%, a recall of 97.20%, and an F1-score of 97.37%. This suggests that DenseNet provides a more comprehensive and balanced recognition ability across all categories, resulting in superior overall performance.

It is evident that the proposed model outperformed other methods across all metrics. With a classification accuracy of 97.83%, it demonstrated exceptional recognition capabilities. The recall rate of 97.81% and F1-score of 97.78% further highlighted the model’s excellence in both accurate identification and comprehensive classification. Additionally, achieving the lowest loss value of 0.04 substantiated its superior performance. Collectively, these metrics indicate that the proposed model has a significant performance advantage and exhibits outstanding overall effectiveness.

In addition, according to Figure 5, it is evident that the loss of the proposed model decreased more significantly and smoothly after replacing the cross-entropy loss function. The classification accuracy reached a peak of 97.83% and stabilized after 150 epochs of model training. The fluctuations in the accuracy curve before 150 epochs could be attributed to the model’s insufficient extraction of transmission line image features and the incomplete mapping of these features to their corresponding labels. After 150 epochs, the accuracy curve of the proposed model became almost smooth and achieved the highest accuracy compared with other models. In contrast, the average test accuracy of DenseNet reached 94.01%, but the training accuracy curve of the model fluctuated too much, the loss decline was unstable, and the convergence effect was poor. The average test accuracy of the AlexNet model was only 86.34%, and the model training was unstable. It can be seen that the improved model proposed in this paper had the best classification test accuracy, the loss decreased more smoothly, and the loss converged to 0.04 at the lowest level in the training process. According to the comparison of the four methods, using GoogLeNet as the feature extraction model achieved a better feature extraction effect than AlexNet, MobileNet-V2, and DenseNet. At the same time, for the imbalanced training samples in the data set, focal loss can have a better classification effect. Compared with the original cross-entropy loss function, the obtained color loss function curve is smoother and more stable, which shows the effectiveness of the proposed method.

As shown in Figure 6, it can be seen that AlexNet has a poor classification effect on the category, and the highest accuracy is only 87.78%. Although DenseNet and MobileNet-V2 had better classification results than AlexNet, with the highest classification accuracy of 94.44% and 92.22%, respectively, they misclassified more samples. In contrast, the highest accuracy of the proposed method reached 98.89%. Only two loose samples were misclassified into the category of broken, which was attributed to a certain similarity in features between loose and broken faults in some images of transmission lines. Other comparative models had poor classification ability due to their limited feature extraction capabilities.

In order to further validate the effectiveness of proposed the improved GoogLeNet, the classification results of the validation set of the four methods were visualized by T-SNE (t-distributed stochastic neighbor embedding), as shown in Figure 7. As can be seen from the figure, the proposed method had the lowest misclassification compared with the other three methods. In addition, the DenseNet model had good classification performance compared with MobileNetV2 and AlexNet. The AlexNet model achieved the worst classification performance in this dataset, and there was more overlap of misclassification between the three categories.

5. Conclusions

The approach toward the traditional manual inspection of transmission line images captured by UAVs has been increasingly replaced by deep learning due to it is capability to extract features automatically in defect detection. However, under practical conditions, the fault images captured by UAVs are often imbalanced and limited, as the probability of physical faults in transmission lines is much lower than that of normal states. Additionally, the captured images usually are affected by varying shooting backgrounds, which hinders the effectiveness of feature extraction. But the existing research rarely considers the impact of the aforementioned issues simultaneously. To overcome the defects of traditional inspection methods and the shortcomings of existing deep-learning-based inspection methods, this paper proposes a novel method based on multi-strategy image processing and an improved deep network is proposed to conduct defect diagnosis of transmission lines.

Firstly, multi-strategy image processing, including image enhancement based on wavelet denoising, and multi-threshold segmentation based on HSV color space and morphological analysis was proposed to reduce background interference and dextract the effective area of transmission lines. Then, GAN was used to generate the transmission line image, which enhanced the diversity of sample morphology by augmenting samples and reduced the impact of data imbalance. Finally, the focal loss function was replaced with the original cross-entropy loss function to improve GoogLeNet to realize the deep feature extraction of the image and defect diagnosis of the transmission line. The actual transmission line data set was used to test the performance of the proposed method. Experiments showed that, compared with the contrastive models, the proposed method had the best accuracy in the classification and detection of line defects.

Although the proposed method demonstrated excellent performance in detecting defects in transmission lines, it was primarily focused on surface defects, and there were limitations in detecting potential internal faults caused by overheating. In future research, the multi-modal models will be our potential research direction, which can achieve more comprehensive fault detection for transmission lines with multidimensional information, including images, sensor data, infrared imagery, and so on. However, when studying multimodal models in complex transmission line environments, challenges related to equipment installation and data collection may arise.

Author Contributions

Conceptualization, M.G. and X.Y.; methodology, M.G.; software, H.T. and L.S.; validation, Z.C. and X.Z.; writing—original draft preparation, M.G.; writing—review and editing, X.Y. and W.F.; visualization, H.T. and W.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy, legal, or ethical reasons.

Conflicts of Interest

Authors Ming Gou, Hao Tang, Lei Song, Zhong Chen were employed by the company Yichang Electric Power Survey & Design Institute Co., Ltd. Author Xiangwen Zeng was employed by the company Hubei Zefeng Electric Power Design Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Li, Y.; Liu, M.; Li, Z.; Jiang, X. CSSAdet: Real-Time end-to-end small object detection for power transmission line inspection. IEEE Trans. Power Deliv. 2023, 38, 4432–4442. [Google Scholar] [CrossRef]
Liu, Z.; Wu, G.; He, W.; Fan, F.; Ye, X. Key target and defect detection of high-voltage power transmission lines with deep learning. Int. J. Electr. Power Energy Syst. 2022, 142, 108277. [Google Scholar] [CrossRef]
Mishra, D.; Ray, P. Fault detection, location and classification of a transmission line. Neural Comput. Appl. 2018, 30, 1377–1424. [Google Scholar] [CrossRef]
Chen, K.; Hu, J.; He, J. Detection and classification of transmission line faults based on unsupervised feature learning and convolutional sparse autoencoder. IEEE Trans. Smart Grid 2016, 9, 1748–1758. [Google Scholar] [CrossRef]
Wang, Y.; Li, Q.; Chen, B. Image classification towards transmission line fault detection via learning deep quality-aware fine-grained categorization. J. Vis. Commun. Image Represent. 2019, 64, 102647. [Google Scholar] [CrossRef]
Zheng, X.; Jia, R.; Gong, L.; Zhang, G.; Dang, J. Component identification and defect detection in transmission lines based on deep learning. J. Intell. Fuzzy Syst. 2021, 40, 3147–3158. [Google Scholar] [CrossRef]
Deng, F.; Zeng, Z.; Mao, W.; Wei, B.; Li, Z. A novel transmission line defect detection method based on adaptive federated learning. IEEE Trans. Instrum. Meas. 2023, 72, 3508412. [Google Scholar] [CrossRef]
Komoda, M.; Kawashima, T.; Minemura, M.; Mineyama, A.; Aihara, M.; Ebinuma, Y.; Kiuchi, M. Electromagnetic induction method for detecting and locating flaws on overhead transmission lines. IEEE Trans. Power Deliv. 1990, 5, 1484–1490. [Google Scholar] [CrossRef]
Cheng, H.; Zhai, Y.; Chen, R.; Wang, D.; Dong, Z.; Wang, Y. Self-shattering defect detection of glass insulators based on spatial features. Energies 2019, 12, 543. [Google Scholar] [CrossRef]
Yuan, C.; Xie, C.; Li, L.; Zhang, F.; Gubanski, S. Ultrasonic phased array detection of internal defects in composite insulators. IEEE Trans. Power Deliv. 2016, 23, 525–531. [Google Scholar] [CrossRef]
Xiao, Y.; Xiong, L.; Zhang, Z.; Dan, Y. A novel defect detection method for overhead ground wire. Sensors 2023, 24, 192. [Google Scholar] [CrossRef] [PubMed]
Fu, W.; Yang, K.; Wen, B.; Shan, Y.; Li, S.; Zheng, B. Rotating machinery fault diagnosis with limited multisensor fusion samples by fused attention-guided wasserstein GAN. Symmetry 2024, 16, 285. [Google Scholar] [CrossRef]
Liao, W.; Fu, W.; Yang, K.; Tan, C. Multi-scale residual neural network with enhanced gated recurrent unit for fault diagnosis of rolling bearing. Meas. Sci. Technol. 2024, 35, 056114. [Google Scholar] [CrossRef]
Ni, H.; Wang, M.; Zhao, L. An improved Faster R-CNN for defect recognition of key components of transmission line. Math. Biosci. Eng. 2021, 18, 4679–4695. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Wang, H.; Shen, J.; Zhang, X.; Gao, X. Application of Data-Driven Iterative Learning Algorithm in Transmission Line Defect Detection. Sci. Program. 2021, 2021, 9976209. [Google Scholar] [CrossRef]
Fu, Q.; Liu, J.; Zhang, X.; Zhang, Y.; Ou, Y.; Jiao, R.; Mazzanti, G. A small-sized defect detection method for Overhead transmission lines based on convolutional neural networks. IEEE Trans. Instrum. Meas. 2023, 72, 3524612. [Google Scholar] [CrossRef]
Yu, Z.; Lei, Y.; Shen, F.; Zhou, S.; Yuan, Y. Research on identification and detection of transmission line insulator defects based on a lightweight YOLOv5 network. Remote Sens. 2023, 15, 4552. [Google Scholar] [CrossRef]
Bhadra, A.B.; Hasan, K.; Islam, S.S.; Sarker, N.; Tama, I.J.; Khan, S.M. Robust Short-Circuit Fault Analysis Scheme for Overhead Transmission Line. In Proceedings of the IEEE International Women in Engineering (WIE) Conference on Electrical and Computer Engineering (WIECON-ECE), Dhaka, Bangladesh, 4–5 December 2021; pp. 104–107. [Google Scholar] [CrossRef]
Shen, T.; Liang, X.; Zhang, B.; Yang, G.; Li, D.; Zu, J.; Pan, S. Transmission line safety early warning technology based on multi-source data perception. In Proceedings of the IEEE 2nd International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), Zhuhai, China, 24–26 September 2021; pp. 261–264. [Google Scholar] [CrossRef]
Wang, Y.; Liang, Y.; Wang, J.; Zhang, S. Image improvement in the wavelet domain for optical coherence tomograms. J. Innov. Opt. Health Sci. 2021, 4, 73–78. [Google Scholar] [CrossRef]
Song, Q.; Ma, L.; Cao, J.; Han, X. Image denoising based on mean filter and wavelet transform. In Proceedings of the IEEE 4th International Conference on Advanced Information Technology and Sensor Application (AITS), Harbin, China, 21–23 August 2015; pp. 39–42. [Google Scholar] [CrossRef]
Kurniastuti, I.; Wulan, T.D.; Andini, A. Color Feature Extraction of Fingernail Image based on HSV Color Space as Early Detection Risk of Diabetes Mellitus. In Proceedings of the IEEE International Conference on Computer Science, Information Technology, and Electrical Engineering (ICOMITEE), Banyuwangi, Indonesia, 27–28 October 2021; pp. 51–55. [Google Scholar] [CrossRef]
Popuri, A.; Miller, J. Generative Adversarial Networks in Image Generation and Recognition. In Proceedings of the IEEE International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 13–15 December 2023; pp. 1294–1297. [Google Scholar] [CrossRef]
Doi, K.; Iwasaki, A. The effect of focal loss in semantic segmentation of high resolution aerial image. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Valencia, Spain, 22–27 July 2018; pp. 6919–6922. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar] [CrossRef]
Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. Int. Conf. Mach. Learn. 2017, 70, 214–223. [Google Scholar] [CrossRef]
Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 2017, 30, 5767–5777. [Google Scholar] [CrossRef]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2012, 2, 1097–1105. [Google Scholar] [CrossRef]
Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar] [CrossRef]
Howard, A.; Zhmoginov, A.; Chen, L.C.; Sandler, M.; Zhu, M. Inverted residuals and linear bottlenecks: Mobile networks for classification, detection and segmentation. In Proceedings of the 2018 Conference on Computer Vision and Pattern Recognition CVPR, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar] [CrossRef]

Figure 1. The basic structure of GAN.

Figure 2. Structure diagram of GoogLeNet.

Figure 3. Flowchart of the proposed method.

Figure 4. Sample schematic diagram of the transmission line defect dataset.

Figure 5. Test accuracy and loss curves for different models.

Figure 6. Confusion matrix of test results for different models.

Figure 7. T-SNE of test results for different models.

Table 1. Specific number of different defect classes.

Defect Class	Normal	Loose	Broken
Train	600	586	294
Test	60	60	60
Label	0	1	2

Table 2. Comparison of experimental results of different models.

Method	Accuracy (%)	Recall (%)	F1-Score (%)	Loss
AlexNet	86.34	90.11	89.10	0.20
DenseNet	94.01	97.20	97.37	0.16
MobileNet-V2	91.76	96.97	96.68	0.19
Proposed method	97.83	97.81	97.78	0.04

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gou, M.; Tang, H.; Song, L.; Chen, Z.; Yan, X.; Zeng, X.; Fu, W. Research on Defect Diagnosis of Transmission Lines Based on Multi-Strategy Image Processing and Improved Deep Network. Processes 2024, 12, 1832. https://doi.org/10.3390/pr12091832

AMA Style

Gou M, Tang H, Song L, Chen Z, Yan X, Zeng X, Fu W. Research on Defect Diagnosis of Transmission Lines Based on Multi-Strategy Image Processing and Improved Deep Network. Processes. 2024; 12(9):1832. https://doi.org/10.3390/pr12091832

Chicago/Turabian Style

Gou, Ming, Hao Tang, Lei Song, Zhong Chen, Xiaoming Yan, Xiangwen Zeng, and Wenlong Fu. 2024. "Research on Defect Diagnosis of Transmission Lines Based on Multi-Strategy Image Processing and Improved Deep Network" Processes 12, no. 9: 1832. https://doi.org/10.3390/pr12091832

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Defect Diagnosis of Transmission Lines Based on Multi-Strategy Image Processing and Improved Deep Network

Abstract

1. Introduction

2. Multi-Strategy Image Processing

2.1. Image Enhancement Based on Wavelet Denoising

2.2. Multi-Threshold Segmentation Based on HSV Color Space

2.3. Extraction of Transmission Line Regions Based on Morphological Processing

3. Detection of Transmission Line Defects

3.1. GAN

3.2. GoogLeNet

3.3. Focal Loss

3.4. Proposed Method

4. Experimental Results and Analysis

4.1. Dataset Introduction

4.2. Results and Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI