Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation

Meng, Dewei; Xu, Xuemei; Jiang, Zhaohui; Xu, Lei

doi:10.3390/app14198587

Open AccessArticle

Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation

¹

School of Electronic Information, Central South University, Changsha 410082, China

²

School of Automation, Central South University, Changsha 410082, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(19), 8587; https://doi.org/10.3390/app14198587

Submission received: 27 August 2024 / Revised: 14 September 2024 / Accepted: 19 September 2024 / Published: 24 September 2024

(This article belongs to the Special Issue Deep Learning for Object Detection)

Download

Browse Figures

Versions Notes

Abstract

Insulators are essential for electrical insulation and structural support in transmission lines. With the advancement of deep learning, object detection algorithms have become primary tools for detecting insulator defects. However, challenges such as low detection accuracy for small targets, weak feature map representation, the insufficient extraction of key information, and a lack of comprehensive datasets persist. This paper introduces OD (Omni-dimensional dynamic)-YOLOV7-tiny, an enhanced insulator defect detection method. We replace the YOLOv7-tiny backbone with FasterNet and optimize the convolution structure using PConv, improving spatial feature extraction efficiency and operational speed. Additionally, we incorporate the OD (Omni-dimensional dynamic)-SlimNeck feature fusion module and a decoupled detection head to enhance accuracy. For deployment on edge devices, channel pruning and channel-wise distillation are applied, significantly reducing model parameters while maintaining high accuracy. Experimental results show that the improved model reduces parameters by 53% and increases accuracy and mean average precision (mAP) by 3.9% and 2.2%, respectively. These enhancements confirm the effectiveness of our lightweight model for insulator defect detection on edge devices.

Keywords:

object detection; YOLOv7-tiny; insulator defect; channel pruning; channel-wise distillation

1. Introduction

The development of China’s power industry has led to increasingly long and complex transmission lines. Insulators, as critical components, serve dual purposes: providing mechanical support and ensuring electrical insulation [1]. Exposed to the atmosphere year-round, insulators are susceptible to natural factors such as lightning strikes, pollution, and severe cold. Failures in insulators can cause transmission lines to contact towers or other lines, resulting in power outages and potentially large-scale blackouts. Consequently, regular inspections of porcelain insulators on transmission lines are crucial for the reliable delivery of electricity across the grid [2]. Traditionally, insulator inspections have been performed manually. However, inspecting high-voltage transmission lines poses significant challenges in terms of difficulty, efficiency, and safety [3].

The rapid development of UAV (unmanned aerial vehicle) technology has enabled the power system to use drones for capturing images of insulators along high-voltage transmission lines [4]. These images are then analyzed and diagnosed using image processing algorithms. Despite this advancement, the wide angle and variable perspectives of drone photography present challenges such as small target sizes and complex backgrounds, leading to low detection accuracy, weak feature map representation, and the limited extraction of key information. The object detection algorithms based on deep learning have been proven to be highly effective in identifying defects in target objects, such as the external defects of capacitors [5]. As a result, researchers are increasingly utilizing image processing and deep learning techniques to detect insulator defects.

To address these challenges, this paper proposes an improved method for insulator defect detection based on the YOLOv7-tiny network. We replace the backbone network with the FasterNet architecture to reduce the model’s parameters. In the neck part of the YOLO model, we introduce the OD-SlimNeck feature fusion network, incorporating full-dimension dynamic convolution in the feature extraction layer. This enhancement not only improves the performance of the lightweight network but also enhances the detection of small targets. Additionally, a decoupled detection head is used to separately predict classification and localization tasks, thereby increasing the accuracy of insulator defect detection.

To further optimize the model, we applied channel pruning, significantly reducing the parameter size without a notable decrease in accuracy. Finally, to enhance detection accuracy further, we used the YOLOv7 model as a teacher model and applied channel distillation (CD) to our pruned model, thereby improving the accuracy of insulator defect detection.

This research aims to employ an improved deep learning network architecture for detecting insulator defects and to lighten the model through channel pruning and channel knowledge distillation, enabling deployment on mobile devices such as drones. The academic contributions of this paper are summarized as follows:

1. Enhanced Feature Fusion Network: We proposed the OD-SlimNeck network architecture for detecting insulator defects using drones. This enhancement improves the detection of small defects and prevents the loss of detailed information. By integrating multi-scale features and leveraging both local and global features, the network’s feature mapping representation is enhanced, thereby improving detection performance. The full-dimension dynamic convolution allows the network to extract regions of interest from complex backgrounds and interference-laden images, focusing on key targets and further boosting the model’s performance.

2. Model Compression through Channel Pruning: We applied the group slim channel pruning method to compress the improved model. Before pruning, the model undergoes sparsity training, ensuring that within certain pruning limits, the model’s performance is not significantly compromised. Pruning channels achieve a more lightweight network with fewer parameters and computational loads, and its effectiveness is evaluated.

3. Channel Distillation for Accuracy Enhancement: We employed the YOLOv7 model as a teacher model to perform channel distillation on the pruned model. This method enhances the accuracy of insulator defect detection without increasing the number of parameters, providing a feasible solution for deploying insulator defect detection models on mobile devices like drones, ensuring the precise identification of insulator defects.

The remainder of the study is organized as follows: Section 2 describes the development of object detection technology. Section 3 details the improvements made to the insulator defect detection model, including channel pruning and channel distillation. Section 4 presents the experimental details and results, covering data, evaluation metrics, and model performance analysis. Finally, Section 5 summarizes the findings of this research.

2. Related Work

Object detection technology can be broadly categorized into traditional methods and deep learning-based methods [6], with the introduction of AlexNet [7] marking the transition between the two. Currently, mainstream deep learning-based object detection algorithms are classified into two major categories: (1) region proposal-based two-stage object detection algorithms and (2) regression-based end-to-end single-stage object detection algorithms. Typical two-stage object detection algorithms include R-CNN [8], Fast R-CNN [9], and Faster R-CNN [10]. These methods involve generating region proposals and then classifying these proposals, which often results in higher accuracy but at the cost of increased computational complexity. Representative single-stage object detection algorithms include YOLOv3 [11], YOLOv4 [12], YOLOv7 [13], and RetinaNet [14]. The YOLO (You Only Look Once) series, known for its speed and versatility, has become a significant focus of research in recent years. Redmon et al. introduced the YOLO algorithm in 2016 [15], framing object detection as a regression problem to enable end-to-end rapid detection. Since its inception, the YOLO series has undergone continuous improvements and upgrades, incorporating effective advancements from other researchers in the field. Network pruning, a viable method for eliminating redundant branches within networks, is divided into two primary types: weight pruning and structured pruning. Weight pruning targets and removes inconsequential connections with minimal weights in trained networks, as noted in [16]. This approach, while effective, requires specialized sparse matrix operation libraries for acceleration, which may not be well suited for hardware implementations. In contrast, structured pruning, as discussed in [17], tends to offer a more balanced trade-off between flexibility and ease of implementation. Recent advancements in this field [18] involve entirely removing channels based on their small incoming weights, rather than just individual weights, followed by fine-tuning the network to restore accuracy. Additionally, methods cited in [19,20] apply sparsity regularization to various structural levels, including filters, channels, or layers, in convolutional neural networks (CNNs).

3. Materials and Methods

3.1. YOLOV7-Tiny Algorithm

As one of the most prominent object detection algorithms, the YOLO series effectively balances detection speed and accuracy. YOLOv7 is available in multiple versions, each varying in terms of the width and depth of the network model, yet all sharing a similar structure. Given the importance of speed in insulator defect detection using drone aerial photography, this paper selects the smallest and fastest version: YOLOv7-tiny. The specific structure is illustrated in Figure 1.

3.2. The Improved YOLOv7-Tiny Network Model

The improved YOLOv7-tiny builds upon the original YOLOv7-tiny by incorporating the FasterNet architecture as the backbone network, introducing the OD-SlimNeck feature extraction framework in the neck part, and utilizing a decoupled detection head. These enhancements reduce the number of parameters while improving the accuracy of insulator defect detection, making it more suitable for target detection in drone aerial images. The framework of the improved YOLOv7-tiny algorithm is illustrated in Figure 2, with its specific structure detailed in the following sections.

FasterNet Backbone

When conducting insulator defect detection tasks using drone aerial photography, it is crucial to ensure that the detection model can be deployed on mobile devices like drones while maintaining high detection speed. To achieve this, we focus on designing cost-effective, high-speed neural networks.

The YOLOv7-tiny neural network architecture includes some redundant computations, leading to a higher number of floating-point operations (FLOPs). This increase in FLOPs contributes to the model’s latency. The relationship among latency, FLOPs, and FLOPS is shown below:

L atency = \frac{F L O P s}{F L O P S}

(1)

FLOPs indicates the quantity of floating-point operations and FLOPS indicates floating-point operations per second. The division of FLOPs and FLOPS serves as a measure of computational latency. From Formula (1), the FasterNet network backbone improves the detection speed by effectively reducing FLOPs while increasing FLOPS [21], thereby reducing latency and increasing computation speed without compromising accuracy.

Depthwise Convolution (DWConv) is a prevalent optimization technique used in backbone networks. Unlike traditional convolution methods, DWConv allocates a single convolution kernel to each channel, ensuring that each channel is processed by just one kernel. This approach significantly reduces redundant computations and floating-point operations per second (FLOPs). However, substituting regular convolution with DWConv alone may lead to reduced network accuracy. To counter this, pointwise convolution (PWConv) is often employed alongside DWConv to enhance precision [22]. In this setup, the number of DWConv channels is expanded from c to c′, exceeding the original channel count c of standard convolution to offset the accuracy loss incurred by DWConv. Nevertheless, this increase in channels necessitates more memory access, thereby elevating latency and diminishing overall computational efficiency. Memory accesses for DWConv are detailed in Equation (2), where h and w denote the height and width of the image, respectively, and c represents the channel count.

h \times w \times {2 c^{'} + k}^{2} \times c \approx h \times w \times 2 c

(2)

The memory accesses for regular convolution are as follows:

h \times w \times 2 c + k^{2} \times c \approx h \times w \times 2 c

(3)

When c′ exceeds c, it is clear that the memory demand for DWConv surpasses that of standard convolution. There is a need for a novel convolution module that can replace the inefficiencies associated with both regular convolution and DWConv, thereby enhancing detection speeds.

Compared with regular Conv and DWConv, PConv in FasterNet only needs to apply conventional Conv to a part of the input channel to extract spatial features. The remaining channels remain unchanged. If the feature graph is stored continuously or regularly in memory, the first or last continuous channel is considered as a representative of the entire feature graph. According to the experimental results, compared to regular convolution, PConv achieves a significant reduction in FLOPs, with only 1/16th of the computational operations required. Similarly, the memory access in PConv is also reduced to just 1/4th of that in regular convolution. Similar to DWConv, the FasterNet module adds pointwise convolution (PWConv) to PConv in order to capture correlations between input channels. This combination forms two structures, including T-shaped convolution and two independent convolution structures. Comparisons of different convolution modes are represented in Figure 3.

Unlike traditional convolution, T-shaped convolution prioritizes the central position more significantly. While the T-shaped convolution module can be directly utilized for more efficient computation, it demands a greater number of floating-point operations per second (FLOPs) compared to PConv and PWConv. For identical input and output features, the FLOPs for T-shaped convolution are as follows:

h \times w \times (k^{2} \times c_{p} \times c + c \times (c - c_{p}))

(4)

The FLOPs of PConv and PWConv are as follows:

h \times w \times (k^{2} \times c_{p}^{2} + c \times c_{p})

(5)

where c_p is the first or last consecutive channel number in consecutive memory accesses, c > c_p, and c − c_p > c_p. Obviously, the FLOPs of T-shaped Conv are larger than those of PConv and PWConv.

Each FasterNet Block consists of one partial convolution (PConv) layer and two pointwise convolution (PWConv) layers, as illustrated in Figure 4. Moreover, normalization and activation layers play crucial roles in neural networks, serving as indispensable components. In each FasterNet module, Batch Normalization (BN) and Rectified Linear Units (ReLUs) [23] are integrated into the two PWConv layers. BN enhances training speed and accuracy, while the ReLU, serving as the activation function, accelerates model training and mitigates the issue of vanishing gradients. Positioning the normalization and activation layers between the two PWConv layers of each FasterNet module helps reduce latency and maintain feature diversity.

Replacing the backbone network of YOLOv7 with FasterNet effectively decreases the quantity of FLOPs and increases the quantity of FLOPS, which in turn reduces memory access and latency. This enhancement allows the model to decrease the detection time per image during the detection process, leading to higher FPS values.

2.: New Feature Fusion Network: OD-Slimneck

Object detection is a critical downstream task in computer vision. For various mobile devices, models with large numbers of parameters often struggle to meet the requirements for real-time detection. In order to accelerate the prediction computation, the input image in the CNN must almost undergo a similar transformation process in the backbone: the spatial information is gradually transferred to the channel. Each spatial (width and height) compression and channel expansion of the feature map may result in the partial loss of semantic information. Dense convolutional computation maximally preserves the hidden connections between individual channels, while sparse convolution completely disconnects these connections.

GSConv aims to maintain connectivity throughout its application [24], but deploying it across all stages of the model can lead to increased network depth. This additional depth hinders data flow, substantially extending inference times. By the time feature maps reach the neck stage, they are at their most condensed—maximized in channel dimensions and minimized in spatial dimensions—rendering further transformations unnecessary. Consequently, a more efficient strategy is to employ GSConv solely at the neck. Initially, GSConv replaces the lightweight convolutional method, SC, offering a computational cost of about 60–70% of SC, while still providing a comparable impact on the model’s learning capabilities. Subsequently, the GSbottleneck layer is integrated atop GSConv to further enhance the model’s performance. Figure 5 illustrates the structure of the GSConv module.

The construction of the GS bottleneck module is illustrated in Figure 6a. Following this, we have designed the cross-layer part of the network, the VoV-GSCSP module, using a one-time aggregation approach. The VoV-GSCSP module maintains satisfactory accuracy while reducing computational and architectural complexity. Figure 6b depicts the construction of the VoV-GSCSP module. Moreover, the SlimNeck was designed with hardware-friendliness in mind, facilitating easier deployment and operation across various hardware platforms. Therefore, replacing the neck part of YOLOv7-tiny with SlimNeck not only maintains good performance, but also enhances efficiency and broadens potential applications.

Due to the parameter-sharing characteristics of convolutional neural networks (CNNs) [25], traditional feature filters utilize convolutional kernel parameters that are fixed for the entire image. Consequently, when optimizing CNNs, there is often a need to upscale both the model size and parameters, as well as to utilize larger datasets. This requirement inevitably results in a significant increase in computational overhead. Omni-dimensional dynamic convolution (ODConv) expands upon traditional dynamic convolution kernels by incorporating the spatial kernel size and the number of input and output channels within the kernel space [26]. The ODConv is represented in Figure 7. This approach establishes a four-dimensional parallel strategy that leverages complementary attention mechanisms. The ODConv module substitutes the standard convolution in the backbone network. This substitution enhances network performance, rendering it more efficient and streamlined. The computational equations are as follows:

z = (α_{1} ⊙ W_{1} + \dots + α_{N} ⊙ W_{N}) * y

(6)

α_{i} = α_{w_{i}} ⊙ α_{f_{i}} ⊙ α_{c_{i}} ⊙ α_{s_{i}}, i = 1, 2, \dots, N

(7)

where

y \in R^{h \times w \times c_{i n}}

and

z \in R^{h \times w \times c_{o u t}}

represent the input and output features, respectively.

w_{i}

, the

i^{t h}

convolution kernel with

c_{o u t}

filters, is defined as

W_{m}^{i} \in R^{k \times k \times c_{i n}}

, where

m = 1, \dots, c_{o u t}

. The scalar

α_{w_{i}} \in R

weights the

W^{i}

with an attention mechanism;

α_{f_{i}} \in R^{c_{o u t}}

,

α_{c_{i}} \in R^{c_{i n}}

, and

α_{s_{i}} \in R^{k \times k}

are newly introduced attention scales. The operation

⊙

represents element-wise multiplication across different dimensions of the kernel space. The operation

*

denotes convolution.

Building on the concepts introduced for SlimNeck and ODConv, we aimed to both reduce the inference speed and complexity of the network’s neck section, while also enhancing the model’s accuracy in detecting insulator defects. Consequently, we propose a novel structure: OD-Slimneck, as illustrated in Figure 8. By integrating the ODConv module following the GSConv module, this structure increases the model’s parameter count minimally, yet significantly boosts the accuracy in detecting small-scale targets such as insulator defects. Replacing the neck network of YOLOV7-tiny with our OD-SlimNeck structure not only utilizes the lightweight SlimNeck design, but also significantly enhances the detection performance for small objects with ODConv compared to standard convolution layers.

3.: Decoupled Head

The detection head of YOLOv7-tiny is a coupled head, which typically channels the feature maps output from convolutional layers directly into several fully connected or convolutional layers. This generates outputs for object positions and categories, sharing parameters between the classification and localization branches. Although the network architecture is straightforward, it requires a significant amount of parameters and computational resources, which can lead to overfitting. Furthermore, the focus and interest of the tasks of object detection, localization, and classification differ; classification primarily concerns identifying which features are closest to existing categories, while localization focuses more on positional coordinates to adjust bounding box parameters [27]. Using the same feature map for both classification and localization often yields suboptimal results. Inspired by the decoupled head of YOLOX [28], we have designed a decoupled head module that can be integrated into YOLOv7-tiny. Unlike traditional coupled prediction heads, our module fully shares weights for category predictions, while only partially sharing weights for localization and confidence scores, incorporating a mixed-channel strategy to redesign a more efficient decoupled head structure. This approach reduces latency while maintaining accuracy, facilitating network convergence and enhancing the detection performance of deep convolutional neural networks. The structure of the decoupled head is illustrated in Figure 9. By replacing the coupled head in YOLOv7-tiny with a decoupled head, we can effectively reduce the number of parameters and computational complexity. This enhancement boosts the model’s generalization capabilities and robustness, significantly improving the performance of object detection.

4.: Channel pruning of the improved YOLOv7-tiny network

Although the improved YOLOv7-tiny model has fewer parameters, deploying it on drones for real-time insulator inspections remains a significant challenge due to the limited memory and computational capabilities of embedded devices. To address this issue, we applied a channel pruning algorithm to the trained YOLOv7-tiny insulator defect detection model to streamline it, facilitating easier deployment on lightweight devices with constrained processing power.

The most common method of model pruning is channel pruning, which requires minimal software and hardware resources and effectively balances fine-grained and coarse-grained approaches. As illustrated in Figure 10, channel pruning significantly reduces the model’s size, structure, and the number of parameters.

For our improved YOLOv7-tiny model, we employed channel pruning through a method that involves pruning the modified model [29]. Initially, we trained the original model with sparse regularization, assigning a scaling factor to each channel to signify its importance. Subsequently, we conducted channel sparsity training to distinguish between essential and non-essential channels. The Batch Normalization (BN) layer, a powerful operator that ensures rapid convergence and enhances generalization capabilities, is typically connected after the convolution layer in most traditional network architectures. Specifically, the BN layer employs batch statistics to normalize feature activations, as detailed below:

Z_{out} = γ \frac{Z_{i n} - μ}{\sqrt{σ^{2} + \in}} + β

(8)

In this context,

μ

and

σ

represent the mean and standard deviation of the features in a mini-batch, respectively.

γ

and

β

are trainable factors for scaling and shifting, respectively, and

Z_{o u t}

is the output of the Batch Normalization (BN) layer.

The Batch Normalization (BN) layers are inserted following the convolutional layers that have trainable factors [30], where the scaling factor serves as an indicator of channel importance. The loss function used during the sparse training process is as follows:

L = \sum_{(x, y)} l (f (x, W), Y) + α \sum_{γ \in} g (γ)

(9)

The first term represents the training loss of the network, where x and y denote the input and output of the training, respectively, and W represents the training parameters of the network. The second term is the L1 regularization of the scale coefficient

γ

of the Batch Normalization (BN) layer. The penalty factor

α

is used to balance these two terms, while

γ

serves as a scaling factor introduced for each channel. By minimizing L, the model tends to drive the

γ

coefficients of the BN layer towards zero, thereby achieving structural sparsity.

After training under sparsity regularization, we obtained a model with many scaling factors approaching zero. We can then prune the channels whose scaling factors are close to zero by removing all associated incoming and outgoing connections along with their respective weights. We employ a global threshold across all layers to prune the channels, which is defined as a certain percentile of all scaling factor values.

Finally, the channel-pruned lightweight model may initially experience some loss in accuracy. However, this can largely be compensated for by a subsequent fine-tuning process conducted on the pruned network. Fine-tuning gently recalibrates the parameters of the pruned model through mild optimization, compensating for the accuracy loss caused by the removal of channels, and achieving a balance between efficiency and performance. The entire process of channel pruning is illustrated in Figure 11.

5.: Channel Distillation

Knowledge distillation has significant applications in computer vision, particularly in the field of object detection, where it is used to reduce model parameters and improve the accuracy of existing models [31]. After channel pruning, to quickly restore the accuracy of the model, we assist the training of the pruned model through channel distillation. Unlike traditional knowledge distillation (KD), the channel distillation (CD) method transfers channel information from the teacher to the student [32].

In SENet [33], the channel-wise attention mechanism enables the model to learn the weights for each channel, which are then multiplied with the original channels. This process enhances the features of important channels while diminishing those of less significant ones, making the extracted features more directional and improving the network’s predictive performance. The weights for each channel are calculated as follows:

w_{c} = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{h = 1}^{W} u_{c} (i, j)

(10)

The variable

w_{c}

represents the weight of the

c^{t h}

channel. H and W denote the spatial dimensions of the feature map, while

u_{c} (i, j)

indicates the activation rate.

Each channel in the feature map corresponds to a visual pattern, yet the significance of the visual patterns across channels varies [34]. Given that the teacher model outperforms the student model, it suggests that the visual patterns learned by the teacher model are more accurate. Consequently, we aim for the student model to emulate these patterns. We utilize Global Average Pooling (GAP) to assess the importance of the feature mapping for each channel, representing the attention information of each channel. These attention data are then treated as knowledge. Both the student and the teacher compute the attention information from their respective feature maps. The teacher then guides the student in learning and acquiring this attention information. In this way, the student gains insight into the teacher’s channel-specific attention details, thereby enhancing its performance. Typically, the layers between the teacher and the student models do not align. For simplicity, we apply channel distillation only between networks where spatial resolution is reduced. Moreover, if the number of channels does not match, we adopt the FitNets approach, using 1 × 1 convolution to upscale dimensions [35]. First, we upscale the student’s feature mapping to match the number of channels in the teacher model, followed by channel distillation. The formula for channel distillation (CD) loss is defined as follows:

C D (s, t) = \frac{\sum_{i = 1}^{n} \sum_{j = 1}^{c} {(w_{s}^{i j} - w_{t}^{i j})}^{2}}{n \times c}

(11)

In the provided formula, CD(s,t) represents the CD loss between the student and the teacher. The term

w_{i j}

denotes the weight of the

j^{t h}

channel for the

i^{t h}

sample, and c denotes the number of channels.

After conducting channel distillation across different channels, we proceed with guided knowledge distillation (GKD) following the softmax predictions. Figure 12 illustrates the structure of our distillation process.

GKD is designed based on the foundational principles of knowledge distillation (KD) [36]. The core concept of KD involves calculating the distribution of predictions between a teacher model and a student model. By progressively minimizing the discrepancy between these two distributions, the output distribution of the student model becomes similar to that of the teacher model. The formula for KD is as follows:

p = s o f t m a x (\frac{a}{T})

(12)

K D (s, t) = \frac{\sum_{i = 1}^{n} K L (p_{s}^{i}, p_{t}^{i})}{n}

(13)

In this context, p represents the probability distribution computed by the logit function and T denotes the temperature parameter. Employing a higher T value results in a softer class probability distribution. Here, n refers to the batch size. KD(s,t) represents the mean Kullback–Leibler divergence (KL) between

p_{s}^{i}

and

p_{t}^{i}

.

Although the teacher network is more accurate than the student network, the teacher model still exhibits some prediction errors during batch processing, leading to significant issues. When the teacher makes incorrect predictions, this erroneous knowledge is transferred to the student, worsening the student’s performance. To address this, we have developed an improved version of knowledge distillation (KD), termed guided knowledge distillation (GKD). Our approach ensures that only the positive prediction distributions from the teacher are transferred to the student, while the negative predictions are disregarded. More specifically, our method involves backpropagating the KD loss only for samples correctly predicted by the teacher network, ignoring the errors. The formula for GKD is defined as follows:

G K D (s, t) = \frac{\sum_{i = 1}^{n} I (p_{t}^{i}, y_{i}) K L (p_{s}^{i}, p_{t}^{i})}{\sum_{i = 1}^{n} I (p_{t}^{i}, y_{i})}

(14)

In this formula, I represents an indicator function that equals 1 when the teacher’s output matches the true label, denoted as

I (p_{t}^{i}, y_{i})

, and 0 otherwise,

I (p_{t}^{i}, y_{i})

. For example, consider a batch containing n samples, where the teacher model correctly predicts

n_{1}

samples. In this case, the GKD (guided knowledge distillation) loss is computed only for these

n_{1}

correctly predicted samples.

After enhancing the network model as described above, channel pruning was applied, followed by channel distillation to obtain our final OD-YOLOV7-tiny network model. The entire process is illustrated in Figure 13.

4. Experiment and Results

4.1. The Experimental Environment and Dataset

In this study, all training and testing were conducted on a single computer equipped with an NVIDIA GeForce RTX 3090 graphics card (NVIDIA, Santa Clara, CA, USA). The machine was running Ubuntu 20.04, using CUDA version 11.3, and the deep learning framework PyTorch 1.11.0 [37]. Training was performed on both the original YOLOv7-tiny model and an improved version of the YOLOv7-tiny model, as well as our teacher model, YOLOv7. The training images were set to a resolution of 640 × 640 pixels. The batch size was configured at 16, with a total of 300 training iterations. The Adam optimizer was selected, starting with an initial learning rate of 0.001 [38].

The original dataset utilized in this study exhibits two types of defects: insulator damage and insulator flashover, as depicted in Figure 14. This dataset includes insulator defects of varying sizes, enhancing its realism. To improve the model’s generalizability, images from the original dataset were augmented through random rotations, flips, and brightness and contrast enhancements. The expanded dataset contains 1631 distinct images of insulator defects. These 1631 images were standardized to a resolution of 640 × 640 pixels, annotated in the format of the VOC2007 dataset, and split in a 7:2:1 ratio into training, testing, and validation sets, containing 1142, 326, and 123 images, respectively.

4.2. Pefermence Matric

To validate the effectiveness of the model, precision (P), recall (R), average precision (AP), and mean average precision (mAP) [39], which are commonly used in deep learning, are used as evaluation metrics. The formulas are shown as follows (15)–(18):

P = \frac{T P}{T P + F P}

(15)

R = \frac{T P}{T P + F N}

(16)

A P = \int_{0}^{1} p (r) d r

(17)

m A P = \frac{1}{N} \sum_{i = 1}^{N} A P_{i}

(18)

where TP, FP, and FN are the number of true positives, the number of false positives, and the number of false negatives, respectively.

4.3. Experiment

Ablation experiment

To verify the effectiveness of the proposed models, each was tested on the established dataset and compared with the original YOLOv7-tiny. The comparative results are presented in Table 1.

From Table 1, it is evident that replacing the backbone of the original YOLOv7-tiny network with FasterNet does not significantly alter the accuracy of insulator defect detection, yet it substantially reduces the model’s parameter count. Through multiple experiments, adding the OD-SlimNeck neck structure to the YOLOv7-tiny model using FasterNet as its backbone increased the accuracy of insulator defect detection from 71.6% to 74.5%. This represents a 2.9% improvement over the original YOLOv7-tiny network and a 7.1% increase in recall rate compared to the YOLOv7-tiny network, with mAP@0.5 also rising by 5.8%. The addition of a decoupled detection structure significantly enhanced the detection outcomes, achieving a precision of 78.4%, a recall rate of 65.4%, and an mAP@50 of 71.2%. Compared to the original network structure, the model parameter count has increased and the inference time slightly increased to 1.31 ms but still meets the requirements for real-time detection, confirming the effectiveness of the algorithm improvements made in this study.

2.: Channel pruning

We conducted channel pruning on the improved models described above. Table 2 presents the main parameters and results obtained from channel pruning, model fine-tune training, and sparse training. Among these, ‘Speed_up’ is defined as the ratio of the computational load before channel pruning to that after pruning. The number of parameters in a model reflects its complexity. More parameters allow the model to learn more complex features and patterns, theoretically enabling higher detection accuracies. However, this also implies that the model requires more computational resources and time for inference to achieve an optimal balance between performance and efficiency. To balance model detection accuracy and rate, we set ‘Speed_up’ to 2.0 and 3.0, effectively reducing the number of parameters to one-half and one-third of the original, respectively. The ‘Sparsity rate’ is the proportion of weights in the model that are set to zero. A higher sparsity rate means more weights are zeroed, thus making the model more sparse.

From Table 2, we discovered that when the Speed_up ratio is set to 3.0, the detection accuracy of the pruned model significantly declines, failing to effectively balance the relationship between detection accuracy and speed. However, when the speed-up ratio is set to 2.0, the pruned model maintains a relatively stable detection accuracy despite a 50% reduction in the number of parameters. Therefore, we employed a speed-up ratio of 2.0 for channel pruning. Furthermore, We compared the differences between the random pruning method and our channel pruning method. The comparative results of model performance before and after two pruning methods are shown in Table 3.

The comparison results are presented in Table 3, indicating that the model parameters are reduced by 49.79%, decreasing the size by 2.7 MB after channel pruning. The forward inference time is reduced by 0.53 ms, while the mAP@0.5 decreased by 4.8%. However, there is a significant reduction in the number of parameters and GFLOPs in the pruned model, and the accuracy of the pruned model is closely comparable to that of the pre-pruning insulator defect detection model. These findings suggest that the aforementioned improved model can be precisely simplified through channel pruning. The changes in the number of channels per convolutional layer in the model after channel pruning are shown in Figure 15. The orange segments represent the number of channels in all convolutional layers of the model before pruning, while the red segments represent the number after pruning. It is evident that the number of channels in most convolutional layers has been substantially reduced, demonstrating the effectiveness of the channel pruning algorithm in decreasing the model’s parameter count.

Sparse training introduces sparsity constraints, such as L1 regularization, during the training process, encouraging a significant portion of the weights in the network to converge to zero [40]. As illustrated in Figure 16, after undergoing sparse training, most of the model’s weights will be close to zero. This means that compression technologies can be employed to further reduce space consumption when storing and transmitting the model.

3.: Channel distillation

The student network designed in this study is a channel-pruned model, while the teacher network is based on YOLOv7, and after channel distillation, we call it a distilled network. Their detection results and the comparative results of the student network (prune model) with the distilled network are shown in Figure 17. The results indicate that, without altering the number of parameters, the accuracy of the pruned model is 75.5%, improved by 2.2%, and the mAP@0.5 is 28.4%, increased by 0.4%, following channel distillation. This demonstrates that channel distillation effectively enhances the precision of the lightweight detection model for insulator defect detection.

4.: Comparative experiment

At the same time, we also conducted experiments with SSD-lite [41], YOLOV8n, YOLOx-tiny, and YOLOv5s on our dataset. As shown in Table 4, the improved YOLOv7-tiny achieves the highest mAP@0.5 value among these algorithms while having the least number of parameters, effectively balancing model compression and accuracy loss.

5.: Generalization Ability Verification

To verify the robustness and generalization capability of our algorithm, we conducted defect detection using our method on the publicly available PCB Defects dataset from the Kaggle website [42]. The PCB Defects dataset contains 1386 images, with six types of defects: missing hole, mouse bite, open circuit, short, spur, and spurious copper. Our model achieved an average accuracy of 48.6% in detecting these six defect types within this dataset. As illustrated in Figure 18, the model successfully detects all missing hole defects on the PCB, demonstrating its strong generalization ability.

4.4. Experiment Results and Analysis

As shown in Table 5, after channel distillation, the resulting distilled model (OD-YOLOv7-tiny), has a parameter size of 2.8 M, which represents a 53% reduction compared to the original YOLOv7-tiny model. In terms of model speed, the inference speed of the original YOLOv7-tiny model was 1.25 ms, whereas the final OD-YOLOv7-tiny model achieved an inference speed of 0.78 ms, marking an improvement of 37.6% and significantly enhancing the speed. Regarding accuracy, the OD-YOLOv7-tiny model shows improvements in precision (P), recall (R), mAP@0.5, and mAP@0.5:0.95 by 3.9%, 5.3%, 2.2%, and 0.3%, respectively, compared to the original YOLOv7-tiny model. These enhancements in both model size and speed were achieved without compromising accuracy. Figure 19 illustrates the mAP@0.5 and mAP@0.5:0.95 throughout the training process for various networks, where the improved model refers to the enhanced network, the pruned model to the network after channel pruning, and the distilled model to the network after channel distillation, which is the final network, OD-YOLOv7-tiny.

As shown in Figure 19, the detection performance of the four models on the same type of image indicates that the model with network improvements performs the best. The accuracy of the model after channel pruning decreases; however, the detection capability of the model OD-YOLOv7-tiny, which underwent channel distillation, surpasses that of the original YOLOv7-tiny model. By reducing the model size and enhancing its speed, our proposed method can be effectively deployed on small edge devices. This allows for the real-time detection of insulator defects in power transmission lines, meeting the application requirements for rapid response handling. Figure 20a,b, respectively, display the heatmaps for our model’s detection of damaged insulator damage and insulator flashover. Through these heatmaps, we can identify which areas in the feature maps exhibit higher activation values. This enables us to comprehend the model’s focus on various features and offers a more intuitive understanding of its attention to different regions.

In order to directly test the detection outcomes of the improved OD-YOLOv7-tiny, we randomly selected several images from the test set of our insulator defect dataset, which contained various defects for detection, and conducted a visual comparison with YOLOv7-tiny. Figure 21 illustrates the detection performance of both models in different scenarios. It is evident that the improved OD-YOLOv7-tiny exhibits superior detection capabilities compared to the original YOLOv7-tiny.

5. Conclusions

This study addresses the real-time detection of insulator defects by introducing the OD-YOLOV7-tiny algorithm. This method refines the YOLOv7-tiny framework by incorporating network improvements, channel pruning, and channel distillation. We evaluated our enhanced algorithm alongside the original YOLOv7-tiny using the public PCB Defects dataset to validate its efficacy. The experimental results demonstrate that our algorithm significantly improves performance by reducing model size, increasing detection speed, and enhancing overall detection accuracy. The OD-YOLOV7-tiny model achieved a 53% reduction in parameters, a 37.6% increase in detection speed, and a 3.9% improvement in accuracy. This balance between model compression and minimal accuracy loss makes it highly suitable for deployment on small-scale edge devices, meeting the demands of practical applications. In future work, we will continue to explore insulator defect detection, aiming to further improve detection performance with more efficient methods while reducing computational requirements.

Author Contributions

Conceptualization, D.M.; Methodology, D.M., X.X. and Z.J.; Software, D.M.; Validation, L.X.; Formal analysis, X.X.; Investigation, L.X.; Writing—original draft, D.M.; Writing—review & editing, X.X.; Supervision, L.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China grant number 61803390, 61501525; Major Scientific Instrument Development Project of National Natural Science Foundation of China grant number 61927803.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, K.; Li, B.; Qin, L.; Li, Q.; Zhao, F.; Wang, Q.; Xu, Z.; Yu, J. A review on the application of deep learning target detection algorithm in overhead transmission line insulator defect detection. High Volt. Technol. 2022, 49, 3584–3595. [Google Scholar]
Tian, Y. Artificial intelligence image recognition method based on convolutional neural network algorithm. IEEE Access 2020, 8, 125731–125744. [Google Scholar] [CrossRef]
Wang, Y.; Li, Q.; Liu, Y.; Wang, C. Insulator defect detection based on improved YOLOv5 algorithm. In Proceedings of the 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS), Xiangtan, China, 12–14 May 2023; pp. 770–775. [Google Scholar]
Zeng, Y.; Zhang, R.; Lim, T.J. Wireless communications with unmanned aerial vehicles: Opportunities and challenges. IEEE Commun. Mag. 2016, 54, 36–42. [Google Scholar] [CrossRef]
Xu, L.; Xu, X.; Xia, Q.; Yao, Y.; Jiang, Z. A light-weight defect detection model for capacitor appearance based on the Yolov5. Measurement 2024, 232, 114717. [Google Scholar] [CrossRef]
Zou, Z.; Chen, K.; Shi, Z.; Guo, Y.; Ye, J. Object detection in 20 years: A survey. Proc. IEEE 2023, 111, 257–276. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–27 June 2023; pp. 7464–7475. [Google Scholar]
Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June 1–July 2016; pp. 779–788. [Google Scholar]
Han, S.; Pool, J.; Tran, J.; Dally, W. Learning both weights and connections for efficient neural network. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar]
Li, H.; Kadav, A.; Durdanovic, I.; Samet, H.; Graf, H.P. Pruning Filters for Efficient ConvNets. arXiv 2016, arXiv:1608.08710. [Google Scholar]
Molchanov, P.; Tyree, S.; Karras, T.; Aila, T.; Kautz, J. Pruning convolutional neural networks for resource efficient inference. arXiv 2016, arXiv:1611.06440. [Google Scholar]
Ye, J.; Lu, X.; Lin, Z.; Wang, J.Z. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. arXiv 2018, arXiv:1802.00124. [Google Scholar]
Huang, Z.; Wang, N. Data-driven sparse structure selection for deep neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 304–320. [Google Scholar]
Chen, J.; Kao, S.-h.; He, H.; Zhuo, W.; Wen, S.; Lee, C.-H.; Chan, S.-H.G. Run, Don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 12021–12031. [Google Scholar]
Hua, B.-S.; Tran, M.-K.; Yeung, S.-K. Pointwise convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 984–993. [Google Scholar]
Nair, V.; Hinton, G.E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A better design paradigm of detector architectures for autonomous vehicles. arXiv 2022, arXiv:2206.02424. [Google Scholar]
Rawat, W.; Wang, Z. Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 2017, 29, 2352–2449. [Google Scholar] [CrossRef]
Li, C.; Zhou, A.; Yao, A. Omni-dimensional dynamic convolution. arXiv 2022, arXiv:2209.07947. [Google Scholar]
Zhuang, J.; Qin, Z.; Yu, H.; Chen, X. Task-specific context decoupling for object detection. arXiv 2023, arXiv:2303.01047. [Google Scholar]
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Liu, Z.; Li, J.; Shen, Z.; Huang, G.; Yan, S.; Zhang, C. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2736–2744. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
Yang, Y.; Qiu, J.; Song, M.; Tao, D.; Wang, X. Distilling knowledge from graph convolutional networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 7074–7083. [Google Scholar]
Zhou, Z.; Zhuge, C.; Guan, X.; Liu, W. Channel distillation: Channel-wise attention for knowledge distillation. arXiv 2020, arXiv:2006.01683. [Google Scholar]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Simon, M.; Rodner, E. Neural activation constellations: Unsupervised part model discovery with convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1143–1151. [Google Scholar]
Adriana, R.; Nicolas, B.; Ebrahimi, K.S.; Antoine, C.; Carlo, G.; Yoshua, B. Fitnets: Hints for thin deep nets. Proc. ICLR 2015, 2, 1. [Google Scholar]
Hinton, G.; Vinyals, O.; Dean, J. Distilling the knowledge in a neural network. arXiv 2015, arXiv:1503.02531. [Google Scholar]
Paszke, A.; Gross, S.; Chintala, S.; Chanan, G.; Yang, E.; DeVito, Z.; Lin, Z.; Desmaison, A.; Antiga, L.; Lerer, A. Automatic differentiation in pytorch. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Zhang, Z. Improved adam optimizer for deep neural networks. In Proceedings of the 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada, 4–6 June 2018; pp. 1–2. [Google Scholar]
Henderson, P.; Ferrari, V. End-to-end training of object class detectors for mean average precision. In Proceedings of the Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, 20–24 November 2016; Revised Selected Papers, Part V 13. pp. 198–213. [Google Scholar]
Srinivas, S.; Subramanya, A.; Venkatesh Babu, R. Training sparse neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 138–145. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. pp. 21–37. [Google Scholar]
Tang, S.; He, F.; Huang, X.; Yang, J. Online PCB defect detector on a new PCB defect dataset. arXiv 2019, arXiv:1902.06197. [Google Scholar]

Figure 1. The YOLOv7-tiny algorithm framework.

Figure 2. The framework of the improved YOLOv7-tiny algorithm.

Figure 3. Comparison of different convolution modes.

Figure 4. Structure of FasterNet Block.

Figure 5. The structure of the GSConv module.

Figure 6. The structures of the (a) GS bottleneck module and the (b) VoV-GSCSP modules.

Figure 7. Structure of the ODConv.

Figure 8. Structure of the OD-SlimNeck.

Figure 9. Structure of the Decoupled head.

Figure 10. Pruning schematic diagram of channel pruning algorithm.

Figure 11. The entire process of channel pruning.

Figure 12. The structure of our distillation process.

Figure 13. The process of obtaining the OD-YOLOV7-tiny network model.

Figure 14. Examples of two different types of defects in the dataset: (a) insulator damage and the (b) insulator flashover.

Figure 15. The changes in the number of channels per convolutional layer in the model after channel pruning.

Figure 16. The distribution of model weights after sparse training.

Figure 17. The comparative results of the student’s and distilled networks.

Figure 18. The detections of PCB defects.

Figure 19. The mAP@0.5 throughout the training process for various networks.

Figure 20. The heatmaps for our model’s detection of damaged insulator damage and insulator flashover.

Figure 21. The detection performance of both models in different scenarios.

Table 1. Results of ablation experiments.

	Algorithm	P	R	mAP@0.5	mAP@0.5:0.95	Parameter Size (M)	GFLOPs	Inference Time (Batch Size 32)
A	YOLOV7-tiny	71.6%	56.0%	62.4%	28.1%	6.1	13.0	1.25 ms
B	A+FasterNet	71.1%	60.5%	63.4%	28.4%	4.1	8.5	1.16 ms
C	B+OD-Slimneck	74.5%	63.1%	68.2%	30.3%	3.9	7.7	1.14 ms
D	C+Decoupled_detect	78.4%	65.4%	71.2%	32.8%	5.5	10.7	1.31 ms

Table 2. Main parameter and results of different speed up experiments.

Speed_up Ratio	Sparsity Learning Epoch	Fine-Tune Training Epoch	Sparsity Ratio	P	mAP@0.5	Inference Time (Batch Size 32)
2.0	300	300	32.7%	73.3%	64.6%	0.78 ms
3.0	300	300	46.5%	50.6%	39.2%	0.61 ms

Table 3. The comparative results of model performance before and after two pruning methods.

	P	R	mAP@0.5	mAP@0.5:0.95	Parameter Size (M)	GFLOPs	Inference Time (Batch Size 32)
Before pruning	78.4%	65.4%	71.2%	32.8%	5.5	10.7	1.31 ms
After channel pruning	73.3%	58.5%	64.6%	28.0%	2.8	5.3	0.78 ms
After Random pruning	70.9%	58.6%	61.7%	26.3%	2.8	5.3	0.81 ms

Table 4. Detection results of different algorithms.

	mAP@0.5	mAP@0.5:0.95	Parameter Size (M)	GFLOPs
OD-YOLOV7-tiny	66.3%	28.4%	2.8	5.3
SSD-lite	47.4%	12.0%	3.0	3.8
YOlOx-tiny	68.9%	26.7%	5.0	7.6
YOLOV5s	68.4%	27.5%	4.9	12.4
YOLOV8n	69.2%	28.1%	3.2	8.7

Table 5. The parameters and detection results of the progressively optimized model.

Network	P	R	mAP@0.5	mAP@0.5:0.95	Parameter Size (M)	GFLOPs	Inference Time (Batch Size 32)
YOLOV7-tiny	71.6%	56.0%	62.4%	28.1%	6.1	13.0	1.25 ms
Improved model	78.4%	65.4%	71.2%	32.8%	5.5	10.7	1.31 ms
Pruned model	73.3%	58.5%	64.6%	28.0%	2.8	5.3	0.78 ms
Distilled model	75.5%	61.3%	66.3%	28.4%	2.8	5.3	0.78 ms

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Meng, D.; Xu, X.; Jiang, Z.; Xu, L. Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation. Appl. Sci. 2024, 14, 8587. https://doi.org/10.3390/app14198587

AMA Style

Meng D, Xu X, Jiang Z, Xu L. Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation. Applied Sciences. 2024; 14(19):8587. https://doi.org/10.3390/app14198587

Chicago/Turabian Style

Meng, Dewei, Xuemei Xu, Zhaohui Jiang, and Lei Xu. 2024. "Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation" Applied Sciences 14, no. 19: 8587. https://doi.org/10.3390/app14198587

APA Style

Meng, D., Xu, X., Jiang, Z., & Xu, L. (2024). Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation. Applied Sciences, 14(19), 8587. https://doi.org/10.3390/app14198587

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Real-Time Detection of Insulator Defects with Channel Pruning and Channel Distillation

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. YOLOV7-Tiny Algorithm

3.2. The Improved YOLOv7-Tiny Network Model

4. Experiment and Results

4.1. The Experimental Environment and Dataset

4.2. Pefermence Matric

4.3. Experiment

4.4. Experiment Results and Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI