Soldering Defect Segmentation Method for PCB on Improved UNet

Li, Zhongke; Liu, Xiaofang

doi:10.3390/app14167370

Open AccessArticle

Soldering Defect Segmentation Method for PCB on Improved UNet

by

Zhongke Li

and

Xiaofang Liu

^*

School of Computer Science and Engineering, Sichuan University of Science & Engineering, Yibin 644000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 7370; https://doi.org/10.3390/app14167370 (registering DOI)

Submission received: 7 May 2024 / Revised: 20 July 2024 / Accepted: 22 July 2024 / Published: 21 August 2024

(This article belongs to the Special Issue Advances in Computer Vision and Semantic Segmentation, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Despite being indispensable devices in the electronic manufacturing industry, printed circuit boards (PCBs) may develop various soldering defects in the production process, which seriously affect the product’s quality. Due to the substantial background interference in the soldering defect image and the small and irregular shapes of the defects, the accurate segmentation of soldering defects is a challenging task. To address this issue, a method to improve the encoder–decoder network structure of UNet is proposed for PCB soldering defect segmentation. To enhance the feature extraction capabilities of the encoder and focus more on deeper features, VGG16 is employed as the network encoder. Moreover, a hybrid attention module called the DHAM, which combines channel attention and dynamic spatial attention, is proposed to reduce the background interference in images and direct the model’s focus more toward defect areas. Additionally, based on GSConv, the RGSM is introduced and applied in the decoder to enhance the model’s feature fusion capabilities and improve the segmentation accuracy. The experiments demonstrate that the proposed method can effectively improve the segmentation accuracy for PCB soldering defects, achieving an mIoU of 81.74% and mPA of 87.33%, while maintaining a relatively low number of model parameters at only 22.13 M and achieving an FPS of 30.16, thus meeting the real-time detection speed requirements.

Keywords:

PCB soldering defect; semantic segmentation; UNet; defect detection

1. Introduction

As an indispensable carrier of electronic components, a printed circuit board (PCB) can realize various circuit functions and plays an essential role in support, fixation, and heat dissipation [1]. Widely used across industries such as computing and telecommunications, PCBs constitute the primary components of nearly all electronic devices [2]. With the advent of the fourth industrial revolution, there are heightened demands regarding the levels of integration and quality in PCB manufacturing [3]. High-quality PCB products undergo stringent inspection procedures to ensure their stability and reliability [4]. The current efforts by electronic manufacturers to enhance the production quality focus on developing high-precision inspection methods, following the trend of “zero-defect manufacturing”, aimed at minimizing defect occurrences as much as possible [5]. Given the complexity of PCB manufacturing processes, the emergence of various types of defects is inevitable [6], such as soldering defects, including missing solders, short solders, and solder balls. The quality of the soldering of electronic components mounted on PCBs greatly impacts the overall product quality. Soldering defects directly or indirectly affect the actual performance of the product. Therefore, conducting a solder quality inspection of PCBs is a crucial step in ensuring the compliance and reliability of the product. Conducting PCB defect detection not only optimizes PCB production lines and reduces the likelihood of defects but also contributes to resource conservation, minimizes capacity wastage, and improves resource utilization. Detecting PCB defects has become a crucial step in enhancing the PCB production quality, and businesses urgently require efficient PCB defect detection technologies to address this issue [7].

In the early stages of PCB manufacturing, defect detection relied primarily on visual inspection by workers [8], which has some obvious drawbacks. Firstly, it is heavily influenced by human factors, and the accuracy may vary depending on the skill level and fatigue of the workers. Secondly, with the increase in PCB production and the shrinkage of the line spacing and component volume, visual inspection becomes increasingly challenging. It fails to meet the requirements of modern industrial production levels. With technological advancements, PCB defect detection has gradually shifted toward utilizing advanced machine equipment technologies, such as automated optical inspection (AOI) [9]. AOI technology primarily uses template matching or image recognition methods [10], significantly enhancing the detection efficiency. However, the detection outcome is directly affected by the template matching accuracy and lighting conditions, and the equipment still requires supervision, maintenance, and the interpretation of the results by skilled personnel [11]. The inefficiency of traditional detection methods underscores the crucial role of efficient and automated PCB defect detection in the inspection process.

Currently, deep-learning-based detection methods have become the mainstream direction for PCB defect detection due to their high detection efficiency [12]. Deep-learning-based PCB defect detection mainly consists of two types of networks: object detection and semantic segmentation [13]. Object detection networks can be further divided into two-stage algorithms, represented by the Faster-RCNN algorithm [14], and single-stage algorithms, dominated by the SSD algorithm [15] and the YOLO series algorithm [16]. For example, Ding et al. [17] proposed TDD-net for the detection of PCB defects by combining Faster-RCNN and a feature pyramid network, adopting the point of view that PCB defects account for a small percentage of the image, as well as considering the irregular shape of the defects. Shi et al. [18] proposed an algorithm, named SSDT, for tiny defect detection based on SSD, which resulted in improved PCB surface defect detection accuracy. Tang et al. [19] introduced a lightweight PCB defect detection model, light-PDD, which utilizes an improved cross-stage local structure for feature fusion. This model reduces the complexity, enhances the detection speed, and achieves high accuracy.

Semantic segmentation involves partitioning an image into regions with different features, extracting regions of interest [20]. Kang et al. [21] proposed an adaptive feature reconstruction network (AFRNet) consisting of a Siamese encoder with shared parameters and a symmetric feature reconstruction module. This significantly improves the segmentation accuracy of various benchmark models for defects on PCBs. Ling et al. [22] proposed combining the similarity measurement of Siamese networks with encoder–decoder semantic segmentation networks to further enhance the segmentation capabilities for soldering defects on PCBs. Although both networks achieve high segmentation performance, their architectures require both template images and defect images as input, making the segmentation performance highly dependent on image matching, and the detection speed of the networks is relatively slow. Furthermore, the preparation of template images and defect images is tedious and requires high-level acquisition techniques. The UNet network proposed by Ronneberger et al. [23], which adopts a U-shaped encoder–decoder network structure, effectively utilizes the contextual information in images to enhance the segmentation accuracy and is widely applied in defect detection. Chen et al. [24] introduced a multi-scale dynamic attention fusion UNet model (MDAF-UNet), providing an effective solution for the segmentation of defects in mobile phone backplanes. Kamanli [25] proposed MCPAD-UNet for steel surface defect segmentation based on multi-scale cross-patch attention with dilated convolution. This model accurately identifies and segments defects in industrial processes, leveraging multi-scale contextual information to handle complex and subtle defects.

In summary, the challenges in the segmentation of PCB soldering defects include small defect sizes, irregular contours, and significant interference due to background image similarity. In previous studies on PCB defect segmentation, most methods used template matching, such as in references [21,22]. Although template matching can achieve satisfactory segmentation results, it requires both template images and defect images, which creates high data acquisition requirements and results in a slow segmentation speed. To address these issues, this paper proposes three critical improvements based on UNet. The UNet network structure only requires defect images for segmentation, and, after the improvements, the segmentation accuracy is enhanced and it meets the requirements regarding the real-time segmentation speed. First, we enhance the encoder’s focus on the differences in features at different scales; second, we introduce a hybrid attention mechanism to reduce the background similarity interference and strengthen the ability to segment the contours of small soldering defects in images at various scales; third, we enhance the decoder’s feature fusion capabilities. Under the condition of having only defect images, this paper achieves higher segmentation accuracy for PCB soldering defects while using a small number of model parameters and achieving a fast detection speed. The main contributions of this paper are as follows.

(1): The encoder adopts a backbone network that varies in its attention to different scales, flexibly adjusting the degree of attention to differently scaled feature layers during the feature extraction stage. While maintaining an emphasis on shallower feature layers, it enhances the attention to deeper feature layers, thereby improving the overall model’s ability to segment defect edges.
(2): The dynamic hybrid attention module (DHAM) is proposed to replace the ordinary skip connections in the network. In conditions where the background image similarity causes substantial interference, the DHAM effectively utilizes features at various scales to enhance the attention to defect contours.
(3): The residual GSConv module (RGSM), based on GSConv, is proposed to be applied within the decoder network, enhancing the fusion capabilities for features between low-resolution and high-resolution feature maps to generate more accurate segmentation results.

The rest of this paper is organized as follows. Section 2 describes the structure and each module of the improved UNet. Section 3 presents the experimental setup, evaluation metrics, and loss function. Section 4 presents the experimental results and analysis, and Section 5 provides the conclusions.

2. Methodology

2.1. Architecture Overview

The encoder–decoder structure is the mainstream network structure for semantic segmentation, and UNet is a classic and effective encoder–decoder network structure, where the encoder plays the role of feature extraction and downsampling to reduce the resolution of the image, and the decoder is used for feature recovery and the feature fusion and segmentation of the image. In this paper, based on the UNet network architecture, VGG16 is selected as the encoder. The DHAM is introduced to replace the skip connections, and the decoder part is entirely replaced with the RGSM to enhance the network’s attention to defects and improve the feature fusion capability, thereby enhancing the segmentation accuracy for defects. The improved network model slightly increases the number of parameters and slightly reduces the segmentation speed, but they remain within acceptable limits. Compared to other models, it still maintains several advantages.

Figure 1 illustrates the improved UNet network architecture proposed in this paper. The blue arrows represent the 3 × 3 convolutional layer and the ReLU activation function, which are used for feature extraction in the encoder. The pink arrows represent the 2 × 2 max pooling layer, which reduces the image’s dimensions. The orange arrows represent the DHAM, which enhances the attention to defects under interference from background similarity. The yellow arrows represent the upsampling layer, which increases the image’s dimensions for feature fusion. The green arrows represent the RGSM, which enhances the feature fusion capabilities of the decoder. The purple arrows represent the 1 × 1 convolutional layer, which alters the output image dimensions to produce the final segmentation result. This network architecture first employs VGG16 as the encoder, replaces ordinary skip connections with the DHAM, and then uses multiple RGSMs as the decoder, with a 1 × 1 convolutional layer to output the final result of segmentation.

2.2. VGG16 Encoder

In the encoder of the original UNet network, each layer consists of two stacked 3 × 3 convolutional layers followed by a 2 × 2 max pooling layer to form a downsampling module. This design endows the model with strong feature learning capabilities. However, its structure is relatively simple, with the same operations applied to shallow and deep feature layers. As a result, there is insufficient attention to the differences between feature layers at different scales, leading to limited feature extraction capabilities.

For PCB soldering defects, due to their small size and irregular contours, it is crucial to focus more on relatively deeper feature layers during the feature extraction stage to capture the more detailed edge features of the defects. VGG16, with a similar feature extraction approach to the UNet encoder structure, possesses more robust feature extraction capabilities in the deeper layers due to its additional convolutional layers and max pooling layers. By replacing the original encoder with the feature extraction part of VGG16, we can flexibly adjust the attention to differently scaled feature layers during the feature extraction stage. This adjustment enhances the focus on deeper feature layers while maintaining an emphasis on shallower feature layers. This approach boosts the ability to capture defect contours and extract high-level semantic information, thereby improving the overall model’s capability for defect edge segmentation.

2.3. Dynamic Hybrid Attention Module

For PCB soldering defect images, the similarity interference in the background is severe due to the small spacing and high similarity between PCB components. Introducing attention mechanisms can alleviate background interference by directing the model’s focus toward the critical parts of the defect areas, thereby improving the segmentation accuracy. This paper proposes a dynamic hybrid attention module, which combines channel attention and dynamic spatial attention in parallel. By integrating different attention perspectives from the channels and spatial dimensions, this module comprehensively considers the features of the attention area, filters out other information with substantial similarity interference in the image, and selects critical information from the defect area to enhance the model’s feature-capturing capability. Additionally, dynamic convolution [26] is introduced into the spatial attention, adapting the convolution kernel weights flexibly based on different input channels to enhance the adaptability to differently scaled context information and improve the feature representation capabilities. This paper incorporates the DHAM into the skip connections between each encoder and decoder, replacing the simple connection operation in the original UNet network. It selects essential feature information from multiple scales and focuses more specifically on features of different scales, thereby further improving the model’s performance.

The DHAM, shown in Figure 2, comprises channel attention and dynamic spatial attention.

In the channel attention module (CAM), average pooling is utilized to aggregate the feature maps of each channel, followed by two fully connected layers to learn the characteristics of each channel and determine which channels are more significant. Subsequently, the output scale is adjusted through a batch normalization layer to obtain the channel attention weights. The calculation of the CAM is as follows:

f_{c a} = B N (F C_{2} (F C_{1} (A v g (f_{i n}))))

(1)

where

f_{i n}

denotes the input feature,

A v g

represents average pooling,

F C

represents the fully connected layer, and

B N

represents batch normalization.

Dynamic convolution involves utilizing multiple convolutional kernels in a single layer and dynamically aggregating these kernels using an attention mechanism in a nonlinear manner. The attention mechanism is input-dependent. In a single layer, dynamic convolution sets up k parallel convolutional kernels. For each input datum

x

, dynamic convolution calculates an input-dependent attention weight

π_{k}

. Based on the calculated attention weights, the output of each convolutional kernel is multiplied by the corresponding attention weight, and these weighted outputs are then summed to obtain the final convolution result. The attention weights

π_{k}

are obtained by compressing the global spatial information using global average pooling. This is followed by two fully connected layers (with a ReLU activation function introduced in between) and a softmax function. The dynamic convolution structure is computationally efficient and has small kernel sizes, greatly enhancing the model’s feature representation capabilities. The expression for dynamic convolution is as follows:

\begin{array}{l} y = g ({\tilde{W}}^{T} (x) x + \tilde{b} (x)) \\ \tilde{W} (x) = \sum_{k = 1}^{K} π_{k} (x) {\tilde{W}}_{k}, \tilde{b} (x) = \sum_{k = 1}^{K} π_{k} (x) {\tilde{b}}_{k} \\ s . t . 0 \leq π_{k} (x) \leq 1, \sum_{k = 1}^{K} = 1 \end{array}

(2)

where the aggregated weight

\tilde{W} (x)

and bias

\tilde{b} (x)

are functions of the input.

g

is an activation function.

π_{k}

is the attention weight for the

k^{t h}

linear function

{\tilde{W}}_{k}^{T} x + {\tilde{b}}_{k}

.

In the branch of the dynamic spatial attention module (DSAM), the process starts by compressing and integrating the feature maps using a 1 × 1 convolution. Subsequently, three dynamic convolutions are applied to effectively utilize the contextual information from feature maps of different scales and enhance the feature representation at different spatial positions. Finally, a 1 × 1 convolution adjusts the output scale, resulting in spatial attention weights. The calculation for the DSAM is as follows:

f_{s a} = B N (c_{2}^{1 \times 1} (d_{3}^{3 \times 3} (d_{2}^{3 \times 3} (d_{1}^{3 \times 3} (c_{1}^{1 \times 1} (f_{i n}))))))

(3)

where

c

represents a standard convolution operation,

d

represents a dynamic convolution operation,

B N

denotes batch normalization, and the superscript denotes the convolutional filter size. After adding the channel attention and spatial attention and passing through the sigmoid activation function, it can be represented as

f_{c s} = σ (f_{c a} + f_{s a})

(4)

where

σ

represents the sigmoid activation function. Based on this, the final representation of the feature map output after the dynamic hybrid attention module is

f_{o u t} = f_{i n} + (f_{i n} \cdot f_{c s})

(5)

2.4. RGSM Decoder

Slim-neck introduces a lightweight convolution system called GSConv [27], whose structure is illustrated in Figure 3. Depth-wise separable convolution can reduce the number of parameters and the computational overhead but does not effectively integrate spatial and channel information. The GSConv structure first applies a standard convolution operation to the input feature map to obtain the first set of feature maps. Then, these feature maps undergo depth-wise separable convolution. Specifically, a separate convolutional kernel is applied independently to each input channel. Subsequently, a 1 × 1 convolutional kernel is used to perform linear combinations between the channels, generating the second set of feature maps. Subsequently, these two sets of feature maps are concatenated and subjected to channel shuffle operations, enhancing the flow of feature information in both the spatial and channel dimensions while reducing the network’s computational complexity.

The role of the decoder is mainly to fuse the downsampled feature maps with the upsampled feature maps to restore the detailed image information. Due to the small and irregular nature of PCB soldering defects, the detailed defect information is severely reduced after multiple rounds of feature extraction. Effective feature fusion operations are required in the decoder to restore the features effectively. To address this issue, this paper proposes a residual GSConv module (RGSM) based on GSConv and incorporating residual connections for use in the decoder. The RGSM structure, as shown in Figure 4, takes as input a feature map obtained by combining downsampled and upsampled feature maps. First, a 1 × 1 GSConv is employed to effectively fuse the features and exploit the channel shuffling feature, recovering detailed information from the shallower features and aiding in feature restoration from upsampled features. Through the above operations, features of different depths but the with same scale, which were simply concatenated, are fused to form new features. Subsequently, ReLU activation is applied to enhance the nonlinearity of the features, and GSConv is used again to enhance the feature fusion capabilities. Secondly, a residual connection structure combines the features obtained from another 3 × 3 standard convolution branch, followed by ReLU activation to enhance the nonlinearity of the combined feature maps, thereby strengthening the feature expression capabilities. Additionally, the RGSM effectively reduces the number of model parameters, decreases the network’s computational costs, and alleviates the problem of the parameter increase caused by deepening the network in the encoder, thereby facilitating overall model optimization.

3. Experimental Setup and Evaluation Indicators

3.1. Dataset

The PCB soldering defect dataset used in this study consists of five types of defects [22], namely missing solder (MS), missing proteus (MP), solder short (SS), overturned direction (OD), and solder ball (SB) defects. MS refers to components having insufficient or no solder joints, which may cause short circuits on the PCB. MP indicates that the components are not installed in the correct position, resulting in an open circuit. SS refers to two or more adjacent solder joints of components melted together, leading to open circuits or abnormal circuit functions. OD denotes the reverse installation of chip components, which may affect the functionality of the PCB. SB represents solder balls that violate the minimum spacing requirements. Figure 5 provides examples of these five types of defects. Additionally, there are 285 instances of MS defects, 166 cases of MP defects, 92 cases of SS defects, 128 cases of OD defects, and 177 instances of SB defects.

3.2. Evaluation Indicators

This paper employs the following four commonly used semantic segmentation metrics to comprehensively evaluate the segmentation performance of the model: mean intersection over union (mIoU), mean pixel accuracy (mPA), frames per second (FPS), and number of model parameters. True positive (TP) represents the number of correctly predicted positive samples, true negative (TN) denotes the number of correctly predicted negative samples, false positive (FP) indicates the number of negative samples incorrectly predicted as positive, and false negative (FN) represents the number of positive samples incorrectly predicted as negative.

The mIoU is a commonly used evaluation metric for semantic image segmentation. It calculates the intersection ratio to the union between the predicted set and the ground truth for each class, and then computes the average across classes. The definition of the mIoU is as follows:

mIoU = \frac{1}{N} \sum_{i = 1}^{N} \frac{{TP}_{i}}{{TP}_{i} {+ FP}_{i} {+ FN}_{i}}

(6)

The mPA aims to measure the accuracy of pixel-level classification tasks. It represents the average proportion of correctly predicted pixels for each class out of the total number of pixels. The calculation formula is as follows:

mPA = \frac{1}{N} \sum_{i = 1}^{N} \frac{{TP}_{i} {+ TN}_{i}}{{TP}_{i} {+ TN}_{i} {+ FP}_{i} {+ FN}_{i}}

(7)

3.3. Experiment Details

The dataset consists of 340 images with defects, out of which 240 are used for training and 100 for validation. To ensure that the model learns an adequate number of defect features and maintains robustness to varying lighting conditions, image angles, and sizes, data augmentation is applied to the training set. This includes randomly adjusting the brightness with a factor ranging from −0.2 to 0.2, adjusting the contrast with a factor ranging from −0.2 to 0.2, randomly rotating the image between −90 and 90 degrees, and resizing the image to 512 × 512 pixels. These augmentation techniques are applied individually or in random combinations, resulting in an expanded training set of 720 images. The validation set remains unchanged to evaluate the model’s training performance and ensure its generalization.

The input image size is uniformly set to 512 × 512. Training is conducted for 200 epochs with a batch size of 8, using the SGD optimizer for parameter optimization. The initial learning rate is set to 0.0001, and the cosine annealing learning rate schedule is employed for adjustment. The hardware of the experimental environment is detailed in Table 1.

3.4. Loss Function

In PCB soldering defect images, the defect areas occupy a tiny proportion of the entire image, leading to a severe imbalance between positive and negative samples. Moreover, there is also an imbalance in the quantities of different types of defects. The choice of the loss function plays a crucial role in defect segmentation.

This paper adopts a hybrid loss function combining focal loss and Dice loss to address the issues above. Focal loss enables the model to focus more on defect areas in complex backgrounds, while Dice loss helps to alleviate the problem of poor training performance due to class imbalance. The formulas for the focal loss, Dice loss, and focal–Dice loss are as follows:

L_{f o c a l} = \frac{1}{N} \sum_{i = 1}^{N} (1 - α) {p_{i}^{'}}^{γ} (1 - p_{i}) \log (1 - p_{i}^{'}) - α p_{i} {(1 - p_{i}^{'})}^{γ} \log (p_{i}^{'})

(8)

L_{d i c e} = 1 - \frac{2 TP}{2 TP + FP + FN}

(9)

L = L_{f o c a l} + L_{d i c e}

(10)

where

p_{i}

represents the actual pixel value and

p_{i}^{'}

represents the probability predicted by the model for a defect.

α

is the balancing factor for the weighting of the loss values of positive and negative samples. Due to the highly disparate proportions of background pixels and defect pixels in PCB soldering images,

α

is set to 0.9.

γ

adjusts the loss function for easily separable samples, causing the model to pay more attention to difficult samples to improve the performance and robustness on these samples, with

γ

set to 2.

In this paper, based on the improved UNet model, experimental results were obtained using the Dice loss, focal loss, CE loss, CE–Dice loss and focal–Dice loss. The results are shown in Table 2. It can be observed from Table 2 that the choice of loss function has a crucial impact on the training effectiveness of the soldering defect segmentation model. The focal–Dice loss effectively overcomes the challenges posed by the tiny proportion of defects in the images and the imbalance in the number of classes.

4. Experimental Results

4.1. Encoder Comparison Experiments

The encoder primarily functions as a feature extractor, and selecting a suitable encoder is crucial for the overall segmentation performance of the model. Table 3 presents a comparison of the segmentation performance results obtained using different encoders, including MobileNetV3, ResNet50, and VGG16. When using VGG16, the model achieves the highest mIoU and mPA by incorporating deeper convolutional networks for the enhanced extraction of deeper-level features. Specifically, compared to MobileNetV3, the mIoU increases by 6.77%, and, compared to ResNet50, it increases by 4.3%. Similarly, the mPA increases by 5.34% compared to MobileNetV3 and 3.58% compared to ResNet50. These results indicate that VGG16 possesses stronger feature extraction capabilities. MobileNetV3, with fewer parameters and designed as a lightweight encoder, exhibits weaker feature extraction capabilities for soldering defect features, resulting in lower segmentation performance.

4.2. Attention Module Comparison Experiments

To validate the effectiveness of the proposed DHAM, other classical attention mechanisms (BAM, CBAM, ECA) and recently proposed attentions (LSKA, CCA) were compared in the same positions within the network and based on the selection of the VGG16 encoder. The evaluation criteria included the mIoU, mPA, and number of model parameters, as shown in Table 4. Large selective kernel attention (LSKA) [28] can adjust the receptive fields but does not perform well for small soldering defects. Context anchor attention (CCA) [29], on the other hand, enhances the central features by utilizing the interdependencies between pixels, leading to improved segmentation accuracy. BAM [30], CBAM [31], and DHAM are all hybrid attention mechanisms based on channel and spatial attention. ECA [32], however, focuses solely on channel information while ignoring spatial information, resulting in no improvement in the segmentation performance. From the experiments, it is evident that the hybrid attention, which simultaneously focuses on defect areas from both channel and spatial perspectives, enhances the PCB soldering defect segmentation performance. Using BAM, the mIoU reaches 80.25% and the mPA reaches 86.24%, while CBAM achieves an mIoU of 80.45% and mPA of 86.69%. DHAM, by strengthening the adaptability to context information at different scales, achieves the greatest improvement in the model segmentation performance, with an mIoU of 80.50% and mPA of 86.67%, and, as a lightweight module, it adds almost no additional parameters.

4.3. Ablation Experiments

To validate the effectiveness of each module improvement in enhancing the model performance, this study conducted ablation experiments on VGG16, the DHAM, and the RGSM individually and in combination. The evaluation criteria included the mIoU, mPA, and number of model parameters, as shown in Table 5. Module A represents the VGG16 encoder, Module B represents the DHAM, and Module C represents the RGSM.

Firstly, Modules A, B, and C each showed improvements in the mIoU and mPA compared to the original UNet model. Module A, by enhancing the extraction of deep features, exhibited significant improvements in both the mIoU and mPA, but also led to an increase in the number of model parameters. Module B combined the different attention perspectives of the channels and spatial domains and adaptively adjusted the weights of the convolutional kernels in the spatial aspect to enhance the feature expression. With only a minor increase in parameters, the mIoU reached 77.54% and the mPA rose to 83.76%. Module C employed effective feature fusion operations to restore the features effectively, improving the model’s ability to express feature details and enhancing the segmentation capabilities. The mIoU increased to 79.33%, the mPA rose to 84.24%, and the number of model parameters was reduced. The combination of any two modules outperformed their individual usage, indicating that the advantages of each module were not masked when used together but they rather complemented each other to enhance the overall segmentation capabilities of the model. The same trend was observed for the combination of three modules. Although Module A increased the number of model parameters, the lightweight nature of Module B and the parameter-reducing effect of Module C resulted in a final model with only a 2.54M increase in the number of model parameters compared to the original model. However, this model demonstrated improved overall segmentation performance, with an mIoU of 81.74% and mPA of 87.33%.

4.4. Comparison Experiments

To demonstrate the effectiveness of our proposed method, we compared several common segmentation methods for defect segmentation, including UNet [23], DeepLabv3plus [33], SegFormer [34], PSPNet [35], and HRNet [36]. We used the mIoU, mPA, number of model parameters, and FPS as evaluation metrics in the experiments on the PCB soldering defect dataset. The comparison results are presented in Table 6, and the specific IoU values for the background and each defect using the different methods are provided in Table 7. From the experimental results, it can be observed that DeepLabv3plus did not perform well for small defects like SS, resulting in a relatively low overall mIoU and a large number of model parameters with a slower speed. Although SegFormer had the smallest number of model parameters and the fastest speed, its performance in terms of the mIoU and mPA was not outstanding. PSPNet exhibited a faster detection speed but had a larger number of model parameters and relatively lower segmentation performance. While UNet had fewer parameters and a faster detection speed, its mIoU was only 77.40% and the mPA was 83.11%. HRNet demonstrated higher segmentation accuracy but had a larger number of model parameters and a slower detection speed. None of the mentioned algorithms could simultaneously achieve high segmentation accuracy, a fast detection speed, and a low number of model parameters. In contrast, our proposed method achieved the highest mIoU and mPA, reaching 81.74% and 87.33%, respectively. Compared to UNet, there was an improvement of 4.34% and 4.22% in the mIoU and mPA, respectively. Moreover, our method outperformed the others in the segmentation of each defect, especially for the most challenging MP and SS defects, surpassing other methods significantly while maintaining a lower number of model parameters and a faster detection speed. Compared to the other methods, it achieved the best overall performance.

Additionally, Figure 6 provides the segmentation results of various methods for different types of defects. All methods performed well in segmenting defects with large and regular shapes. However, as shown in the comparison of the second row of images, DeepLab3plus, SegFormer, and UNet failed to detect small SB defects. From the comparisons in the first and third rows of images, it can be observed that the other methods did not fully capture the details of certain defects in their segmentation. In contrast, our method demonstrates stronger detection capabilities and a superior detail segmentation ability for small defect targets.

5. Conclusions

PCB soldering defect segmentation plays a crucial role in the quality control of PCBs. In this study, we propose an improved method for PCB soldering defect segmentation based on UNet. To address the issue of low segmentation accuracy, our proposed model replaces the encoder with the VGG16 encoder. By deepening the network’s feature extraction part to enhance the encoder’s feature extraction capabilities, we aim to capture more detailed edge features of defect contours.

Additionally, we introduce a hybrid attention module based on channel attention and dynamic spatial attention, replacing the simple skip connections in the UNet structure. This helps to mitigate interference from complex backgrounds and facilitates the better acquisition of defect information.

In the decoder network part, we employ the residual GSConv module to fuse features from different depths, enhancing the feature fusion capabilities and better restoring the detail information in the image.

Finally, we design a hybrid loss function combining the Dice loss and focal loss to enhance the model’s ability to distinguish between defects and backgrounds.

The experimental results demonstrate that our model achieves significant segmentation performance while maintaining a fast detection speed and a lower number of model parameters. In future work, we will continue to investigate the influence of different parameters on model training and apply the model to other PCB defects.

Author Contributions

Conceptualization, Z.L. and X.L.; methodology, Z.L.; software, Z.L.; validation, Z.L.; formal analysis, Z.L. and X.L.; investigation, Z.L.; resources, Z.L. and X.L.; data curation, Z.L.; writing—original draft preparation, Z.L.; writing—review and editing, Z.L. and X.L.; visualization, Z.L.; supervision, X.L.; project administration, X.L.; funding acquisition, X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the Academician (Expert) Workstation Fund Project of Sichuan Province of China, under Grant 2016YSGZZ01; in part by the Special Fund for Training High Level Innovative Talents of Sichuan University of Science and Engineering, under Grant B12402005; in part by the Sichuan University of Science and Engineering for Talent Introduction project, under Grant 2021RC16; in part by the Project of Sichuan University of Science and Engineering for Research on the Ideological and Political Construction of Machine Learning Curriculum, under Grant SZ202204; and in part by the Innovation Fund of Postgraduate, Sichuan University of Science & Engineering, under Grant Y2024117.

Data Availability Statement

The data that support the findings of this study are available at the online repository of Baidu (https://pan.baidu.com/s/1_xLYqLI8HwxcsBO3i0wsyQ, access code: 25l6, accessed on 20 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yuan, Z.; Tang, X.; Ning, H.; Yang, Z. LW-YOLO: Lightweight Deep Learning Model for Fast and Precise Defect Detection in Printed Circuit Boards. Symmetry 2024, 16, 418. [Google Scholar] [CrossRef]
Lakshmi, G.; Sankar, V.U.; Sankar, Y.S. A Survey of PCB Defect Detection Algorithms. J. Electron. Test. 2023, 39, 541–554. [Google Scholar] [CrossRef]
Zhou, Y.; Yuan, M.; Zhang, J.; Ding, G.; Qin, S. Review of Vision-Based Defect Detection Research and Its Perspectives for Printed Circuit Board. J. Manuf. Syst. 2023, 70, 557–578. [Google Scholar] [CrossRef]
Zheng, J.; Sun, X.; Zhou, H.; Tian, C.; Qiang, H. Printed Circuit Boards Defect Detection Method Based on Improved Fully Convolutional Networks. IEEE Access 2022, 10, 109908–109918. [Google Scholar] [CrossRef]
Fonseca, L.A.L.O.; Iano, Y.; Oliveira, G.G.d.; Vaz, G.C.; Carnielli, G.P.; Pereira, J.C.; Arthur, R. Automatic Printed Circuit Board Inspection: A Comprehensible Survey. Discov. Artif. Intell. 2024, 4, 10. [Google Scholar] [CrossRef]
Sankar, V.U.; Lakshmi, G.; Sankar, Y.S. A Review of Various Defects in PCB. J. Electron. Test. 2022, 38, 481–491. [Google Scholar] [CrossRef]
Chen, I.-C.; Hwang, R.-C.; Huang, H.-C. PCB Defect Detection Based on Deep Learning Algorithm. Processes 2023, 11, 775. [Google Scholar] [CrossRef]
Xu, Y.; Huo, H. DSASPP: Depthwise Separable Atrous Spatial Pyramid Pooling for PCB Surface Defect Detection. Electronics 2024, 13, 1490. [Google Scholar] [CrossRef]
Xiao, G.; Hou, S.; Zhou, H. PCB Defect Detection Algorithm Based on CDI-YOLO. Sci. Rep. 2024, 14, 7351. [Google Scholar] [CrossRef] [PubMed]
Ulger, F.; Yuksel, S.E.; Yilmaz, A.; Gokcen, D. Solder Joint Inspection on Printed Circuit Boards: A Survey and A Dataset. IEEE Trans. Instrum. Meas. 2023, 72, 1–21. [Google Scholar] [CrossRef]
Ling, Q.; Isa, N.A.M. Printed Circuit Board Defect Detection Methods Based on Image Processing, Machine Learning and Deep Learning: A Survey. IEEE Access 2023, 11, 15921–15944. [Google Scholar] [CrossRef]
Zhang, L.; Chen, J.; Chen, J.; Wen, Z.; Zhou, X. LDD-Net: Lightweight Printed Circuit Board Defect Detection Network Fusing Multi-Scale Features. Eng. Appl. Artif. Intell. 2024, 129, 107628. [Google Scholar] [CrossRef]
Saberironaghi, A.; Ren, J.; El-Gindy, M. Defect Detection Methods for Industrial Products Using Deep Learning Techniques: A Review. Algorithms 2023, 16, 95. [Google Scholar] [CrossRef]
Božič, J.; Tabernik, D.; Skočaj, D. End-to-End Training of A Two-Stage Neural Network for Defect Detection. In Proceedings of the 2020 25th International Conference on Pattern Recognition, Milan, Italy, 10–15 January 2021; IEEE: Piscatevi, NJ, USA, 2021; pp. 5619–5626. [Google Scholar]
Jiang, W.; Li, T.; Zhang, S.; Chen, W.; Yang, J. PCB Defects Target Detection Combining Multi-Scale and Attention Mechanism. Eng. Appl. Artif. Intell. 2023, 123, 106359. [Google Scholar] [CrossRef]
Zhao, Q.; Ji, T.; Liang, S.; Yu, W. PCB Surface Defect Fast Detection Method Based on Attention and Multi-Source Fusion. Multimed. Tools Appl. 2024, 83, 5451–5472. [Google Scholar] [CrossRef]
Ding, R.; Dai, L.; Li, G.; Liu, H. TDD-net: A Tiny Defect Detection Network for Printed Circuit Boards. CAAI Trans. Intell. Technol. 2019, 4, 110–116. [Google Scholar] [CrossRef]
Shi, W.; Lu, Z.; Wu, W.; Liu, H. Single-Shot Detector with Enriched Semantics for PCB Tiny Defect Detection. J. Eng. 2020, 13, 366–372. [Google Scholar] [CrossRef]
Tang, J.; Wang, Z.; Zhang, H.; Li, H.; Wu, P.; Zeng, N. A Lightweight Surface Defect Detection Framework Combined with Dual-Domain Attention Mechanism. Expert Syst. Appl. 2024, 238, 121726. [Google Scholar] [CrossRef]
Yu, Y.; Wang, C.; Fu, Q.; Kou, R.; Huang, F.; Yang, B.; Yang, T.; Gao, M. Techniques and Challenges of Image Segmentation: A Review. Electronics 2023, 12, 1199. [Google Scholar] [CrossRef]
Kang, D.; Lai, J.; Zhu, J.; Han, Y. An Adaptive Feature Reconstruction Network for the Precise Segmentation of Surface Defects on Printed Circuit Boards. J. Intell. Manuf. 2023, 34, 3197–3214. [Google Scholar] [CrossRef]
Ling, Z.; Zhang, A.; Ma, D.; Shi, Y.; Wen, H. Deep Siamese Semantic Segmentation Network for PCB Welding Defect Detection. IEEE Trans. Instrum. Meas. 2022, 71, 1–11. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Part III. pp. 234–241. [Google Scholar]
Chen, H.; Min, B.-W. Research on Mobile Phone Backplane Defect Segmentation Based on MDAF-UNet. Electronics 2024, 13, 1385. [Google Scholar] [CrossRef]
Kamanli, A.F. A Novel Multi-Scale Cross-Patch Attention with Dilated Convolution (MCPAD-UNET) for Metallic Surface Defect Detection. Signal Image Video Process. 2024, 18, 485–494. [Google Scholar] [CrossRef]
Chen, Y.; Dai, X.; Liu, M.; Chen, D.; Yuan, L.; Liu, Z. Dynamic Convolution: Attention over Convolution Kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11030–11039. [Google Scholar]
Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A Better Design Paradigm of Detector Architectures for Autonomous Vehicles. arXiv 2022, arXiv:2206.02424. [Google Scholar]
Li, Y.; Hou, Q.; Zheng, Z.; Cheng, M.-M.; Yang, J.; Li, X. Large Selective Kernel Network for Remote Sensing Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 16794–16805. [Google Scholar]
Cai, X.; Lai, Q.; Wang, Y.; Wang, W.; Sun, Z.; Yao, Y. Poly Kernel Inception Network for Remote Sensing Detection. arXiv 2024, arXiv:2403.06258. [Google Scholar]
Park, J.; Woo, S.; Lee, J.-Y.; Kweon, I.S. Bam: Bottleneck Attention Module. arXiv 2018, arXiv:1807.06514. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Eecoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid Scene Parsing Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
Sun, K.; Xiao, B.; Liu, D.; Wang, J. Deep High-Resolution Representation Learning for Human Pose Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5693–5703. [Google Scholar]

Figure 1. The improved UNet network architecture.

Figure 2. The DHAM structure.

Figure 3. The GSConv structure.

Figure 4. The RGSM structure.

Figure 5. Schematic of soldering defects. The first row is the images without defects; the second row is the images with defects. (a) MS. (b) MP. (c) SS. (d) OD. (e) SB.

Figure 6. Segmentation results for each method. (a) Original figure. (b) Ground truth. (c) DeepLab3plus. (d) SegFormer-B1. (e) PSPNet-ResNet50. (f) UNet. (g) HRNet-W32. (h) Our method.

Table 1. Experimental environment.

Category	Version
Operating system	Ubuntu 18.04
GPU	Nvidia RTX 3090 (24 GB) (NVIDIA, Santa Clara, CA, USA)
CPU	Xeon(R) Platinum 8255C (Intel, Santa Clara, CA, USA)
Programming	Python 3.9.18 + Pytorch 2.0.1 + CUDA 11.7

Table 2. The results of different loss functions when utilizing our proposed method.

Loss Function	mIoU/%	mPA/%
Dice	60.28	63.06
Focal	61.53	66.80
CE	62.26	66.45
CE + Dice	81.56	87.14
Focal + Dice	81.74	87.33

Table 3. Comparison of different encoders.

Encoder	mIoU/%	mPA/%	Parameters/M
MobileNetV3	73.33	80.76	8.96
ResNet50	75.80	82.52	43.93
VGG16	80.10	86.10	24.89

Table 4. Comparison of different attention modules.

Attention Module	mIoU/%	mPA/%	Parameters/M
LSKA	76.27	83.55	25.49
CCA	80.32	86.50	25.62
ECA	80.07	86.05	24.89
BAM	80.25	86.24	25.00
CBAM	80.45	86.69	24.94
DHAM	80.50	86.67	24.96

Table 5. Results of ablation experiments.

Method	mIoU/%	mPA/%	Parameters/M
UNet	77.40	83.11	19.59
UNet + A	80.10	86.10	24.89
UNet + B	77.54	83.76	19.73
UNet + C	79.33	84.24	16.76
UNet + A + B	80.50	86.67	24.96
UNet + A + C	81.61	87.07	22.06
UNet + B + C	79.86	85.60	16.90
UNet + A + B + C	81.74	87.33	22.13

Table 6. Comparison results of different methods.

Method	mIoU/%	mPA/%	Parameters/M	FPS
DeepLab3plus	71.70	80.56	54.71	25.50
SegFormer-B1	75.26	83.76	13.68	60.93
PSPNet-ResNet50	76.60	84.23	46.71	39.19
UNet	77.40	83.11	19.59	39.57
HRNet-W32	78.27	85.56	29.54	22.31
Ours	81.74	87.33	22.13	30.16

Table 7. Comparison of IoU results on background and defects for different methods.

Method	IoU
Method	BG	MP	MS	SB	SS	OD
DeepLab3plus	99.66	87.02	43.85	80.59	41.16	77.90
SegFormer-B1	99.69	86.15	44.43	81.44	60.87	79.00
PSPNet-ResNet50	99.72	88.37	43.00	81.78	65.28	81.43
UNet	99.74	88.03	52.40	84.69	56.57	82.99
HRNet-W32	99.73	88.55	49.13	84.51	64.62	83.07
Ours	99.78	89.82	57.59	87.61	70.74	84.93

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Z.; Liu, X. Soldering Defect Segmentation Method for PCB on Improved UNet. Appl. Sci. 2024, 14, 7370. https://doi.org/10.3390/app14167370

AMA Style

Li Z, Liu X. Soldering Defect Segmentation Method for PCB on Improved UNet. Applied Sciences. 2024; 14(16):7370. https://doi.org/10.3390/app14167370

Chicago/Turabian Style

Li, Zhongke, and Xiaofang Liu. 2024. "Soldering Defect Segmentation Method for PCB on Improved UNet" Applied Sciences 14, no. 16: 7370. https://doi.org/10.3390/app14167370

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Soldering Defect Segmentation Method for PCB on Improved UNet

Abstract

1. Introduction

2. Methodology

2.1. Architecture Overview

2.2. VGG16 Encoder

2.3. Dynamic Hybrid Attention Module

2.4. RGSM Decoder

3. Experimental Setup and Evaluation Indicators

3.1. Dataset

3.2. Evaluation Indicators

3.3. Experiment Details

3.4. Loss Function

4. Experimental Results

4.1. Encoder Comparison Experiments

4.2. Attention Module Comparison Experiments

4.3. Ablation Experiments

4.4. Comparison Experiments

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI