Lightweight Reconstruction Network for Surface Defect Detection Based on Texture Complexity Analysis

Shi, Hui; Li, Gangyan; Bao, Hanwei

doi:10.3390/electronics12173617

Open AccessArticle

Lightweight Reconstruction Network for Surface Defect Detection Based on Texture Complexity Analysis

by

Hui Shi

,

Gangyan Li

and

Hanwei Bao

^*

School of Mechanical and Electronic Engineering, Wuhan University of Technology, Wuhan 430074, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(17), 3617; https://doi.org/10.3390/electronics12173617

Submission received: 1 August 2023 / Revised: 20 August 2023 / Accepted: 23 August 2023 / Published: 27 August 2023

Download

Browse Figures

Versions Notes

Abstract

:

Deep learning networks have shown excellent performance in surface defect recognition and classification of certain industrial products. However, most industrial product defect samples are scarce and have a wide variety of defect types, making methods that require a large number of defect samples for training unsuitable. In this paper, a lightweight surface defect detection network (LRN-L) based on texture complexity analysis is proposed. Only a large number of defect-free samples, which can be easily obtained, are needed to detect defects. LRN-L includes two stages: texture reconstruction stage and defect localization stage. In the texture reconstruction phase, a lightweight reconstruction network (LRN) based on convolutional autoencoder is designed, which can reconstruct defect-free texture images; a loss function combining structural loss and L1 loss is proposed to improve the detection effect; we built a calculation model for image complexity, calculated the texture complexity for texture samples, and divided textures into three levels based on complexity. In the defect localization stage, the residual between the reconstructed image and the original image is taken as the possible region of the defect, and the defect localization is realized via a segmentation algorithm. In this paper, the network structure, loss function, texture complexity and other factors of LRN-L are analyzed in detail and compared with other similar algorithms on multiple texture datasets. The results show that LRN-L has strong robustness, accuracy and generalization ability, and is more suitable for industrial online detection.

Keywords:

defect detection; deep learning; convolution autoencoder; loss function; texture complexity

1. Introduction

Traditional machine learning methods can effectively solve the problem of defect detection of a variety of industrial products, such as bearings [1], mobile screen [2], coiled materials [3], rails [4], steel beams [5], etc. These methods can manually design feature extractors to adapt to the specific product image dataset and input product features into classifiers such as SVM (support vector machines) [6] and NN (neural network) [7] to determine whether the product has defects. However, when the surface defects of the products have problems such as a complex background texture, large variation of defect feature scale, and similarity of defect region features and background features (as shown in Figure 1), the traditional machine learning method cannot meet the needs of this kind of detection.

Since AlexNet [8] was proposed, the deep learning method based on convolutional neural network (CNN) has become the mainstream method in the field of surface defect detection [9,10,11,12]. CNN can not only automatically learn image features, but also extract more abstract image features through the superposition of multiple convolution layers, which has better feature representation ability than the manually designed feature extraction algorithm. According to the results of network output, the defect detection algorithm based on deep learning can be divided into the defect classification method, defect recognition method and defect segmentation method.

The algorithm based on defect classification usually uses some classical classification networks to train the samples, and the trained model can classify the defective and defective-free samples. Tian [13] used two CNN networks to detect defects in six types of images; Xu [14] proposed a CNN classification network integrating VGG (Visual Geometry Group) and ResNet to detect and classify the surface defects of rollers; Weimer [15] also use CNN to identify defect categories. Such methods usually do not involve the location of defect areas.

In order to accurately locate the defect area, some researchers have improved the network with excellent performance in target recognition task and applied it to surface defect detection. Such algorithms are mostly based on R-CNN [16], SSD (single-shot multibox detector) [17], YOLO (You Only Look Once) [18] and other networks. Chen [19] applied deep CNN (DCNN) to accelerate defect detection.

In order to achieve the accuracy of pixel-level detection, some researchers have used segmentation networks, such as the detection network constructed by Huang [20] using U-Net to transform defect detection tasks into semantic segmentation tasks, which improves the accuracy of magnetic tile surface detection. Long [21] uses a full convolutional network (FCN) to segment the defect area. These methods all rely on a certain number of defect samples.

On many occasions, the type of product defect is unpredictable, and it is difficult to collect a large number of defect samples. To solve these problems, researchers began to pay attention to small samples or unsupervised learning methods. For examples, Yu [22] used the Yolo V3 network to train a small number of defective samples to achieve high accuracy detection. Methods based on autoencoder (AE) are used for surface defect detection tasks, such as the convolutional autoencoder (CAE) [23], stacked noise reduction autoencoder based on fisher criterion (FCSDA) [24], robust autoencoder (RCAE) [25], sparse denoising autoencoder network fused with gradient difference information [26], etc. Mei [27] proposed using the multi-scale convolution autoencoder network (MSCDAE) to reconstruct the image and generate the detection result by using the reconstruction residual. Compared with the traditional unsupervised algorithms, such as PHOT (phase-only transform) [28] and DCT (discrete cosine transformation) [29], MSCDAE has greatly improved the model evaluation index. Yangh [30] used feature clustering on the basis of MSCDAE to improve the reconstruction accuracy of texture background. The data samples used in the above reconstruction network are mostly regular textures, without considering the differences in image textures, so the detection accuracy obtained via such detection methods cannot fully reflect the performance of detection methods, nor can it measure the generalization ability of detection methods.

In addition to the autoencoder, the generic adversarial network (GAN) [31] is also applied to the unsupervised defect detection method. By learning a large number of normal samples, GAN enables the generator in the network to learn the texture features of normal samples. Zhao [32] combined GAN and autoencoder to put defects into defect-free samples and trained GAN network to restore images. He [33] used SGAN and autoencoder to train unmarked steel surface defect samples, extract fine-grained image features and classify them. Schlegl [34] proposed the AnoGAN network to solve the abnormal detection of lesion images under unsupervised conditions, while GAN has the problem of unstable performance [35] in applications.

Considering the scarcity of defect samples in application, this paper proposes a method based on lightweight reconstruction network for low-complexity texture (LRN-L). This method uses only a small number of defect-free samples to train the reconstructed network, so that the network has the ability to reconstruct the samples. When abnormal samples are inputted, the trained network model can detect the abnormal region of the samples. In addition to the experimental analysis of the network structure, loss function, algorithm efficiency and other aspects, this paper also introduces the index of texture complexity, and uses the calculation model of texture complexity to grade the texture samples, to evaluate the detection ability and application of LRN-L.

2. LRN-L

LRN-L is divided into two stages: texture reconstruction stage and defect location stage. In the texture reconstruction stage, the reconstruction network (LRN) is designed based on CAE, and only a small number of defect-free samples are used for training, so that the reconstruction network can generate defect-free images; In the defect location stage, the residual image between the reconstructed image and the original image is taken, and the defect is located by the segmentation algorithm. The LRN-L model is shown in Figure 2.

2.1. Texture Complexity

Texture complexity reflects the difficulty of some operations (such as image enhancement, defect detection, etc.). One of the functions of texture complexity is to measure the performance of the algorithm; Second is to classify textures or measure the similarity between textures. The structure of the reconstructed network is closely related to the texture complexity, so for textures with different complexity level, the network structure should be different. Texture complexity can be measured in different ways [36,37,38,39,40]. The GLCM (gray level co-occurrence matrix) [41] is used to statistically analyze the features of texture to reflect the complexity.

If the image gray has N levels, then the gray level co-occurrence matrix P is a N-order matrix, where the element in the i-th row and j-th column represents the probability that two pixels with gray i and j, respectively, separated by a distance δ = (Δx, Δy), occur simultaneously in the image. δ determines the distance and direction between two pixels. There are four commonly used directions θ: 0° direction, δ = (Δx, 0); 45° direction, δ = (Δx, Δy); 90° direction, δ = (0, Δy); and 135° direction, δ = (−Δx, −Δy).

Generally, five most commonly used parameters are extracted from GLCM to describe texture features: Energy J, Entropy H, Contrast G, Deficit Q and Correlation COV, which are defined as follows:

J = \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} {P_{i j}}^{2},

(1)

H = - \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} P_{i j} \log_{2} P_{i j},

(2)

G = \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} {(i - j)}^{2} P_{i j},

(3)

Q = \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} \frac{P_{i j}}{1 + {(i - j)}^{2}},

(4)

C O V = \frac{\sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} i j P_{i j} - μ_{1} μ_{2}}{{σ_{1}}^{2} {σ_{2}}^{2}},

(5)

\begin{matrix} μ_{1} = \sum_{i = 0}^{N - 1} i \sum_{j = 0}^{N - 1} P_{i j}, μ_{2} = \sum_{j = 0}^{N - 1} j \sum_{i = 0}^{N - 1} P_{i j}, \\ {σ_{1}}^{2} = \sum_{i = 0}^{N - 1} {(i - μ_{1})}^{2} \sum_{j = 0}^{N - 1} P_{i j}, {σ_{2}}^{2} = \sum_{j = 0}^{N - 1} {(j - μ_{2})}^{2} \sum_{j = 0}^{N - 1} P_{i j} \end{matrix}

GLCMs of four directions is extracted from texture image, and J, H, G, Q and COV in the four directions are calculated, respectively, denoted as J_i, H_i, G_i, Q_i and COV_i, where i = 1, 2, 3, 4. To make the texture features independent of direction, the harmonic average is calculated for the above feature parameters by Formula (6). Taking the parameter J as an example, the energy values of the four directions are J₁, J₂, J₃ and J₄, respectively, and the energy value J of the texture image is obtained from Formula (6).

J = \frac{4}{\sum_{i = 1}^{4} 1 / J_{i}},

(6)

Among the five parameters, J, H and G were positively correlated with texture complexity, while Q and COV were negatively. Inspired by SSIM [42], G, Q and COV are selected as indicators of texture complexity based on the texture features of industrial product surface images. The mean square error (MSE) is used to assign weights to the parameters of G, Q, and COV, and the formula of texture complexity f is constructed, as shown in Formula (7). In the Formula (7), PC_i represents G, Q and COV, respectively, and i = 1, 2 and 3, ā, MSE_i and w_i represent the average, the variance and the weight assigned to parameters, respectively.

\{\begin{matrix} ā = (G + Q + C O V) / 3 \\ \begin{matrix} {M S E}_{i} = {({P C}_{i} - ā)}^{2} i = 1, 2, 3 \\ ω_{i} = \frac{{M S E}_{i}}{\sum_{i = 1}^{3} {M S E}_{i}} i = 1, 2, 3 \\ f = ω_{1} {P C}_{1} + ω_{2} (1 - {P C}_{2}) + ω_{3} (1 - {P C}_{3}) \end{matrix} \end{matrix},

(7)

Mario [44] divided image textures into three levels according to the complexity: low-complexity texture, medium-complexity texture, and complexity textures, represented by L, M and H, as shown in Figure 3.

2.2. Lightweight Reconstruction Network Model (LRN)

The core of lightweight reconstruction network model is to comprehensively design the network from two aspects, namely, network structure and detection speed, while maintaining accuracy. According to the characteristics of industrial product texture samples, some improvements are made on the basis of CAE. The structure of LRN is shown in Figure 4.

First, input the original image into the network, and use three convolution kernels of size 1 × 1, 3 × 3 and 5 × 5 to obtain multiscale features; then, input them to CAE module. The output of the decoding module is then deconvoluted by different kernels to obtain three scales of the reconstructed images, and the final reconstructed image is obtained via feature fusion. Compared with the MSCDAE [29], multi-scale features can also be obtained, but the computational cost is reduced.

The CAE module of LRN includes four convolution sub-modules and four deconvolution sub-modules. Each convolution sub-module includes a convolution layer, a BN layer [43] and a nonlinear activation layer. The first three convolution sub-modules also include a pool layer that can change the image scale. The activation function adopts Relu6. Use a 5 × 5 convolution kernel for the first three convolution layers, and the last layer uses a 3 × 3 convolution kernel.

The depth of the reconstruction network determines the reconstruction ability of the autoencoder. If a model with complex network structure is used, the ability of texture feature extraction can be improved, but at the same time, the ability of feature extraction of defect region is also improved. When the residual operation is carried out, the detection will fail because the difference between reconstruction image and origin image is not obvious enough. LRN uses a lightweight network structure, which has limited ability to reconstruct textures. However, through the design of multi-scale features and the loss function, the network can not only fully learn the characteristics of normal texture but also perform the restorative reconstruction of the defective areas.

2.3. Loss Function

The LRN takes the reconstruction error between the original image and the reconstructed image as the loss function to promote the convergence of the network. Set the input image as x and the reconstructed image as y.

1.: L₁ Loss

L₁ loss is also known as MAE (mean absolute error) loss, which is defined as:

L_{1} = |x - y| + λ {‖ω‖}_{F},

(8)

where ω represents the set of weight matrices in the reconstructed network, λ represents the penalty factor of the regularization term, and 0 < λ < 1.

2.: L₂ Loss

L₂ loss is also called MSE (mean squared error) loss, which is a common loss function to evaluate the difference between the reconstructed image and original image, and it is defined as follows:

L_{2} = {‖x - y‖}^{2} + λ {‖ω‖}_{F},

(9)

Compared with L₁, L₂ is more sensitive to abnormal areas and will over punish large loss errors, such as MSCDAE [29], so LRN introduces L₁ loss.

3.: Structural Loss

Both L₁ and L₂ do not consider the structural characteristics of texture, so LRN introduces SSIM (structural similarity index) [44] to build a loss function. SSIM optimizes the model from brightness, contrast and structure [45], as shown in Formulas (10) and (11). The larger the SSIM is, the more similar the images are. When the two images are identical, SSIM = 1. Therefore, to use it as a loss function, we add a minus sign.

S S I M (x, y) = \frac{2 μ_{x} μ_{y} + C_{1}}{μ_{x}^{2} + μ_{y}^{2} + C_{1}} \times \frac{2 σ_{x y} + C_{2}}{σ_{x}^{2} + σ_{y}^{2} + C_{2}} \times \frac{σ_{x y} + C_{3}}{σ_{x} σ_{y} + C_{3}},

(10)

L_{S S I M} (x, y) = 1 - S S I M (x, y),

(11)

where

μ_{x}

and

μ_{y}

are the average brightness of x and y,

σ_{x}

and

σ_{y}

is the standard deviation of the pixel value, the covariance of x and y is

σ_{x y}

, and C₁, C₂ and C₃ are constant terms that are added to avoid situations where the denominator is zero.

4.: Loss Function of LRN

The loss function designed in this paper, L_LRN, combines the advantages of L₁ and L_SSIM and adopts a combined form, as shown in Formula (12), where α is a weight factor with the range of (0, 1) to balance the proportion of L₁ loss and L_SSIM.

L_{L R N} = α L_{1} + (1 - α) L_{S S I M},

(12)

2.4. Defect Location

1.: Residual Image

The difference between the original image (as shown in Figure 5a, the red circle area is the defect area) and the reconstructed image by LRN (as shown in Figure 5b, the red circle area is the reconstruction area for defects) is made by using Formula (13). The residual image is shown in Figure 5c, which contains the location information of the abnormal area.

r = {(x - y)}^{2},

(13)

2.: Noise Removal

The residual image shows a lot of noise, forming pseudo defects that affect the positioning of the real defect area. The average filter is used to denoise, and the result is shown in Figure 5d.

3.: Threshold Segmentation and Defect Location

The adaptive threshold method is used to locate the defect, and the final result is obtained, as shown in Figure 5e.

3. Experiment

In this paper, LRN-L is tested on the surface texture dataset of industrial products. The influence factors of LRN-L, including loss function, network structure, and texture complexity, are analyzed in detail. Finally, LRN-L is compared with other similar unsupervised algorithms. The implementation of this program was executed by using Python 3.6 and PyTorch framework. Performance testing was carried out using CUDA 9.0 and CUDNN 5.1. The CPU of the workstation is Intel Xeon X5 @2.9 GHz, accompanied by 128 GB DDR4 memory, Ubuntu 16.04. Furthermore, the GPU employed was the NVIDIA GTX-1080Ti with 11 GB of single card video memory.

3.1. Dataset Introduction

The texture samples are shown in Figure 6. Figure 6a–j are from the dataset DAGM2007 [46], which contains 10 kinds of texture sample. Figure 6k–n are from the dataset MVTech [35]. Figure 6o is from AITEX [47]. As to the 15 kinds of texture, each kind contains 100 defect-free positive samples for training and 10 defect samples for testing. The image size is 512 × 512.

3.2. Evaluation Index

This paper uses Recall, Precision and F₁ Measure to evaluate the performance of LRN-L, which is defined as follows:

R e c a l l = \frac{T P}{T P + F N} \times 100 %,

(14)

P r e c i s i o n = \frac{T P}{T P + F P} \times 100 %,

(15)

F_{1} M e a s u r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l},

(16)

where TP is the defect sample with correct defect segmentation, FP is the defect sample with no defect area detected, and FN is the normal sample with defect area detected.

3.3. Network Structure Comparison Experiment

The network structure affects the training results of the reconstructed network. In this experiment, the structure of LRN is compared with classic networks such as FCN [21] and U-Net [48]. The experimental results are shown in Figure 7.

The results show that the number of layers of the network cannot be too many when the reconstruction network is used to detect texture surface defects. Although the deep network structure has a strong ability of feature extraction, it is easy to reconstruct the defect area, resulting in the residual error between the reconstructed image and the original image being almost equal to zero, and the defect location cannot be realized. When using lightweight structure to reconstruct the network, it can not only fully learn the texture features of positive samples, but also reconstruct the defect area into normal texture, forming an obvious reconstruction error. Therefore, LRN does not need too many layers, nor does it need a complex network structure such as GRL (Global Residual Learning) [49], sub-pixel layer [50] and residual connection [51].

3.4. Loss Function Comparison Experiment

In L_LRN, L₁, L₂, L_SSIM and their combination are selected for comparative experiments. During the training, the size of the image block (patch) is 32 pixels, the size of the batch is 256, and after 1000 iterations, the model output results are entered into the defect location module.

Figure 8a,b are the experimental results of two types of surface defect samples under various loss functions, the red circle area is the defect area. Figure 8a shows the defect samples with irregular surface texture. From the comparison of residual results, it can be seen that the residual results obtained by using L₂ as the loss function have more noise points in other areas except the real defect area, forming pseudo defects; using L_SSIM as the loss function alone, the detected defect area is slightly smaller than the real defect area; compared with other loss functions, L_LRN achieves a better result. Figure 8b shows the defect samples with regular surface texture. From the comparison of residual results, it can be seen that the integrity of the defect area obtained by using L₂ is poor, which is similar to the detection result obtained by using L₂ + L_SSIM. L_LRN achieves a good result, and the result is similar to that using only L₁.

Table 1 shows the Precision, Recall and F₁ Measure for LRN to use different loss function. For the defect samples with an irregular surface texture in Figure 8a, L_LRN achieved maximum values of 0.75 and 0.82 for Recall and F₁ Measure, respectively, and is slightly inferior to L_SSIM in terms of Precision. For the defect samples with a regular surface texture, shown in Figure 8b, when only L₁ is used, Recall achieves the highest value of 0.76, followed by LRN, which is 0.71. Precision achieved by using only L_SSIM is the highest, which is 0.96, and L_LRN is second with a slightly lower value of 0.87. For F₁ Measure, L₁ performs best.

The results show the following: (1) For a regular texture, using L₁ alone and L_SSIM alone, or using L₁ and L_SSIM in combination (L_LRN), can achieve better results with slight differences. (2) For an irregular texture, it is suggested to use L_LRN, which can obtain better results. (3) The L_LRN can solve the detection task of more types of texture surface abnormalities, and it is the best loss function.

3.5. Experiment of Texture Complexity

In the face of defect detection tasks with different texture complexities, it is necessary to evaluate the applicability of LRN-L. The characteristic parameters of texture samples shown in Figure 6 are calculated according to Formulas (6) and (7) and are shown in Table 2. The experimental results are shown in Table 3.

From the results presented in Table 3, LRN-L performs admirably in reconstructing images for low-complexity and medium-complexity textures, yielding a higher defect detection rate. However, this algorithm’s efficacy diminishes when dealing with complex textures. Most notably, there does not appear to be a direct linear relationship between the magnitude of the evaluation index and the texture complexity for low-complexity and medium-complexity textures. For instance, samples (d) and (j), despite being simplistic in their texture complexity, were deemed undetectable, as they exhibit irregular and inhomogeneous texture structures, thereby exhibiting low values across all three indices.

Overall, LRN-L yields superior results when applied to samples with low-complexity and medium-complexity textures, particularly those with relatively uniform texture structures. On the other hand, samples with low-complexity and medium-complexity textures but non-uniform texture structures have low detection indices. LRN-L is unsuited to deal with high-complexity textures.

3.6. Experiment of Loss Function under Different Weight Factors

L_LRN is a combination of L₁ and L_SSIM, as shown in Formula (12). Weight factor α was used to balance the relative importance of these two components. Using sample (g) in Figure 6, with α from 0.15 to 0.85, we conducted a series of comparative experiments in increments of 0.1. The results are shown in Figure 9 and Table 4.

As illustrated in Figure 9, the results vary significantly with changes in α. As α increases, the L_SSIM proportion decreases, resulting in reduced the structural influence. The results obtained at α = 0.15 exhibit the least amount of noise and yield more accurate defect localization. Table 4 demonstrates that α = 0.15 produces the highest Recall and F₁ Measure, which is 0.79 and 0.73, respectively, as well as the second-highest Precision among the evaluation indices.

3.7. Comparison Experimental of Related Algorithms

In this experiment, LRN-L is compared with the traditional unsupervised method (LCA [52], PHOT [28]) and the unsupervised method based on autoencoder (MSCDAE [27]), and it has been proven in the literature that the performance of MSCDAE is superior to other autoencoding methods such as ACAE [9] and RCAE [25]. The experiment uses the texture samples (b), (e), (j), (n) and (o) in Figure 5. These five types of textures belong to low-complexity textures and medium-complexity textures. As to the five kinds of texture, each kind contains 100 defect-free positive samples for training and 10 defect samples for testing. The default network parameters are as follows: block size is 32 × 32, batch size is 256, number of epochs is 1000 and weight α is 0.15. The results are shown in Figure 10.

LCA can eliminate the high-frequency part, which represents the background, while retaining the low-frequency part, which represents the defect, which is not suitable for irregular texture detection, as shown in No. 3 in Figure 10. For PHOT, only No. 3 detection is effective. MSCDAE can detect the defect areas of all samples, but also detect some defect-free areas as suspected defects, such as No. 1, No. 3 and No. 5. LRN-L achieves good detection results on all types of defects and textures.

In addition, Recall, Precision and F₁ Measure are used to quantitatively analyze the experimental results of the above four methods, as shown in Table 5 (the optimal result is highlighted in bold font).

As can be seen from Table 5, the three metrics of LRN-L are superior to other algorithms in almost all types of samples. The Recall on sample No. 3 is slightly lower than that of MSCDAE, but MSCDAE will simultaneously detect defect-free areas and generate pseudo defects.

The efficiency of the algorithms is also compared. Sample images measuring 1024 × 1024 pixels were used in the experiment. Under the same computational performance, the processing time of the four methods is compared, as shown in Table 6. The average detection time of LRN-L is 2.82 ms, which can meet the requirements of industrial real-time detection. Other methods are time consuming, which limit the promotion of their practical applications.

4. Conclusions

In this paper, a method of texture defect detection based on the reconstruction network (LRN-L) is proposed. LRN uses CAE with a lightweight structure to design the reconstruction network. In the phase of texture reconstruction, only the defect-free samples are used for training, which can solve the problem of shortage of defective samples in the industry. In the defect location stage, the accurate location of the defect region is achieved by segmentation algorithm. In this paper, the L_LRN loss function is designed for defect detection, which improves the detection efficiency. The evaluation index of image complexity is established, the texture complexity of texture samples is calculated, and the texture complexity level is divided. This paper discusses the influence of network structure, loss function, texture complexity and other factors on the defect detection task in the unsupervised algorithm, and it conducts a comparative experiment between the proposed LRN-L and other unsupervised algorithms on multiple types of texture samples. The results show that LRN-L has strong robustness, accuracy and generalization ability, and is more suitable for transplantation to the industrial detection. Because of the lightweight characteristics of the network, LRN-L is more suitable for the detection of surface defects of industrial products with low-complexity and medium-complexity textures.

Author Contributions

Conceptualization, G.L. and H.B.; methodology, H.S.; software, H.S.; validation, H.S. and H.B.; formal analysis, H.S.; investigation, H.S.; resources, H.S.; data curation, H.S.; writing—original draft preparation, H.S.; writing—review and editing, H.S. and G.L.; visualization, H.S.; supervision, G.L.; project administration, G.L.; funding acquisition, H.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank Gangyan Li for his helpful suggestions with regard to this paper. We also thank Wenyong Yu for his helpful analysis of the methodology. We also thank Haiming Yao for his helpful collaboration on and corrections to this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Deng, S.; Cai, W.; Xu, Q.; Liang, B. Defect detection of bearing surfaces based on machine vision technique. In Proceedings of the International Conference on Computer Application and System Modeling (ICCASM 2010), Taiyuan, China, 22 October 2010. [Google Scholar]
Jian, C.; Gao, J.; Ao, Y. Automatic surface defect detection for mobile phone screen glass based on machine vision. Appl. Soft Comput. 2017, 52, 348–358. [Google Scholar] [CrossRef]
Bulnes, F.G.; Usamentiaga, R.; Garcia, D.F.; Molleda, J. An efficient method for defect detection during the manufacturing of web materials. J. Intell. Manuf. 2016, 27, 431–445. [Google Scholar] [CrossRef]
Jin, X.T.; Wang, Y.N.; Zhagn, H.; Liu, L.; Zhong, H.; Hei, Z.D. Deep Rail: Automatic visual detection system for railway surface defect using Bayesian CNN and attention network. Acta Autom. Sin. 2019, 45, 2312–2327. [Google Scholar]
Li, L.F.; Ma, W.F.; Li, L.; Lu, C.J. Research on detection algorithm for bridge cracks based on deep learning. Acta Autom. Sin. 2019, 45, 1727–1742. [Google Scholar]
Chen, S.; Hu, T.; Liu, G.; Pu, Z.; Li, M.; Du, L. Defect classification algorithm for IC photomask based on PCA and SVM. In Proceedings of the Congress on Image and Signal Processing, Sanya, China, 27 May 2008. [Google Scholar]
Huang, J.X.; Li, D.; Ye, F.; Zhang, W.J. Detection of surface defection of solder on flexible printed circuit. Opt. Precis. Eng. 2010, 18, 2443–2453. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Napoletano, P.; Piccoli, F.; Schettini, R. Anomaly detection in nanofibrous materials by CNN-based self-similarity. Sensors 2018, 18, 209. [Google Scholar] [CrossRef]
Cha, Y.J.; Choi, W.; Suh, G.; Mahmoudkhani, S.; Büyüköztürk, O. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Comput.-Aided Civ. Infrastruct. Eng. 2018, 33, 731–747. [Google Scholar] [CrossRef]
Gao, Y.; Gao, L.; Li, X.; Yan, X. A semi-supervised convolutional neural network-based method for steel surface defect recognition. Robot. Comput.-Integr. Manuf. 2020, 61, 1018–1025. [Google Scholar] [CrossRef]
Zhao, Z.; Xu, G.; Qi, Y.; Liu, N.; Zhang, T. Multi-patch deep features for power line insulator status classification from aerial images. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24 July 2016. [Google Scholar]
Wang, T.; Chen, Y.; Qiao, M.; Snoussi, H. A fast and robust convolutional neural network-based defect detection model in product quality control. Int. J. Adv. Manuf. Technol. 2018, 94, 3465–3471. [Google Scholar] [CrossRef]
Xu, X.; Zheng, H.; Guo, Z.; Wu, X.; Zheng, Z. SDD-CNN: Small Data-Driven Convolution Neural Networks for Subtle Roller Defect Inspection. Appl. Sci. 2019, 9, 1364. [Google Scholar] [CrossRef]
Weimer, D.; Scholz, R.B.; Shpitalni, M. Design of deep convolutional neural network architectures for automated feature extraction in industrial inspection. Manuf. Technol. 2016, 65, 417–420. [Google Scholar] [CrossRef]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Berg, A.C.; Fu, C.Y.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. Comput. Vis. Pattern Recognit. 2016, 6, 779–788. [Google Scholar]
Chen, J.; Liu, Z.; Wang, H.; Núñez, A.; Han, Z. Automatic defect detection of fasteners on the catenary support device using deep convolutional neural network. IEEE Trans. Instrum. Meas. 2017, 67, 257–269. [Google Scholar] [CrossRef]
Huang, Y.; Qiu, C.; Guo, Y.; Wang, X.; Yuan, K. Surface defect saliency of magnetic tile. In Proceedings of the IEEE 14th International Conference on Automation Science and Engineering, Munich, Germany, 20 August 2018. [Google Scholar]
Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 640–651. [Google Scholar]
Yu, W.; Zhang, Y.; Shi, H. Surface Defect Inspection Under a Small Training Set Condition. In Proceedings of the International Conference on Intelligent Robotics and Applications, Shenyang, China, 8 August 2019. [Google Scholar]
Masci, J.; Meier, U.; Cireşan, D.; Schmidhuber, J. Stacked convolutional auto-encoders for hierarchical feature extraction. In Proceedings of the International Conference on Artificial Neural Networks, Torremolinos, Spain, 8 June 2011. [Google Scholar]
Li, Y.; Zhao, W.; Pan, J. Deformable patterned fabric defect detection with fisher criterion-based deep learning. IEEE Trans. Autom. Sci. Eng. 2016, 14, 1256–1264. [Google Scholar] [CrossRef]
Chalapathy, R.; Menon, A.K.M.; Chawla, S. Robust, Deep and Inductive Anomaly Detection. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases; Springer: Berlin/Heidelberg, Germany, 2017; pp. 36–51. [Google Scholar]
Yuan, J.; Zhang, Y.J. Application of sparse denoising autoencoder network with gradient difference information for abnormal action detection. Acta Autom. Sin. 2017, 43, 604–610. [Google Scholar]
Mei, S.; Yang, H.; Yin, Z. An Unsupervised-Learning-Based Approach for Automated Defect Inspection on Textured Surfaces. IEEE Trans. Instrum. Meas. 2018, 67, 1266–1277. [Google Scholar] [CrossRef]
Aiger, D.; Talbot, H. The phase only transform for unsupervised surface defect detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 295–302. [Google Scholar]
Lin, H.D. Tiny surface defect inspection of electronic passive components using discrete cosine transform decomposition and cumulative sum techniques. Image Vis. Comput 2008, 26, 603–621. [Google Scholar] [CrossRef]
Yang, H.; Chen, Y.; Song, K.; Yin, Z. Multiscale Feature-Clustering-Based Fully Convolutional Autoencoder for Fast Accurate Visual Inspection of Texture Surface Defects. IEEE Trans. Autom. Sci. Eng. 2019, 16, 1450–1467. [Google Scholar] [CrossRef]
Makhzani, A.; Shlens, J.; Jaitly, N.; Goodfellow, I.; Frey, B. Adversarial autoencoders. arXiv 2015, arXiv:1511.05644. [Google Scholar]
Zhao, Z.; Li, B.; Dong, R.; Zhao, P. A Surface Defect Detection Method Based on Positive Samples. In Proceedings of the International Conference on Artificial Intelligence, Nanjing, China, 28–31 August 2018; Pacific Rim. Springer: Cham, Switzerland, 2018; pp. 473–481. [Google Scholar]
Di, H.; Ke, X.; Peng, Z.; Dongdong, Z. Surface defect classification of steels with a new semi-supervised learning method. Opt. Lasers Eng. 2019, 117, 40–48. [Google Scholar] [CrossRef]
Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Schmidt-Erfurth, U.; Langs, G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In Proceedings of the International Conference on Information Processing in Medical Imaging, Boone, NC, USA, 25–30 June 2017; Springer: Cham, Switzerland, 2017; Volume 6, pp. 146–157. [Google Scholar]
Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Los Angeles, CA, USA, 15 June 2019; pp. 9592–9600. [Google Scholar]
Chen, Y.Q.; Duan, J.; Zhu, Y.; Qian, X. Research on the Image Complexity Based on Texture Features. Chin. Opt. 2015, 8, 407–413. [Google Scholar] [CrossRef]
Zou, J.; Liu, C.C. Texture classification by matching co-occurrence matrices on statistical manifolds. In Proceedings of the 10th IEEE International Conference on Computer and Information Technology (CIT 2010), Bradford, UK, 29 June 2010; pp. 1–7. [Google Scholar]
Gao, Z.Y.; Yang, X.M.; Gong, J.M.; Jin, H. Research on Image Complexity Description Methods. J. Image Graph. 2010, 15, 129–135. [Google Scholar]
Guo, X.Y.; Li, W.S.; Qian, Y.H.; Bai, R.Y.; Jia, C.H. Computational Evaluation Methods of Visual Complexity Perception for Images. Acta Electron. Sin. 2020, 48, 819–826. [Google Scholar]
Yang, L.; Zhou, Y.; Yang, J.; Chen, L. Variance WIE based infrared images processing. Electron. Lett. 2006, 42, 857–859. [Google Scholar] [CrossRef]
Haralick, R.M.; Shanmugam, K. Texture features for image classification. IEEE Trans. Syst. Man Data Hiding Based Pixel Value Ordering Cybern. 1973, 3, 610–621. [Google Scholar]
Bergmann, P.; Löwe, S.; Fauser, M.; Sattlegger, D.; Steger, C. Improving unsupervised defect segmentation by applying structural similarity to autoencoders. arXiv 2018, arXiv:1807.02011. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv 2015, arXiv:1502.03167. [Google Scholar]
Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 2016, 3, 47–57. [Google Scholar] [CrossRef]
Lv, C.; Zhang, Z.; Shen, F.; Zhang, F.; Su, H. A Fast Surface Defect Detection Method Based on Background Reconstruction. Int. J. Precis. Eng. Manuf. 2019, 21, 363–375. [Google Scholar]
Jager, M.; Knoll, C.; Hamprecht, F.A. Weakly supervised learning of a classifier for unusual event detection. IEEE Trans. Image Process. 2019, 17, 1700–1708. [Google Scholar]
Silvestre, B.J.; Albero, A.T.; Miralles, I.; Pérez-Llorens, R.; Moreno, J. A Public Fabric Database for Defect Detection Methods and Results. Autex Res. J. 2019, 19, 363–374. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Istanbul, Turkey, 17 October 2016; pp. 234–241. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June 2016; pp. 770–778. [Google Scholar]
Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel Convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June 2016; pp. 1874–1883. [Google Scholar]
Huang, G.; Liu, Z.; Van, D.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June 2016; pp. 4700–4708. [Google Scholar]
Tsai, D.M.; Huang, T.Y. Automated surface inspection for statistical textures. Image Vis. Comput. 2003, 21, 307–323. [Google Scholar]

Figure 1. Various surface defects. (a) Dark defects. (b) Bright defects. (c) Large-scale defects covering the image. (d) Minor defects. (e) Defects with small color difference. (f,g) Defects similar to texture. (h) Fuzzy defects.

Figure 2. LRN-L model.

Figure 3. Classification of texture complexity.

Figure 4. The structure of LRN.

Figure 5. Defect location operation process. (a) The original image. (b) Reconstruction image via LRN. (c) The residual image obtained via Formula (13). (d) Filtered residual map. (e) Defect location.

Figure 6. Defect location operation process. (a–j) DAGM2007 dataset. (k–n) MVTech dataset. (o) AITEX dataset.

Figure 7. Residual images of the network structure comparison experiment.

Figure 8. Results under different loss functions. (a) Irregular texture samples. (b) Regular texture samples.

Figure 9. Comparison under different weight factors.

Figure 10. Comparison results of multiple methods.

Table 1. Results under different loss functions (A: irregular texture sample; B: regular texture sample).

	Loss Function	L₁	L₂	L_SSIM	L₂ + L_SSIM	L_LRN
Index		L₁	L₂	L_SSIM	L₂ + L_SSIM	L_LRN
Precision	A	0.93	0.35	0.93	0.52	0.89
Precision	B	0.84	0.65	0.96	0.70	0.87
Recall	A	0.51	0.38	0.59	0.5	0.75
Recall	B	0.76	0.70	0.59	0.67	0.71
F₁ Measure	A	0.66	0.36	0.72	0.51	0.82
F₁ Measure	B	0.80	0.67	0.73	0.69	0.78

Table 2. Characteristic parameters of texture samples.

Samples	J	H	G	Q	COV	f
a	0.025	3.983	5.720	0.391	0.032	3.344
b	0.009	4.653	2.173	0.273	0.001	1.586
c	0.043	3.439	0.819	0.692	0.212	0.8035
d	0.148	2.343	0.738	0.765	0.600	0.569
e	0.035	3.649	1.408	0.601	0.207	1.1005
f	0.013	4.755	6.285	0.415	0.048	3.6185
g	0.100	2.682	0.558	0.781	0.474	0.542
h	0.042	3.451	0.648	0.731	0.172	0.738
i	0.045	3.383	0.702	0.716	0.209	0.7465
j	0.063	3.227	1.131	0.675	0.295	0.918
k	0.035	5.273	1.160	0.664	0.166	0.997
l	0.121	3.555	0.290	0.845	0.513	0.3885
m	0.188	2.969	0.298	0.854	1.007	0.1455
n	0.021	5.808	2.215	0.525	0.123	1.546
o	0.074	4.203	1.386	0.703	0.254	1.066

Table 3. Results under different texture complexity.

Samples	f	Level	Precision	Recall	F₁ Measure
a	3.344	H	0.001	0.001	0.001
b	1.586	M	0.855	0.799	0.822
c	0.8035	L	0.68	0.908	0.777
d	0.569	L	0.034	0.337	0.062
e	1.1005	M	0.925	0.883	0.904
f	3.6185	H	0.001	0.001	0.001
g	0.542	L	0.937	0.742	0.828
h	0.738	L	0.739	0.854	0.792
i	0.7465	L	0.824	0.946	0.881
j	0.918	L	0.291	0.431	0.348
k	0.997	L	0.596	0.064	0.116
l	0.3885	L	0.754	0.823	0.787
m	0.1455	L	0.807	0.492	0.612
n	1.546	M	0.935	0.948	0.941
o	1.066	M	0.884	0.772	0.824

Table 4. Comparison of test results under different weight factors.

Index	Weight Factor α
Index	0	0.15	0.25	0.35	0.45	0.55	0.65	0.75	0.85	1
Precision	0.71	0.69	0.58	0.28	0.46	0.53	0.23	0.89	0.54	0.62
Recall	0.72	0.79	0.62	0.73	0.65	0.67	0.52	0.55	0.72	0.45
F₁ Measure	0.71	0.73	0.60	0.41	0.54	0.60	0.32	0.68	0.62	0.52

Table 5. Comparison of detection effects of different algorithms.

	Algorithms	LCA	PHOT	MSCDAE	LRN-L
Index		LCA	PHOT	MSCDAE	LRN-L
Recall	1	0.478	0.133	0.203	0.799
	2	0.612	0.318	0.359	0.946
	3	0.117	0.341	0.966	0.707
	4	0.641	0.414	0.881	0.948
	5	0.663	0.155	0.562	0.772
Precision	1	0.024	0.112	0.143	0.855
	2	0.412	0.367	0.696	0.824
	3	0.002	0.478	0.444	0.793
	4	0.899	0.006	0.920	0.935
	5	0.436	0.324	0.463	0.884
F₁ Measure	1	0.045	0.122	0.168	0.822
	2	0.492	0.341	0.662	0.881
	3	0.004	0.398	0.608	0.732
	4	0.748	0.012	0.900	0.941
	5	0.526	0.210	0.508	0.824

Table 6. Comparison of processing time.

Algorithms	PHOT	LCA	MSCDAE	LRN-L
Time (ms)	450	430	9746.59	2.82

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shi, H.; Li, G.; Bao, H. Lightweight Reconstruction Network for Surface Defect Detection Based on Texture Complexity Analysis. Electronics 2023, 12, 3617. https://doi.org/10.3390/electronics12173617

AMA Style

Shi H, Li G, Bao H. Lightweight Reconstruction Network for Surface Defect Detection Based on Texture Complexity Analysis. Electronics. 2023; 12(17):3617. https://doi.org/10.3390/electronics12173617

Chicago/Turabian Style

Shi, Hui, Gangyan Li, and Hanwei Bao. 2023. "Lightweight Reconstruction Network for Surface Defect Detection Based on Texture Complexity Analysis" Electronics 12, no. 17: 3617. https://doi.org/10.3390/electronics12173617

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Lightweight Reconstruction Network for Surface Defect Detection Based on Texture Complexity Analysis

Abstract

1. Introduction

2. LRN-L

2.1. Texture Complexity

2.2. Lightweight Reconstruction Network Model (LRN)

2.3. Loss Function

2.4. Defect Location

3. Experiment

3.1. Dataset Introduction

3.2. Evaluation Index

3.3. Network Structure Comparison Experiment

3.4. Loss Function Comparison Experiment

3.5. Experiment of Texture Complexity

3.6. Experiment of Loss Function under Different Weight Factors

3.7. Comparison Experimental of Related Algorithms

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI