1. Introduction
Welding inspection, or the inspection of the quality of welding products, used to ensure the integrity, reliability, safety and usability of the welded product structure, is widely used in aerospace, aviation, automotive, machinery, shipbuilding and other industries.
Although welding technology has developed to a considerably mature stage, welding defects may occur as a result of improper manual operation, unstable environments and welding equipment problems. As shown in
Figure 1, the defects can include porosity, concavity, cracks, etc. A good welding defect detection technology will improve the productivity of the manufacturing industry, speed up the production cycle for high-quality products and cut the costs of labor and materials [
1].
The most traditional welding inspection method, which is carried out by experienced professionals with naked eyes and specialized tools, not only results in detection inefficiency but also demands a large number of professionals. Moreover, detection accuracy cannot be guaranteed due to visual fatigue and other problems caused by working long hours. Other methods can be categorized as either destructive testing or non-destructive testing methods depending on how the detection is conducted [
2]. Non-destructive testing methods can achieve detection without causing any damage to the tested object and thus has been widely used and studied.
The non-destructive testing methods include magnetic particle testing, eddy current testing, magneto-optical imaging testing, ultrasonic testing, infrared testing, penetrant testing and phased array ultrasonic testing, each of which has its own limitations [
3,
4,
5,
6,
7]. The first three methods have certain requirements for the type of materials to be tested. Specifically, magnetic particle testing requires that the test piece be ferromagnetic material; eddy current testing is only applicable to the detection of surface and near-surface defects in conductive materials, as there are many interference factors involved in the detecting process; and magneto-optical imaging inspection produces images that are unclear, either for the object itself or the background, thus requiring a series of further image processing algorithms to improve the contrast of the magneto-optical image and highlight the welding characteristic information. Ultrasonic testing and infrared testing, on the other hand, have high requirements for the surface of the object to be tested. Ultrasonic testing does not work well on rough surfaces because a rough surface interferes with the ultrasonic projection effect and thus affects the accuracy of the test results. Infrared inspection cannot assess the shape, size or location of welding defects. The performance of penetration inspection is significantly affected by the imaging agent and testing environment. Phased array ultrasonic testing is a new technology in the field of non-destructive testing. It has the advantages of high sensitivity, high resolution and the ability to detect complex workpieces and deep defects. However, this technology also has its shortcomings. Defects are not intuitive, so it is difficult to characterize them, and this technology does not allow real-time detection. It is only suitable for detecting internal defects of more than 5 mm with regular shapes, and it has high operational requirements for operators.
Another method of non-destructive testing, the structured light-based non-destructive testing method, though not sensitive to the internal defects of the weld, has been widely used as an excellent non-destructive testing method for the surface defects of welds. This method is generally implemented to generate original images and data by laser scanning the weldment with either laser points or beams [
8]. Structured-light inspection features high accuracy, compact hardware and a high sampling rate, and line structured-light inspection has turned out to have better stability, efficiency and performance. Most of the existing structured-light non-destructive testing methods are based on structured-light images, which cannot give full play to the advantages of high-precision data and do not have good anti-noise properties. With the rise of fields such as deep learning and signal processing, improving the performance of structured-light non-destructive testing with the help of high-performance models of dimensional expansion and convolutional neural networks would be of great value to research. The expansion of the scale and dimensions of defect features through the encoding of high-precision welding data and the recognition of two-dimensional encoded images by making full use of the advantages of mature convolutional neural network models are the focus of this paper.
As a result, starting from the collection of the original grayscale image of the weld contour with a laser sensor, this study used the corresponding welding data as one-dimensional time-series data with all the defect characteristics of the weld, adopting the Gram angle transform and Markov transform methods to encode the data into two-dimensional images. LeNet, AlexNet, ResNet, VGG and other deep learning algorithms were also applied to classify and recognize the four types of welding detection results, which are no defects, holes, burrs and depressions. The flow chart for this method is shown in
Figure 2. It was verified through experiments that this method is 4–6% more efficient in identifying and classifying the four types of welding detection result compared to the detection method involving processing welding images directly.
In recent years, non-destructive testing has been widely used in welding manufacturing, additive manufacturing (AM), textile reinforced concrete (TRC), building performance diagnostic inspections and other popular fields. Although NDT technology has been developed over decades and applied in various manufacturing inspection scenarios, it still has shortcomings [
9,
10,
11]. NDT techniques such as visual inspection, magnetic particle inspection, penetrant inspection, ultrasonic inspection, radiographic inspection, acoustic emission and eddy current inspection are mostly manual and heavily dependent upon inspectors’ knowledge and experience, leaving room for errors [
12].
The optimization methods for welding defect detection can be divided into three branches: traditional algorithms, machine-learning methods and deep-learning methods. Some scholars have proposed traditional welding defect detection methods based on the morphological and geometric characteristics of welding images. For example, Masoumeh Aminzadeh et al. proposed a welding defect detection background subtraction technique based on grayscale morphology. In this method, which uses optical inspection non-destructive quality monitoring technology, the low computational load associated with the morphological operations used make it more computationally efficient than background subtraction techniques, such as spline approximation and surface fitting. The performance of this technique was tested by applying it to the detection of defects in welds with non-uniform strength distributions where the defects were precisely segmented [
13]. Zeng et al. proposed a welding seam recognition method using two directional lights. First, directional lights were projected onto the edges of the seams to generate unique artificial light and shadow (LS) features. Then, image processing algorithms based on threshold and edge extraction were used to calculate the accurate edges of the seams to achieve weld recognition. Finally, a welding seam inspection platform was built, and the welding seam identification and deviation correction experiments were carried out. The experimental results showed that the proposed detection method could effectively identify the edge of the seam [
14]. Common weld detection algorithms are likely to be disturbed by the noise from the spatter and arc during the welding process. A weld seam recognition algorithm based on structured-light vision has been proposed to overcome this challenge. The algorithm can be divided into three steps: initial laser center line recognition, online laser center line detection and weld feature extraction. A Laplacian of Gaussian filter is used for the recognition of the laser center line. Then, an algorithm based on the NURBS-snake model detects the laser center line online in a dynamic region of interest (ROI). Using the line obtained from the previous step, feature points are determined through segmentation and straight-line fitting, while the position of the weld seam can be calculated according to the feature points. The accuracy, efficiency and robustness of the recognition algorithm have been verified with experiments [
15]. However, the above-mentioned welding detection methods have turned out to have low detection accuracy, and the features of the structure can hardly be applied in other contexts.
With the rapid development of machine-learning technology, a variety of machine learning-based NDT methods have emerged in the field of additive industrial manufacturing [
16]. Machine learning (ML) has been applied to various aspects of AM to improve the whole design and manufacturing workflow, especially in the era of industry 4.0. Goh et al. discussed the application of various types of machine-learning techniques to various aspects of additive manufacturing [
17]. Various machine-learning methods have been proposed in the field of welding defect detection. Hongquan Jiang et al. proposed a new method for weld defect classification based on the analytic hierarchy process (AHP) and Dempster–Shafer (DS) evidence theory. The method based on DS evidence theory was presented to improve the accuracy of classification and included calculation of the standard values of features based on frequency histogram analysis and an improved Dempster’s rule for combination based on WF. A case study on the classification of steam turbine weld defects was provided to illustrate and evaluate the proposed techniques. [
18]. Sudhagar et al. proposed a system for detection and classification of defective welds using weld surface images. The weld surfaces produced under different welding condition were captured by a digital camera and processed to extract features. The features from the weld surface image were extracted with a maximally stable extremal region algorithm and used as the input for classification of the weld joint. The support vector machines algorithm has been used for classification of welds using features from surface images [
19]. Lee et al. proposed a system for monitoring the welding of galvanized steel sheets with a spectrometer. In this study, the Fisher criterion was used to achieve defect feature ranking, and the k-nearest neighbor algorithm was used to select the feature ranking and achieve defect classification [
20]. Juan-Manuel Alvarado-Orozco et al. proposed a method based on machine learning and the random forest (RF) algorithm to classify porosity and achieved high accuracy. The proposed method was divided into three stages: the preprocessing stage—image denoising, smoothing and unblurring to highlight the areas with pores; the feature extraction stage—segmentation of pores and the morphological/geometrical features that describe the porosity; and the intelligent classifier stage—definition, training, testing and validation of the random forest classifier [
21]. Compared to traditional defect detection methods, these machine-learning methods have delivered better results for classification and detection. However, with the rapid development of deep learning in the past few years, deep-learning methods have surpassed machine-learning methods in classification and recognition accuracy.
At present, the development of deep-learning technology is gradually becoming mature. Deep learning has shown better performance in welding image classification than traditional machine learning for the training of a large number of data samples. A variety of methods adopting deep-learning neural networks for welding defect detection have also been proposed. For example, Je-Kang Park et al. proposed a method based on a convolutional neural network (CNN) that uses a single RGB camera to inspect welding defects on the transmission surface of the engine. The proposed method consists of two steps, beginning with extraction of the welding area to be inspected from the captured image. In this first step, to extract the welding area from the captured image, a CNN-based approach is used to detect the center of the engine transmission in the image. In the second stage, the extracted area is identified by another CNN as defective or non-defective [
22]. Zhifen Zhang et al. designed an 11-layer CNN classification model based on weld images to identify weld penetration defects. The CNN model has made full use of arc lights by combining them in various ways to form the complementary features. Test results show that the CNN model has better performance than our previous work [
23]. Industrial X-ray weld image defect detection is an important research field for non-destructive testing (NDT) [
24]. Dong et al. proposed a multitask deep one-class CNN for defect classification. They built a stacked encoder–decoder autoencoder to learn feature representation from normal images. The encoder is used as a feature extractor based on the hard sharing scheme for multitask learning. For defect detection, their approach achieved results almost as good as the supervised method, even without any annotated data [
25]. Chen et al. focused on establishing an end-to-end automatic detection model for X-ray welding defects to improve the accuracy and efficiency of detection based on a deep learning algorithm. Considering the feature information of welding defects, this study achieved improvements on the basis of the Faster R-CNN method and used the deep residual network Res2Net to improve the feature extraction ability [
26].
In the above-mentioned methods, most of the welding datasets used were composed of X-ray images and welding images collected by a CCD camera. These types of welding images have certain limitations. When the X-ray image and the CCD camera collect the welding image for welding defect classification processing, noise is produced and the accuracy of the classification affected. Various researchers have proposed several noise reduction methods for processing image noise, including skeleton extraction, morphological opening and closing operations, filters and other noise reduction processing methods, but the processing effect is not good enough and processed noise still exists. Furthermore, the X-ray image and the image collected by the CCD camera are not sensitive to welding surface defects, such as depressions and burrs, and the radiation produced when the X-ray image is collected is a problem that needs to be considered. In order to address these downsides, the method proposed in this paper does not directly work on the welding image but processes the corresponding data from the original weld image collected by the laser acquisition device.
Deep learning has achieved considerable advances in the field of computer vision, but, for one-dimensional time series, direct application of general predictive models does not work well. The problem can be attributed to the difficulty of training neural networks, the scarcity of large-scale labeled datasets and insufficient research on 1D-CNNs compared to 2D-CNNs. If we convert one-dimensional time series into corresponding two-dimensional time-series images, we can achieve better recognition results, as in the image field. In the field of time-series classification, many researchers have also proposed related methods. For example, Hatami et al. used recurrence plots (RPs) to transform time series into 2D texture images and then took advantage of the deep CNN classifier [
27]. Li et al. first adopted the slide relative position to convert the time series data into 2D images during preprocessing and then employed a CNN to classify these images. This made the best use of the advantages of CNN in image recognition [
28].
In welding defect detection and recognition, the results of the structured-light sampling of welds are similar to one-dimensional time series. Therefore, this study extracted the center trajectory of the weld structured-light image strip, regarding it as one-dimensional time-series sampling at a fixed time interval, and treated the entire dataset as one-dimensional time-series data for processing. By encoding the one-dimensional time-series data into two-dimensional time-series images, detection and classification of weld defects can be realized. Compared with other welding inspection methods, the method presented in this study has three advantages: first, it is easier to highlight defects by expanding the dimensions and scales of weld features; second, treating the data points as a one-dimensional time series can make the description of the contour features of the weld more accurate; third, one-dimensional time series can encode two-dimensional images and achieve higher classification accuracy using the existing deep learning model, and the generalization ability of this algorithm is stronger and stable.
2. Method of This Study
Using a laser profile sensor, we sampled and detected the welding seam of a steel plate. By doing so, welding surface classification of the holes, burrs, depressions and non-defect samples in the steel plate welding process was realized. A burr defect is defined herein as any small, visible protrusion in the weld area; a hole defect is defined as any visible void area; and a depression defect is defined as any weld area below the level of the steel plate. The study was carried out in three steps. First, we obtained one-dimensional weld data by performing denoising processing on the gray image of the original weld and extracting the center trajectory of the weld structured-light strip. Then, we obtained the weld data as a one-dimensional time series and converted the sequence code into two-dimensional time-series images. Finally, we used a deep neural network to classify weld defects and verify the advanced nature of the method proposed in this paper. The general process of the study is shown in
Figure 2.
2.1. Image Denoising Processing
The original structured-light image had to be smoothed and denoised since there was obvious noise interference in it. We conducted a comparative analysis of the commonly used denoising algorithms, including the median filter, mean filter, Gaussian filter and adaptive median filter algorithms. Considering that the laser light line is narrow, we used an image filter to denoise the original laser image to ensure that the continuity of the laser light line was guaranteed and that large-scale reflection halo noise in the laser sampling was achieved. The adaptive median filter denoising method was used to denoise and smooth the original structured-light image. The processed image is shown in
Figure 3b.
2.2. Extraction of the Trajectory of the Weld Center
After the image was denoised, the center trajectory of the light strip of the original structured-light image was extracted using the Steger algorithm. Common light strip center trajectory-extraction algorithms used in welding inspection include the geometric center method, extreme value method, gray barycentric method, direction template method and Hessian matrix method, all of which, however, have certain limitations. Both the geometric center method and the extreme value method have fast extraction speeds, but they are easily affected by image noise. The gray barycentric method is not sensitive to the translation of the light stripe section, though it can reduce the error caused by the asymmetry of gray distribution. The directional template method has high accuracy and good algorithm robustness, but the positioning accuracy is only at the pixel level, and it also involves a large amount of calculations and slow processing speed. The Hessian matrix method has high accuracy but requires multiple, large-scale two-dimensional Gaussian convolutions, and the calculation speed is slow. In comparison to the algorithms mentioned above, the Steger algorithm, based on the Hessian matrix, can achieve sub-pixel-precision positioning of the center of the light strip by finding the Taylor expansion in the normal direction of the light strip with strong robustness and fast processing speeds.
The Steger algorithm calculates the eigenvalues and eigenvectors based on the Hessian matrix to obtain the normal direction of the light strip center of the structured-light image and then performs Taylor expansion in the normal direction to obtain the corresponding normal extreme point, which is the required light strip center trajectory for sub-pixels [
29]. For any point
on the structured-light image light line, the Hessian matrix is expressed as:
where
,
,
,
, represent the second-order partial derivatives of the image and
represents the Hessian matrix.
The eigenvector of the maximum eigenvalue of the Hessian matrix corresponds to the normal direction of the light strip, which is represented by
. With the point
as the reference point, the sub-pixel coordinates of the center of the light strip are expressed as:
where
t is expressed as:
If —that is, if the point where the first derivative is zero is located in the current pixel—then represents the center point of the light strip, and is expressed as sub-pixel coordinates.
In the experiment, the text data for the weld center trajectory were obtained using the Steger algorithm. It was found that the distance intervals in the horizontal direction were the same. If the horizontally equal interval distance unit here is regarded as the time unit of the same time interval, we can take it as a special unit of one-dimensional time-series data in the same dimension. By taking the extension direction of the line laser as the X axis and the height information returned by the line laser sensor as the Y axis, we can construct the image shown in
Figure 4.
2.3. One-Dimensional (1D) Data Coding of Two-Dimensional (2D) Time-Series Images (GAF)
When a one-dimensional sequence is converted into a corresponding two-dimensional image, better recognition performance can be achieved by using contemporary machine vision. After analysis, the central trajectory text data obtained in this experiment could be treated as special one-dimensional time series data. We adopted the one-dimensional time series method to process welding data in the experiment and to convert it to the corresponding two-dimensional time-series image.
In the field of welding defect detection, converting one-dimensional sequence data into a two-dimensional image and utilizing depth-learning model analysis is a method worth testing. Through experimental analysis, our central trajectory text data was encoded as a two-dimensional color time-sequence image encoding a neural network learning process. The process used for the two-dimensional color time-sequence images is shown in
Figure 5.
This article introduces two frameworks that encode one-dimensional time series as two-dimensional time-series images. The first is a Gram angular field (GAF) method. This method is used to encode time series as pole-based representations rather than Cartesian coordinates. It examines the angle and identifies poor triangle function transformations. If two angles and COS functions are used, GASF is obtained, while if the COS function of the two corners is used, GADF is obtained. The second framework is the Markov transition field (MTF) method, which uses the Markov changes in the time series. Images produced using the MTF method indicate the first-dimensional first-order Markov transfer probability along one axis and the time dependence along another one [
30].
The principle of the GAF method is the conversion of the resulting one-dimensional time-series data from a right angle coordinate system to the pole coordinate system and the identification of time dependency at different time points by considering the angles and angular differences between the different points. Images are considered Gram matrixes in the GAF method, each of which is a triangle between different time intervals (i.e., superimposed). There are two implementation methods: GASF (corresponding to the angle) and GADF (corresponding to the angle difference).
Supposing all vectors are in units, the Gram matrix can be written with the following formula:
where
is an angle of two vectors. Single-variable time sequences can only explain data characteristics and potential status to some extent, so our goal was to find an alternative with richer representations. The answer was the Gram matrix, which retains time dependency. In the Gram matrix obtained after the implementation, the diagonal elements provided information on each feature while the remaining elements provided information about these features’ relations to each other. Therefore, the Gram matrix can not only show the features of the data but also reflects the close links between different features.
In a given time series
, in order to ensure the inner spot is not biased toward the maximum observation, we normalize
to set all values in the time series at intervals [−1, 0] or [0, 1]:
where
and
are the maximum and minimum values in the time series, and
and
are the results normalized to the interval [–1, 0] and [0, 1]. By encoding the value as the angle cosine and the timestamp as a radius, we denote the time series
, and the equation is as follows:
In Equation (7), is a timestamp, and is a constant factor that standardizes the polar coordinate span. is the normalized time series element value of [–1, 1], is the angle in polar coordinates and is the radius in polar coordinates. This conversion has two advantages.
On the one hand, is double-shot, and it is also monotonous. Therefore, the proposed map generates a unique result in the unique inverse mapping in the pole coordinate system with the given time series. On the other hand, the conversion to polar coordinate representations can retain time dependency.
After converting the time series into a pole coordinate form, it is possible to consider the triangular representation and the difference between each point in order to determine the time dependency of different time intervals with an angle view. The Gramian summation angular field (GASF) and Gramian difference angular field (GADF) are defined as follows:
where
I is a unit line vector
. The two types of Gramian angular fields (GAFSs) are actually quasi-Gram matrixes
.
The transformed polar coordinate representation can constitute a new class of Gram matrix:
In the above formula, is a Gram matrix obtained with the GASF method, while is a Gram matrix obtained with the GADF method. The one-dimensional time series is encoded as a GASF matrix and a GADF matrix by the above algorithm.
2.4. One-Dimensional (1D) Data Coding of Two-Dimensional (2D) Time-Series Image (MTF)
The Markov transition matrix is not sensitive to the time dependency of the sequence. Based on the first-order Markov chain, and considering the time position relation, we used the Markov transition field (MTF) method in this study [
31].
When a time series
is given to define the packet number box
of the time series, and each
in the time series is assigned to the respective storage box
, the weighted adjacent matrix
which can be constructed as
is converted from the first-order Markov chain count-point box. In order to overcome this shortcoming so that
is not sensitive to the distribution of
and the time dependency of the time step, a new definition of the Markov transition field (MTF) was used as follows:
The in the MTF here is the probability of the → transition. By considering the time position, the matrix containing the transition probability for the size axis is extended to the MTF matrix. The main diagonal in the MTF matrix, , captures the probability of transition from each quantile to itself (self-transition probability). To improve the computational efficiency, a fuzzy kernel is used to average the pixels in each non-overlapping patch ; that is, the transformation probability in each subsequent length is aggregated. By doing so, one-dimensional time-series data are encoded into a Markov transition field (MTF) matrix.
2.5. Neural Network Model
One of the earliest neural networks, the LeNet network has reduced heavy computing costs through the use of convolution, parameter sharing and pooling [
32]. After the LeNet network, the AlexNet network appeared and verified the efficiency of deep convolutional neural networks. It uses the ReLU function as the activation function for the neural network, dropout regularization to control the overfitting and the parallel computing power of the GPU to accelerate the training of the network [
33]. After the AlexNet network, the VggNet neural network appeared. In contrast to the AlexNet network, the VggNet neural network adopts several continuous 3 × 3 convolution kernels to replace the large convolution (11 × 11, 5 × 5) in AlexNet, and removes the local response normalization layer in AlexNet [
34]. With a deeper network structure, a smaller convolution kernel and a pooled sampling domain, the VGG network model manages to control the number of parameters while obtaining more image features, thus avoiding excessive computation and an overly complex structure, and many powerful networks have improved on it [
35]. Compared to the traditional VGG network, the ResNet network, which appeared later, is less complex and requires a smaller number of parameters. It utilizes a residual network in which the gradient does not disappear when the number of network layers is increased. Therefore, the classification accuracy is improved, and the deep network degradation problem is solved [
36].