Next Article in Journal
A Hybrid Swarming Algorithm for Adaptive Enhancement of Low-Illumination Images
Previous Article in Journal
Attached Flows for Reaction–Diffusion Processes Described by a Generalized Dodd–Bullough–Mikhailov Equation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Product Quality Anomaly Recognition and Diagnosis Based on DRSN-SVM-SHAP

School of Mechanical and Precision Instrument Engineering, Xi’an University of Technology, Xi’an 710048, China
*
Author to whom correspondence should be addressed.
Symmetry 2024, 16(5), 532; https://doi.org/10.3390/sym16050532
Submission received: 21 March 2024 / Revised: 23 April 2024 / Accepted: 25 April 2024 / Published: 29 April 2024

Abstract

:
Conventional quality control methodologies are inadequate for fully elucidating the aberrant patterns of product quality. A multitude of factors influence product quality, yet the limited number of controlled quality characteristics is insufficient for accurately diagnosing quality abnormalities. Additionally, there are asymmetries in data collection, data pre-processing, and model interpretation. In this context, a quality anomaly recognition and diagnosis model for the complex product manufacturing process is constructed based on a deep residual network, support vector machine (SVM), and Shapley additive explanation (SHAP). Given the numerous complex product quality characteristic indexes and unpredictable accidental factors in the production process, it is necessary to mine the deep relationship between quality characteristic data and quality state. This mining is achieved by utilizing the strong feature extraction ability of the deep residual shrinkage network (DRSN) through self-learning. The symmetry of the data within the model has also been taken into account to ensure a more balanced and comprehensive analysis. The excellent binary classification ability of the support vector machine is combined with the DRSN to identify the quality anomaly state. The SHAP interpretable model is employed to diagnose the quality anomaly problem of a single product and to identify and diagnose quality anomalies in the manufacturing process of complex products. The effectiveness of the model is validated through case analysis. The accuracy of the DRSN-SVM quality anomaly recognition model reaches 99%, as demonstrated by example analysis, and the model exhibits faster convergence and significantly higher accuracy compared with the naive Bayesian model classification and support vector machine classification models.

1. Introduction

It is of paramount importance to monitor the production process and detect quality defects in a timely manner during the manufacturing of a product. The diagnosis of quality problems allows for the determination of the key factors that lead to abnormal production or quality conditions, which in turn enables real-time adjustment of the production process. This is an essential means of ensuring the quality of product manufacturing. The advent of intelligent manufacturing processes and the continuous development of new-generation information technology have led to an increased focus on the use of big data, artificial intelligence, and other technologies to achieve real-time quality prediction, quality anomaly identification, and diagnosis in the manufacturing process. This is an important area of research [1].
Xu et al. [2] employed a convolutional neural network (CNN) for the automatic extraction of significant features from raw production data. This was done with the aim of identifying control chart patterns of statistical process control (SPC) and detecting quality anomaly patterns. Meanwhile, Yu et al. [3] proposed a method for the identification of control chart anomalies based on a convolutional neural network (CNN) and a long-short-term memory (LSTM) network to recognize quality anomaly patterns in the manufacturing process. Chiu et al. [4] proposed a hybrid technique that employs singular spectrum analysis and random forest to identify SPC process control charts and verify quality anomaly patterns. Wan et al. [5] proposed a method for recognizing quality anomaly patterns by utilizing optimized random forest and multi-feature extraction, which addresses the limitation of conventional multi-source control charts in identifying the specific variables causing process anomalies. Li et al. [6] proposed a genetic algorithm combined with a probabilistic neural network to identify anomalies from quality control charts. This was in response to the issue of late detection of quality anomalies and a few instances of product quality defects. Technical term abbreviations are explained when used for the first time.
The intricate manufacturing processes inherent to mechanical production present a challenge to the production of defect-free products. The description of these defects through control chart models is, therefore, a difficult task. However, the application of machine learning algorithms can significantly enhance the accuracy of identifying the types of quality defects. These algorithms offer powerful data analysis capabilities, enabling the identification of intricate details hidden within the original data. Zhao et al. [7] mitigated the impact of asymmetrical and imbalanced samples on the performance of the convolutional neural network model by incorporating a sensitivity coefficient in the loss function. They optimized the model parameters using orthogonal tests to identify quality anomalies in the patterns. Liu et al. [8] employed a deep belief network to extract primary data features from manufacturing process raw data, thus creating an online process diagnosis model for highly effective quality recognition. In a separate investigation, Mondal et al. [9] proposed a hierarchical Bayesian network framework for multivariate and multistage process fault diagnosis. Furthermore, Yu et al. [3] developed a stacked denoising self-encoder technique for identifying patterns of defects in manufacturing processes. Donovan et al. [10] introduced a cost-sensitive deep convolutional neural network model for the identification of manufacturing process control chart patterns, with the objective of achieving quality anomaly pattern detection. Meanwhile, Jeong et al. [11] have developed a recursive quantitative analysis methodology that extracts non-linear characteristics from data on quality. These features were combined with temporal order parameters to create new vectors of characteristics that improve quality diagnosis accuracy. Benmahamed et al. [12] combine the nearest neighbor algorithm and support vector machine with the Duval method to optimize the support vector machine with the particle swarm optimization algorithm. Li et al. [13] proposed a diagnostic method for inferring surface topography problems in groove grinding for quality diagnosis. A Bayesian network-based Leaky Noise-OR model was employed to address the issue of diagnosing multi-factorial process quality in the presence of incomplete information. Cai et al. [14] proposed an improved mesh-optimized principal component analysis-support vector machine mean drift monitoring model for identifying anomalous variables. Ma et al. [15] examined quality anomaly detection, constructing a Gaussian mixture model, and refining the Mahalanobis distance. Meanwhile, Mondal et al. [9] proposed a unified framework using a bi-level Bayesian network to determine the absolute mean deviation of the feature state through inferred state distributions produced by HBN. This framework also produces a control chart to identify process changes and diagnose their root causes. Xu et al. [16] proposed a Bayesian optimization multi-attention depth residual network suitable for industrial soft sensing models. By combining the soft threshold function with the residual network, attention fusion module and Bayesian optimization strategy, the problem of noise and threshold compression to 0 is effectively solved, thus avoiding information loss and improving the rationality of the hyperparameter setting.
According to the aforementioned research, it is imperative to ascertain the present quality status within the product manufacturing process, establish whether the current product quality is atypical, and diagnose the cause of the anomaly expediently and precisely. However, the real-time performance of identifying and diagnosing quality anomalies in the manufacturing process of complex products needs improvement due to the high dimension of quality characteristics, high coupling, and noise of quality-related data. The performance of a neural network model is closely related to the model’s complexity, and if the model is too complex, it has good performance but is not conducive to deployment on hardware or online applications [17]. Therefore, this paper addresses the issue of quality anomaly identification and diagnosis in the manufacturing process of complex products. After examining various indexes of quality characteristics and accounting for significant noise levels, a method is proposed for identifying quality anomalies that employs self-learning neural networks and support vector machines. The SHAP technique is then utilized to perform a diagnosis on the root cause of each anomalous quality sample. First, the depth of the residual neural network is utilized to enhance noise reduction and increase self-learning capabilities for data mining of redundant data. From extensive historical data analysis, the primary characteristics associated with identifying abnormal quality are identified. Secondly, the deep residual network’s extracted features are utilized as input for identifying quality anomalies through the support vector machine’s excellent binary classification performance. A quality diagnosis model based on SHAP was developed to identify the root cause of abnormal quality by assessing the degree of the contribution of quality characteristics. The model was then tested using an example.
The first introductory section outlines the research background of this thesis, the current state of research, the evaluation of existing research, and presents the core research questions, the importance of the research, the methodology adopted, the technical path, and the overall structural design. Section 2 then proposes a new neural network-based framework for the identification and diagnosis of quality anomalies in manufacturing processes. By innovatively constructing a neural network model, extracting quality characteristics using the deep residual shrinkage network (DRSN), then combining it with the support vector machine (SVM) for anomaly identification, and applying the SHAP method for diagnosing the causes of anomalies, a comprehensive DRSN-SVM-SHAP diagnostic model is created, the practicality of which is verified by examples. Section 3 briefly summarises the main work and research results of the full thesis.

2. Materials and Methods

2.1. Analysis of Quality Anomaly Identification and Diagnosis in the Complex Product Manufacturing Process

General models for the identification and diagnosis of quality anomalies have inherent flaws that make them less accurate and less efficient. Figure 1 illustrates the general methodology for identifying and diagnosing manufacturing process quality anomalies. The formation process of product quality in the figure demonstrates that complex product quality influences numerous factors. Based on historical data, the quality anomaly pattern is not comprehensive, resulting in the emergence of new quality anomalies. This can lead to an inability to identify these anomalies in a timely and effective manner. The classification of quality anomalies process in the figure illustrates that quality anomalies are typically divided into several types. However, this classification introduces increased complexity to the model. The data dimensionality reduction process, while downgrading the quality features, also results in the loss of some of the features, which affects the accuracy of the identification. In the diagnosis of the causes of quality anomalies, a general solution is typically provided for a class of problems but not for each sample. This results in the inability to quickly carry out the processing of the quality problem, which affects the progress of the production of the product.
The manufacturing process for complex products is multifaceted and involves various quality indicators. Meanwhile, the correlation between the different factors affecting product quality in the manufacturing process can lead to quality abnormalities that are difficult to identify. It is challenging to accurately diagnose the primary factors responsible for quality issues. To this end, this paper presents a method for identifying quality anomalies based on Deep Neural Networks (DNN) and SVM. Additionally, the SHAP model is utilized to diagnose the underlying cause of such anomalies, as illustrated in Figure 2. Initially, as shown in the deep residual network quality feature extraction model in Figure 2, the feature extraction capability of the deep residual shrinkage network is used to extract the relevant information for anomaly identification from the quality characteristic data, which is characterized by high-level noise and strong redundancy. Subsequently, the quality diagnostic model is based on the SVM model in Figure 2. The product quality is categorized into normal and abnormal categories with the help of SVM’s stable and good binary classification ability for high-dimensional data. Additionally, the quality anomaly identification process for individual products can be achieved by utilizing the dependable interpretation abilities delivered by the SHAP model in conjunction with deep learning algorithms. Furthermore, the contribution ranking of quality characteristic data that affects the quality anomaly is provided to facilitate the identification and diagnosis of manufacturing process quality anomalies. The flow of identification and diagnosis of quality anomalies in the manufacturing process based on DRSN-SVM-SHAP in the figure reflects the operation logic of the model. As illustrated in the model’s structure, in addition to the model’s inherent advantages, such as accelerated convergence and enhanced accuracy, it also possesses numerous intrinsic advantages for addressing the variability inherent to the manufacturing process. These advantages are primarily as follows: (1). The model exhibits profound feature learning and characterization capacity. The deep architectural design of DRSN enables the automatic acquisition of multi-level, non-linear feature representations from raw data, along with the consideration of symmetry factors. This capability effectively captures complex, hidden variability patterns within the manufacturing process. The resulting representation enables the model to maintain high anomaly identification accuracy in the face of multi-factor variability. (2). The model has the ability to generalize and adapt. As a classifier, SVM has good generalization performance and the ability to identify anomalous samples, which is suitable for dealing with symmetric information feature space. Meanwhile, SHAP can provide a local interpretation of the model prediction, help identify the key factors leading to anomalies, and facilitate the model’s adaptation to specific variability. (3). The model is capable of dynamic feature importance assessment. In addition to providing a global view of feature contributions, SHAP also dynamically assesses the importance of each feature in identifying product quality anomalies under specific manufacturing conditions. Furthermore, SHAP analyses can reveal how a particular symmetry or symmetry-breaking pattern affects the identification of quality anomalies. This allows the model to respond quickly to changes in the manufacturing process and provide valuable insights for targeted process improvement and anomaly mitigation strategies. The application of the aforementioned strategies within this model enables the model to effectively cope with the variability inherent to the manufacturing process, to accurately and timely identify and diagnose anomalies, and to improve the stability of the manufacturing process and product quality.

2.2. Construction of a Neural Network-Based Model for Identifying and Diagnosing Quality Anomalies

The complex product manufacturing process necessitates the implementation of quality abnormal analysis to identify problems. This process entails the identification of quality-related data imbalance, asymmetry, high noise, high redundancy, and the characteristics of high coupling, which collectively lead to the recognition and diagnosis of the main causes of low accuracy. This paper presents a recognition and diagnosis model of quality anomalies in complex product manufacturing processes based on a neural network. The processing method of the highly imbalanced and asymmetrical quality data set is first outlined. The residual shrinkage network, which is capable of learning from high-noise, high-redundant data, is employed to extract effective features from quality-related data. This enables the identification of information related to quality. The support vector machine, which exhibits excellent binary classification performance, is used to identify the current quality state by taking the extracted quality data as input. The SHAP model is employed to diagnose the cause of quality anomalies by explaining the process of feature extraction and quality anomaly identification.
In the context of designing a prediction model for quality assurance in complex manufacturing processes, the methodology employed possesses a natural advantage in addressing the issue of dataset imbalance and asymmetry. This remains a core reason for the subpar accuracy in recognizing quality anomalies. The common data processing techniques for unbalanced datasets include oversampling, undersampling, and mixed sampling. Therefore, this issue warrants attention, and proper processing methods need to be employed to improve accuracy. Among these techniques, mixed sampling combines oversampling to expand the data set and undersampling to clean the expanded data set. This helps to avoid the over-fitting that can result from repeatedly extracting samples through oversampling and skipping core data through undersampling.
The “SMOTE + ENN” technique for mixed sampling initially creates novel minority samples based on Equation (1), employs the Euclidean distance metric for gauging the “distance” between samples, and randomly selects some samples from its M nearest neighbors for creating new ones. In order to predict the samples on the dataset, the Edited Nearest Neighbors (ENN) approach is employed. Samples with incorrect predictions are then eliminated post-final sampling in order to generate a new dataset.
x n e w = x + r a n d ( 0 , 1 ) × ( x ~ x )

2.2.1. Based on the Quality of the Depth of the Residual Shrinkage Network Feature Extraction

The manufacturing process for intricate products is intricate and incorporates different quality features. Initially, the raw quality data gathered undergoes principal component analysis, random forest, correlation filtering, and other techniques to reduce the various parameters. However, some attributes are absent due to the dimension reduction process, leading to inaccuracies in the recognition results. In this paper, the residual shrinkage deep feature extraction method is evaluated to determine the quality of a convolutional neural network and its ability to learn from data correlations. Additionally, the study explores the ability of residual modules in the deep residual shrinkage network to process high levels of noise and redundant information, leading to efficient feature dimension reduction. Technical terms are explained upon first use. The DRSN used in this model is architecturally symmetric, making the feature extraction process more balanced and stable.
1.
Convolutional Neural Network
The Convolutional Neural Network (CNN) comprises an input layer, a hidden layer, and an output layer. Through convolution and pooling operations, the hidden layer autonomously learns and extracts features from extensive data.
  • Convolutional layer (Conv). To extract features from original data, a convolution operation is used rather than a complete connection between artificial neural network layers. This approach adopts local connection and weight sharing to reduce computation while allowing the retention of upper-layer information for full feature extraction in lower layers. The operation is expressed in Formula (2).
    y l ( i , j ) = K i l x l r j = j = 0 c 1 K i l j x l j + j
    Here,   K i l j   denotes the j weight in the i th convolution kernel of the l th layer. x l j + j denotes j the weight-aware location in the j th convolved region of the l th layer. c is the size of the convolution kernel.
  • Pooling layer (pooling the layers). It appears in pairs with the convolutional layer to compress the features after convolution to achieve data dimensionality reduction. According to the different calculation methods, the pooling operation is divided into random pooling, maximum pooling, and so on. Taking random pooling as an example, by calculating the probability value of elements in the current range and randomly retaining features according to the probability, the generalization ability is stronger.
2.
Deep residual networks
Convolutional neural network layers provide more comprehensive feature extraction, but may also suffer from excessive fitting and gradient explosion. The Residual Network (ResNet) [18] enhances the depth of the convolutional neural network by adding several standardization batches along with activation functions. The structure of the residual module consists of two Conv paths with an identity link, as depicted in Figure 3. The identity path allows for superior gradient propagation in the loss function of the deep residual network, leading to slight parameter updates, enhancing training speed, and mitigating network degradation.
Here, the cuboid represents the feature map with C channels, W width, and 1 height, and K represents the number of convolution kernels in the convolution layer (K = C means that the number of convolution kernels is the same as the number of channels of the input feature). ① indicates that the size of the input feature map is equal to the size of the output feature map. ② means that the width of the output feature map is halved. ③ means that the width of the output feature map is halved and the number of channels is doubled.
Batch normalization (BN) is utilized for the standardization of data and to diminish the impact of variations in data feature distribution on the model. Equations (3)–(6) reflect the batch calculation process.
μ = 1 N b a t c h n = 1 N b a t c h x n
σ 2 = 1 N b a t c h n = 1 N b a t c h x n μ 2
x n = x n μ σ 2 + ε
y n = γ x n + β
Here, x n and   y n denote the input and output features of the n th sample, γ and β denote the two trainable parameters of the scaling and translation distribution, and ε is a positive number close to zero.
Activation function (AF) is employed to transform data non-linearly and prevent the occurrence of the vanishing gradient problem. Many activation functions utilize the Rectified Linear Unit activation function (ReLu). Compared to sigmoid and tanh functions, ReLU is widely used in CNN since it can not only speed up learning but also alleviate the issue of gradient vanishing [19]. The ReLu activation function has a derivative of either 1 or 0, with a value range that remains roughly unchanged when transferred between layers. By doing so, it improves the speed of training.
y = m a x x , 0
Here, x and y represent function input and output characteristics, respectively.
3.
Deep Residual Shrinkage Network (DRSN)
A deep residual shrinkage network (DRSN) builds upon the deep residual network’s integrated attention mechanism and soft threshold function [20]. This enhancement enables the network to concentrate on non-core features by employing the attention mechanism, deactivating unimportant features via the soft threshold function, and conserving useful features. This module effectively extracts useful features from noisy signals. The DRSN consists of the input layer, convolutional layer, stacked residual shrinkage building unit (RSBU), batch normalization (BN), activation function, global average pooling (GAP), and fully connected layer [21], as depicted in Figure 4. The DRSN offers several key advantages in terms of feature extraction in this model: (1). Residual learning: the DRSN contains residual connections that allow the network to learn additive residuals instead of mapping inputs directly to outputs. This alleviates the problem of gradient vanishing encountered in deep networks and allows the model to train deeper architectures more efficiently. Deep networks can capture more complex and abstract features, thus improving the discriminative power of extracted features. (2). Efficient information propagation: residual connectivity circumvents the nonlinear activation function, enabling gradients to flow directly through the network. Shrinkage regularization also suppresses noise and irrelevant features, thereby accelerating training convergence, reducing overfitting, and ultimately improving model performance. (3). Reduced feature dimension: it helps to reduce the dimensionality of the feature space while retaining important information. The addition of residual join and contraction operations to the model enables it to generalize better to new data.

2.2.2. Quality Anomaly Recognition Based on DRSN-SVM

The identification of quality anomalies in complex product manufacturing processes is primarily based on the current value of the product quality characteristics index to determine if the product meets quality standards. The deep residual shrinkage network (DRSN) reduces noise and redundancy from the original quality data, resulting in reduced data dimensions. The support vector machine (SVM) is then used to identify abnormal quality states based on the reduced quality data. Through the extent of residual shrinkage, the combination of a network and support vector machine (SVM) can address complex problems of data redundancy arising from high noise and high product quality characteristics, leading to good recognition results.
The support vector machine is a linear classifier in supervised learning [22]. Its basic principle is to find a hyperplane that meets the classification requirements and make the points in the training set as far away from the classified hyperplane as possible to maximize the white space on both sides of it. Let the training set D be x i , y i ,   i = 1 , 2 , , n , and   x i R d , y i 1 , 1 are eigenvectors and associated labels, respectively. A separate hyperplane can be briefly described as W T X + b = 0 , where W is the normal vector of the hyperplane and b represents its deviation.
Find a marginal hyperplane with the greatest distance from the nearest training set that satisfies the formula for all sample points:
y i W T X i + b 1 ζ i i = 1 , 2 , , n
ζ i   is the slack variable reflecting the i constraint, and the optimization problem corresponding to the SVM model can be formulated as follows:
min φ ω = 1 2 ω 2 + C i = 1 n ζ i s . t .   y i W T X i + b 1 ζ i i = 1 , 2 , , n ζ i 0
C is a penalty factor that is added when violating the constraint to prevent overfitting, and the Lagrange multiplier α i 0 α i C is introduced to convert it into the corresponding dual problem:
max D α = i = 1 n α i 1 2 i = 1 n α i α j y i y j x i T x j 0 α i C i = 1 n α i y i = 0
The solution to the problem α * = α 1 * , α 2 * , , α n * T , α i * is either a training sample of 0 or a support vector, and a small number of support vectors determine the hyperplane of classification. The optimal classification function obtained is:
f x = s g n i = 1 n α i y i x i x + b
s g n is a symbolic function, and b * = y i i = 1 n α i * y i x i , y j .

2.2.3. Quality Abnormal Cause Diagnosis Based on SHAP

The DRSN-SVM model for identifying quality anomalies can determine whether a product has normal qualifications or not. When quality is abnormal, identifying the root cause promptly is crucial. Due to the complexity of the production process for complex products and the high dimensionality and numerous characteristics of quality data, determining the true cause of an anomaly by empirical diagnosis alone is challenging. The SHAP model is an interpretable tool for understanding machine learning algorithms. This is achieved by assessing the marginal contribution of each eigenvalue to the prediction or classification process [23]. The root cause of SHAP diagnostic anomalies can be identified in the following ways: (1). Feature importance: The SHAP value indicates the relative importance of each feature in determining the model predictions. This helps to determine which features have the greatest impact when an anomaly occurs. (2). Interpretation of individual predictions: This enables an understanding of how combinations of features contribute to a particular prediction, thus identifying outlier combinations or extremes that may be associated with an anomaly. (3). Feature interactions: SHAP can reveal how features interact and contribute to model outputs, thus helping to identify complex relationships that may be contributing to anomalies. In this paper, we employ the SHAP model to visually demonstrate the process of quality anomaly recognition. By calculating the contribution value of each sample to the classification result, we can identify the fundamental characteristics responsible for product quality anomalies.
(1)
Shapley value
Game, also called countermeasures, refers to behavior with a competitive or adversarial nature. For the players in the game, that is, the cooperative parties, how to fairly distribute the benefits of successful cooperation is a problem that needs to be studied. In 1953, the scholar Lloyd Shapley proposed the Shapley value distribution method, that is, the use of the players in the cooperation of the bureau to measure the benefits received. The cooperative response is the need to determine the benefits received by each person and be content with:
i s φ i v v ( s )
φ i v = s S i ω s v s v s \ i
Here, S i is a set of subsets in I which contain all number of i . s is the number of elements of the set s , and   ω s is the weighting factor.
(2)
SHAP (Shapley additive explanation) value
The SHAP value relies on the Shapley value distribution approach in game theory to compute the marginal contribution of each feature to the sample in the machine learning model. This includes when the feature is added to the model and in all feature sequences. The SHAP value of the feature represents its contribution at the time of prediction. This paper analyzes the impact of diverse features on changes in the black box model of machine learning. Typical evaluations of the significance of variables, for instance, Pearson correlations, can only establish a connection for the complete dataset and not for each sample. SHAP resolves the prediction black box issue for each sample by providing local interpretability, as depicted in Figure 5.
Taking a linear prediction model as an example, it is easy to calculate the impact of a single feature on the model:
f x = β 0 + β 1 x 1 + + β p x p
Here, x is the sample instance. Each x j is an eigenvalue of the sample x , here j = 1 , 2 , p ; β j is the weight corresponding to the feature j . Predict the marginal contribution ϕ j of the j th feature in f ^ ( x ) :
ϕ j f = β j x j E β j X j = β j x j β j E X j
Here, E β j X j is the average estimation effect of the feature j . The contribution is the difference between the feature effect and the mean effect. The total contribution of all features of sample x is obtained by subtracting the average predicted value from the predicted value of sample x :
j = 1 p ϕ j f = j = 1 p β j x j E β j X j = β 0 + j = 1 p β j x j β 0 + j = 1 p E β j X j = f x E f X
The interpretation model can represent the prediction process of the entire sample as well as calculate the contribution of each characteristic value in the examples. Consequently, it enables the identification of the fundamental features that cause abnormal results in the prediction.

2.2.4. Establishment of a Quality Anomaly Recognition and Diagnosis Model Based on DRSN-SVM-SHAP

As previously stated, quality data for the complex manufacturing process of the product is collected, along with predicted results. The matrix samples are then divided into test and training sets in proportion. The DRSN is used to process the training set for feature extraction and generate input for the SVM classifier. This generates the quality anomaly classification results. The quality anomaly diagnosis was conducted, resulting in the final complex product manufacturing quality anomaly recognition and diagnosis model based on DRSN-SVM-SHAP.
The process of abnormal identification and diagnosis of complex product manufacturing quality based on DRSN-SVM-SHAP is as follows:
Step 1: Collect the relevant processing factor data during processing.
Step 2: Data cleaning.
Step 3: The “SMOTE + ENN” mix sampling to increase little class data.
Step 4: Use the Max–min standardization for data normalization processing.
Step 5: Use the quality abnormal data labels to train DRSN to get the DRSN initialization parameter.
Step 6: Transfer the parameters before the Flatten layer of the DRSN model to the DRSN part of the DRSN-SVM model, and the grid search algorithm is used to optimize the hyperparameters.
Step 7: Repeat Step 4 ~ Step 6 to generate DRSN-SVM model.
Step 8: Bringing quality characteristics to identify the abnormal quality.
Step 9: Input the abnormal quality data into the SHAP model for quality diagnosis.
Step 10: Output quality diagnosis results.
The abnormal identification of complex product manufacturing quality and the diagnosis process based on DRSN-SVM-SHAP are shown in Figure 6.

2.2.5. Model Evaluation

In this paper, the confusion matrix and ROC curve, commonly used evaluation indexes of classification models, were used to evaluate the model [24].
(1)
Confusion Matrix
The evaluation of a model is accomplished using the confusion matrix, which quantifies the number of errors and classification result pairs. In the binary classification model, the True Positive rate of the model for truth-value errors is called TP and the False Negative rate of the model for truth-value is called FN. The True Negative rate of the model for truth-value errors is called TN, and the False Positive rate for truth-value errors is called FP. Table 1 displays the confusion matrix. The matrix’s diagonal denotes the number of correctly classified samples by the model. Higher values indicate superior model performance. A type I error occurs when the model predicts a positive class when in fact the true class is negative. This type of error frequently results in overdiagnosis or overtreatment. A type II error occurs when the model predicts a negative class when in fact the true class is positive. This type of error frequently results in underdiagnosis or failure to take necessary action. Other evaluation indexes derived from the confusion matrix are the classification accuracy AUC, Recall, Precision, and the F-measure. Values closer to 1 correspond to more accurate model classification results. Equations (17)–(19) reflect the AUC, Recall, Precision and F-measure calculation processes.
A U C = ( T P + T N ) T P + T N + F P + F N
R e c a l l = T P ( T P + F N )
P r e c i s i o n = T P ( T P + F P )
F = 2 P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
(2)
Receiver Operating Characteristic Curve (ROC Curve)
In the ROC curve, the horizontal axis represents the error rate of negative samples. The vertical axis represents the recall rate of the derivative evaluation index of the confusion matrix. The closer the curve is to the upper left corner, the better the classification performance of the model.
(3)
Matthews Correlation Coefficient (MCC)
MCC is mainly used to measure the binary classification problem, which takes into account TP, TN, FP, and FN, and is a more balanced index, and can be used in the case of sample imbalance and asymmetry. The value of MCC is in the range of [−1, 1], and the value of 1 means that the prediction is in perfect agreement with the actual result, and the value of 0 means that the predicted result is not as good as the random prediction, and the value of −1 means that the predicted result is not in agreement with the actual result at all. inconsistent with the actual results. Thus, we see that MCC essentially describes the correlation coefficient between the predicted and actual results.
M C C = T P × T N F P × F N ( T P + F P ) × ( T P + F N ) × ( T N + F P ) × ( T N + F N )

2.3. Case Analysis

A data set of manufacturing processes for a complex semiconductor part consists of 1567 instances (104 of which are unqualified). Each instance represents a product entity with 591 features and 1 label (label 1 for qualified and label −1 for unqualified).
If the data is simply divided into a training set and a validation set according to a certain percentage, the resulting validation set will contain a very small number of data points and will exhibit significant fluctuations in the validation scores. In such cases, it is preferable to use k-fold cross-validation, which randomly divides the data set into k partitions of equal size. For each partition i, the model is trained on the remaining k − 1 partitions, and then the model is evaluated on partition i. The final validation score is equal to the average of the K scores. The findings indicate that using k = 10 resulted in higher performances and fewer skewed estimations [25]. In this paper, the value of k is set to 10, which means that 90% of the data is used for training purposes, while the remaining 10% is employed as a test set for model testing.

2.3.1. Data Preprocessing

Firstly, the features with outliers, such as null values and missing values, are found and sorted out. After data analysis, it was found that there were 12 columns of unique values, 33 columns of null values of more than 40%, 98 columns of repeated values in the data, and 448 columns of features that remained after deletion. The non-null data set was obtained by adding missing values according to the average value of the feature, including 1463 positive examples (qualified) and 104 negative examples (unqualified). The data set after sampling by “SMOTE + ENN” has a total of 2184 examples, including 1442 positive examples and 742 negative examples. Part of the data after preprocessing is shown in Table 2.

2.3.2. Model Structure and Parameter Settings

Based on the complex product manufacturing quality, the DRSN-SVM exception recognition model is formed by the superposition of multiple depth residual shrinkage modules using a grid search algorithm for parameter optimization. This paper designed the module that has 12 residual contraction depths of residual shrinkage network model for feature extraction, involved in the model structure and parameters as shown in Table 3. Neural network structure parameters include convolution kernel number, width, and step length (“/2” on behalf of the step length is 2). The important parameters of SVM are the C penalty function and γ , where the penalty function C optimized to 1 indicates a low tolerance to errors. The optimization γ is 0.0003, which indicates that the model has strong generalization ability.

2.3.3. Result Analysis

The training set data is brought into the deep residual shrinkage network for feature extraction, and the extracted features are input into the support vector machine model for classification. Figure 7 shows the learning curve of the manufacturing quality anomaly recognition model of complex products based on the DRSN-SVM model. The learning curve for quality anomaly identification using the naive Bayes classifier and the SVM classifier is shown in Figure 8. In Figure 7 and Figure 8, the red and green dots and lines indicate the training score and cross-validation score, respectively, which represent the performance of the model on the training and test sets. And, the light red and light green regions indicate the range of variation of the model’s cross-validation scores for different numbers of training instances, respectively. The light red and light green regions decrease, indicating that the model’s performance becomes more stable.
As can be seen from Figure 8 and Table 4, the accuracy of the naive Bayes classifier decreases first and then increases with the increase in sample size, and finally, the accuracy is 0.84 when the sample size reaches 1600. The accuracy of the support vector machine classifier is 0.98 when the number of samples is 1600. However, the accuracy of the DRSN-SVM model is 1 when the sample size is 1170. Obviously, the quality anomaly identification method based on DRSN-SVM converges earlier and has higher accuracy. On this basis, the SHAP model is used to diagnose quality anomalies.
(1)
Analysis of the data sets
Combining features and the importance of the recognition results effect, the data collection of all the characteristics of each sample value displayed shapes drawn into a scatterplot, as shown in Figure 9. The Y-axis represents the feature names of the dataset and is sorted by feature importance. The X-axis shows the SHAP value of each feature, which is the impact of the feature on the prediction. The characteristics of the identification results are coded by color, where red represents recognition results that are relevant, and blue represents results that are negatively related to the identification results. For each sample, if the combination result is greater than 0.5, the discriminant for the product is qualified, unqualified, and vice. The jitter of the same feature in the Y-axis direction represents the SHAP value distribution of the feature in the entire dataset, and the wider the regional distribution is, the more influential the feature is. Due to the large number of features, only the top 20 features that affect the recognition results are selected for display.
Figure 9 demonstrates that feature 19 holds the most significant sway on recognition results, closely pursued by feature 3 and feature 8. Conversely, feature 4 and feature 9 possess a negligibly minor influence on model outcomes. Nonetheless, the image alone is insufficient to indicate the characteristics’ overall impact on the model’s magnitude, given the importance of distribution properties. The color indicates the direction of influence that each feature has on the recognition result, with importance ranked accordingly. Figure 10 displays a bar chart that clearly demonstrates the average impact of each feature on the model’s output. This enables us to assess the significance of each feature and make more informed decisions.
(2)
Sample analysis
By analyzing the input of a single sample using SHAP, we can determine the contribution value of each characteristic towards the recognition result. The outcomes are presented in Figure 11, where red highlights the features that lead to a positive recognition result and blue indicates those that lead to a negative result. The values of the feature are represented by the numbers below the bar. The wider the color area of the bar chart, the greater the feature’s influence. If the f(x) value is less than 0.5, the instance’s identification result is unqualified, and vice versa. The single-sample SHAP analysis of samples 257 and 1290 was conducted as described below.
As depicted in Figure 11, sample 257’s model output value is 0.11, resulting in an unqualified classification. The main contributor to identifying this sample as an anomaly was feature 3, followed by feature 19 and feature 5. Therefore, it is possible to diagnose the result, and the abnormality cause of this sample is linked to feature 3, feature 9, and feature 5.
As illustrated in Figure 12, sample 1290’s model output value is 0.59, leading to a qualified classification. During the qualification identification process, feature 19 is the most influential, with feature 14 and feature 18 as secondary contributors.
Figure 12 is inadequate in presenting the SHAP value for every feature. By sorting the contribution value of each feature and using it as the X-axis, we can generate a histogram (Figure 13), which provides a clear visual representation of the SHAP value for every feature in sample 1290.

2.3.4. Model Evaluation

The DRSN-SVM quality anomaly recognition model was evaluated using the confusion matrix and ROC curve. The test set data was brought into the model for testing, and the resulting confusion matrix is displayed in Figure 14. It is evident that 142 out of 143 positive cases were correctly classified, along with 75 out of 76 negative cases, resulting in TP = 141, TN = 75, FN = 2, and FP = 1. Further calculations yield model A U C = 0.99, R e c a l l = 0.986, P r e c i s i o n = 0.993, and F = 0.979, MCC = 0.97 indicating satisfactory classification performance of the model.
The ROC curve for the DRSN-SVM anomaly recognition model’s quality is depicted in Figure 15. The depiction is indicative of the model’s classification performance, having an AUC of 0.99, suggesting that it conducts well.

3. Conclusions

A quality anomaly recognition and diagnosis model based on DRSN-SVM-SHAP has been developed to address the difficulties of fully expressing the quality anomaly pattern through traditional quality control charts. As there are numerous factors affecting product quality and limited controlled quality characteristics that make it difficult to diagnose quality anomalies accurately, this model was designed. The study used the deep residual shrinkage network to extract features from the data on quality characteristics. Subsequently, SVM was utilized to identify anomalies after reducing the data dimension. This led to the development of a complex model for recognizing anomalies in complex product quality using DRSN and SVM to classify manufacturing quality into anomalies. The SHAP model was implemented to diagnose the root causes of abnormalities in each sample and provide decision-making suggestions for managers. The model was verified using actual production data and evaluated with a confusion matrix, ROC curve, and other indicators. The final outcomes indicate that the DRSN-SVM quality anomaly recognition model achieved 99% accuracy. This is a superior result to that achieved by the naive Bayes and support vector machine classification models. The DRSN-SVM model for recognizing quality anomalies achieves faster convergence and significantly improved accuracy.

Author Contributions

Conceptualization, Y.L. and Z.W.; methodology, Y.L., Z.W. and D.Z.; software, M.Y. and X.G.; validation, M.Y., X.G. and L.B.; formal analysis, Y.L.; investigation, Z.W.; resources, Y.L.; data curation, D.Z.; writing—original draft preparation, Y.L.; writing—review and editing, Z.W.; visualization, D.Z. and L.B.; supervision, M.Y.; project administration, Z.W.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Key Research and Development Program of Shaanxi (Program Nos. 2021SF-421, 2021SF-422), the Key Scientific Research Program of Shaanxi Provincial Education Department (Program No. 20JY047), and the Collaborative Innovation Center of Modern Equipment Green Manufacturing in Shaanxi Province, China.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, H.; Zhang, G.; Jiang, J.; Zong, X. Strategy for improving Product Quality of Manufacturing Industry in China. Strateg. Study CAE 2022, 24, 38–47. [Google Scholar] [CrossRef]
  2. Xu, J.; Lv, H.; Zhuang, Z.; Lu, Z.; Zou, D.; Qin, W. Control Chart Pattern Recognition Method Based on Improved One-dimensional Convolutional Neural Network. IFAC-Pap. 2019, 52, 1537–1542. [Google Scholar] [CrossRef]
  3. Yu, J.; Zheng, X.; Wang, S. A deep autoencoder feature learning method for process pattern recognition. J. Process Control 2019, 79, 1–15. [Google Scholar] [CrossRef]
  4. Chiu, J.E.; Tsai, C.H. On-line concurrent control chart pattern recognition using singular spectrum analysis and random forest. Comput. Ind. Eng. 2021, 159, 107538. [Google Scholar] [CrossRef]
  5. Wan, Y.; Zhu, B. Abnormal patterns recognition in bivariate autocorrelated process using optimized random forest and multi-feature extraction. ISA Trans. 2021, 109, 102–112. [Google Scholar] [CrossRef] [PubMed]
  6. Li, L.L.; Chen, K.; Gao, J.M.; Li, H. Research on Quality Anomaly Recognition Method Based on Optimized Probabilistic Neural Network. Shock Vib. 2021, 27, 2813–2821. [Google Scholar] [CrossRef]
  7. Zhao, Z.Y.; Liu, Y.M.; Wang, N. Dynamic Process Quality Abnormal Patterns Recognition Method Based on improved Convolutional Neural Network. Ind. Eng. Manag. 2021, 26, 69–76. [Google Scholar] [CrossRef]
  8. Liu, Y.; Zhou, H.; Tsung, F.; Zhang, S. Real-time quality monitoring and diagnosis for manufacturing process profiles based on deep belief networks. Comput. Ind. Eng. 2019, 136, 494–503. [Google Scholar] [CrossRef]
  9. Mondal, P.P.; Ferreira, P.; Kapoor, G.; Bless, P. Monitoring and Diagnosis of Multistage Manufacturing Processes Using Hierarchical Bayesian Networks. Procedia Manuf. 2021, 53, 32–43. [Google Scholar] [CrossRef]
  10. Donovan, F.; Talayeh, R. A cost-sensitive convolution neural network learning for control chart pattern recognition. Expert Syst. Appl. 2020, 150, 113275. [Google Scholar] [CrossRef]
  11. Jeong, C.; Fang, X. Two-Dimensional Variable Selection and Its Applications in the Diagnostics of Product Quality Defects. IISE Trans. 2021, 54, 619–629. [Google Scholar] [CrossRef]
  12. Benmahamed, Y.; Kherif, O.; Teguar, M.; Boubakeur, A.; Ghoneim, S.S. Accuracy Improvement of Transformer Faults Diagnostic Based on DGA Data Using SVM-BA Classifier. Energies 2021, 14, 2970. [Google Scholar] [CrossRef]
  13. Li, J.Y.; Yu, Z.H.; Xu, X.G. Diagnosis method of multi-cause process quality under incomplete information. J. Harbin Inst. Technol. 2016, 48, 88–93. [Google Scholar]
  14. Cai, Y.; Chen, S.; Wang, Y. Multivariate Mean Shift Diagnostic Model Based on Support Vector Machine. In Proceedings of the 2017 IEEE 7th Annual International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Honolulu, HI, USA, 31 July 2017–4 August 2017. [Google Scholar] [CrossRef]
  15. Ma, L.; Dong, J.; Peng, K. Root cause diagnosis of quality-related faults in industrial multimode processes using robust Gaussian mixture model and transfer entropy. Neurocomputing 2018, 285, 60–73. [Google Scholar] [CrossRef]
  16. Xu, J.; Gao, S.; Dang, X.; Zhao, W.; Zhang, Q.; Qiu, S. BO-MADRSN: Bayesian optimized multi-attention residual shrinkage networks for industrial soft sensor modeling. Measurement 2023, 224, 113477. [Google Scholar] [CrossRef]
  17. Yang, S.; Zhang, H.; Li, Z.; Duan, S.; Yan, J. Identification of industrial exhaust based on an electronic nose with an interleaved grouped residual convolutional compression network. Sens. Actuators A Phys. 2023, 363, 114692. [Google Scholar] [CrossRef]
  18. Zhao, Z.; Luo, Z.; Wang, P.; Li, J. Survey on Image Classification Algorithms Based on Deep Residual Network. Comput. Syst. Appl. 2020, 29, 14–21. [Google Scholar] [CrossRef]
  19. Jiao, J.; Zhao, M.; Lin, J.; Liang, K. A comprehensive review on convolutional neural network in machine fault diagnosis. Neurocomputing 2020, 417, 36–63. [Google Scholar] [CrossRef]
  20. Zhao, M.; Zhong, S.; Fu, X.; Tang, B.; Pecht, M. Deep Residual Shrinkage Networks for Fault Diagnosis. IEEE Trans. Ind. Inform. 2020, 16, 4681–4690. [Google Scholar] [CrossRef]
  21. Qin, Y.; Liu, X.; Zhang, F.; Shan, Q.; Zhang, M. Improved deep residual shrinkage network on near infrared spectroscopy for tobacco qualitative analysis. Infrared Phys. Technol. 2023, 129, 104575. [Google Scholar] [CrossRef]
  22. Liu, X.; Zhang, Z. Parameter optimization of Support Vector Machine based on improved grid search method. J. Jiangxi Univ. Sci. Technol. 2019, 40, 5–9. [Google Scholar] [CrossRef]
  23. Liu, Y.; Ke, J.; Jiang, H.; Song, X. Improvement of the PoS Consensus Mechanism in Blockchain Based on Shapley Value. J. Comput. Res. Dev. 2018, 55, 2208–2218. [Google Scholar] [CrossRef]
  24. Li, Y.X.; Chai, Y.; Hu, Y.Q.; Yin, H.P. Review of imbalanced data classification methods. Control Decis. 2019, 34, 673–688. [Google Scholar] [CrossRef]
  25. Imran, U.; Waris, A.; Nayab, M.; Shafiq, U. Examining the Impact of Different K Values on the Performance of Multiple Algorithms in K-Fold Cross-Validation. In Proceedings of the 3rd International Conference on Digital Futures and Transformative Technologies (ICoDT2), Islamabad, Pakistan, 3–4 October 2023; pp. 1–4. [Google Scholar] [CrossRef]
Figure 1. The general process of quality anomaly identification and diagnosis.
Figure 1. The general process of quality anomaly identification and diagnosis.
Symmetry 16 00532 g001
Figure 2. Framework for quality anomaly identification and diagnosis in the manufacturing process.
Figure 2. Framework for quality anomaly identification and diagnosis in the manufacturing process.
Symmetry 16 00532 g002
Figure 3. Deep residual network structure diagram.
Figure 3. Deep residual network structure diagram.
Symmetry 16 00532 g003
Figure 4. Structure diagram of the residual shrinkage module.
Figure 4. Structure diagram of the residual shrinkage module.
Symmetry 16 00532 g004
Figure 5. Schematic of SHAP Model Function.
Figure 5. Schematic of SHAP Model Function.
Symmetry 16 00532 g005
Figure 6. Process of a complex product manufacturing quality anomaly recognition model based on DRSN-SVM.
Figure 6. Process of a complex product manufacturing quality anomaly recognition model based on DRSN-SVM.
Symmetry 16 00532 g006
Figure 7. Accuracy curve of a quality anomaly recognition model based on DRSN-SVM.
Figure 7. Accuracy curve of a quality anomaly recognition model based on DRSN-SVM.
Symmetry 16 00532 g007
Figure 8. Naive Bayes classifier and support vector machine classifier classification accuracy curve diagram.
Figure 8. Naive Bayes classifier and support vector machine classifier classification accuracy curve diagram.
Symmetry 16 00532 g008
Figure 9. Data set feature SHAP value distribution scatterplot.
Figure 9. Data set feature SHAP value distribution scatterplot.
Symmetry 16 00532 g009
Figure 10. Feature Shapley value distribution histogram of the data set.
Figure 10. Feature Shapley value distribution histogram of the data set.
Symmetry 16 00532 g010
Figure 11. Single-sample SHAP analysis chart of sample 257.
Figure 11. Single-sample SHAP analysis chart of sample 257.
Symmetry 16 00532 g011
Figure 12. Single-sample SHAP analysis chart of sample 1290.
Figure 12. Single-sample SHAP analysis chart of sample 1290.
Symmetry 16 00532 g012
Figure 13. Single-sample SHAP analysis histogram of sample 1290.
Figure 13. Single-sample SHAP analysis histogram of sample 1290.
Symmetry 16 00532 g013
Figure 14. Confusion matrix of quality anomaly classification.
Figure 14. Confusion matrix of quality anomaly classification.
Symmetry 16 00532 g014
Figure 15. ROC curve of the DRSN-SVM quality anomaly recognition model.
Figure 15. ROC curve of the DRSN-SVM quality anomaly recognition model.
Symmetry 16 00532 g015
Table 1. Confusion matrix table.
Table 1. Confusion matrix table.
Confusion MatrixTrue Value
PositiveNegative
Predicted valuePositiveTPFP
(Type I)
NegativeFN
(Type II)
TN
Table 2. Partial tables after preprocessing.
Table 2. Partial tables after preprocessing.
ID01234589Label
03030.9325642187.7331411.1271.360299.6701−1
13095.782465.142230.4221463.6610.8294208.2045−1
22932.612559.942186.4111698.0171.510282.86021
32988.722479.92199.033909.79261.320473.8432−1
43032.242502.872233.3671326.521.533473.8432−1
21842944.922450.762195.4442914.1791.5978137.7844−1
Table 3. The table of model structure and super parametric values.
Table 3. The table of model structure and super parametric values.
Number of ModulesOutput SizeDRSN-SVM
11 × 448 × 1Input
14 × 224 × 1Conv (4, 3,/2)
14 × 112 × 1RBU (4, 3,/2)
34 × 112 × 1RBU (4, 3)
18 × 56 × 1RBU (8, 3,/2)
38 × 56 × 1RBU (8, 3)
116 × 28 × 1RBU (16, 3,/2)
316 × 28 × 1RBU (16, 3)
116BN, ReLU, GAP
116FC
11SVM (1, 0.0003)
Table 4. Comparison results with other methods.
Table 4. Comparison results with other methods.
MethodNumber of Training InstancesTraining Score
NBM 1600 0.84
SVM 1600 0.98
DRSN-SVM 1170 1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, Y.; Wang, Z.; Zhang, D.; Yang, M.; Gao, X.; Ba, L. Product Quality Anomaly Recognition and Diagnosis Based on DRSN-SVM-SHAP. Symmetry 2024, 16, 532. https://doi.org/10.3390/sym16050532

AMA Style

Liu Y, Wang Z, Zhang D, Yang M, Gao X, Ba L. Product Quality Anomaly Recognition and Diagnosis Based on DRSN-SVM-SHAP. Symmetry. 2024; 16(5):532. https://doi.org/10.3390/sym16050532

Chicago/Turabian Style

Liu, Yong, Zhuo Wang, Dong Zhang, Mingshun Yang, Xinqin Gao, and Li Ba. 2024. "Product Quality Anomaly Recognition and Diagnosis Based on DRSN-SVM-SHAP" Symmetry 16, no. 5: 532. https://doi.org/10.3390/sym16050532

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop