Research on Micro-Fault Detection and Multiple-Fault Isolation for Gas Sensor Arrays Based on Serial Principal Component Analysis

Xu, Yonghui; Meng, Ruotong; Yang, Zixuan

doi:10.3390/electronics11111755

Open AccessArticle

Research on Micro-Fault Detection and Multiple-Fault Isolation for Gas Sensor Arrays Based on Serial Principal Component Analysis

by

Yonghui Xu

^1,*,

Ruotong Meng

² and

Zixuan Yang

³

¹

Institute of Automatic Testing and Control, Harbin Institute of Technology, Harbin 150001, China

²

Beijing Institute of Mechanical and Electrical Engineering, Beijing 100074, China

³

China Aerospace Science and Industry Nanjing Chenguang Group, Nanjing 210006, China

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(11), 1755; https://doi.org/10.3390/electronics11111755

Submission received: 27 April 2022 / Revised: 20 May 2022 / Accepted: 27 May 2022 / Published: 31 May 2022

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Machine learning algorithms play an important role in fault detection and fault diagnosis of gas sensor arrays. Because the gas sensor array will see stability degradation and a shift in output signal amplitude under long-term operation, it is very important to detect the abnormal output signal of the gas sensor array in time and achieve accurate fault location. In order to solve the problem of low detection accuracy of micro-faults in gas sensor arrays, this paper adopts the serial principal component analysis (SPCA) method, which combines the advantages of principal component analysis (PCA) in the linear part and the advantages of kernel principal component analysis (KPCA) in the nonlinear part. The experimental results show that this method is more sensitive to micro-faults and has better fault detection accuracy than the fault detection methods of PCA and KPCA. In addition, in order to solve the current problem of low accuracy of multiple-fault isolation, a SPCA-based reconstruction contribution fault isolation method is proposed in this paper. The experimental results show that this method has higher fault isolation accuracy than the method based on contribution graph.

Keywords:

fault detection; fault isolation; gas sensor array; principal component analysis (PCA); kernel principal component analysis (KPCA); reconstruction contribution

1. Introduction

With the development of artificial intelligence and sensor technology, machine olfactory systems, as a kind of bionic sensor detection technology that uses electronic devices to simulate biological olfaction, have been widely used in gas qualitative detection and quantitative analysis [1], such as gas quality monitoring [2,3], biomedicine [4,5,6], and food industry [7,8]. Especially in some industrial environments, the detection of toxic, harmful, flammable and explosive gases is more related to human health and life safety [9,10,11,12,13]. Therefore, it is of great significance and practical application value to reasonably optimize the machine olfactory system to realize the timely and effective detection and recognition of gases in the environment.

Methane is a flammable and explosive gas which is one of the most commonly used dangerous gases in industrial environments. The use of machine olfactory systems can better realize real-time and effective monitoring of the gas and the stability of methane. Machine olfactory systems are composed of the gas sensor array unit, the signal processing unit and the pattern recognition unit [14,15]. Recent advances in nanofabrication, sensor, algorithm and micro-circuitry design have considerably improved the machine olfactory systems [16,17,18]. Among them, the gas sensor array is the source of the machine olfactory system to sense and acquire the gas data information of the target. And the reliability of gas sensor detection results plays an important role in the comprehensive performance evaluation of the whole olfactory sensor system. In recent years, metal-oxide semiconductor (MOS) gas sensors have developed rapidly, and the sensitivity and selectivity of MOS sensors have been effectively improved by improving oxide structures and using sensor arrays [19,20]. MOS sensors have become one of the most prominent choices for the sensing elements used in machine olfaction applications because of their high sensitivity, fast response, low cost, and operation ease [21,22]. However, due to the characteristics of the gas-sensitive material and the design of the sensor structure, the MOS gas sensor will malfunction during long-term use. The causes of failure include: aging of gas-sensitive materials (sensing material degradation as a result of irreversible chemical reactions), circuit equipment failures (power supply instability, deficient mechanical manipulation), and changes in ambient temperature and humidity [23,24], etc. The above-mentioned sensor failure causes will affect the response characteristics of the sensor, resulting in a sudden failure of the sensor output signal, which in turn causes the output signal of the sensor array to be abnormal [25,26,27,28]. Moreover, the gas sensor array is composed of multiple MOS gas sensors, which also increases the probability of failure and the complexity of failure. Usually, serious faults are evolved from micro-faults. Therefore, it is very necessary to maintain the working state of the gas sensor array, especially it is very meaningful to use an effective machine learning algorithm to detect and locate the micro-faults of the gas sensor array. The acquired fault information can provide necessary decision information for subsequent system maintenance.

Many algorithms have been proposed to solve sensor fault detection problems, such as principal component analysis (PCA), independent principal component analysis (ICA), support vector machine (SVM), kernel principal component analysis (KPCA), and nonnegative matrix factorization (NMF) [29,30,31,32,33]. To further improve the accuracy of fault detection, Cheng Zhang et al. [34] proposed a new fault detection index based on Hotelling’s

T^{2}

and square Euclidean distance weighted combination based on the square prediction error statistics of the PCA algorithm. Although this method can improve the fault detection index for fault identification it does not improve the process of feature extraction of process data and only improves the accuracy of fault detection to a certain extent. Fu et al. [35] proposed a fault detection strategy called Trend Correlation based Fault Detection strategy (TCFD). This strategy can detect the faulty sensor nodes through analyzing the trend correlation and the median value of neighboring nodes. However, since gain faults do not change the general trend of the output data, the detection of such faults is relatively difficult. Jiang et al. [36] proposed a KPCA fault detection method based on two-parameter optimization. The literature generally constructs the best selection criteria for kernel parameters and principal component scores and considers each type of fault, which improves the fault detection performance. However, the KCPA algorithm requires a large number of calculations, and the method also considers all kinds of faults for parameter optimization; thus, the number of calculations will further increase. In addition, researchers have proposed fault detection methods based on neural networks. For example, Chen Y et al. [37] used wavelet packet energy technology to extract features. The genetic algorithm has been used to optimize the BP neural network to build the fault diagnosis system, and the diagnosis accuracy with small error was realized. However, due to the limitations of the neural network method, the learning of BP structure parameters and the adjustment of weights make the execution efficiency poor. At present, research on sensor micro-faults is very limited. Few people consider the complex situation of multiple-fault mixed linear nonlinearity, and most of the methods have low accuracy in the above situation.

After the fault of the current gas sensor array is determined by the fault detection method, fault isolation is needed. Currently, the most classic fault isolation method is the contribution plots method based on process variable analysis proposed by P. Miller [38]. However, due to the fault towing effect in this method, fault variables affect the contribution of fault-free variables [39]. This problem also leads to poor isolation accuracy in the case of multiple-faults. On this basis, researchers have proposed many improved algorithms based on contribution plots [40,41,42]. Although these methods improve the accuracy of fault isolation to some extent, the trailing effect of fault isolation for multivariate statistical variables cannot be completely eliminated. To solve the problem of fault sensitivity caused by the interaction of different variables, Navi M. adopts a partially adaptive KPCA algorithm to realize fault isolation. This method constructs model information by performing KPCA on the simplified fault subset and then compares the structural residuals with the model information by using fault isolation characteristics similar to odd-even relationships to obtain the isolation results [43]. In addition, the reconstruction idea has been applied to the construction of isolation algorithms in recent years [44,45,46], and a better fault isolation result has been achieved to some extent.

To improve the sensitivity to micro-faults and the ability to detect multiple-faults in gas sensor array fault detection, a principal component analysis based on a serial model structure (SPCA) [47] is used to detect gas sensor array faults. This method is a feature extraction method combining PCA and KPCA. More specifically, PCA is applied to extract principal components (PCs) as linear features firstly, and then, in the PCA residual space (RS), KPCA is used to obtain nonlinear feature extraction. The feature statistics constructed based on SPCA algorithm contain both linear and nonlinear principal component features, and have better feature description ability for data information. Additionally, it has excellent sensitivity to micro-faults. Considering complex industrial processes, The kernel density estimation method (KDE) is used to calculate the statistics control limits. Based on the above process, to effectively isolate the faults detected and realize fault location, an SPCA-based reconstruction contribution fault isolation algorithm is proposed in this paper. It uses the idea of reconstruction contribution and the theory of fault direction set to complete the effective isolation of multiple-faults. In this paper, to verify the effectiveness of the proposed fault detection and isolation method based on the SPCA algorithm, a machine olfactory system based on the MOS gas sensor array is established and used to collect dataset. The sufficient experimental samples are obtained under normal working conditions in a steady-state environment of methane, and the fault signal is obtained by means of fault injection. Through the simulation analysis, it can be seen that the SPCA method adopted in this paper is obviously better than the separate PCA and KPCA methods in both micro-faults diagnosis and multiple-faults isolation, which proves the effectiveness of the method proposed in this paper.

The structure of the rest of this paper is as follows: in Section 2, after summarizing the fault detection and isolation methods, the fault detection methods based on the SPCA algorithm and SPCA-based reconstruction contribution fault isolation algorithm are introduced in detail; in Section 3, the specific experimental analysis is carried out; and in Section 4, the conclusion is drawn.

2. Method

2.1. Overview of Fault Detection and Isolation Methods

The block diagram of the SPCA-based fault detection and isolation method is shown in Figure 1. The SPCA algorithm used here is a serial model structure composed of principal component analysis (PCA) and kernel principal component analysis (KPCA). In the fault detection stage, PCA is used to obtain the linear principal component and residual component of normal data. Then, KPCA is used to process the residual component of PCA to obtain the nonlinear principal component and KPCA residual component. The linear and nonlinear principal components and KPCA residual components are used to build the fault detection model. Second, the fault detection statistics of process observation data (test data) are calculated based on the model information as the fault detection information. According to whether the test data deviate from the fault detection model and the degree of deviation, we can judge whether the system operation state is abnormal. The detailed SPCA fault detection process is given in Section 2.2.

After the fault detection method based on SPCA is used to determine the current fault, it enters the fault isolation stage. In this paper, an SPCA-based reconstruction contribution fault isolation algorithm is proposed. This method mainly adopts the idea of reconstruction contribution proposed by Alcala et al. [44], which defines the reconstruction fault detection index along the variable direction as the variable contribution of fault isolation. The

S P E

statistics based on the SPCA algorithm in the process of fault detection are used as iterative judgment conditions. Until the final reconstruction fault data statistics are less than the threshold value, the fault sensor can be determined according to the reconstruction information. In addition, the algorithm can adaptively determine the number of faults by combining the idea of iteration. Details are given in Section 2.3.

2.2. Fault Detection Stage

2.2.1. Serial Principal Component Analysis

SPCA is a hybrid linear and nonlinear statistical modeling method. It uses PCA and KPCA to form a serial model structure, which combines the linear and nonlinear characteristics of data.

The SPCA algorithm consists of two modeling processes. First, PCA is used to extract the linear feature information of the data, and the original data are divided into two subspaces: principal component subspace

\hat{X}

and residual subspace

\tilde{X}

. Assume the original data matrix

X \in R^{n \times m}

, n is the number of samples, m is the number of variables, and the quantity of data collected n is far greater than the number of variables m. The PCA decomposition formula is

\begin{matrix} X = \hat{X} + \tilde{X} = \sum_{i = 1}^{k_{L}} t_{L i} p_{L i}^{T} + \tilde{X} \end{matrix}

(1)

where

t_{L i}

is the ith linear score vector,

p_{L i}

is the ith linear loading vector, and

k_{L}

represents the number of principal elements of the linear model. The loading vector

p_{L i}

can be obtained by decomposing the covariance matrix of the original data

X

. Assume the test vector

x_{t} \in R^{m}

, whose ith score vector is

\begin{matrix} t_{L i} = x_{t}^{T} p_{L i} \end{matrix}

(2)

If the number of principal components of the test vector

x_{t}

is

k_{L}

, its linear principal component scoring matrix is

{[t_{L 1}, t_{L 2}, \dots, t_{L k_{L}}]}^{T}

.

Therefore, the residual vector

\tilde{x_{t}}

of the test vector

x_{t}

is expressed as

\begin{matrix} {\tilde{x}}_{t} = x_{t} - \sum_{i = 1}^{k_{L}} t_{L i} p_{L i} \end{matrix}

(3)

In the second modeling step, KPCA is applied to the residual space

\tilde{X}

of PCA to extract the nonlinear features of the data.

\tilde{X}

is mapped to a high-dimensional feature space

Φ (\tilde{X}) \in R^{n} \times F

through a nonlinear mapping

Φ (\cdot)

, then the KPCA decomposition formula is

\begin{matrix} Φ (\tilde{X}) = \sum_{i = 1}^{k_{N}} t_{N i} p_{N i}^{T} + E \end{matrix}

(4)

t_{N i} \in R^{n}

is the ith nonlinear score vector,

p_{N i} \in F

is the ith loading vector,

k_{N}

represents the number of nonlinear principle components, and the nonlinear principle component rating matrix is

{[t_{N 1}, t_{N 2}, \dots, t_{N k_{N}}]}^{T}

.

E \in R^{n} \times F

is the KPCA residual matrix on the nonlinear residual subspace. To obtain the principal component and rating vector of KPCA, the covariance matrix was eigendecomposed as

\begin{matrix} \frac{1}{n - 1} Φ^{T} (\tilde{X}) Φ (\tilde{X}) p_{N i} = λ_{N i} p_{N i} \end{matrix}

(5)

where

\tilde{X} = {[{\tilde{x}}_{1}, {\tilde{x}}_{2}, \dots, {\tilde{x}}_{n}]}^{T}

,

λ_{N i}

is the ith eigenvalue of

1 / (n - 1) Φ^{T} (\tilde{X}) Φ (\tilde{X})

, the corresponding eigenvector is represented by

α_{i} = {[α_{i, 1}, α_{i, 2}, \dots, α_{i, n}]}^{T}

, and the loading vector

p_{N i}

is represented by

\begin{matrix} p_{N i} = \sum_{j = 1}^{n} α_{i, j} Φ ({\tilde{x}}_{j}) = Φ^{T} (\tilde{X}) α_{i} \end{matrix}

(6)

Combining Equations (5) and (6), we obtain

\begin{matrix} (n - 1) λ_{N i} α_{i} = K α_{i} \end{matrix}

(7)

where

K \in R^{n \times n}

is the kernel matrix formed by the vector element

k_{i, j} = k e r ({\tilde{x}}_{i}, {\tilde{x}}_{j}) = Φ^{T} ({\tilde{x}}_{i}) Φ ({\tilde{x}}_{j})

.

λ_{N i}

and

α_{i}

are the eigenvalues and eigenvectors of the kernel matrix

K

.

For the residual vector

{\tilde{x}}_{t}

of the test vector, its ith nonlinear score vector is obtained by mapping

Φ ({\tilde{x}}_{t})

to the nonlinear loading vector

\begin{matrix} t_{N i} = Φ^{T} ({\tilde{x}}_{t}) p_{N i} = \sum_{j = 1}^{n} α_{i, j} Φ^{T} ({\tilde{x}}_{j}) Φ ({\tilde{x}}_{t}) = K^{T} (x_{t}) α_{i} \end{matrix}

(8)

where

K (x_{t})

is the test kernel vector, whose jth element is

{[K (x_{t})]}_{j} = k e r ({\tilde{x}}_{t}, {\tilde{x}}_{j})

.

The rest of the detailed PCA and KPCA derivation process is shown in the literature [48,49].

2.2.2. Fault Detection Based on SPCA

To use the SPCA algorithm for fault detection of gas sensor array process observation data, two characteristic statistics are constructed: joint

T^{2}

statistics [47] and

S P E

statistics [50].

\begin{matrix} T^{2} = t_{S P C A}^{T} Γ^{- 1} t_{S P C A} \end{matrix}

(9)

\begin{matrix} S P E = \sum_{j = 1}^{n} {(t_{N j})}^{2} - \sum_{j = 1}^{k_{N}} {(t_{N j})}^{2} \end{matrix}

(10)

where

t_{S P C A} = {[t_{L 1}, t_{L 2}, \dots, t_{L k_{L}}, t_{N 1}, t_{N 2}, \dots, t_{N k_{N}}]}^{T}

contains the linear and nonlinear principal components, the PCA principal component is

t_{P C A} = {[t_{L 1}, t_{L 2}, \dots, t_{L k_{L}}]}^{T}

, and the KPCA principal component is

t_{K P C A} = {[t_{N 1}, t_{N 2}, \dots, t_{N k_{N}}]}^{T}

.

Γ

is the matrix composed of the eigenvalues corresponding to the principal component score vector.

The statistics of the process observation data are mainly used to determine whether the current gas sensor array fails. Therefore, it is necessary to use the historical data of normal conditions to calculate the normal statistical control limit. The statistics of existing PCA and KPCA methods are usually calculated by the F distribution and weighted

χ^{2}

distribution. However, when considering complex industrial processes, it is difficult to ensure that the process data conform to specific distribution assumptions. The kernel density estimation method (KDE) is used to calculate the control limits of the two statistics [51].

The KDE method is a nonparametric method for estimating the probability density function. Especially for the single variable stochastic process, the performance is better, and the confidence interval obtained by the KDE method is closer to the real data. Given the random variable x, the probability density function is assumed to be f, and the kernel density is estimated to be:

\begin{matrix} {\hat{f}}_{h} (x) = \frac{1}{n} \sum_{i = 1}^{n} K_{h} (x - x_{i}) = \frac{1}{n h} \sum_{i = 1}^{n} K (\frac{x - x_{i}}{h}) \end{matrix}

(11)

K (\cdot)

is the kernel function, and

h > 0

is a smooth parameter called bandwidth.

K_{h} (x) = 1 / h K (x / h)

is the scaling kernel function. Here, the specific determination method of bandwidth is given by the literature [52]. We know that the probability is obtained by integrating the probability density function over a continuous range. Therefore, when the significance level is

α

, the parameter c is determined by the following formula:

\begin{matrix} P (x < c) = \int_{- \infty}^{c} f (x) d x = α \end{matrix}

(12)

The KDE method can be used to obtain the control limits of the joint

T^{2}

statistics and

S P E

statistics. When the statistics of process observation data exceed the control limit of normal statistics, it can be judged that the current gas sensor array fails. The detailed flowchart of fault detection is given in Figure 2.

2.3. Fault Isolation Stage

Assume that the gas sensor array is composed of m gas sensors, and there are p gas sensor faults at time t (

0 < p < m

). At this time, the sensor array output is

x (t)

. According to the principle of fault reconstruction, the signal

z_{i}

after reconstruction is expressed as

\begin{matrix} z_{i} = x (t) - ξ_{i}^{T} f_{i} \end{matrix}

(13)

where

ξ_{i}

represents the fault direction, and

f_{i}

is the amplitude of the fault signal on the corresponding fault direction

ξ_{i}

. Therefore, the reconstructed signal is obtained by subtracting the fault direction from the original signal and multiplying the amplitude in the corresponding direction.

According to the definition of

S P E

statistics, it represents the distance between the original signal

x (t)

and the estimated signal

\hat{x} (t)

. The smaller the

S P E

statistics, the better the consistency between the estimated signal and the original signal. Therefore,

S P E

statistics are used as an indicator to judge the consistency between the reconstructed fault data and the data collected under normal system conditions. After reconstruction in the correct fault direction, the

S P E

value of the reconstructed data should be minimal. Therefore, the main purpose of the SPCA-based reconstruction contribution fault isolation algorithm proposed in this paper is to find the appropriate

p, Ξ, F

to meet the following formula:

\begin{matrix} a r g \underset{p, Ξ, F}{m i n} S P E (z_{i} (t)) \end{matrix}

(14)

where p is the number of fault sensors,

Ξ = [ξ_{1}, ξ_{2}, \dots, ξ_{p}]

is the fault direction set, and

F = [f_{1}, f_{2}, \dots, f_{p}]

is the fault amplitude set corresponding to the fault direction set

Ξ

.

Assuming the parameter p is known, the process of fault isolation is transformed into finding all fault directions and calculating the amplitude

f_{i}

along the corresponding fault direction

ξ_{i}

. To obtain the reconstructed data information with minimum

S P E

statistics, the specific method is to obtain the first-order partial derivative of the amplitude

f_{i}

on a certain direction of

ξ_{i}

in

S P E (z_{i})

and obtain the derivative result 0. The specific derivation process is as follows:

First, it can be determined from the SPCA algorithm that the statistical

S P E

is obtained by taking the residual matrix of the PCA algorithm as the input of the KPCA algorithm. Therefore, the expression of the reconstructed data is updated as

\begin{matrix} z_{i} & = \tilde{x} (t) - ξ_{i}^{T} f_{i} \end{matrix}

(15)

where

\tilde{x} (t)

is the projection of the original data

x (t)

on the PCA residual subspace, that is,

\tilde{x} (t) = x (t) - \hat{x} (t)

.

\hat{x} (t) = x (t) P P^{T}

is the projection of the original data

x (t)

on the PCA principal component subspace.

P

is the principal component loading matrix obtained in the PCA process.

The statistical

S P E

is defined as the norm of the residual vector in the feature space, so the

S P E

statistics of the SPCA algorithm can be rewritten as

\begin{matrix} S P E = \sum_{j = 1}^{n} {(t_{N j})}^{2} - \sum_{j = 1}^{k_{N}} {(t_{N j})}^{2} = \sum_{j = 1}^{n - k_{N}} {(t_{N j})}^{2} = {\tilde{t}}_{k_{N}} {\tilde{t}}_{k_{N}}^{T} \end{matrix}

(16)

\begin{matrix} {\tilde{t}}_{k_{N}} = {\tilde{P}}_{f}^{T} ϕ = {[υ_{k_{N + 1}}, υ_{k_{N + 2}}, \dots, υ_{n}]}^{T} ϕ = K {(x_{t})}^{T} \tilde{α} \end{matrix}

(17)

where

{\tilde{t}}_{k_{N}}

is the nonlinear residual score matrix,

{\tilde{P}}_{f}

is the nonlinear residual loading matrix, and

ϕ

is the mapping result of the PCA residual space in the KPCA high-dimensional feature space. Since the mapping result of the matrix in the high-dimensional feature space is expressed implicitly, the kernel function is introduced. In Equation (17),

\tilde{α}

is the residual eigenvector of the kernel matrix, and

K (x_{t})

is the kernel function of the normal data set

X

and the current test data

x (t)

. The

S P E

expression can be converted to

\begin{matrix} i n d e x = S P E = ϕ^{T} {\tilde{P}}_{f} {\tilde{P}}_{f}^{T} ϕ = K {(x_{t})}^{T} \tilde{α} {\tilde{α}}^{T} K (x_{t}) \end{matrix}

(18)

If the normal data is a zero mean vector, then the reconstructed data

S P E

is represented by Equation (18). If not, the

S P E

expression of the reconstructed data is further updated to

\begin{matrix} i n d e x = \tilde{\bar{K}} {(z_{i})}^{T} \tilde{α} {\tilde{α}}^{T} \tilde{\bar{K}} (z_{i}) \end{matrix}

(19)

where

K (z_{i})

represents the kernel matrix of the normal data and the reconstructed test data. The kernel function

\bar{K}

represents the kernel matrix after the kernel function

K

is centralized, and

\tilde{K}

represents the kernel matrix after the kernel function

K

is normalized. The formulas of the two are

\begin{matrix} \bar{K} (z_{i}) = K (z_{i}) - K I_{n} - E K (z_{i}) + E K I_{n} \end{matrix}

(20)

\begin{matrix} \tilde{K} (z_{i}) = K (z_{i}) (n - 1) / t r a c e (K) \end{matrix}

(21)

In this case,

E

is the matrix with a dimension of

n \times n

and the element is

1 / n

.

I_{n}

is the matrix with a dimension of

n \times 1

and the element is

1 / n

. Let

T r = (n - 1) / t r a c e (\bar{K})

, where

K = Φ Φ^{T}

,

K

is the kernel function of the training data, and

Φ

is the mapping function of the training data. To obtain the amplitude of the reconstructed data and calculate the derivative of

S P E

with respect to

f_{i}

, the statistical expression for the reconstructed data can be written as

\begin{matrix} \frac{\partial I n d e x}{\partial f_{i}} = 2 \tilde{\bar{K}} {(z_{i})}^{T} \tilde{α} {\tilde{α}}^{T} \frac{\partial \tilde{\bar{K}} (z_{i})}{\partial f_{i}} \end{matrix}

(22)

Further calculation,

\begin{matrix} \frac{\partial \tilde{\bar{K}} (z_{i})}{\partial f_{i}} = T r (I - E) \frac{\partial K (z_{i})}{\partial f_{i}} \end{matrix}

(23)

According to the properties of partial derivatives, the derivative of

K (z_{i})

with respect to

f_{i}

can be reduced to

\begin{matrix} \frac{\partial K (z_{i})}{\partial f_{i}} = \frac{\partial K (z_{i})}{\partial z_{i}} \frac{\partial z_{i}}{\partial f_{i}} \end{matrix}

(24)

Since

K (z_{i}) = {[k (z_{i}, x_{1}), k (z_{i}, x_{2}), \dots, k (z_{i}, x_{m})]}^{T}

, the kernel function

K

is the Gaussian kernel, so the derivative of the kernel function is

\begin{matrix} \frac{\partial}{\partial f_{i}} k (z_{i}, x_{j}) = - 2 k (z_{i}, x_{j}) \frac{{(z_{i} - x_{j})}^{T}}{c} \end{matrix}

(25)

c is the bandwidth of the kernel. Furthermore, from Equation (13), we can obtain

\begin{matrix} \frac{\partial z_{i}}{\partial f_{i}} = - ξ_{i} \end{matrix}

(26)

Therefore, the derivative vector of

K (z_{i})

with respect to

f_{i}

is expressed as

\begin{matrix} \frac{\partial K (z_{i})}{\partial f_{i}} & = \frac{2}{c} [\begin{matrix} k (z_{i}, x_{1}) {(z_{i} - x_{1})}^{T} \\ k (z_{i}, x_{2}) {(z_{i} - x_{2})}^{T} \\ \dots \\ k (z_{i}, x_{n}) {(z_{i} - x_{n})}^{T} \end{matrix}] ξ_{i} \\ = \frac{2}{c} [\begin{matrix} k (z_{i}, x_{1}) {(z_{i} - x_{1})}^{T} - k (z_{i}, x_{1}) f_{i} ξ_{i}^{T} \\ k (z_{i}, x_{2}) {(z_{i} - x_{2})}^{T} - k (z_{i}, x_{2}) f_{i} ξ_{i}^{T} \\ \dots \\ k (z_{i}, x_{n}) {(z_{i} - x_{n})}^{T} - k (z_{i}, x_{n}) f_{i} ξ_{i}^{T} \end{matrix}] ξ_{i} \\ = \frac{2}{c} [B ξ_{i} - K (z_{i}) f_{i}] \end{matrix}

(27)

where

B

is expressed as

\begin{matrix} B = [\begin{matrix} k (z_{i}, x_{1}) {(z_{i} - x_{1})}^{T} \\ k (z_{i}, x_{2}) {(z_{i} - x_{2})}^{T} \\ \dots \\ k (z_{i}, x_{n}) {(z_{i} - x_{n})}^{T} \end{matrix}] \end{matrix}

(28)

Finally, the derivative formula of

S P E

to amplitude

f_{i}

is expressed as

\begin{matrix} \frac{\partial I n d e x}{\partial f_{i}} = 2 \tilde{\bar{K}} {(z_{i})}^{T} \tilde{α} {\tilde{α}}^{T} (I - E) [B ξ_{i} - K (z_{i}) f_{i}] \end{matrix}

(29)

Let the derivative be zero, and

f_{i}

is

\begin{matrix} f_{i} = \frac{\tilde{\bar{K}} {(z_{i})}^{T} \tilde{α} {\tilde{α}}^{T} (I - E) B ξ_{i}}{\tilde{\bar{K}} {(z_{i})}^{T} \tilde{α} {\tilde{α}}^{T} (I - E) K (z_{i})} \end{matrix}

(30)

Equation (30) is obviously an implicit expression of amplitude

f_{i}

, so it is necessary to iterate until the

f_{i}

converges and finally determine the amplitude

f_{i}

.

From Equation (13), the reconstructed data are obtained by subtracting the fault direction from the original data and multiplying the amplitude in the corresponding direction. All possible fault directions can be obtained by the permutation function, and then the corresponding amplitudes of all fault directions are calculated according to the above derivation. To find the correct fault direction and corresponding amplitude, define

\begin{matrix} Ω (Ξ) = S P E (x (t)) - S P E {(z_{i})}_{Ξ} \end{matrix}

(31)

where

S P E {(z_{i})}_{Ξ}

represents the

S P E

statistics corresponding to the reconstructed data on the fault direction set

Ξ

. Theoretically, when the reconstructed data of the fault data are closer to the normal data, the smaller the

S P E

statistics are, the larger the corresponding contribution rate.

Whether the SPCA-based reconstruction contribution isolation algorithm can successfully achieve fault isolation depends on two conditions. One is to find the fault direction with the maximum contribution after reconstruction. Another is that after the reconstruction is completed in the correct fault direction, the

S P E

statistics of the reconstructed data should be less than the threshold of the statistics set to realize the condition.

The above fault isolation algorithm can determine the most likely fault direction using the reconstruction idea when the number of fault sensors is known. However, in the process of fault isolation of gas sensor arrays, the number of faulty sensors is often unknown. To determine the number of faults, an iterative method is used. If the reconstruction process is performed accurately (reconstructing the data using the exact number of faults and the set of fault directions), then the reconstructed data and the output signal of the sensor array in the normal state should be very similar. The

S P E

statistics of the correct reconstruction data should be close to the

S P E

statistics of the output signal in the normal state. If the number of faulty sensors is unknown, we can preset an initial number of faulty sensors, and then increase the number of faulty sensors by one on this basis, and repeat the above process multiple times until the conditions are met. The specific algorithm flowchart is shown in Figure 3. The pseudocode of the algorithm is shown in Algorithm 1.

Algorithm 1: Gas sensor array multifault isolation algorithm pseudocode.

Input: Output

x (t)

at time t of the gas sensor array; Threshold

δ

; Total number of
gas sensors assembled by the gas sensor array

p_{m a x}

. PCA process principal
component loading matrix

P

Output: The number of fault sensors estimated is

\hat{p}

, fault direction

\hat{Ξ}

, and fault
amplitude

\hat{F}

.
Initialization:

\tilde{x} (t) = x (t) (I - P P^{T})

, set

\hat{p} = 1

.
for

\hat{p} = 1 : p_{m a x}

do
Use permutation function to generate the Q fault direction set

[Ξ_{1}, Ξ_{2}, \dots, Ξ_{Q}]

;
for

i = 1 : Q

do
Use Equation (30), calculate the fault amplitude set

F_{i}

corresponding to fault
direction set

Ξ_{i}

;
end
Use Equation (31), and set the fault direction

\hat{Ξ} = Ξ_{i}

, fault amplitude set

\hat{F} = F_{i}

;
According to Equation (13), update

\tilde{x} (t)

with fault direction set

\hat{Ξ}

and fault
amplitude set

\hat{F}

;
Calculate the

S P E

statistics for the reconstructed data

S P E (z_{i})

;
if

S P E (z_{i}) < δ

then
Stop iteration.
end
end

3. Experiment and Results

3.1. Experimental Setup

To verify the effectiveness of the proposed fault detection and isolation method based on the SPCA algorithm for gas sensor arrays, this paper adopts the dataset acquired by the MOS gas sensor array based on experiments. A sensor array consisting of 20 FIGARO commercial MOS gas sensors was used as the research object, and a data acquisition system was established, as shown in Figure 4. The experimental system consisted of a gas supply device, a gas sensor array, a data acquisition board, a power supply device and a computer. The gas sensor array, populated by sensor devices (four of each) tagged as TGS2600, TGS2602, TGS2610, TGS2611,TGS2620, thus forming a

4 * 5

array. Different tags of sensors have different sensitivity to gas. The sensor device was placed in a sealed plexiglass gas chamber. Because the response characteristics of the sensor array are easily affected by temperature and humidity, the experimental conditions are set as follows: temperature 15 °C, relative humidity 20%. The USB-6251 data acquisition board produced by the NI Company was adopted for data acquisition, the DC source DH1718E-5 is used to power the data acquisition board, and the sampling frequency is set to 100 Hz.

This paper mainly studies the reliability of the machine olfactory system in the methane gas environment. Focus on analyzing the problem of monitoring the working state of the gas sensor array when the gas state of the methane in the environment has stabilized and the sensor output signal has reached a stable level. Use the fault diagnosis method to judge the reliability of the MOS gas sensor array during use, and obtain the fault information to provide the necessary information for subsequent maintenance decisions. Therefore, methane was selected as the sample gas and injected into the chamber through the ejector. When the system completes the gas injection, wait for the internal gas state to stabilize and the output signal of the gas sensor array to stabilize. When the gas sensor array was working normally, sufficient experimental samples were collected to be used as the dataset for algorithm verification. Figure 5 shows the response output of the MOS gas sensor array after the methane gas reaches a steady state. The abscissa is the sampling point, a total of 1785, and the ordinate is the voltage value output by the sensor array. It can be seen from the figure that the steady-state output voltage values of different types of sensors in response to methane gas are different, what is more, even if it is the same type of sensor, there will be slightly different output due to differences in its own craftsmanship. Therefore, the sensor array can eliminate the influence of errors caused by the manufacturing process.

After observing and analyzing the working principle of MOS gas sensor, the reasons for the failure of MOS gas sensor during its use include: the sensitivity changes caused by irreversible chemical reactions of gas-sensitive materials during long working hours; electrical failure, unstable power supply or external influence caused by other equipment in the design of sensor structure. Through research and analysis, different fault principles may result in the same form of fault output signal [23]. Typical fault types include: bias fault, shock fault, constant output, power failure, etc. [53].

The signal fault characteristics of the gas sensor array output are often different under different fault degrees. When the fault amplitude is small, the corresponding fault features are not obvious, which makes fault detection difficult. However, serious faults usually develop from microfaults, so it is of great significance to detect the microfaults of MOS gas sensor arrays in a timely and effective manner. Among them, impact fault, bias fault and constant output fault have obvious difference in amplitude under different fault degree. Therefore, when the fault degree is small, the offset amplitude of corresponding fault signal is also small, such a fault can be considered as a microfault. Microfaults are relative to significant faults. Microfaults are usually defined as faults whose amplitude is approximately 0.1–1% of the mean amplitude of normal signals. The amplitude of significant faults is approximately 1–10% of the mean amplitude of normal signals.

This paper focuses on the fault diagnosis method for the abnormal output signal of gas sensor array caused by gas sensor failure. Because the fault sample data caused by sensor damage are not easy to obtain, the fault injection method is used to superimpose the abnormal data to the normal data to obtain the fault data to simulate the fault sample. The specific idea is: use the hardware fault simulation circuit to generate various fault information, superimpose it on the normal signal through the signal superposition circuit, and then obtain the fault signal by the data acquisition system to simulate the real fault. The specific reference on the idea of fault superposition [54,55,56].

3.2. Fault Detection

In order to verify the robustness of the SPCA algorithm-based fault detection method and its effectiveness in detecting micro-faults and multiple-faults in MOS gas sensor arrays, two experiments are conducted in this subsection. One is the algorithm robustness evaluation simulation experiment, and the other is the fault detection experiment for the sensor array in the machine olfactory system built in this paper.

In the algorithm robustness evaluation simulation experiment, in order to compare the detection effect of the fault detection algorithm based on the PCA algorithm, the KPCA algorithm and the SPCA algorithm, the ROC curve of the three algorithms are drawn, and the corresponding AUC values are calculated. The ROC curve is used to analyze the credibility of hypothesis verification effects, and is used here to compare the detection effects of different fault detection algorithms. The area under the curve is the AUC value to evaluate the accuracy of the detection. For details of other ROC curves, see [57]. To make the results more convincing, the output variables are constructed and calculated using a simulated nonlinear system [47], which is described as follows:

\{\begin{matrix} x_{1} = u_{1} + e_{1} \\ x_{2} = u_{2} + e_{2} \\ x_{3} = 2 u_{1} + 3 u_{2} + e_{3} \\ x_{4} = 5 u_{1} - 2 u_{2} + e_{4} \\ x_{5} = u_{1}^{2} - 3 u_{2} + e_{5} \\ x_{6} = - u_{1}^{3} + 3 u_{2}^{2} + e_{6} \end{matrix}

(32)

where

u_{1}

and

u_{2}

are independent variables which follow the uniform distributed in

[- 1, 1]

, while e₁–e₆ are independent noise variables obeying the normal distribution with the zero mean and variance of

0.01

.

Normal operation dataset consisting of 1500 samples is simulated based on the model Equation (32). Among these data, 1000 samples are used to build fault detection models and calculate the control limits of statistics under normal conditions. Another 500 samples are used as test data, and the variable

x_{3}

has a step change of

+ 1.5

after the 300th sample. Both KPCA and SPCA use Gaussian kernel function, and

c = 50

. The PCA and KPCA algorithms use the cumulative contribution rate method to determine the principal component, and the cumulative contribution rate is set to

90 %

. Because the control limit of the statistics is obtained by the KDE method, the confidence level of the control limit is used as a variable, and the value is separated by

0.05

within the range of

(0.5, 1)

. In this way, a set of control limits can be obtained. Calculate the true positive rate and false positive rate under each control limit, you can draw the ROC curve and calculate the AUC value.

The ROC curve of

T^{2}

statistics and

S P E

statistics of the three fault detection methods are shown in Figure 6, Figure 7 and Figure 8. Furthermore, Table 1 is the corresponding AUC value. It can be seen from the experimental results that the statistical AUC value of the SPCA fault detection method is significantly better than the other two methods, especially the AUC value of the

S P E

statistics reaches

0.9911

. The ROC curve is the embodiment of the comprehensive index of the fault detection algorithm. Therefore, it can be known that the fault detection algorithm based on SPCA has better robustness, that is, compared with the PCA algorithm and the KPCA algorithm alone, it has better fault detection performance.

In the second experiment, to verify the detection performance of the MOS gas sensor array based on the SPCA algorithm for micro-faults and multiple-faults, use the experimental data collected by the data acquisition system built in this paper to analyze. The training sample is composed of 1000 sample points obtained from continuous sampling over a period of time, and the test data is 500 continuous sampling points. The following is an example of selecting a set of experimental samples. Randomly select the No. 2 and No. 5 sensors as the fault sensors, and superimpose the impact fault and the bias fault, respectively. The multiple fault signals of the MOS gas sensor array are shown in Figure 9. Superimpose an impact fault on sensor No. 2, starting at the 165th sampling point and continuing for 5 sampling points. The fault amplitude is

0.95 %

of the normal data mean of sensor No. 2. Superimpose a bias fault on sensor No. 5, starting at 215 and continuing to 500. The fault amplitude is

0.23 %

of the normal data mean of sensor No. 5. The partial enlarge image in Figure 9 shows the superimposed effect on sensor No. 2 and No. 5. It can be seen that the impact fault and the bias fault appear at the corresponding positions in the test data set, respectively. On the parameter setting of the algorithm: The kernel function of KPCA and SPCA selected the Gaussian kernel function, and the bandwidth was 50. The PCA and KPCA algorithms used the cumulative contribution rate method to determine the principal component and the cumulative contribution rate with a value of

90 %

. In addition, the statistical control limit of the three algorithms was determined by the kernel density estimation method(KDE), and the confidence limits were

99 %

.

The fault detection performance of the MOS gas sensor array based on the SPCA algorithm is shown in Figure 10. Each column in Figure 10 represents the algorithm used: PCA, KPCA and SPCA, each row in Figure 10 represents the statistics used:

T^{2}

statistics and

S P E

statistics.

According to the results in Figure 10, the impact fault, as a significant fault, can be detected well by the

S P E

statistics of the three algorithms, while fault detection results by the

T^{2}

statistic of PCA and KPCA are relatively poor. However, as a bias fault of micro-faults, the fault amplitude is not significant enough. PCA and KPCA modeling methods cannot effectively detect all bias faults by the

S P E

statistics or

T^{2}

statistic. This is mainly because

S P E

is composed of the original data residual space, which is more sensitive to noise interference. Furthermore, the

T^{2}

statistics is composed of the principal component subspace, the volatility of

T^{2}

statistics of PCA and KPCA mainly comes from the fluctuation of the original data itself. In contrast, the SPCA modeling method adopted in this paper can effectively detect the faults by both

T^{2}

statistics and

S P E

statistics, whether it is for impulse fault or bias fault. In addition, it can be seen from the experimental results that the statistic value calculated by SPCA is several orders of magnitude higher than that of PCA and KPCA. Obviously, the SPCA algorithm amplifies the statistics of fault data with small amplitudes, so that the corresponding micro-faults can be effectively detected. Generally speaking, due to the sensitivity of the

S P E

statistics to noise interference, it had a higher error detection rate than the

T^{2}

statistics. The better detection rate of the SPCA algorithm’s

T^{2}

statistic is because it is a statistic composed of linear and nonlinear principal components, so it has better performance on the characteristics of the data.

In addition to using

T^{2}

statistics and

S P E

statistics for fault detection, wilks statistics are also used to evaluate the separability of the principal component features of the normal and fault data extracted from the fault detection model. Wilks statistics is used in multivariate statistical analysis to test the discriminant effect between each variable and the parent. The idea is to decompose the total deviation matrix into intra-group deviation and inter-group deviation [58]. Under the experimental samples in the above examples, 30 normal data and fault data are randomly selected and the fault detection model was used to extract the principal element characteristics. When using PCA and KPCA algorithms to determine principal components, the cumulative contribution rate was set as

90 %

. The principal component features extracted from the two kinds of data are regarded as two kinds of populations, and the wilks statistics are calculated to identifying and classifying them. The wilks statistical results of the three algorithms are shown in Table 2. It is known that when wilks statistics are used to identify categories, the smaller the result, the greater the difference between categories. Obviously, the principal components obtained based on SPCA fault detection model have smaller wilks statistics, and the normal data and fault data are easier to be distinguished. Therefore, this algorithm has better fault detection effect when applied to fault detection.

To comprehensively evaluate the performance of the SPCA method in detecting micro-faults, 100 groups of samples were analyzed by simulation experiment. 500 continuous sampling points were obtained from each set of experimental samples when the gas sensor array was working normally. Randomly select 1 to 3 sensors as fault sensors. Each sensor is randomly superimposed on the typical fault type described above. The fault degree is a micro-fault, and the fault amplitude is approximately 0.1–1% of the mean value of the selected sensor, which is randomly selected according to the normal distribution. Fault start time is also random. Table 3 shows the mean fault detection rate(FDR), error detection rate(EDR) and corresponding standard deviation of each statistic under 100 experimental samples of fault detection methods based on PCA, KPCA and SPCA.

As seen in the experimental results in Table 3, the fault detection methods based on PCA and KPCA have poor results for random micro-faults detection, and the corresponding standard deviation is also large. In contrast, the fault detection method based on SPCA algorithm has a high fault detection rate, and the corresponding error detection rate is lower than other algorithms with better stability.

Based on the above experiments and analysis, compared with using the PCA or KPCA algorithm alone for fault detection, the fault detection method based on SPCA adopted in this paper has better robustness, and obviously has a higher fault detection rate and a lower error detection rate in the detection of micro-faults and multiple-faults. In summary, the reason is that the SPCA algorithm performs secondary modeling of PCA and KPCA on the fault data successively, extracts the linear features and non-linear features, highlights the micro-faults and makes the statistics of the fault data several orders of magnitude higher than the normal signal statistics, achieving a good effect of fault detection.

3.3. Fault Isolation

To verify the superiority of the SPCA-based reconstruction contribution fault isolation algorithm under multiple faults, the machine olfactory system built in Section 3.1 was used to collect data under normal working conditions of the gas sensor array to obtain sufficient experimental samples. The experimental data with simulated faults were acquired by the superposition method. In total, 100 independent repeated samplings were performed to obtain 100 sets of experimental samples. In each group of experimental samples, one to three sensors were randomly selected as fault sensors, and the fault type was randomly selected. Superimpose the faults of different amplitudes on the normal signal, and the start time of the fault was randomly set. Then, obtain enough simulation multi-fault experiment samples.

The following is an example to illustrate the isolation process of a single fault, two faults and three faults simultaneously. For convenience of expression, the following fault samples are taken as examples to illustrate. There were 500 test samples, and MOS gas sensors No. 2, 3, and 5 were randomly selected as fault sensors. The specific fault settings were as follows: a bias fault with an amplitude of

0.65 %

of the normal mean was superimposed on the 301–500 samples of the No. 2 sensor, a bias fault with an amplitude of

0.77 %

of the normal mean was superimposed on the 351–500 samples of the No. 3 sensor, and a bias fault with an amplitude of

0.67 %

of the normal mean was superimposed on the 381–500 samples of the No. 5 sensor. In this way, one fault, two faults and three faults were obtained at different time periods.

According to Figure 3 of the fault isolation algorithm, the control limit of

S P E

statistics in fault detection constructed by SPCA model should be obtained at first. Figure 11 shows the SPE statistics based on SPCA algorithm corresponding to the above example.

The red dotted line in Figure 11 is the

S P E

statistic control limit

S P E a = 111.3

calculated based on the SPCA algorithm. The

S P E

statistics based on SPCA began to change beyond the control limit at the 301st sample, and then the statistical changes were further increased at the 351st and 381st samples, respectively, indicating that failures occurred successively by the sensors. However, it is obviously impossible to determine the specific type label of the faulty sensor from the statistical variation diagram. Next comes fault isolation, where the

S P E

statistic control limit is used as the iteration termination condition of the isolation algorithm. Then, in the case of one fault sensor, two fault sensors and three fault sensors, the contribution of each fault direction is calculated. The fault direction labels with the maximum contribution are shown in the figure. The calculation results are shown in Figure 12, the abscissa represents the fault direction set label, and the ordinate represents the contribution of each reconstruction. The fault direction set label is obtained by the permutation and combination of sensor numbers. The specific definition of the fault direction set label is shown in Equation (33), and the calculation method is also described below.

Fault direction set label is defined here to compare the actual results with the simulation results. For the MOS gas sensor array with m gas sensors,

Q = Z (m, p)

(Z is the permutation function) candidate fault direction set

[Ξ_{1}, Ξ_{2}, \dots, Ξ_{Q}]

. Let

Γ (Ξ)

be the expression of fault direction set

Ξ = [ξ_{1}, ξ_{2} . ., ξ_{p}]

,

i = 1, 2, \dots, Q

and the formula is

\begin{matrix} Γ (Ξ) & = \sum_{j = 1}^{i_{1} - 1} Z (m - j, p - 1) + \sum_{j = i_{1} + 1}^{i_{2} - 1} Z (m - j, p - 2) \\ + \dots + \sum_{j = i_{p - 2} + 1}^{i_{p - 1} - 1} Z (m - j, 1) + i_{p} - i_{p - 1} \end{matrix}

(33)

For example, we are using a sensor array consisting of 20 MOS gas sensors, and each of them has the possible to malfunction. In the case of one fault sensor, after permutation and combination operations, there are 20 fault direction set labels, that is,

Q = 20

, which are 1, 2, 3,…, 19, 20, a fault direction label of 1 means that sensor 1 has failed, a fault label of 2 means that sensor 2 has failed, and so on. In the case of two fault sensors, after permutation and combination operation, there are 190 fault direction set labels, that is,

Q = 190

, which are 1, 2, 3,…, 189, 190, respectively, which means that the fault sensor combination is (1,2), (1,3), (1,4),…, (18,20), (19,20), a fault direction label of 1 means that both sensor 1 and sensor 2 have failed, and the fault direction label of 2 means that both sensor 1 and sensor 3 have failed, and so on.

According to the preset settings of this experiment, one fault, two faults and three faults were obtained at different time periods. Taking sensor No. 2, 3, and 5 as an example, Equation (33) was used to calculate the fault direction set label as 173, that is, the fault sensor combination is (2,3,5). This value is consistent with the preset fault combination. As identified by the algorithm in Figure 12, the fault direction label with the maximum contribution was 173 when three faults occurred. This value is consistent with the preset fault direction label, and it was the same for one and two faults. This verifies that the proposed fault isolation algorithm can accurately identify the labels of faulty sensors.

It needs to be noted that it is found that when three or more faults are reconstructed simultaneously, reconstruction errors are inevitably introduced that cannot be ignored. This can lead to situations where the reconstructed

S P E

statistics cannot be less than the preset threshold, even if the data are reconstructed in the correct direction of the fault. Therefore, on the premise of correctly identifying the fault direction, the threshold value can be slightly larger than the statistical threshold value. In this experiment, the threshold was set as

1.5

times the

S P E

statistic threshold.

The nature of the SPCA-based reconstruction contribution fault isolation algorithm is an iterative method that can adaptively determine the number of faults and locate the fault sensors in the MOS gas sensor array. Figure 13 shows the

S P E

statistics reconstructed according to the number of fault sensors in three fault cases. The first box is the original fault detection result without reconstructed. It can be clearly seen from this box that the

S P E

statistics based on SPCA began to change beyond the threshold value at the 301st sample, and then the statistical changes were further increased at the 351st and 381st samples, respectively, indicating that multiple failures occurred in this sensor array. The second, third, and fourth boxes are the

S P E

statistics of the reconstructed data assuming that the number of fault sensors is 1, 2, and 3. As seen from the value of the vertical axis, as the number of fault sensors involved in the reconstruction increased, the

S P E

statistics of the reconstructed data gradually decreased. When it is assumed that only one sensor fails or two sensors fail, the

S P E

statistics of the reconstructed data is still significantly beyond the threshold value from the 351st sample in the second box or from the 381st sample in the third box. This means that the conditions for the termination of the iteration have not been reached at this time, and the iteration will continue. Until the three faults were completely reconstructed, the

S P E

statistics of the reconstructed data were less than the set threshold value, the iteration stopped and the fault isolation was completed. This indicates that a total of three sensors in the sensor array have failed, and the fault direction set labels is 173 shown in Figure 12, that is, the faulty sensors are No. 2, 3, and 5. The result of the algorithm is consistent with the actual situation, and the fault location is successfully realized.

To comprehensively evaluate the performance of the SPCA-based reconstruction contribution fault isolation algorithm for the MOS gas sensor array proposed in this paper, the proposed method is compared with the commonly used PCA fault isolation method based on contribution plots and the SPCA fault isolation method based on contribution plots.

According to the results in Table 4, when a single fault occurs, the three isolation methods have good isolation accuracy. However, due to the trailing effect of the contribution plots, when the number of fault sensors increased to 3, the isolation accuracy of the PCA and SPCA methods decreased to approximately

30 %

. However, the SPCA-based reconstruction contribution fault isolation algorithm proposed in this paper still had an accuracy rate of

96.1 %

. This is because the proposed algorithm is adaptive to the determination of the number of faults by using the iterative method, and the idea of reconstruction contribution can reduce the trailing effect of the contribution plots method. Therefore, the SPCA-based reconstruction contribution fault isolation algorithm has great advantages.

4. Conclusions

In this paper, a new set of fault detection and isolation schemes is proposed to solve the problems of micro-fault detection and multiple-fault location of gas sensor arrays. The SPCA fault detection method adopts the serial model structure and combines PCA and KPCA algorithms to obtain the joint

T^{2}

statistics of the set linear and nonlinear principal component characteristics and the

S P E

statistic to characterize the residual space feature. The new combined

T^{2}

statistic and

S P E

statistic are more sensitive to micro-faults in the gas sensor array and had a lower fault misdetection rate. On this basis, the SPCA-based reconstruction contribution fault isolation algorithm is proposed by using the reconstruction contribution and fault direction set theory. Compared with the traditional contribution plots method, this method can effectively reduce the influence of the trailing effect, and can accurately isolate multiple-faults adaptively. The experimental results show that the SPCA fault detection method has a high precision for gas sensor array micro-fault detection, and the proposed SPCA-based reconstruction contribution fault isolation algorithm still has high isolation accuracy when multiple sensor faults occur.

Author Contributions

Methodology, Y.X.; software, R.M.; supervision, Y.X.; validation, Z.Y.; writing—original draft preparation, R.M.; writing—review and editing, Z.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, D.; Yang, Z.; Yu, S.; Mi, Q.; Pan, Q. Diversiform metal oxide-based hybrid nanostructures for gas sensing with versatile prospects. Coord. Chem. Rev. 2020, 413, 213272. [Google Scholar] [CrossRef]
Buono, P.; Balducci, F. A Web App for Visualizing Electronic Nose Data. In Proceedings of the 2018 22nd International Conference Information Visualisation (IV), Fisciano, Italy, 10–13 July 2018. [Google Scholar]
Zarra, T.; Cimatoribus, C.; Naddeo, V.; Reiser, M.; Belgiorno, V.; Kranert, M. Environmental odour monitoring by electronic nose. Glob. Nest J. 2018, 20, 664–668. [Google Scholar]
Tiele, A.; Wicaksono, A.; Ayyala, S.K.; Covington, J.A. Development of a Compact, IoT-Enabled Electronic Nose for Breath Analysis. Electronics 2020, 9, 84. [Google Scholar] [CrossRef] [Green Version]
Tkaczyk, M. Development of a Low-Cost Electronic Nose for Detection of Pathogenic Fungi and Applying It to Fusarium oxysporum and Rhizoctonia solani. Sensors 2021, 21, 5868. [Google Scholar]
Licht, J.C.; Grasemann, H. Potential of the Electronic Nose for the Detection of Respiratory Diseases with and without Infection. Int. J. Mol. Sci. 2020, 21, 9416. [Google Scholar] [CrossRef]
Zambotti, G.; Soprani, M.; Gobbi, E.; Capuano, R.; Ponzoni, A. Portable Electronic Nose Device for the Identification of Food Degradation. In Sensors and Microsystems; Springer: Cham, Switzerland, 2020. [Google Scholar]
Rusinek, R.; Kmiecik, D.; Gawrysiak-Witulska, M.; Malaga-Tobola, U.; Gancarz, M. Identification of the Olfactory Profile of Rapeseed Oil as a Function of Heating Time and Ratio of Volume and Surface Area of Contact with Oxygen Using an Electronic Nose. Sensors 2021, 21, 303. [Google Scholar] [CrossRef] [PubMed]
Licen, S.; Di Gilio, A.; Palmisani, J.; Petraccone, S.; de Gennaro, G.; Barbieri, P. Pattern Recognition and Anomaly Detection by Self-Organizing Maps in a Multi Month E-nose Survey at an Industrial Site. Sensors 2020, 20, 1887. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kang, M.; Cho, I.; Park, J.; Jeong, J.; Lee, K.; Lee, B.; Del Orbe Henriquez, D.; Yoon, K.; Park, I. High Accuracy Real-Time Multi-Gas Identification by a Batch-Uniform Gas Sensor Array and Deep Learning Algorithm. ACS Sens. 2022, 7, 430–440. [Google Scholar] [CrossRef]
Xie, Z.; Raju, M.V.R.; Brown, B.S.; Stewart, A.C.; Fu, X.A. Electronic nose for detection of toxic volatile organic compounds in air. In Proceedings of the International Conference on Solid-State Sensors, Kaohsiung, Taiwan, 18–22 June 2017. [Google Scholar]
Addabbo, T.; Bertocci, F.; Fort, A.; Mugnaini, M.; Shahin, L.; Vignoli, V.; Spinicci, R.; Rocchi, S.; Gregorkiewitz, M. An Artificial Olfactory System (AOS) for Detection of Highly Toxic Gases in Air Based on YCoO₃. Procedia Eng. 2014, 87, 1095–1098. [Google Scholar] [CrossRef] [Green Version]
Lim, S.H.; Liang, F.; Kemling, J.W.; Musto, C.J.; Suslick, K.S. An Optoelectronic Nose for Detection of Toxic Gases. Nat. Chem. 2009, 1, 562–567. [Google Scholar] [CrossRef] [Green Version]
Green, G.C.; Chan, A.D.C.; Dan, H.; Min, L. Using a metal oxide sensor (MOS)-based electronic nose for discrimination of bacteria based on individual colonies in suspension. Sens. Actuators B Chem. 2011, 152, 21–28. [Google Scholar] [CrossRef]
Zou, Y.; Lv, J. Using Recurrent Neural Network to Optimize Electronic Nose System with Dimensionality Reduction. Electronics 2020, 9, 2205. [Google Scholar] [CrossRef]
John, A.T.; Murugappan, K.; Nisbet, D.R.; Tricoli, A. An Outlook of Recent Advances in Chemiresistive Sensor-Based Electronic Nose Systems for Food Quality and Environmental Monitoring. Sensors 2021, 21, 2271. [Google Scholar] [CrossRef]
Shekhirev, M.; Lipatov, A.; Torres, A.; Vorobeva, N.S.; Sinitskii, A. Highly Selective Gas Sensors Based on Graphene Nanoribbons Grown by Chemical Vapor Deposition. ACS Appl. Mater. Interfaces 2020, 12, 7392–7402. [Google Scholar] [CrossRef] [PubMed]
Lee, J.; Jung, Y.; Sung, S.; Lee, G.; Kim, J.; Seong, J.; Shim, Y.; Jun, S.; Jeon, S. High-performance gas sensor array for indoor air quality monitoring: The role of Au nanoparticles on WO₃, SnO₂, and NiO-based gas sensors. J. Mater. Chem. A 2021, 9, 1159–1167. [Google Scholar] [CrossRef]
Liu, X.; Ma, T.; Pinna, N.; Zhang, J. Two-Dimensional Nanostructured Materials for Gas Sensing. Adv. Funct. Mater. 2017, 27, 1702168. [Google Scholar] [CrossRef]
Joshi, N.; Braunger, M.L.; Shimizu, F.M.; Riul, A., Jr.; Oliveira, O.N. Insights into nano-heterostructured materials for gas sensing: A review. Multifunct. Mater. 2021, 4, 032002. [Google Scholar] [CrossRef]
Binions, R. Metal Oxide Semiconductor Gas Sensors in Environmental Monitoring. In Semiconductor Gas Sensors; Elsevier Science: Amsterdam, The Netherlands, 2013; pp. 433–466. [Google Scholar]
Hammer, C.; Warmer, J.; Maurer, S.; Kaul, P.; Thoelen, R.; Jung, N. A Compact 16 Channel Embedded System with High Dynamic Range Readout and Heater Management for Semiconducting Metal Oxide Gas Sensors. Electronics 2020, 9, 1855. [Google Scholar] [CrossRef]
Korotcenkov, G.; Han, S.D.; Cho, B.K.; Brinzari, V. Grain Size Effects in Sensor Response of Nanostructured SnO2- and In2O3-Based Conductometric Thin Film Gas Sensor. Crit. Rev. Solid State Mater. Sci. 2009, 34, 1–17. [Google Scholar] [CrossRef]
Korotcenkov, G.; Cho, B.K. Instability of metal oxide-based conductometric gas sensors and approaches to stability improvement (short survey). Sens. Actuators B Chem. 2011, 156, 527–538. [Google Scholar] [CrossRef]
Padilla, M.; Perera, A.; Montoliu, I.; Chaudry, A.; Marco, S. Fault detection, identification, and reconstruction of faulty chemical gas sensors under drift conditions, using Principal Component Analysis and Multiscale-PCA. In Proceedings of the International Joint Conference on Neural Networks, IJCNN 2010, Barcelona, Spain, 18–23 July 2010. [Google Scholar]
Pardo, M.; Faglia, G.; Sberveglieri, G.; Corte, M.; Masulli, F.; Riani, M. Monitoring Reliability of Sensors in an Array by Neural Networks. Sens. Actuators B Chem. 2002, 67, 128–133. [Google Scholar] [CrossRef] [Green Version]
Fonollosa, J.; Vergara, A.; Huerta, R. Algorithmic mitigation of sensor failure: Is sensor replacement really necessary? Sens. Actuators B Chem. 2013, B183, 211–221. [Google Scholar] [CrossRef]
Martinelli, E.; Magna, G.; Vergara, A.; Natale, C.D. Cooperative classifiers for reconfigurable sensor arrays. Sens. Actuators B Chem. 2014, B199, 83–92. [Google Scholar] [CrossRef]
Qiu, T.; Zhu, X. Statistics Analysis of PCA-Based Sensor Fault Detection. Appl. Mech. Mater. 2011, 121–126, 1085–1089. [Google Scholar] [CrossRef]
Ajami, A.; Daneshvar, M. Data driven approach for fault detection and diagnosis of turbine in thermal power plant using Independent Component Analysis (ICA). Int. J. Electr. Power Energy Syst. 2012, 43, 728–735. [Google Scholar] [CrossRef]
Lin, S.L. Application of Machine Learning to a Medium Gaussian Support Vector Machine in the Diagnosis of Motor Bearing Faults. Electronics 2021, 10, 2266. [Google Scholar] [CrossRef]
Xu, P.; Wang, Y.C.; Wang, K.; Wang, Q.Y. Fault Detection and Diagnosis for Sensor in Complex Control System Based on KPCA. Appl. Mech. Mater. 2014, 623, 202–210. [Google Scholar] [CrossRef]
Li, X.B.; Yang, Y.P.; Zhang, W.D. Fault detection method for non-Gaussian processes based on non-negative matrix factorization. Asia Pac. J. Chem. Eng. 2012, 8, 362–370. [Google Scholar] [CrossRef]
Cheng, Z.; Gao, X.; Tao, X.; Yuan, L.; Pang, Y. Fault detection and diagnosis strategy based on a weighted and combined index in the residual subspace associated with PCA. J. Chemom. 2018, 32, e2981. [Google Scholar]
Fu, X.; Wang, Y.; Li, W.; Yang, Y.; Postolache, O. Lightweight Fault Detection Strategy for Wireless Sensor Networks Based on Trend Correlation. IEEE Access 2021, 9, 9073–9083. [Google Scholar] [CrossRef]
Jiang, H.; Xu, G.; Gao, Z.; Li, Y. A dual-parameter optimization KPCA method for process fault diagnosis. In Proceedings of the 2015 Annual Reliability and Maintainability Symposium (RAMS), Palm Harbor, FL, USA, 26–29 January 2015. [Google Scholar]
Chen, Y.; Cong, Z.; Zhang, Q.; Xia, H. UAV fault detection based on GA-BP neural network. In Proceedings of the 2017 32nd Youth Academic Annual Conference of Chinese Association of Automation (YAC), Hefei, China, 19–21 May 2017. [Google Scholar]
Miller, P.; Swanson, R.E.; Heckler, C.E. Contribution plots: A missing link in multivariate quality control. Appl. Math. Comput. Sci 1998, 8, 775–792. [Google Scholar]
Westerhuis, J.A.; Gurden, S.P.; Smilde, A.K. Generalized contribution plots in multivariate statistical process monitoring. Chemom. Intell. Lab. Syst. 2000, 51, 95–114. [Google Scholar] [CrossRef]
Wang, Y.; He, W.; Liu, Z.; Yang, C. The research of fault diagnosis method based on weighted Q contribution plot and SDG. In Proceedings of the Control Conference, Washington, DC, USA, 17–19 July 2013. [Google Scholar]
Wang, J.; Ge, W.; Zhou, J.; Wu, H.; Jin, Q. Fault isolation based on residual evaluation and contribution analysis. J. Frankl. Inst. 2016, 354, 2591–2612. [Google Scholar] [CrossRef]
Babamoradi, H.; Frans, V.D.B.; Rinnan, S. Confidence limits for contribution plots in multivariate statistical process control using bootstrap estimates. Anal. Chim. Acta 2016, 908, 75–84. [Google Scholar] [CrossRef]
Navi, M.; Meskin, N.; Davoodi, M. Sensor fault detection and isolation of an industrial gas turbine using partial adaptive KPCA. IFAC Pap. 2018, 64, 37–48. [Google Scholar] [CrossRef]
Carlos, F.; Alcala, S.; Joe, Q. Reconstruction-Based Contribution for Process Monitoring with Kernel Principal Component Analysis. Ind. Eng. Chem. Res. 2010, 49, 7849–7857. [Google Scholar]
Guo, X.; Yang, M.; Li, Y. Modified reconstruction-based contribution plots for fault isolation. Chin. J. Sci. Instrum. 2015, 36, 1193–1200. [Google Scholar]
Mourot, G.; Kallas, M.; Anani, K.; Maquin, D. Sparse Reconstruction-Based Contribution for Multiple Fault Isolation by KPCA. In Proceedings of the Mediterranean Conference on Control & Automation, Zadar, Croatia, 19–22 June 2018. [Google Scholar]
Deng, X.; Tian, X.; Chen, S.; Harris, C.J. Nonlinear Process Fault Diagnosis Based on Serial Principal Component Analysis. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 560–572. [Google Scholar] [CrossRef] [Green Version]
Dunia, R.; Qin, S.J.; Edgar, T.F.; Mcavoy, T.J. Identification of Faulty Sensors Using Principal Component Analysis. Aiche J. 2010, 42, 2797–2812. [Google Scholar] [CrossRef]
Sang, W.C.; Lee, C.; Lee, J.M.; Jin, H.P.; Lee, I.B. Fault detection and identification of nonlinear processes based on kernel PCA. Chemom. Intell. Lab. Syst. 2005, 75, 55–67. [Google Scholar]
Lee, J.M.; Yoo, C.K.; Sang, W.C.; Vanrolleghem, P.A.; Lee, I.B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223–234. [Google Scholar] [CrossRef]
Samuel, R.T.; Cao, Y. Nonlinear process fault detection and identification using kernel PCA and kernel density estimation. Syst. Sci. Control Eng. Open Access J. 2016, 4, 165–174. [Google Scholar] [CrossRef] [Green Version]
Liang, J. Multivariate Statistical Process Monitoring Using Kernel Density Estimation. Asia Pac. J. Chem. Eng. 2010, 13, 185–192. [Google Scholar] [CrossRef]
Chen, Y.; Song, K.; Wang, Q.; Jiahao, L.U. Research on Self-Validating MOS Gas Sensor Array and Its Application. Chin. J. Sens. Actuators 2018. [Google Scholar]
Shen, Z.; Wang, Q. Failure Detection, Isolation, and Recovery of Multifunctional Self-Validating Sensor. IEEE Trans. Instrum. Meas. 2012, 61, 3351–3362. [Google Scholar] [CrossRef]
Feng, Z.; Wang, Q.; Shida, K. Design and Implementation of a Self-Validating Pressure Sensor. IEEE Sens. J. 2009, 9, 207–218. [Google Scholar] [CrossRef]
Han, Q.Z.; Yong, Y. A wavelet-based approach to abrupt fault detection and diagnosis of sensors. IEEE Trans. Instrum. Meas. 2001, 50, 1389–1396. [Google Scholar]
Marzban, C. The ROC Curve and the Area under It as Performance Measures. Weather Forecast. 2004, 19, 1106–1114. [Google Scholar] [CrossRef]
Yin, Y.; Tian, X. Classification of Chinese drinks by a gas sensors array and combination of the PCA with Wilks distribution. Sens. Actuators B Chem. 2007, 124, 393–397. [Google Scholar] [CrossRef]

Figure 1. Block diagram of fault detection and isolation method.

Figure 2. Flowchart of fault detection based on SPCA.

Figure 3. Flowchart of the multifault isolation algorithm for gas sensor array.

Figure 4. Gas sensor array data acquisition system.

Figure 5. MOS gas sensor array normal output signal.

Figure 6. PCA algorithm statistics of the ROC curve.

Figure 7. KPCA algorithm statistics of the ROC curve.

Figure 8. SPCA algorithm statistics of the ROC curve.

Figure 9. Effect diagram of fault superimposed to sensor2 and sensor5.

Figure 10. Fault detection results of three algorithms.

Figure 11. Multiple-fault diagnosis results based on SPCA algorithm.

Figure 12. Reconstruction contribution rate for one fault, two faults and three faults.

Figure 13.

S P E

statistics of reconstructed data with different numbers of faults.

Figure 13.

S P E

statistics of reconstructed data with different numbers of faults.

Table 1. Comparison of AUC values of the three algorithms.

AUC Value	PCA	KPCA	SPCA
$T^{2}$ statistics	0.6740	0.6671	0.8883
$S P E$ statistics	0.9219	0.9265	0.9911

Table 2. Wilks statistics of three comparison algorithms.

	PCA	KPCA	SPCA
Wilks statistics	0.5612	0.5424	0.1805

Table 3. Detection rate and error detection rate of

T^{2}

statistics and SPE statistics of the three algorithms.

Table 3. Detection rate and error detection rate of

T^{2}

statistics and SPE statistics of the three algorithms.

Compare Algorithm	$T^{2}$ Statistics		$SPE$ Statistics
Compare Algorithm	FDR Mean ± Std	EDR Mean ± Std	FDR Mean ± Std	EDR Mean ± Std
PCA	$62.21 % \pm 11.36 %$	$2.70 % \pm 3.81 %$	$87.56 % \pm 9.24 %$	$4.86 % \pm 4.21 %$
KPCA	$45.24 % \pm 10.91 %$	$6.85 % \pm 3.92 %$	$79.46 % \pm 12.51 %$	$5.74 % \pm 2.69 %$
SPCA	$97.26 % \pm 4.15 %$	$0.72 % \pm 0.52 %$	$98.79 % \pm 2.46 %$	$1.72 % \pm 1.48 %$

Table 4. Isolation accuracy of the three algorithms.

Fault Number	PCA Contribution Plots Method	KPCA Contribution Plots Method	SPCA Reconstruction Isolation Method
1	$100 %$	$99.98 %$	$100 %$
2	$55.7 %$	$54.68 %$	$99.64 %$
3	$31.26 %$	$30.95 %$	$96.1 %$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, Y.; Meng, R.; Yang, Z. Research on Micro-Fault Detection and Multiple-Fault Isolation for Gas Sensor Arrays Based on Serial Principal Component Analysis. Electronics 2022, 11, 1755. https://doi.org/10.3390/electronics11111755

AMA Style

Xu Y, Meng R, Yang Z. Research on Micro-Fault Detection and Multiple-Fault Isolation for Gas Sensor Arrays Based on Serial Principal Component Analysis. Electronics. 2022; 11(11):1755. https://doi.org/10.3390/electronics11111755

Chicago/Turabian Style

Xu, Yonghui, Ruotong Meng, and Zixuan Yang. 2022. "Research on Micro-Fault Detection and Multiple-Fault Isolation for Gas Sensor Arrays Based on Serial Principal Component Analysis" Electronics 11, no. 11: 1755. https://doi.org/10.3390/electronics11111755

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Micro-Fault Detection and Multiple-Fault Isolation for Gas Sensor Arrays Based on Serial Principal Component Analysis

Abstract

1. Introduction

2. Method

2.1. Overview of Fault Detection and Isolation Methods

2.2. Fault Detection Stage

2.2.1. Serial Principal Component Analysis

2.2.2. Fault Detection Based on SPCA

2.3. Fault Isolation Stage

3. Experiment and Results

3.1. Experimental Setup

3.2. Fault Detection

3.3. Fault Isolation

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI