*4.2. Mode Discrimination Method*

If the training data has *q* patterns *R*1, *R*2, ··· , *R<sup>q</sup>* , the JS divergence set

$$\{JS(\mathbf{Z}, \mathbf{R}\_1), JS(\mathbf{Z}, \mathbf{R}\_2), \dots, JS(\mathbf{Z}, \mathbf{R}\_q) \}$$

between the testing data *Z* and different modes *R* can be calculated using Equation (51). If *i*<sup>0</sup> is the schema tag corresponding to the minimum JS divergence, it means that

$$i\_0 = \arg\min \{ \lceil S(\mathbf{Z}, \mathbf{R}\_1) \rceil, \lceil S(\mathbf{Z}, \mathbf{R}\_2) \rceil, \dots, \lceil S(\mathbf{Z}, \mathbf{R}\_q) \rceil \}. \tag{53}$$

It is reasonable to assume that testing data *Z* and training data *Ri*<sup>0</sup> belong to the same mode. However, for a new failure mode that may be unknown in the application, Equation (50) evaluates the testing data *Z* as the known failure mode of type *i*0, which is obviously unreasonable.

If *JS Z*, *Ri*<sup>0</sup> is too large, we believe that testing data *Z* comes from an unknown new failure mode; its label is *q* + 1. However, the method to obtain the threshold *JS*high of *JS Z*, *Ri*<sup>0</sup> is a problem that should be investigated. A method to determine *JS*high is provided below.

For the training data *Ri*<sup>0</sup> = [*r*1,*r*2, ··· ,*rm*] of the *i*<sup>0</sup> mode, the density estimation of the data set can be obtained using Equation (16).

$$\hat{f}\_{K,R}(\mathbf{x}) = \frac{1}{m(h\_m)^n} \sum\_{i=1}^m K\left(\frac{r\_i - \mathbf{x}}{h\_m}\right) \tag{54}$$

In addition, if the length of the sampling window is fixed as *p*(*p* < *m*), the new sampling data is *R*(*j*) = *rj*,*rj*+1, ··· ,*rj*+*<sup>p</sup>* ⊂ *Ri*<sup>0</sup> , *j* = 1, 2, ··· , *m* − *p* by sliding the sampling window. For each *R*(*j*), the density of the dataset can be estimated as

$$f\_{K, \mathcal{R}^{(j)}}(\mathbf{x}) = \frac{1}{p\left(h\_p\right)^{\mathrm{fl}}} \sum\_{i=j}^{j+p} K\left(\frac{r\_i - \mathbf{x}}{h\_p}\right). \tag{55}$$

Using Equation (52), the divergence between the training data *R* and the sample data *R*(*j*) can be obtained as

$$\begin{split} \mathbf{J}S\_{\hat{\boldsymbol{\beta}}} &= \mathbf{J}S \Big( \mathbf{R}, \mathbf{R}^{(\hat{\boldsymbol{\beta}})} \Big) \\ &= H \Big( \Big( \hat{f}\_{\mathbf{K},\mathbf{R}} + \hat{f}\_{\mathbf{K},\mathbf{R}^{(\hat{\boldsymbol{\beta}})}} \Big), \Big( \hat{f}\_{\mathbf{K},\mathbf{R}} + \hat{f}\_{\mathbf{K},\mathbf{R}^{(\hat{\boldsymbol{\beta}})}} \Big) / 2 \Big) - H \Big( \hat{f}\_{\mathbf{K},\mathbf{R}} \Big) - H \Big( \hat{f}\_{\mathbf{K},\mathbf{R}^{(\hat{\boldsymbol{\beta}})}} \Big). \end{split} \tag{56}$$

Using Equation (55), we can obtain a series of JS divergence calculation value sets

$$J\mathcal{S} = \left\{ J\mathcal{S}\_1, J\mathcal{S}\_2, \dots, J\mathcal{S}\_{m-p} \right\}.$$

We use this set to provide the estimation formula ˆ *fJS*(*x*) of the density function *fJS*(*x*) of the JS divergence as

$$\hat{f}\_{\bar{f}|S}(x) = \frac{1}{(m-p)\left(h\_{m-p}\right)^n} \sum\_{j=1}^{m-p} K\left(\frac{fS\_{\bar{f}}-x}{h\_{m-p}}\right). \tag{57}$$

If the significance level is *α*, the probability of ˆ *fJS*(*x*) that exceeds the threshold *JS*high is

$$P\left\{\int\_{0}^{fS\_{\text{high}}} \hat{f}\_{\text{JS}}(\mathbf{x})d\mathbf{x}.\right\} < \alpha\tag{58}$$

Because the distribution type of JS divergence is not a common random distribution, the quantile cannot be obtained by looking up the table; instead, it can only be obtained by numerical integration. If *h* is the step size, and

$$\int\_{h\ast(i-1)}^{+\infty} \hat{f}\_{\text{JS}}(\mathbf{x})d\mathbf{x} \le a \le \int\_{h\ast i}^{+\infty} \hat{f}\_{\text{JS}}(\mathbf{x})d\mathbf{x},\tag{59}$$

it is reasonable to deduce that

$$JS\_{\text{high}} = h \ast i. \tag{60}$$

The following fault detection and isolation criteria are constructed by Equation (58).

**Criterion 1.** *Suppose i*<sup>0</sup> *is the pattern label corresponding to the minimum JS divergence—see Equation (38)—the training data Ri*<sup>0</sup> = [*r*1,*r*2, ··· ,*rm*] *corresponding to the i*<sup>0</sup> *mode and the upper bound of JS divergence is JS*high*—see Equation (56). If the testing data Z* = [*z*1, *z*2,..., *zl*] *meet the requirements,*

$$JS\left(\mathbf{Z}, \mathbf{R}\_{i\_0}\right) \le JS\_{\text{high}}.\tag{61}$$

*The testing data Z and training data Ri*<sup>0</sup> *belong to the same failure mode; otherwise, the testing data Z are considered to originate from the unknown new failure mode, and their label is marked as q* + 1*.*

In conclusion, the fault diagnosis method based on optimal bandwidth is provided (See Algorithm 2), and the corresponding fault diagnosis method flowchart is shown in Figure 2.
