**1. Introduction**

To ensure production safety and product quality, process monitoring technology has become an indispensable ingredient for industrial processes in recent years. It is commonly divided into model-based methods and data-driven methods. Compared with the former ones, the later ones can take advantage of the routine measurement and do not rely on process prior knowledge and precise mechanism models, which are unavailable or cost-intensive to obtained at times [1,2]. Therefore, they are widely used in modern industrial process.

During the past decades, many data-driven process monitoring methods have been published [3–6]. Kim et al. [7] proposed a probabilistic PCA to monitoring industrial processes, which firstly extracts redundant information from the variables and constructs feature distribution for monitoring, but it only extracts features of the input space. Zhao et al. [8] proposed the probabilistic PLSR process monitoring method to monitor quality-related faults, which can simultaneously consider the fault characteristics of the input and output spaces for monitoring. Furthermore, Chen et al. [9] proposed a probability-related PCA method for detecting incipient faults, which can greatly improve the detection ability of minor faults. Probabilistic framework modeling can overcome process noise [10]. However, the process monitoring methods currently proposed are all static methods, and the actual production processes are dynamical featured with variable time lag [11,12].

Process dynamics could refer to the mutual influence before and after the current sampling [13]. To deal with the dynamics of process, Ku et al. [14] built an augmented

**Citation:** Chen, N.; Hu, F.; Chen, J.; Chen, Z.; Gui, W.; Li, X. A Process Monitoring Method Based on Dynamic Autoregressive Latent Variable Model and Its Application in the Sintering Process of Ternary Cathode Materials. *Machines* **2021**, *9*, 229. https://doi.org/10.3390/ machines9100229

Academic Editor: Antonio J. Marques Cardoso

Received: 8 September 2021 Accepted: 3 October 2021 Published: 7 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

matrix and extended the static PCA model to the dynamic PCA (DPCA) for process monitoring. However, the introduction of augmented matrix increases the parameter dimensions called the curse of dimensionality [15]. Motivated by DPCA, Li et al. [16] proposed a dynamic latent variable model for monitoring the Tennessee Eastman process. In this model, the autoregressive model is used to extract data dynamic information, and PCA is performed to reduce redundancy between variables. It divides variable order reduction and dynamic information extraction into two stages, which makes the system complex and not easy to tune. In addition, compared with process variables, quality variables also contain useful fault information [17]. For this reason, Ge et al. [18] proposed a supervised linear dynamic system model process monitoring method. This method uses a first-order autoregressive equation to simulate the first-order dynamic [19] but does not take the variable time lag into account.

Variable time lag refers to the delay between the effects of variables [20]. The existing monitoring methods considering time lag are usually divided into two categories [21]. One is to find the time lag between variables and translate the data to eliminate the time lag and then establish a static process monitoring model for the processed data. For example, Wang et al. [22] proposed a spatial reconstruction method to identify system time lag, then aligned the data and established a monitoring model, but the alignment operation will destroy the data structure and cause data loss. The other idea is to use time lag as an unknown parameter of the process monitoring model and identify the parameters through a data-driven method. For example, Huber et al. [23] proposed to take the time lag as a parameter of a high-order state space system model and then solve it uniformly with the model parameters, but this method relies on the setting of the time lag parameter and the parameter identification method.

From the above discussions, it can be observed that the variable time lag characteristic of a process makes the previous work unfavorable. However, this characteristic is common in industrial processes [24,25]. To deal with this problem, this paper proposes a process monitoring method based on a dynamic autoregressive latent variable model. Firstly, from the data point of view, a linear dynamic model is constructed between process variables and quality variables, and the dynamic information of process input and process output is compressed to latent variables, and then a dynamic autoregressive latent variable model (DALM) is constructed for latent variables to extract variable time lag information. In addition, a fusion Bayesian filtering, smoothing and expectation maximization algorithm is used to identify model parameters. Then, the DALM is applied to the industrial monitoring process. The process variables are filtered through improved Bayesian filtering technology to obtain the latent space distribution of the current state, and the T<sup>2</sup> statistics of the latent space are constructed and monitored [26] to realize the process monitoring task. The main contribution can be concluded as (1) a process monitoring method based on dynamic autoregressive latent variable model is proposed in this paper; (2) a dynamic autoregressive latent variable model (DALM) is developed to extract variable time lag information; (3) a fusion Bayesian filtering, smoothing and expectation maximization algorithm is improved to identify model parameters; (4) based on the DALM, the T<sup>2</sup> statistics of the latent space are constructed to realize the process monitoring task.

The main structure of the paper is arranged as follows. In the second section, a dynamic autoregressive latent variable model is proposed, and the parameter identification algorithm of the model is derived in detail. A process monitoring method based on DALM is proposed in the third chapter. The fourth section uses the monitoring method to monitor the sintering process of the ternary cathode material to verify the monitoring performance of the proposed method. Finally, the last section concludes.

### **2. Modeling Method Based on Dynamic Autoregressive Latent Variable Model**

This section proposes a dynamic autoregressive latent variable modeling method for the accurate modeling of dynamical industrial processes. Bayesian filtering and smoothing inference were used to obtain the spatial distribution of latent variables, and the parameters of the model were identified by combining with the EM algorithm.

### *2.1. Dynamic Autoregressive Latent Variable Model Structure*

In order to consider the dynamic characteristics of the process, the traditional probability latent variable model [27] establishes the relationship between the current moment and the previous moment data, as shown in (1).

$$\begin{aligned} \mathbf{z}\_{l} &= \mathbf{A}\mathbf{z}\_{l-1} + \boldsymbol{\eta}\_{l}^{\mathbf{z}} \\ \mathbf{x}\_{l} &= \mathbf{B}\_{\mathbf{x}} \mathbf{z}\_{l} + \boldsymbol{\eta}\_{l'}^{\mathbf{x}} \\ \mathbf{y}\_{t} &= \mathbf{B}\_{\mathbf{y}} \mathbf{z}\_{t} + \boldsymbol{\eta}\_{t'}^{\mathbf{y}} \end{aligned} \tag{1}$$

where the structure consists of a linear Gaussian dynamic equation and two linear Gaussian observation equations, **z***t* is the latent variable of the process state at time *t;* **x***t* and **y***t* are the process variable and quality variable at time *t*, respectively; **A**, **Bx** and **By** are their own load matrix. The Gaussian dynamic equation is used to describe the dynamic relationship of the process data. The observation equation compresses the information of the process data into low-dimensional latent variables. Therefore, an accurate mathematical model can be established for the dynamic process, but the structure does not consider the time lag characteristics of the process.

In order to further consider the characteristics of process time lag, on the basis of dynamic probabilistic latent variable model (DPLVM) [13], the trend similarity analysis algorithm [22] was first used to obtain the time lag coefficient L of the current process, and then an autoregressive equation was constructed for the latent variables to describe the variable time lag information. Among them, the autoregressive equation models the process dynamics and time lag characteristics, and the linear observation equation models the cross-correlation of data. The probability graph model of the model is shown in Figure 1, and the mathematical expression is shown in (2).

$$\begin{aligned} \mathbf{z}\_{l} &= \mathbf{A}\mathbf{h}\_{l-1} + \boldsymbol{\eta}\_{l'}^{\mathbf{z}} \\ \mathbf{x}\_{l} &= \mathbf{B}\_{\mathbf{x}} \mathbf{z}\_{l} + \boldsymbol{\eta}\_{l'}^{\mathbf{x}} \\ \mathbf{y}\_{t} &= \mathbf{B}\_{\mathbf{y}} \mathbf{z}\_{t} + \boldsymbol{\eta}\_{l'}^{\mathbf{y}} \end{aligned} \tag{2}$$

where **z***t* ∈ *R<sup>d</sup>* represents the latent variable of the process state at time *t*, **h***<sup>t</sup>*−<sup>1</sup> = [ **z***t*−1 *T* **z***t*−2 *T* ··· **z***t*−*L T* ] *T* ∈ *RdL* is the augmented state variable containing the latent variables at time L in the past, **x***t* ∈ *Rv* is the observed value of the process variable at time *t*, **y***t* ∈ *R<sup>k</sup>* is the observed value of the quality variable at time *t* and *d*, *v*, *k*, respectively, correspond to the dimensions of latent variables, process variables and quality variables. **A** ∈ *Rd*×*dL* is the state transition matrix, L is the time lag value of the process, and **Bx** ∈ *Rv*<sup>×</sup>*d*, **By** ∈ *Rk*×*<sup>d</sup>* are the state divergence matrices. η**<sup>z</sup>** *t* ∈ *<sup>R</sup>d*, η**<sup>x</sup>** *t* ∈ *Rv*, and η**y** *t* ∈ *R<sup>k</sup>* are Gaussian noise terms of latent variables, process variables and quality variables, respectively. Assuming that the noises are independent of each other, the distributions obeyed are η**<sup>z</sup>** *t* ∼ *N*(0, <sup>Σ</sup>**z**), η**<sup>x</sup>** *t* ∼ *N*(0, <sup>Σ</sup>**x**) and η**y** *t* ∼ *N*(0, <sup>Σ</sup>**y**), respectively. Latent variables represent the current state of the process, and this model is an extension of the traditional DPLVM.

**Figure 1.** Probability graph model of dynamic autoregressive latent variable model.

### *2.2. Parameter Identification Based on EM Algorithm*

Since only **x***t* and **y***t* can be observed in the process, and latent variables are abstracted to describe the state of the process and are unobservable, the EM algorithm was used to identify the parameters of the model [28]. Each iteration of the EM algorithm consisted of two steps: E step, seeking expectation (exception); M step, seeking maximization (maximization). This section uses Bayesian filtering and smoothing to infer the spatial distribution of latent variables, so as to solve the difficult problem of calculating latent variable statistics.

Under the framework of probability, the model assumed that the latent variables at the initial moment obeyed a Gaussian distribution with mean **u**0 and variance **V**0, that is, **z**0, **z**−1, ··· , **z**1−*L* ∼ *<sup>N</sup>*(**<sup>u</sup>**0, **<sup>V</sup>**0). From the knowledge of probability theory [16], it is easy to ge<sup>t</sup> that the distribution of the latent variable **z***t*, the process variable **x***t* and the quality variable **y***t*obey the Gaussian distribution, as shown in (3).

$$\begin{aligned} \mathbf{z}\_{t}|\mathbf{z}\_{t-1}, \mathbf{z}\_{t-2}, \dots, \mathbf{z}\_{t-L} &\sim N(\mathbf{A}\_{1}\mathbf{z}\_{t-1} + \mathbf{A}\_{2}\mathbf{z}\_{t-2} + \dots + \mathbf{A}\_{L}\mathbf{z}\_{t-L}, \Sigma\_{\mathbf{z}}),\\ \mathbf{x}\_{t}|\mathbf{z}\_{t} &\sim N(\mathbf{B}\_{\mathbf{X}}\mathbf{z}\_{t}, \Sigma\_{\mathbf{x}}),\\ \mathbf{y}\_{t}|\mathbf{z}\_{t} &\sim N(\mathbf{B}\_{\mathbf{Y}}\mathbf{z}\_{t}, \Sigma\_{\mathbf{y}}).\end{aligned} \tag{3}$$

The parameters that needed to be identified were denoted as **Θ** = "**<sup>A</sup>**,**Bx**,**By**,**u**0,**V**0, Σ**<sup>z</sup>**, Σ**x** , <sup>Σ</sup>**y**#, of which **A** = [**<sup>A</sup>**1,**A**2, ··· ,**A***L*]. According to the naive what you see is what you ge<sup>t</sup> thought, the parameter identification problem was transformed into the maximum observation data **x**1:*T*, **y**1:*T*. The log-likelihood function on the parameter **Θ** is shown in (4), where **x**1:*T* represents the observation sequence of the process variable **x**1, **x**2, ··· , **x***T*, **y**1:*T* represents the observation sequence **y**1, **y**2, ··· , **y***T* of the quality variable, where T is the total number of training samples, namely,

$$
\Theta^{new} = \arg\max\_{\Theta} \log P(\mathbf{x}\_{1:T}, \mathbf{y}\_{1:T} | \Theta). \tag{4}
$$

The EM algorithm [29] was used to solve the optimization problem of Equation (4). In the E step of the EM algorithm, the log-likelihood function log *<sup>P</sup>*(**<sup>x</sup>**1:*T*, **y**1:*T*, **<sup>z</sup>**1−*L*:*<sup>T</sup>*|**Θ**) of the complete data had to be calculated with respect to the conditional expectation of the latent variable **z**1−*L*:*T* to obtain the objective cost function (Q function), as shown in (5).

$$Q\left(\Theta|\Theta^{old}\right) = E\_{\mathbf{z}\_{1-L:T}\mid\left(\mathbf{x}\_{1:T},\mathbf{y}\_{1:T},\Theta^{old}\right)}\left\{\log P(\mathbf{x}\_{1:T},\mathbf{y}\_{1:T},\mathbf{z}\_{1-L:T}|\Theta)\right\}.\tag{5}$$

Actually, the likelihood function can be formulated by the application of the product rule of probability. From the model structure, the log-likelihood function of the complete data was expanded, as expressed by (6).

$$\begin{split} & \log P\left(\mathbf{x}\_{1:T}, \mathbf{y}\_{1:T}, \mathbf{z}\_{(-L+1):T} | \Theta \right) \\ &= \log \left\{ P(\mathbf{z}\_{0}, \mathbf{z}\_{-1}, \cdots, \mathbf{z}\_{-L+1}) \prod\_{t=1}^{T} P(\mathbf{z}\_{t} | \mathbf{z}\_{t-1}, \mathbf{z}\_{t-2}, \cdots, \mathbf{z}\_{t-L}) P(\mathbf{x}\_{t} | \mathbf{z}\_{t}) P(\mathbf{y}\_{t} | \mathbf{z}\_{t}) \right\} \\ &= \log P(\mathbf{z}\_{0}, \mathbf{z}\_{-1}, \cdots, \mathbf{z}\_{-L+1}) + \sum\_{t=1}^{T} \log P(\mathbf{z}\_{t} | \mathbf{z}\_{t-1}, \mathbf{z}\_{t-2}, \cdots, \mathbf{z}\_{t-L}) + \sum\_{t=1}^{T} \log P(\mathbf{x}\_{t} | \mathbf{z}\_{t}) \\ &+ \sum\_{t=1}^{T} \log P(\mathbf{y}\_{t} | \mathbf{z}\_{t}). \end{split} \tag{6}$$

For clear writing, we denote *<sup>E</sup>***<sup>z</sup>***T* (*f*(**<sup>x</sup>***t*, **y***<sup>t</sup>*, **<sup>z</sup>***t*)) = *<sup>E</sup>***<sup>z</sup>**1−*L*:*<sup>T</sup>*|(**<sup>x</sup>**1:*T*,**y**1:*T*,**Θ***old*)(*f*(**<sup>x</sup>***t*, **y***<sup>t</sup>*, **<sup>z</sup>***t*)), then the expectation of the complete data likelihood function log *<sup>P</sup>*(**<sup>x</sup>**1:*T*, **y**1:*T*, **<sup>z</sup>**(−*L*+<sup>1</sup>):*<sup>T</sup>*|**Θ**) with respect to the latent variable distribution *<sup>P</sup>*(**<sup>z</sup>**(−*L*+<sup>1</sup>):*<sup>T</sup>*|**<sup>x</sup>**1:*T*, **y**1:*T*, **Θ***old*) is shown in (7).

*Q***Θ**|**Θ***old* = *E***<sup>z</sup>***T* {log *p*(**<sup>x</sup>**1:*T*, **y**1:*T*, **<sup>z</sup>**1−*L*:*<sup>T</sup>*|**Θ**)} = − 1 2 ⎧⎪⎪⎨⎪⎪⎩log |**<sup>V</sup>**0| + *E***z***T*⎛⎜⎜⎝⎡⎢⎣ **z**0... **<sup>z</sup>**−*L*+1 ⎤⎥⎦*T***V**−1 0 ⎡⎢⎣ **z**0... **<sup>z</sup>**−*L*+1 ⎤⎥⎦⎞⎟⎟⎠ − 2*E***z***T*⎛⎜⎜⎝⎡⎢⎣ **z**0... **<sup>z</sup>**−*L*+1 ⎤⎥⎦*T*⎞⎟⎟⎠**V**−1 0 **u**0 + **u***T*0 **V**−<sup>1</sup> 0 **u**0⎫⎪⎪⎬⎪⎪⎭ − 1 2 ⎧⎪⎪⎨⎪⎪⎩*T* log |<sup>Σ</sup>**z**| + *T*∑*t*=1⎧⎪⎪⎨⎪⎪⎩*E***z***T* **z***Tt* Σ−<sup>1</sup> **z z***t* − 2*E***z***T*⎛⎜⎜⎝⎡⎢⎣ **<sup>z</sup>***t*−1 ... **<sup>z</sup>***t*−*L* ⎤⎥⎦*T***A***T*Σ−1 **z z***t*⎞⎟⎟⎠ + *E***z***T*⎛⎜⎜⎝⎡⎢⎣ **<sup>z</sup>***t*−1 ... **<sup>z</sup>***t*−*L* ⎤⎥⎦*T***A***T*Σ−1 **z A**⎡⎢⎣ **<sup>z</sup>***t*−1 ... **<sup>z</sup>***t*−*L* ⎤⎥⎦⎞⎟⎟⎠⎫⎪⎪⎬⎪⎪⎭⎫⎪⎪⎬⎪⎪⎭ − 1 2*T* log |<sup>Σ</sup>**x**| + *T*∑*<sup>t</sup>*=<sup>1</sup>"**x***Tt* Σ−<sup>1</sup> **x x***t* − 2*E***<sup>z</sup>***T* **z***Tt* **Bx***<sup>T</sup>*Σ−<sup>1</sup> **x x***t* + *E***<sup>z</sup>***T* **z***Tt* **Bx***<sup>T</sup>*Σ−<sup>1</sup> **x Bxz***t*#= − 1 2*T* log |<sup>Σ</sup>**y**| + *T*∑*<sup>t</sup>*=<sup>1</sup>1**y***Tt* Σ−<sup>1</sup> **y y***t* − 2*E***<sup>z</sup>***T* **z***Tt* **By***<sup>T</sup>*Σ−<sup>1</sup> **y y***t* + *E***<sup>z</sup>***T* **z***Tt* **By***<sup>T</sup>*Σ−<sup>1</sup> **y Byz***t*3= + *cons*tan *t*. (7)

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

Appendix A provides a detailed update of all parameters at step M. From (7), the related statistics of latent variables in the Q function include *E***<sup>z</sup>***T* (**<sup>z</sup>***t*), *E***<sup>z</sup>***T* **<sup>z</sup>***t***<sup>z</sup>***Tt* and *E***<sup>z</sup>***T* **<sup>z</sup>***t***<sup>z</sup>***Tt*−*<sup>i</sup>*, where *t* = 0, 1, ··· , *T*, *i* = 1, 2, ··· , *L*, in fact, these statistics can be passed. The posterior probability distribution of the latent variables obtained in the E step of the EM algorithm was obtained. The calculation results of these statistics are shown in (8). The detailed derivation process is shown in Appendix B.

$$\begin{split} &E\_{\mathbf{z}\_{T}}(\mathbf{z}\_{t}) = E\left(\mathbf{z}\_{t}|\mathbf{x}\_{1:T}, \mathbf{y}\_{1:T}, \boldsymbol{\Theta}^{old}\right) = \mathbf{m}\_{t}^{1} \\ &E\_{\mathbf{z}\_{T}}(\mathbf{z}\_{t}\mathbf{z}\_{t}^{T}) = E\left(\mathbf{z}\_{t}\mathbf{z}\_{t}^{T}|\mathbf{x}\_{1:T}, \mathbf{y}\_{1:T}, \boldsymbol{\Theta}^{old}\right) = \mathbf{M}\_{t}^{11} + \mathbf{m}\_{t}^{1}\left(\mathbf{m}\_{t}^{1}\right)^{T} \\ &E\_{\mathbf{z}\_{T}}\left(\mathbf{z}\_{t}\mathbf{z}\_{t-1}^{T}\right) = E\left(\mathbf{z}\_{t}\mathbf{z}\_{t-1}^{T}|\mathbf{x}\_{1:T}, \mathbf{y}\_{1:T}, \boldsymbol{\Theta}^{old}\right) = \mathbf{M}\_{t}^{1(i+1)} + \mathbf{m}\_{t}^{1}\left(\mathbf{m}\_{t}^{(i+1)}\right)^{T} \\ &E\_{\mathbf{z}\_{T}}\left(\mathbf{z}\_{t}\mathbf{z}\_{t-1}^{T}\right) = E\left(\mathbf{z}\_{t}\mathbf{z}\_{t-1}^{T}|\mathbf{x}\_{1:T}, \mathbf{y}\_{1:T}, \boldsymbol{\Theta}^{old}\right) = \sum\_{i=1}^{L}\mathbf{A}\_{i}\left(\mathbf{M}\_{t-1}^{1L} + \mathbf{m}\_{t-1}^{i}\left(\mathbf{m}\_{t-1}^{L}\right)^{T}\right) \end{split} \tag{8}$$

where **m***t* and **M***t* are the mean and covariance of the posterior probability distribution of the latent variable. Therefore, through E step and M step iterative update until the parameters converge, the optimized parameter set **Θ***op<sup>t</sup>* = "**<sup>A</sup>**,**Bx**,**By**, **u**0, **V**0, Σ**<sup>z</sup>**, Σ**<sup>x</sup>**, <sup>Σ</sup>**y**# can be obtained.

### **3. Process Monitoring Method Based on Dynamic Autoregressive Latent Variable Model**

In this section, the established dynamic autoregressive latent variable model is used for industrial process monitoring. At first, DALM was used to model the process data so that the current state information was reflected in the latent variables, and then the latent space at the current time was obtained by filtering the process data distribution, constructing statistics and monitoring them. Let us introduce the monitoring process in detail below.

Although the latent space was unobservable, the establishment of a data-driven DALM model based on the characteristics of the process data extracted the information of the process variables to the spatial distribution of the latent variables. The process input **X** = [**<sup>x</sup>**1, **x**2, ··· , **<sup>x</sup>***N*] and output **Y** = [**<sup>y</sup>**1, **y**2, ··· , **<sup>y</sup>***N*] needed to be pre-processed by the normalization method, as shown in (9).

$$\begin{array}{l} \mathsf{X}^{q} = (\mathsf{x} - \mathsf{u}\_{\mathsf{x}}) \cdot \mathsf{std}\_{\mathsf{x}}^{-1}, \\ \mathsf{Y}^{q} = (\mathsf{y} - \mathsf{u}\_{\mathsf{y}}) \cdot \mathsf{std}\_{\mathsf{y}}^{-1}, \end{array} \tag{9}$$

where **ux** and **uy** are the means of the variables **X** and **Y**, **stdx** and **stdy** are the variances of the variables **X** and **Y**. Preprocessed data were filtered through the filtering algorithm to obtain the spatial distribution of the latent variables, as shown in (10).

$$(\mathbf{z}\_{l}^{q}, \mathbf{z}\_{l-1}^{q}, \dots, \mathbf{z}\_{l-L+1}^{q} | \mathbf{x}\_{1:l}^{q}, \mathbf{y}\_{1:l}^{q} \sim N(\mathbf{u}\_{l}^{q}, \mathbf{V}\_{l}^{q}) = N\left(\begin{bmatrix} \mathbf{u}\_{l}^{1:q} \\ \vdots \\ \mathbf{u}\_{l}^{L:q} \end{bmatrix}, \begin{bmatrix} \mathbf{V}\_{l}^{1:q} & \cdots & \mathbf{V}\_{l}^{1:l:q} \\ \vdots & \ddots & \vdots \\ \mathbf{V}\_{l}^{(L):q} & \cdots & \mathbf{V}\_{l}^{(L:l:q)} \end{bmatrix}\right). \tag{10}$$

Among them, **u***t* 1*q* and **<sup>V</sup>***t*11*<sup>q</sup>* are the mean and variance of the latent space distribution, respectively, which were obtained from (11). The detailed derivation process is shown in (A12)–(A16).

$$\mathbf{u}\_{l}^{\text{1d}} = \mathbf{g}\_{l}^{\text{1}} + \begin{bmatrix} \mathbf{G}\_{l}^{\text{1}(L+1)} & \mathbf{G}\_{l}^{\text{1}(L+2)} \end{bmatrix} \begin{bmatrix} \mathbf{G}\_{l}^{(L+1)(L+1)} & \mathbf{G}\_{l}^{(L+1)(L+2)} \\ \mathbf{G}\_{l}^{(L+2)(L+1)} & \mathbf{G}\_{l}^{(L+2)(L+2)} \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{l} - \mathbf{g}\_{l}^{L+1} \\ \mathbf{y}\_{l} - \mathbf{g}\_{l}^{L+2} \end{bmatrix},$$

$$\mathbf{V}\_{l}^{\text{1d}} = \begin{bmatrix} \mathbf{G}\_{l}^{\text{11}} & \cdots & \mathbf{G}\_{l}^{\text{1L}} \end{bmatrix} - \begin{bmatrix} \mathbf{G}\_{l}^{\text{1(L+1)}} & \mathbf{G}\_{l}^{\text{1(L+2)}} \end{bmatrix} \begin{bmatrix} \mathbf{G}\_{l}^{(L+1)(L+1)} & \mathbf{G}\_{l}^{(L+1)(L+2)} \\ \mathbf{G}\_{l}^{(L+2)(L+1)} & \mathbf{G}\_{l}^{(L+2)(L+2)} \end{bmatrix}^{-1} \begin{bmatrix} \mathbf{G}\_{l}^{1(L+1)} & \mathbf{G}\_{l}^{1(L+2)} \\ \vdots & \vdots \\ \mathbf{G}\_{l}^{L(L+1)} & \mathbf{G}\_{l}^{L(L+2)} \end{bmatrix}^{T}. \tag{11}$$

It can be seen from (A24) that the information of the data **X***t q* = [**x** *q* 1, **x** *q* 2, ··· , **x** *q t* ] and **Y***t q* = [**y***q* 1, **y***q* 2, ··· , **y***q t* ] at the current and previous moments was filtered into the current latent variable, and the latent variable distribution at the current moment is shown in (12).

$$\mathbf{z}\_{l}^{q} \vert \mathbf{x}\_{1:l}{}^{q}, \mathbf{y}\_{1:l}{}^{q} \sim \mathcal{N}(\mathbf{u}\_{l}^{1:q}, \mathbf{V}\_{l}^{11:q}).\tag{12}$$

Because the latent space contains the current state of the process dynamics and variable time lag information, the process statistic T<sup>2</sup> was constructed for the current latent variable at time *t*, as shown in (13).

$$T\_{t,q}^2 = \operatorname{E}(\mathbf{z}\_t^q | \mathbf{x}\_{1:t}^q, \mathbf{y}\_{1:t}^q)^T 
covariance 
(\mathbf{z}\_t^q | \mathbf{x}\_{1:t}^q, \mathbf{y}\_{1:t}^q)^{-1} \operatorname{E}(\mathbf{z}\_t^q | \mathbf{x}\_{1:t}^q, \mathbf{y}\_{1:t}^q). \tag{13}$$

Among them, the mathematical expectation and variance of the latent variables on the observation data at the current moment are shown in (14).

$$\begin{aligned} E(\mathbf{z}\_l^q | \mathbf{x}\_{1:t}^q, \mathbf{y}\_{1:t}^q) &= \mathbf{u}\_l^{1q}, \\ Covariantance(\mathbf{z}\_l^q | \mathbf{x}\_{1:t}^q, \mathbf{y}\_{1:t}^q) &= \mathbf{V}\_l^{11q}. \end{aligned} \tag{14}$$

The probability of the latent variable obeyed the Gaussian distribution. Therefore, according to the definition of chi-square distribution, this statistic obeyed the chi-square distribution *χ*2 *α*(*d*) after data preprocessing. Then, combining to the latent variable dimension *d* of the model and the significance level *α* required by the industry, the control threshold *T*2lim of the process monitoring method was obtained, and then the statistics of each time data were calculated online and compared with the control threshold, to determine whether the process deviated from the normal state. The process monitoring logic is determined by (15).

$$T\_{l,q}^2 < T\_{l\text{lim}}^2 = \chi\_a^2(d). \tag{15}$$

Too large an *α* value will lead to a high false alarm rate, and too low an *α* will lead to a high false alarm rate; therefore, in practice, it is a balance between false alarms and missed alarms. This paper chose *α* as 0.01, which means that the false positive rate of normal data was 0.01. If *T*<sup>2</sup> *t*,*q* < *T*2lim, the system was in a normal state. Otherwise, the process located in a fault state, and further diagnosis and identification of the fault was required for process maintenance. The process of DALM modeling and online process monitoring is shown in Figure 2.

The main steps of the process monitoring method based on the DALM model were as follows: them.

Step 1: Collect process data, divide the training and test data sets and standardize

Step 2: Use the training data set to learn the parameters of the DALM model.

Step 3: Build the model and determine the control threshold.

Step 4: Filter the process data online to ge<sup>t</sup> the latent space distribution at the current moment. Step 5: Calculate statistics and compare with the control threshold to determine whether the process is abnormal.

**Figure 2.** Flowchart based on DALM process monitor.

### **4. Case Study on the Sintering Process of Ternary Cathode Materials**

In this section, the proposed process monitoring method based on the dynamic autoregressive latent variable model is used to monitor the sintering process of ternary cathode materials to verify the effectiveness of the method. First the sintering process technology of the ternary cathode material was introduced, then the model structure and parameter determination were introduced in detail, and finally the performance of the model was evaluated.

### *4.1. Introduction to the Sintering Process of Ternary Cathode Materials*

The rapid development of the new energy industry has led to an extremely urgen<sup>t</sup> demand for high-quality ternary cathode materials, and the sintering process of battery materials is the core and key process of battery preparation. This process consists of a series connection of a heating section, a constant temperature section and a cooling section, as shown in Figure 3. The optimal production state of a single temperature section cannot guarantee that the product performance indicators of the entire sintering process are within the optimal range; at the same time, changes in the sintering process, such as environmental humidity or temperature, also affect the stability of product performance indicators. In order to ensure the stability of product performance indicators as much as possible, while reducing energy consumption and material consumption, it is necessary to adjust the sintering parameters of the kiln according to the sintering state in real time, which leads to many variables in each temperature zone and series coupling, which makes the process data present complex process characteristics [29].

**Figure 3.** Structure diagram of sintering furnace for battery sintering.

The temperature field in the sintering process has a significant effect on the material properties. Over-firing will cause changes in the material morphology and internal structure, and under firing will not provide sufficient activation energy for chemical reactions. However, the decomposition reaction that occurs in the heating section is an endothermic process and requires sufficient heat supply, otherwise a reverse reaction will occur, resulting in inefficient water removal, which will affect the subsequent oxidation reaction. Therefore, the state of the heating section is very important to the sintering process. At the same time, the residual lithium content can directly reflect the quality of the product. In order to monitor the process status in real time, a monitoring model is established for the temperature and residual lithium content of the heating section.

Huang et al. [30] established a temperature field monitoring model based on the PBF equipment equation to monitor the dynamic sintering process of parts, but this method requires precise grinding tool structure parameters and can only monitor uniformly distributed temperature fields. Egorova et al. [31] tried to combine neural networks and PCA diagnosis method monitor and diagnose the sintering process. This method can locate the fault and diagnose the cause of the fault. However, the introduction of neural networks increases the time and space complexity of the system and ignores the system dynamic and time lag problems.

Due to the severe temperature interval coupling, the process variables exhibit complex characteristics, making the traditional static monitoring methods unable to achieve accurate monitoring results. The dynamic autoregressive latent variable model proposed in this section considers the dynamic and time lag information of the process at the same time, so it is more in line with the sintering process.

### *4.2. Determination of Model Parameters*

This section establishes a monitoring model for the temperature and product quality in the heating section of the sintering process. The heating section contained seven temperature zones, and each temperature zone had two upper and lower temperature measuring points, but the temperature changes in the 4th to 7th temperature zones were not obvious. The temperature of the first three temperature zones was selected as the process variable **x***t* of the model. At the same time, the residual lithium content of the product reflects the quality of the battery, as does the quality variable **y***t* of the model, Table 1 lists the physical meaning of these variables.

**Table 1.** Selected variables in the sintering process.


To test the monitoring effect of the model under different faults, a total of 2200 continuous time data samples were collected on site with a sampling period of five minutes. The process included a total of three types of faults such as over-temperature, under-temperature and shutdown. For detailed status information, see Table 2.



First, analyze the dynamics of the data and the time lag characteristics of the variables from the data point of view. Figure 4 shows the autocorrelation and cross-correlation diagrams of process data.

**Figure 4.** Correlation plots for the first four process variables.

Figure 4 shows the correlation and cross-correlation between the first four process variables. The value at time 0 in each figure represents the cross-correlation between variables; the value at non-zero time shows the autocorrelation between variables under different time lags. It is worth mentioning that the cross-correlation index can measure the redundancy of variable information, and the autocorrelation index can indirectly measure the dynamic and time delay information between variables. It can be seen that the cross-correlation performance between the variables was above 0.5, indicating that there was strong redundant information between the variables. At the same time, even if there was a difference of 10 sampling times, the autocorrelation between the variables was still very high, indicating that there were time lags and dynamic characteristics between the variables. Therefore, the establishment of a DALM model for the process can be considered. The emission equation of the model extracts the redundant information of the data, and the autoregressive equation of the model extracts the dynamic and time lag information of the variables. This paper uses the trend similarity algorithm, which constructs the trend similarity function according to the time lag feature and solves it, to determine the time lag coefficient, that is, L = 3.

To verify the rationality of the time lag coefficient, under different time lag coefficients, a dynamic autoregressive latent variable monitoring model was established respectively. Note: In order to avoid the latent variable dimension from interfering with the selection of the time lag coefficient the latent variable dimension selected by Akaike information criterion (AIC) was temporarily used [32]. The false alarm rate (false alarm rate, FAR) and fault detection rate (fault detection rate, FDR) were defined to evaluate and monitor performance indicators, as defined in (16).

$$FAR = \frac{N\_{FAR}}{N\_{ll}} \quad FDR = \frac{N\_{FDR}}{N\_f}.\tag{16}$$

*NFAR* represents the number of normal samples that were mistakenly detected as abnormal by the monitoring method, and *Nn* is the number of all normal samples. *NFDR* represents the number of fault samples correctly monitored by the monitoring method, *Nf* is the number of all abnormal samples. Therefore, the closer the FAR is to the significance level, the better, and the closer the FDR is to 1, the better. The significance level of this work was set to 0.01.

The first 1000 normal samples were selected to train the model, and the data type fault 1 was used to test the monitoring effect of the model. Table 3 shows the indicators of the monitoring results of the new method under different time lag coefficients.

**Table 3.** FAR and FDR under different time lag.


The model did not converge when the time lag coefficient was 5, and when the model time lag coefficient was 3, the error and false alarm rate of the model were the best. Therefore, when the time lag coefficient was 3, the model gave the best performance. In order to visually see the monitoring results of the model, Figure 5 shows the monitoring T<sup>2</sup> diagram when the model's time lag coefficient was 2, 3 and 4.

**Figure 5.** Monitoring performance under different time lag. (**a**) Monitoring result at order 2; (**b**) monitoring result at order 3; (**c**) monitoring result at order 4.

It can be seen from Figure 5 that when the model time lag coefficient was 2 and 4, it was easy to misclassify the sample. Especially in the fault interval of 201st–400th: the divided normal samples and abnormal samples were close to the monitoring threshold, which shows that the robustness of the model with this time lag is low; when the model had a time lag coefficient of 3, it is insensitive to the noise and the false alarms are the smallest. Hence, its *NFAR* and *NFDR* were the best. Therefore, the time lag coefficient obtained by the trend similarity identification algorithm enabled the model to obtain a better monitoring effect.

Next, the latent variable dimension was determined. The latent variable dimension is the result of comprehensively considering the complexity and accuracy of the model. The root mean square error (RMSE) is an indicator to measure the accuracy of the model. The expression is shown in (17).

$$RMSE = \sqrt{\sum\_{i=1}^{N} \frac{|\mathbf{y}\_i - \mathbf{y}\_i|}{N}},\tag{17}$$

where *N* is the number of test samples, **y**ˆ*i* is the prediction of the true value **y***i* and **y***i* is the mean value of the true value of the test sample. Samples from the 1st to the 600th were used to train the model, and samples from the 601st to the 1000th were used as the test set. Table 4 shows the root mean square error of model prediction under different latent variable dimensions.

**Table 4.** RMSE under different latent variable dimensions.


Table 4 shows that the prediction performance of the model tends to be stable after the latent variable dimension increased to 3, which was the balance point between model complexity and accuracy. It is worth mentioning that under the time lag coefficient, the latent variable dimension selected by the AIC algorithm was also 3, so the latent variable dimension was determined to be 3.
