*2.3. Chen Dynamics*

This method is a combination of the two previously described methods. It assumes matrix M is not changing during time; therefore, the time-derivation of M is zero. If we multiply Equation (8) by M and then sum it with Equation (14) we will obtain a new dynamic system [39]:

$$M\dot{X}(t) = -\gamma M^T M \left( M X(t) - I \right) M \dot{X}(t) = -\gamma \left( M^T M + I \right) \left( M X - I \right) \tag{17}$$

Solving Equation (17) lead to the following solution:

$$X(t) = -\mathbb{C}\,M^{-1}e^{-(M^TM+I)t} + M^{-1} \tag{18}$$

A derivation of Equation (18) leads to following dynamical system:

$$\dot{X}(t) = \mathbb{C}\left(M^T + M^{-1}\right)e^{-(M^TM + I)t} \tag{19}$$

By combining Equation (17) in Equation (18) leads to Equation (19).

If *MTM* > 0 we can state and confirm that this method has a better convergence rate than the first and second above mentioned ones [39]. As we see in Equation (19) t is multiplied with (*MTM* + *I*) in the exponent term, which produces a larger exponent value than the ones for the previous two methods. This model is better than the previous model. It has a very good convergence rate and its sensitivity to noise is very low. On the other hand, its implementation is di fficult as it has more coe fficient to calculate with respect to the other methods and it does not fit for a real-time matrix inversion (i.e., for inverting a time-varying matrix).

#### *2.4. Summary of the Main Previous*/*Traditional Methods*

By comparing the properties of the above presented methods, there is one big di fference amongs<sup>t</sup> them. Table 1 shows the major di fferences between those models by using 4 di fferent criteria. The convergence rate refers to the convergence during time during the solving process. The "time-varying time matrix inversion" criteria refers to the ability of a given model to be usable for the case of the inversion of a time-varying matrix. The "implementation" criteria refer to how easy it is to implement a given model on RNN machines/processors. And the last criteria in Table 1 refers to how far a given model is sensitive to noise that is present in the time-varying matrix values. This last-mentioned criterion is very important because it does express the resilience of a given model to noise, which is always present in analog computing signals. Although noise is relatively low in digital systems, digital

systems do however introduce a noise-equivalent signal distortion originating from computational rounding of numbers as they are digitally represented and processed with/in a fix-size digital arithmetic.

The Zhang model does have a fixed convergence rate over to time. But the gradient descent model does contain a coefficient which can be changed to influence (increase or decrease) the convergence rate. However, the Chen model is much better than the two previous ones as it does potentially provide a much better convergence rate.

The convergence rate has a direct impact on the noise sensitivity of a given model. Indeed, an increase of the convergence rate does decrease the noise sensitivity. Therefore, the Chen model has highest level of stability with respect to noise, although it is the more complex to implement.

Furthermore, amongs<sup>t</sup> the 3 models listed in Table 1, only the Zhang model offers the capability of solving a time-varying matrix (i.e., a real-time matrix inversion).

**Table 1.** Comparison of different types of DNN (dynamic neural network) concepts (the traditional ones) for matrix inversion.


#### **3. Our Concept: The Novel RNN Method**

According to Chen [39], his model is converging to the solution of Equation (1) under any initial value. One can however re-formulate the Chen model [39] as result of following goal function:

$$\min \; \|Z = \| \left( X - A^{-1} \right) \| \; ^2 + \| (AX - I) \| ^2 \} \tag{20}$$

We can add another positive statement to Equation (20); thus, we multiply the last term of Equation (20), i.e., the term ((*AX* − *<sup>I</sup>*)<sup>2</sup>), with the matrix *A* and we add it again to the function Z.

$$\min \; \|Z = \| \left( X - A^{-1} \right) \|\:^2 + \| (AX - I) \|\:^2 + \| \left( A^T A X - A^T \right) \|\:^2 \} \tag{21}$$

After adding that new term to the right side of Equation (20) and solving Z (according to Equation (21), one does find/obtain the following dynamical system:

$$\dot{MX}(t) = -\gamma(\left(M^T M\right)^2 + M^T M + I)(MX - I) - \dot{M}X \tag{22}$$

The solution of this equation (see Equation (22)) can be expressed as follows:

$$X(t) = -\mathbb{C} \cdot M(t)^{-1} e^{-\gamma \int\_0^t \left( \left(M^T M\right)^2 + M^T M + I \right) dz} + M(t)^{-1} \tag{23}$$

In Equation (23), when times goes to infinite the limit of X will converge to *M*(*t*)−<sup>1</sup> and it thereby provides the solution for Equation (1). *C* is a constant value (a matrix), which is added during/while solving Equation (22); see Table 2 for an illustration. Newly added terms *MTM*<sup>2</sup> + *MTM* produce a better convergence rate. The main reason of this convergence rate is the positive value of the integral and it provides additional factors when compared to the previous time-varying models. Therefore, by adding more coefficients to the right-hand side of Equation (22), we can obtain the following equation:

$$M\dot{X}(t) = -\gamma \left(\sum\_{i=0}^{n} \left(M^T M\right)^i\right) \left(MX - I\right) - \dot{M}X \tag{24}$$

Equation (24) is more general and can create the model of Equation (22), which is a specific configuration of it. For this, we just need to set the value of parameter *n* to 2.

**Theorem 1.** *For any given nonsingular matrix M* ∈ R*n*<sup>×</sup>*n, the state matrix X*(*t*) ∈ R*n*<sup>×</sup>*n, while starting from any (initial value problem) IVP X*(0) ∈ R*n*<sup>×</sup>*n, Equation (24) will achieve global convergence to X*∗(*t*) = *<sup>M</sup>*−<sup>1</sup>(*t*)*.*

**Proof of Theorem 1.** Let define *E*(*t*) = *X*(*t*) − *X*∗(*t*) be the error value during the process for finding the solution. If this equation is multiplied by *M*, it does lead to *M*(*t*)*E*(*t*) = *M*(*t*)*X*(*t*) − *M*(*t*)*X*∗(*t*) or *M*(*t*)*E*(*t*) = *M*(*t*)*X*(*t*) − *I*. Thus, a derivation of the error function will lead to *M* . *E*(*t*) + . *ME*(*t*) = *M* . *X*(*t*) − . *MX*(*t*). By replacing this in Equation (24) we obtain the following expression:

$$\dot{M}\dot{E}(t) = -\gamma \left(\sum\_{i=0}^{n} \left(\boldsymbol{M}^T \boldsymbol{M}\right)^i\right) \boldsymbol{M} \, E(t) - \dot{\boldsymbol{M}} \, E(t) \tag{25}$$

Let us define the Lyapunov function (*t*) = *ME*(*t*), which is always a positive function. The derivative of this function can be obtained as follows:

$$
\dot{\epsilon}(t) = E(t)^T M^T M \cdot \frac{dE(t)}{dt} + E(t)^T M^T \frac{dM(t)}{dt} E(t) \tag{26}
$$

By replacing Equation (25) into Equation (26) it leads to the following:

$$\dot{\epsilon}(t) = -\gamma E(t)^T (\sum\_{i=0}^n \left(M^T M\right)^{i+1}) E(t) \tag{27}$$

Hence:

$$\dot{\epsilon}(t) = -\gamma E(t)^T M^T (\sum\_{i=0}^n \left(M^T M\right)^i) M E(t) \tag{28}$$

One can replace middle term *n <sup>i</sup>*=<sup>0</sup>*MTM<sup>i</sup>* with large enough value of μ therefore:

$$\leq -\gamma \mu \left\| \left| ME(t) \right| \right\| \leq 0 \tag{29}$$

Thus, it appears that .(*t*) is always negative; furthers, .(*t*) = 0 if and only if *X*(*t*) = *X*(*t*)<sup>∗</sup> is satisfied. Therefore, our differential equation globally converges towards a point (matrix), which is the equilibrium point for this function. 

Equation (30) is the result of an analytical solving of Equation (24). Increasing t in this equation will lead to the solution of the algebraic equation (i.e., of Equation (1)).

$$X(t) = -\mathbf{C} \cdot M^{-1} e^{-\gamma \sum\_{i=0}^{n} \int\_{0}^{t} \left(M^{T} M\right)^{i} dz} + M^{-1} \tag{30}$$

In this equation, C is a constant value (matrix) and it is added during solving differential equation. Obviously, this equation has a much better rate of convergence when compared to the previous implementations and, like previous solutions/concepts, the minimum value of the eigenvectors of *MTM* should be positive.

Also, according to the Chen model, if Equation (24) is extended by introducing a monotonically increasing function *F* where *F*(0) = 0, here again our system will be converged to solution of Equation (1). Thus, by introducing a function F in the Equation (24) the following new equation will be obtained, see Equation (31):

$$M\dot{X}(t) = -\gamma \left(\sum\_{i=0}^{n} \left(M^T M\right)^i\right) \mathbf{F}(MX - I) - \dot{M}X \tag{31}$$

**Theorem 2.** *For any given nonsingular matrix M* ∈ R*n*<sup>×</sup>*n, the state matrix X*(*t*) ∈ R*n*<sup>×</sup>*n, while starting from any IVP (initial value problem) X*(0) ∈ R*n*×*n and with a monotonically increasing function F where F*(0) = 0*, Equation (31) will achieve global convergence to X*∗(*t*) = *<sup>M</sup>*−<sup>1</sup>(*t*)*.*

**Proof of Theorem 2.** Let define *E*(*t*) = *X*(*t*) − *X*∗(*t*) for the error value during the process for finding the solution. If this equation is multiplied by *M*, it does lead to *M*(*t*)*E*(*t*) = *M*(*t*)*X*(*t*) − *M*(*t*)*X*∗(*t*) or *M*(*t*)*E*(*t*) = *M*(*t*)*X*(*t*) − *I*. Thus, a derivation of the error function will lead to *M* . *E*(*t*) + . *ME*(*t*) = *M* . *X*(*t*) − . *MX*(*t*). By replacing this in Equation (31), we obtain the following expression:

$$M\dot{E}(t) = -\gamma \left(\sum\_{i=0}^{n} \left(M^T M\right)^i\right) F(ME(t)) - \dot{M}E(t) \tag{32}$$

Let's define the Lyapunov function (*t*) = *ME*(*t*), which is always a positive function. The derivative of this function can be obtained as follows:

$$
\dot{\epsilon}(t) = E(t)^T M^T M \cdot \frac{dE(t)}{dt} + E(t)^T M^T \frac{dM(t)}{dt} E(t) \tag{33}
$$

By replacing Equation (33) into Equation (32) it does lead to:

$$\dot{\epsilon}(t) = -\gamma E(t)^T M^T (\sum\_{i=0}^n \left(M^T M\right)^i) F(ME(t)) \tag{34}$$

Hence:

$$\dot{\epsilon}(t) = -\gamma E(t)^T M^T (\sum\_{i=0}^n \left(M^T M\right)^i) F(ME(t)) \tag{35}$$

One can replace middle term *ni*=<sup>0</sup>*MTM<sup>i</sup>* with large enough value of μ therefore:

$$0 \le -\gamma \mu \, E(t)^T M^T F(ME(t)) \le 0 \tag{36}$$

In the last equation, *E*(*t*)*TMTF*(*ME*(*t*)) is always positive because if *ME*(*t*) becomes negative, *F*(*ME*(*t*)) also becomes negative and vise-versa.

Thus, it appears that .(*t*) is always negative; and .(*t*) = 0 if and only if *X*(*t*) = *X*(*t*)<sup>∗</sup> is satisfied. Therefore, our differential Equation (31) or (32) globally converges towards a point (matrix), which is the equilibrium point for this function. 

By choosing different forms of the function F, one can create various dynamical properties to be expressed by this model.

Examples of functions for F are: sigmoid, linear, square, cubic, arcos, etc. All these functions are suitable to be used in Equation (31) as all of them are increasing monotonic functions and they do all satisfy the *F*(0) = 0 condition (see Figure 1).

**Figure 1.** Illustrative examples of monotonic functions which can be used for solving the inversion of a time-varying matrix through Equation (31).

#### **4. Model Implementation in SIMULINK**

Equation (24) or Equation (31) can be implemented directly into SIMULINK (see Figure 2). This dynamic model has the components shown in Table 2.

If M is not a time-varying matrix we can simply put zero values in the M' matrix, otherwise M' model should be a corresponding derivative of matrix M. Executing the model will result in the following output in SIMULINK, see Figure 2, Figure 3, and Table 2, which is the solution of Equation (1). Therefore, it works well as we expected and gives the solution of Equation (24).

**Figure 2.** The RNN block diagram corresponding to Equation (24). Note that the matrix to be inverted is M.




**Figure 3.** Result of model simulation (just for illustration) for the matrix M indicated in Table 2 (see first row of Table 2), SIMULINK output.
