*Article* **Advances in the Approximation of the Matrix Hyperbolic Tangent**

**Javier Ibáñez 1, José M. Alonso 1, Jorge Sastre 2, Emilio Defez <sup>3</sup> and Pedro Alonso-Jordá 4,\***


**Abstract:** In this paper, we introduce two approaches to compute the matrix hyperbolic tangent. While one of them is based on its own definition and uses the matrix exponential, the other one is focused on the expansion of its Taylor series. For this second approximation, we analyse two different alternatives to evaluate the corresponding matrix polynomials. This resulted in three stable and accurate codes, which we implemented in MATLAB and numerically and computationally compared by means of a battery of tests composed of distinct state-of-the-art matrices. Our results show that the Taylor series-based methods were more accurate, although somewhat more computationally expensive, compared with the approach based on the exponential matrix. To avoid this drawback, we propose the use of a set of formulas that allows us to evaluate polynomials in a more efficient way compared with that of the traditional Paterson–Stockmeyer method, thus, substantially reducing the number of matrix products (practically equal in number to the approach based on the matrix exponential), without penalising the accuracy of the result.

**Keywords:** matrix functions; matrix hyperbolic tangent; matrix exponential; Taylor series; matrix polynomial evaluation

### **1. Introduction and Notation**

Matrix functions have been an increasing focus of attention due to their applications to new and interesting problems related, e.g., to statistics [1], Lie theory [2], differential equations (the matrix exponential function *eAt* can be considered as a classical example for its application in the solution of first order differential systems *Y* (*t*) = *AY*(*t*) with *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*<sup>n</sup>* and for its use in the development of exponential integrators for nonlinear ODEs and PDEs, see [3] for example), approximation theory, and many other areas of science and engineering [4].

There are different ways to define the notion of the function *<sup>f</sup>*(*A*) of a square matrix *<sup>A</sup>*. The most common are via the Jordan canonical form, via the Hermite interpolation, and via the Cauchy integral formula. The equivalence among the different definitions of a matrix function can be found in [5]. Several general methods have been proposed for evaluating matrix functions, among which, we can highlight the Taylor or Padé approximations and methods based on the Schur form of a matrix [4].

Among the most well-known matrix functions, we have the matrix hyperbolic cosine and the matrix hyperbolic sine functions, respectively defined in terms of the matrix exponential function *e<sup>A</sup>* by means of the following expressions:

**Citation:** Ibáñez, J.; Alonso, J.M.; Sastre, J.; Defez, E.; Alonso-Jordá, P. Advances in the Approximation of the Matrix Hyperbolic Tangent. *Mathematics* **2021**, *9*, 1219. https:// doi.org/10.3390/math9111219

Academic Editor: Mariano Torrisi

Received: 23 March 2021 Accepted: 20 May 2021 Published: 27 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

$$\cosh\left(A\right) = \frac{1}{2}\left(\varepsilon^A + \varepsilon^{-A}\right),\\\sinh\left(A\right) = \frac{1}{2}\left(\varepsilon^A - \varepsilon^{-A}\right). \tag{1}$$

These matrix functions are applied, e.g., in the study of the communicability analysis in complex networks [6], or to construct the exact series solution of coupled hyperbolic systems [7]. Precisely due to their applicability, the numerical computation of these functions has received remarkable and growing attention in recent years. A set of state-of-the-art algorithms to calculate these functions developed by the authors can be found in [8–11].

On the other hand, we have the matrix hyperbolic tangent function, defined as

$$\tanh\left(A\right) = \sinh\left(A\right)\left(\cosh\left(A\right)\right)^{-1} = \left(\cosh\left(A\right)\right)^{-1}\sinh\left(A\right),\tag{2}$$

and used, for instance, to give an analytical solution of the radiative transfer equation [12], in the heat transference field [13,14], in the study of symplectic systems [15,16], in graph theory [17], and in the development of special types of exponential integrators [18,19].

In this work, we propose and study two different implementations that compute the matrix hyperbolic tangent function: the first uses the matrix exponential function whereas the second is based on its Taylor series expansion. In addition, for the second approach, we use and compare two different alternatives to evaluate the matrix polynomials involved in the series expansion.

#### *1.1. The Matrix Exponential Function-Based Approach*

This first option is derived from the matrix hyperbolic tangent function definition as expressed in Equations (1) and (2), from which the following matrix rational expression is immediately deduced:

$$\tanh\left(A\right) = \left(\varepsilon^{2A} - I\right)\left(\varepsilon^{2A} + I\right)^{-1} = \left(\varepsilon^{2A} + I\right)^{-1}\left(\varepsilon^{2A} - I\right),\tag{3}$$

where *I* denotes the identity matrix with the same order as *A*. Equation (3) reduces the approximation of the matrix hyperbolic tangent function to the computation of the matrix exponential *e*2*A*.

There exists profuse literature (see e.g., [4,20]) about the approximation of the matrix exponential function and the inconveniences that its calculation leads to [21]. The most competitive methods used in practice are either those based on polynomial approximations or those based on Padé rational approaches, with the former, in general, being more accurate and with lower computational costs [22].

In recent years, different polynomial approaches to the matrix exponential function have been proposed, depending on the type of matrix polynomial used. For example, some approximations use the Hermite matrix polynomials [23], while others derived from on Taylor polynomials [22,24]. More recently, a new method based on Bernoulli matrix polynomials was also proposed in [25].

All these methods use the scaling and squaring method based on the identity

$$e^A = \left(e^{2^{-s}A}\right)^{2^s}{\,}$$

which satisfies the matrix exponential. In the scaling phase, an integer scaling factor *s* is taken, and the approximation of *e*2−*sA* is computed using any of the proposed methods so that the required precision is obtained with the lowest possible computational cost. In the squaring phase, we obtain *e<sup>A</sup>* by *s* repeated squaring operations.

#### *1.2. The Taylor Series-Based Approach*

The other possibility for computing the matrix hyperbolic tangent function is to use its Taylor series expansion

$$\tanh\left(z\right) = \sum\_{k\geq 1} \frac{2^{2k}(2^{2k}-1)\mathcal{B}\_{2k}}{(2k)!} z^{2k-1}, |z| < \frac{\pi}{2}.$$

where B2*<sup>k</sup>* are the Bernoulli's numbers.

As in the case of the matrix exponential, it is highly recommended to use the scaling and squaring technique to reduce the norm of the matrix to be computed and, thus, to obtain a good approximation of the matrix hyperbolic tangent with an acceptable computational cost. Due to the double angle formula for the matrix hyperbolic tangent function

$$\tanh(2A) = 2\left(I + \tanh^2(A)\right)^{-1} \tanh(A)\_\prime \tag{4}$$

which is derived from the scalar one

$$
\tanh(2z) = \frac{2\tanh(z)}{1 + \tanh^2(z)}.
$$

it is possible to compute *Ts* <sup>=</sup> tanh(*A*) by using the following recurrence:

$$\begin{aligned} T\_0 &= \tanh(2^{-s}A), \\ T\_i &= 2\left(I + T\_{i-1}^2(A)\right)^{-1} T\_{i-1}(A), i = 1, \dots, s. \end{aligned} \tag{5}$$

Throughout this paper we will denote by *<sup>σ</sup>*(*A*) the set of eigenvalues of matrix *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*<sup>n</sup>* and by *In* (or *<sup>I</sup>*) the matrix identity of order *<sup>n</sup>*. In addition, *<sup>ρ</sup>*(*A*) refers to the spectral radius of *A*, defined as

$$\rho(A) = \max\{|\lambda|; \lambda \in \sigma(A)\}.$$

With *x*, we denote the value obtained by rounding *x* to the nearest integer greater than or equal to *x*, and *x* is the value obtained rounding *x* to the nearest integer less than or equal to *x*. The matrix norm || · || will stand for any subordinate matrix norm and, in particular, || · ||<sup>1</sup> denotes the 1-norm.

This work is organised as follows. First, Section 2 incorporates the algorithms corresponding to the different approaches previously described for approximating the matrix hyperbolic tangent and for computing the scaling parameter and the order of the Taylor polynomials. Next, Section 3 details the experiments carried out to compare the numerical properties of the codes to be evaluated. Finally, in Section 4, we present our conclusions.

#### **2. Algorithms for Computing the Matrix Hyperbolic Tangent Function**

#### *2.1. The Matrix Exponential Function-Based Algorithm*

The first algorithm designed, called Algorithm 1, computes the matrix hyperbolic tangent by means of the matrix exponential according to Formula (3). In Steps 1 and 2, Algorithm 2 from [26] is responsible for computing *e*2−*sB* by means of the Taylor approximation of order *mk*, being *<sup>B</sup>* = <sup>2</sup>*A*. In Step 3, *<sup>T</sup>* ! tanh(2<sup>−</sup>*sB*) is worked out using Formula (3). In this phase, *T* is computed by solving a system of linear equations, with *<sup>e</sup>*2−*sB* + *<sup>I</sup>* being

the coefficient matrix and *<sup>e</sup>*2−*sB* <sup>−</sup> *<sup>I</sup>* being the right hand side term. Finally, through Steps 4–8, tanh(*A*) is recovered from *<sup>T</sup>* by using the squaring technique and the double angle Formula (5).

**Algorithm 1:** Given a matrix *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*n*, this algorithm computes *<sup>T</sup>* = tanh(*A*) by means of the matrix exponential function.

$$\begin{array}{c} 1 \ B = 2A \\ 2 \circ \text{Calculat} \end{array}$$

**<sup>2</sup>** Calculate the scaling factor *<sup>s</sup>* <sup>∈</sup> <sup>N</sup> <sup>∪</sup> {0}, the order of Taylor polynomial *mk* ∈ {2, 4, 6, 9, 12, 16, 20, 25, 30} and compute *<sup>e</sup>*2−*sB* by using the Taylor approximation /\* Phase I (see Algorithm 2 from [26]) \*/ **<sup>3</sup>** *<sup>T</sup>* = *<sup>e</sup>*2−*sB* + *<sup>I</sup>* −<sup>1</sup> *<sup>e</sup>*2−*sB* <sup>−</sup> *<sup>I</sup>* /\* Phase II: Work out tanh(2−*sB*) by (3) \*/ **<sup>4</sup> for** *<sup>i</sup>* = <sup>1</sup> **to** *<sup>s</sup>* **do** /\* Phase III: Recover tanh(*A*) by (5) \*/ **<sup>5</sup>** *<sup>B</sup>* = *<sup>I</sup>* + *<sup>T</sup>*<sup>2</sup> **<sup>6</sup>** Solve for *<sup>X</sup>* the system of linear equations *BX* = <sup>2</sup>*<sup>T</sup>* **<sup>7</sup>** *<sup>T</sup>* = *<sup>X</sup>* **<sup>8</sup> end**

**Algorithm 2:** Given a matrix *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*n*, this algorithm computes *<sup>T</sup>* = tanh(*A*) by means of the Taylor approximation Equation (8) and the Paterson– Stockmeyer method.

**<sup>1</sup>** Calculate the scaling factor *<sup>s</sup>* <sup>∈</sup> <sup>N</sup> <sup>∪</sup> {0}, the order of Taylor approximation *mk* <sup>∈</sup> {2, 4, 6, 9, 12, 16, 20, 25, 30}, 2−*sA* and the required matrix powers of 4−*sB* /\* Phase I (Algorithm 4) \*/ **<sup>2</sup>** *<sup>T</sup>* <sup>=</sup> <sup>2</sup>−*sAPmk* (4−*sB*) /\* Phase II: Compute Equation (8) \*/ **<sup>3</sup> for** *<sup>i</sup>* = <sup>1</sup> **to** *<sup>s</sup>* **do** /\* Phase III: Recover tanh(*A*) by Equation (5) \*/ **<sup>4</sup>** *<sup>B</sup>* = *<sup>I</sup>* + *<sup>T</sup>*<sup>2</sup> **<sup>5</sup>** Solve for *<sup>X</sup>* the system of linear equations *BX* = <sup>2</sup>*<sup>T</sup>* **<sup>6</sup>** *<sup>T</sup>* = *<sup>X</sup>* **<sup>7</sup> end**

*2.2. Taylor Approximation-Based Algorithms*

Let

$$f(z) = \sum\_{n\geq 1} \frac{2^{2n}(2^{2n} - 1)\mathcal{B}\_{2n}}{(2n)!} z^{2n-1} \tag{6}$$

be the Taylor series expansion of the hyperbolic tangent function, with the radius of convergence *<sup>r</sup>* <sup>=</sup> *<sup>π</sup>*/2, where <sup>B</sup>2*<sup>n</sup>* are the Bernoulli's numbers, defined by the recursive expression

$$\mathcal{B}\_0 = 1 \; , \; \mathcal{B}\_k = -\sum\_{i=0}^{k-1} \binom{k}{i} \frac{\mathcal{B}\_i}{k+1-i} \; k \ge 1.$$

The following proposition is easily obtained:

**Proposition 1.** *Let <sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*<sup>n</sup> be a matrix satisfying <sup>ρ</sup>*(*A*) <sup>&</sup>lt; *<sup>π</sup>*/2*. Then, the matrix hyperbolic tangent* tanh(*A*) *can be defined for A as the Taylor series*

$$\tanh(A) = f(A) = \sum\_{n\geq 1} \frac{2^{2n}(2^{2n}-1)\mathcal{B}\_{2n}}{(2n)!} A^{2n-1}.\tag{7}$$

From [4] Theorem 4.7, this series in Equation (7) converges if the distinct eigenvalues *λ*1, *λ*2, ··· , *λ<sup>t</sup>* of *A* satisfy one of these conditions:


To simplify the notation, we denote with

$$\tanh(A) = \sum\_{k \ge 0} q\_{2k+1} A^{2k+1}{}\_{\prime}$$

the Taylor series (7), and with

$$T\_{2m+1}(A) = \sum\_{k=0}^{m} q\_{2k+1} A^{2k+1} = A \sum\_{k=0}^{m} p\_k B^k = A P\_m(B),\tag{8}$$

the Taylor approximation of order 2*<sup>m</sup>* <sup>+</sup> 1 of tanh(*A*), where *<sup>B</sup>* <sup>=</sup> *<sup>A</sup>*2.

There exist several alternatives that can be applied to obtain *Pm*(*B*), such as the Paterson–Stockmeyer method [27] or the Sastre formulas [28], with the latter being more efficient, in terms of matrix products, compared with the former.

Algorithm <sup>2</sup> works out tanh(*A*) by means of the Taylor approximation of the scaled matrix 4−*sB* Equation (8). In addition, it uses the Paterson–Stockmeyer method for the matrix polynomial evaluation, and finally it applies the recurrence Equation (5) for recovering tanh(*A*).

Phase I of Algorithm 2 is in charge of estimating the integers *m* and *s* so that the Taylor approximation of the scaled matrix *B* is computed accurately and efficiently. Then, in Phase II, once the integer *mk* has been chosen from the set

$$\mathbb{M} = \{2, 4, 6, 9, 12, 16, 20, 25, 30, \dots\}\_{\prime}$$

and powers *B<sup>i</sup>* , 2 <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *<sup>q</sup>* are calculated, with *<sup>q</sup>* <sup>=</sup> <sup>D</sup>√*mk* E or *<sup>q</sup>* = <sup>√</sup>*mk* as an integer divisor of *mk*, the Paterson–Stockmeyer method computes *Pmk* (*B*) with the necessary accuracy and with a minimal computational cost as

$$\begin{split} P\_{m\_k}(B) &= (((p\_{m\_k}B^q + p\_{m\_k-1}B^{q-1} + p\_{m\_k-2}B^{q-2} + \dots + p\_{m\_k-q+1}B + p\_{m\_k-q}I)B^q) \\ &+ p\_{m\_k-q-1}B^{q-1} + p\_{m\_k-q-2}B^{q-2} + \dots + p\_{m\_k-2q+1}B + p\_{m\_k-2q}I)B^q \\ &+ p\_{m\_k-2q-1}B^{q-1} + p\_{m\_k-2q-2}B^{q-2} + \dots + p\_{m\_k-3q+1}B + p\_{m\_k-3q}I)B^q \\ &\dots \\ &+ p\_{q-1}B^{q-1} + p\_{q-2}B^{q-2} + \dots + p\_1B + p\_0I. \end{split} \tag{9}$$

Taking into account Equation (9), the computational cost of Algorithm 2 is *O* 2*<sup>k</sup>* + <sup>4</sup> + <sup>8</sup>*<sup>s</sup>* 3 *n*3 flops.

Finally, in Phase III, the matrix hyperbolic tangent of matrix *A* is recovered by squaring and repeatedly solving a system of linear equations equivalent to Equation (4).

With the purpose of evaluating *Pm*(*B*) in Equation (8) in a more efficient way compared with that offered by the Paterson–Stockmeyer method, as stated in the Phase II of Algorithm 2, the formulas provided in [28] were taken into consideration into the design of Algorithm 3. Concretely, we use the evaluation formulas for Taylor-based matrix polynomial approximations of orders *<sup>m</sup>* = 8, 14+, and 21+ in a similar way to the evaluation described in [22] (Sections 3.1–3.3) for the matrix exponential function. Nevertheless, the Paterson–Stockmeyer method is still being used for orders equal to *<sup>m</sup>* = 2 and *<sup>m</sup>* = 4.

Following the notation given in [22] (Section 4) for an order *<sup>m</sup>*, the suffix "+" in *<sup>m</sup>* = <sup>14</sup>+ and *<sup>m</sup>* = <sup>21</sup>+ means that these Taylor approximations are more accurate than those approximations of order *<sup>m</sup>* = 14 and *<sup>m</sup>* = 21, respectively, since the former will be composed of a few more polynomial terms. The coefficients of these additional terms will be similar but not identical to the corresponding traditional Taylor approximation ones. It is convenient to clarify that we have used the order *<sup>m</sup>* = <sup>14</sup>+, instead of the order *<sup>m</sup>* = <sup>15</sup>+, because we have not found a real solution for the coefficients of the corresponding evaluation formula with order *<sup>m</sup>* = <sup>15</sup>+.

The evaluation formulas for the order *<sup>m</sup>* = 8 that comprise the system of non-linear equations to be solved for determining the unknown coefficients *ci*, *<sup>i</sup>* = 1, . . . , 6, are:

$$\begin{array}{rcl} B & = & -A^2, \\ B\_2 & = & B^2, \\ y\_0(B) & = & B\_2(c\_1B\_2 + c\_2B), \\ y\_1(B) & = & A((y\_0(B) + c\_3B\_2 + c\_4B)(y\_0(B) + c\_5B\_2) + c\_6y\_0(B) \\ & & + 2B\_2/15 + B/3 + I), \end{array} \tag{10}$$

where

$$y\_1(B) = T\_{17}(A) = AP\_8(B)\_{\prime\prime}$$

and *<sup>T</sup>*17(*A*), or *AP*8(*B*), refers to the Taylor polynomial of order 17 of function tanh(*A*) given by Equation (8).


**<sup>1</sup>** Calculate the scaling factor *<sup>s</sup>* <sup>∈</sup> <sup>N</sup> <sup>∪</sup> {0}, the order of Taylor approximation *mk* <sup>∈</sup> {2, 4, 8, 14, 21}, 2−*sA* and the required matrix powers of 4−*sB* /\* Phase I (Algorithm 5) \*/ **<sup>2</sup>** *<sup>T</sup>* <sup>=</sup> <sup>2</sup>−*sAPmk* (4−*sB*) /\* Phase II: Compute Equation (8) \*/ **<sup>3</sup> for** *<sup>i</sup>* = <sup>1</sup> **to** *<sup>s</sup>* **do** /\* Phase III: Recover tanh(*A*) by Equation (5) \*/ **<sup>4</sup>** *<sup>B</sup>* = *<sup>I</sup>* + *<sup>T</sup>*<sup>2</sup> **<sup>5</sup>** Solve for *<sup>X</sup>* the system of linear equations *BX* = <sup>2</sup>*<sup>T</sup>* **<sup>6</sup>** *<sup>T</sup>* = *<sup>X</sup>* **<sup>7</sup> end**

Regarding the non-linear equations for order *<sup>m</sup>* = <sup>14</sup>+ and its unknown coefficients *ci*, *<sup>i</sup>* = 1, . . . , 13, we have

$$\begin{array}{rcl} y\_0(B) &=& B\_2(c\_1B\_2 + c\_2B), \\ y\_1(B) &=& (y\_0(B) + c\_3B\_2 + c\_4B)(y\_0(B) + c\_5B\_2) + c\_6y\_0(B), \\ y\_2(B) &=& A((y\_1(B) + c\_7y\_0(B) + c\_8B\_2 + c\_9B)(y\_1(B) + c\_{10}B\_2 + c\_{11}B)) \\ &+ c\_{12}y\_1 + c\_{13}B\_2 + B/3 + I), \end{array} \tag{11}$$

where

$$y\_2(B) = A\left(P\_{14} + b\_{15}B^{15} + b\_{16}B^{16}\right)r$$

and *AP*<sup>14</sup> represents the Taylor polynomial of order 29 of function tanh(*A*) given by Equation (8). If we denote as *p*<sup>15</sup> and *p*<sup>16</sup> the Taylor polynomial coefficients corresponding to the powers *B*<sup>15</sup> and *B*16, respectively, the relative error of coefficients *b*<sup>15</sup> and *b*<sup>16</sup> with respect to them, with two decimal digits, are:

$$\begin{array}{rcl}|(b\_{15} - p\_{15})/p\_{15}| &=& 0.38, \\ |(b\_{16} - p\_{16})/p\_{16}| &=& 0.85. \end{array}$$

Taking

*<sup>B</sup>*<sup>3</sup> <sup>=</sup> *<sup>B</sup>*2*B*,

the evaluation formulas related to the system of non-linear equations for order *<sup>m</sup>* = <sup>21</sup>+ with the coefficients *ci*, *<sup>i</sup>* = 1, . . . , 21 to be determined are

$$\begin{aligned} y\_0(B) &= \ &B\_3(c\_1B\_3 + c\_2B\_2 + c\_3B)\_\epsilon \\ y\_1(B) &= &(y\_0(B) + c\_4B\_3 + c\_5B\_2 + c\_6B)(y\_0(B) + c\_7B\_3 + c\_8B\_2) \\ &+ c\_9y\_0(B) + c\_{10}B\_3, \\ y\_2(B) &= &A((y\_1(B) + c\_{11}B\_3 + c\_{12}B\_2 + c\_{13}B)(y\_1(B) + c\_{14}y\_0(B) \\ &+ c\_{15}B\_3 + c\_{16}B\_2 + c\_{17}B) + c\_{18}y\_1 + c\_{19}y\_0(B) + c\_{20}B\_3 \\ &+ c\_{21}B\_2 + B/3 + I), \end{aligned} \tag{12}$$

where

$$y\_2(B) = A(P\_{21} + b\_{22}B^{22} + b\_{23}B^{23} + b\_{24}B^{24}),$$

and *AP*<sup>21</sup> stands for the Taylor polynomial of order 43 of the function tanh(*A*) given by Equation (8). With two decimal digits of accuracy, the relative error made by the coefficients *b*22, *b*23, and *b*<sup>24</sup> with respect to their corresponding Taylor polynomial coefficients *p*22, *p*23, and *p*<sup>24</sup> that accompany their respective powers *B*22, *B*23, and *B*24, are the following:

$$\begin{array}{rcl}|(b\_{22} - p\_{22})/p\_{22}| &=& 0.69, \\ |(b\_{23} - p\_{23})/p\_{23}| &=& 0.69, \\ |(b\_{24} - p\_{24})/p\_{24}| &=& 0.70. \end{array}$$

Similarly to [22] (Sections 3.1–3.3), we obtained different sets of solutions for the coefficients in Equations (10)–(12) using the vpasolve function (https://es.mathworks.com/ help/symbolic/vpasolve.html, accessed on 7 March 2020) from the MATLAB Symbolic Computation Toolbox with variable precision arithmetic. For the case of *<sup>m</sup>* = <sup>21</sup>+, the random option of vpasolve has been used, which allowed us to obtain different solutions for the coefficients, after running it 1000 times. From all the sets of real solutions provided, we selected the most stable ones according to the stability check proposed in [28] (Ex. 3.2).

#### *2.3. Polynomial Order m and Scaling Value s Calculation*

The computation of *m* and *s* from Phase I in Algorithms 2 and 3 is based on the relative forward error of approximating tanh(*A*) by means of the Taylor approximation Equation (8). This error, defined as *Ef* <sup>=</sup> tanh (*A*) −1 (*<sup>I</sup>* <sup>−</sup> *<sup>T</sup>*2*m*+1) , can be expressed as

$$E\_f = \sum\_{k \ge 2m+2} c\_k A^k$$

,

, .

and it can be bounded as (see Theorem 1.1 from [29]):


$$E\_f = \left\| \sum\_{k \ge 2m+2} c\_k A^k \right\| = \left\| \sum\_{k \ge m+1} \mathbb{E}\_k B^k \right\| \le \sum\_{k \ge m+1} |\mathbb{E}\_k| \beta\_m^k \equiv h\_m(\beta\_m),$$

where *<sup>β</sup><sup>m</sup>* <sup>=</sup> max<sup>+</sup>

Let Θ*<sup>m</sup>* be

$$\Theta\_{\mathfrak{M}} = \max \left\{ \theta \ge 0 \, : \, \sum\_{k \ge m+1} |c\_k^{\varepsilon}| \theta^k \le u \right\},$$

where *<sup>u</sup>* <sup>=</sup> <sup>2</sup>−<sup>53</sup> is the unit roundoff in IEEE double precision arithmetic. The values of <sup>Θ</sup>*<sup>m</sup>* can be computed with any given precision by using symbolic computations as is shown in Tables 1 and 2, depending on the polynomial evaluation alternative selected.

Algorithm <sup>4</sup> provides the Taylor approximation order *mk* <sup>∈</sup> <sup>M</sup>, *lower* <sup>≤</sup> *<sup>k</sup>* <sup>≤</sup> *upper*, where *mlower* and *mupper* are, respectively, the minimum and maximum order used, the scaling factor *s*, together with 2−*sA*, and the necessary powers of 4−*sB* for computing *T* in

Phase II of Algorithm 2. To simplify reading this algorithm, we use the following aliases: *β<sup>k</sup>* ≡ *βmk* and Θ*<sup>k</sup>* ≡ Θ*mk* .

**Algorithm 4:** Given a matrix *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*n*, the values <sup>Θ</sup> from Table 1, a minimum order *mlower* <sup>∈</sup> <sup>M</sup>, a maximum order *mupper* <sup>∈</sup> <sup>M</sup>, with <sup>M</sup> <sup>=</sup> {2, 4, 6, 9, 12, 16, 20, 25, 30}, and a tolerance *tol*, this algorithm computes the order of Taylor approximation *<sup>m</sup>* <sup>∈</sup> <sup>M</sup>, *mlower* <sup>≤</sup> *mk* <sup>≤</sup> *mupper*, and the scaling factor *<sup>s</sup>*, together with 2−*sA* and the necessary powers of 4−*sB* for computing *Pmk* (4−*sB*) from (9).

**<sup>1</sup>** *<sup>B</sup>*<sup>1</sup> <sup>=</sup> *<sup>A</sup>*2; *<sup>k</sup>* <sup>=</sup> *lower*; *<sup>q</sup>* <sup>=</sup> <sup>√</sup>*mk*; *<sup>f</sup>* = <sup>0</sup> **<sup>2</sup> for** *<sup>j</sup>* = <sup>2</sup> **to** *<sup>q</sup>* **do <sup>3</sup>** *Bj* <sup>=</sup> *Bj*−1*B*<sup>1</sup> **<sup>4</sup> end <sup>5</sup>** Compute *<sup>β</sup><sup>k</sup>* ≈ *Bmk*+<sup>1</sup> 1/(*mk*+1) from *<sup>B</sup>*<sup>1</sup> and *Bq* ; /\* see [30] \*/ **<sup>6</sup> while** *<sup>f</sup>* = <sup>0</sup> **and** *<sup>k</sup>* <sup>&</sup>lt; *upper* **do <sup>7</sup>** *<sup>k</sup>* = *<sup>k</sup>* + <sup>1</sup> **<sup>8</sup> if** mod (*k*, 2) = <sup>1</sup> **then <sup>9</sup>** *<sup>q</sup>* <sup>=</sup> <sup>D</sup>√*mk* E **<sup>10</sup>** *Bq* <sup>=</sup> *Bq*−1*B*<sup>1</sup> **<sup>11</sup> end <sup>12</sup>** Compute *<sup>β</sup><sup>k</sup>* <sup>≈</sup> *<sup>B</sup>mk*+<sup>1</sup> 1/(*mk*+<sup>1</sup>) from *<sup>B</sup>*<sup>1</sup> and *Bq* ; /\* see [30] \*/ **<sup>13</sup> if** |*β<sup>k</sup>* − *<sup>β</sup>k*−1| < *tol* **and** *<sup>β</sup><sup>k</sup>* < <sup>Θ</sup>*<sup>k</sup>* **then <sup>14</sup>** *<sup>f</sup>* = <sup>1</sup> **<sup>15</sup> end <sup>16</sup> end <sup>17</sup>** *<sup>s</sup>* = max 0, F 1 <sup>2</sup> log2(*βk*/Θ*k*) G **<sup>18</sup> if** *s* > 0 **then <sup>19</sup>** *<sup>s</sup>*<sup>0</sup> <sup>=</sup> max 0, F 1 <sup>2</sup> log2(*βk*−1/Θ*k*−1) G **<sup>20</sup> if** *<sup>s</sup>*<sup>0</sup> <sup>=</sup> *<sup>s</sup>* **then <sup>21</sup>** *<sup>k</sup>* = *<sup>k</sup>* <sup>−</sup> <sup>1</sup> **<sup>22</sup>** *<sup>q</sup>* <sup>=</sup> <sup>D</sup>√*mk* E **<sup>23</sup> end <sup>24</sup>** *<sup>A</sup>* = <sup>2</sup>−*sA* **<sup>25</sup> for** *<sup>j</sup>* = <sup>1</sup> **to** *<sup>q</sup>* **do <sup>26</sup>** *Bj* <sup>=</sup> <sup>4</sup>−*sjBj* **<sup>27</sup> end <sup>28</sup> end <sup>29</sup>** *<sup>m</sup>* <sup>=</sup> *mk*

In Steps 1–4 of Algorithm 4, the required powers of *<sup>B</sup>* for working out *Pmk* (*B*) are computed. Then, in Step 5, *β<sup>k</sup>* is obtained by using Algorithm 1 from [30].

As lim *<sup>t</sup>*→<sup>∞</sup> ||*B<sup>t</sup>* ||1/*<sup>t</sup>* <sup>=</sup> *<sup>ρ</sup>*(*B*), where *<sup>ρ</sup>* is the spectral radius of matrix *<sup>B</sup>*, then lim*m*→<sup>∞</sup> <sup>|</sup>*β<sup>m</sup>* <sup>−</sup> *<sup>β</sup>m*−1<sup>|</sup> = 0. Hence, given a small tolerance value *tol*, Steps 6–16 test if there is a value *<sup>β</sup><sup>k</sup>* such that |*β<sup>k</sup>* − *<sup>β</sup>k*−1| < *tol* and *<sup>β</sup><sup>k</sup>* < <sup>Θ</sup>*k*. In addition, the necessary powers of *<sup>B</sup>* for computing *Pmk* (*B*) are calculated. Next, the scaling factor *<sup>s</sup>* <sup>≥</sup> 0 is provided in Step 17:

$$s = \max\left(0, \left\lceil \frac{1}{2} \log\_2 \frac{\beta\_k}{\Theta\_k} \right\rceil\right).$$

With those values of *mk* and *s*, we guarantee that:

$$\Pr\_f(2^{-s}A) \le h\_{m\_k}(4^{-s}\beta\_k) < h\_{m\_k}(\Theta\_k) \prec u\_\prime \tag{13}$$

i.e., the relative forward error of *<sup>T</sup>*2*mk*+1(2−*sA*) is lower than the unit roundoff *<sup>u</sup>*.

Step 18 checks whether the matrices *A* and *B* should be scaled or not. If so, the algorithm analyses the possibility of reducing the order of the Taylor polynomial, but at the same time ensuring that Equation (13) is verified (Steps 19–23). For this purpose, the scaling value corresponding to the order of the Taylor polynomial immediately below the one previously obtained is calculated as well. If both values are identical, the polynomial order reduction is performed. Once the optimal scaling parameter *s* has been determined, the matrices *A* and *B* are scaled (Steps 24–27).

Algorithm <sup>5</sup> is an adaptation of Algorithm 4, where the orders in the set <sup>M</sup> = {2, 4, 8, 14, 21} are used. Steps 1–15 of Algorithm 5 are equivalent to Steps 1–16 of Algorithm 4. Both values *β*<sup>1</sup> and *β*<sup>2</sup> are computed in the same way in both algorithms while values *β*3, *β*4, and *β*<sup>5</sup> are worked out in Algorithm 5 for the polynomial orders *<sup>m</sup>*<sup>3</sup> <sup>=</sup> 8, *<sup>m</sup>*<sup>4</sup> <sup>=</sup> 14, and *<sup>m</sup>*<sup>5</sup> <sup>=</sup> 21, respectively. Steps 16–31 of Algorithm <sup>5</sup> correspond to Steps 17–29 of Algorithm 4.

**Algorithm 5:** Given a matrix *<sup>A</sup>* <sup>∈</sup> <sup>C</sup>*n*×*n*, the values <sup>Θ</sup> from Table 2, <sup>M</sup> <sup>=</sup> {2, 4, 8, 14, 21} and a tolerance *tol*, this algorithm computes the order of Taylor approximation *mk* <sup>∈</sup> <sup>M</sup> and the scaling factor *s*, together with 2−*sA* and the necessary powers of 4−*sB* for computing 2−*sAPmk* (4−*sB*) from (10), (11) or (12).

```
1 B1 = −A2; B2 = B2
                  1
2 Compute β1 ≈ 
               B3

                   1/3 from B1 and B2
3 f = 0; k = 1
4 while f = 0 and k < 5 do
5 k = k + 1
6 if k < 5 then
7 Compute βk ≈ Bmk+1 1/(mk+1) from B1 and B2
8 else
9 B3 = B1B2
10 Compute β5 ≈ 
                      B22

                           1/22 from B1 and B3
11 end
12 if |βk − βk−1| < tol and βk < Θk then
13 f = 1; s = 0
14 end
15 end
16 if f = 0 then
17 s = max

             0, F
                 1
                 2 log2(βk/Θk)
                             G
18 end
19 if s > 0 then
20 s0 = max

              0, F
                 1
                 2 log2(βk−1/Θk−1)
                                  G
21 if s0 = s then
22 k = k − 1
23 end
24 A = 2−sA
25 B1 = 4−sB1
26 B2 = 16−sB2
27 if k = 5 then
28 B3 = 64−sB3
29 end
30 end
31 m = mk
```


**Table 1.** Values of Θ*mk* , 1 ≤ *k* ≤ 9, for polynomial evaluation by means of the Paterson–Stockmeyer method.

**Table 2.** Values of Θ*mk* , 1 ≤ *k* ≤ 5, for polynomial evaluation using the Sastre formulas.


#### **3. Numerical Experiments**

The following codes have been implemented in MATLAB to test the accuracy and the efficiency of the different algorithms proposed:


Three types of matrices with distinct features were used to build a battery of tests that enabled us to compare the numerical performance of these codes. The MATLAB Symbolic Math Toolbox with 256 digits of precision was used to compute the *"exact"* matrix hyperbolic tangent function using the vpa (variable-precision floating-point arithmetic) function. The test battery featured the following three sets:


equal to 128 were chosen because of their highly different and significant characteristics from each other. We decided to scale these matrices so that they had 1-norm not exceeding 512. As a result, we obtained that 1 ≤ *A* <sup>1</sup> ≤ 489.3. The *"exact"* matrix hyperbolic tangent was calculated by using the two following methods together and the vpa function:


The *"exact"* matrix hyperbolic tangent is considered only if

$$\frac{\|\|T\_1 - T\_2\|\|}{\|\|T\_1\|\|} < u.$$

Although MCT and EMP are really comprised of 72 matrices, only 53 matrices of them, 42 from MCT and 11 from EMP, were considered for our purposes. On the one hand, matrix 6 from MCT and matrix 10 from EMP were rejected because the relative error made by some of the codes to be evaluated was greater or equal to unity. This was due to the ill-conditioning of these matrices for the hyperbolic tangent function. On the other hand, matrices 4, 12, 17, 18, 23, 35, 40, 46, and 51 from MCT and matrices 7, 9, 16, and 17 from EMP were not generated because they did not satisfy the described criterion to obtain the *"exact"* matrix hyperbolic tangent. Finally, matrices 8, 13, 15, and 18 from EMP were refused as they are also part of MCT.

For each of the three previously mentioned sets of matrices, one test was respectively and independently carried out, which indeed corresponds to an experiment to analyse the numerical properties and to account for the computational cost of the different implemented codes. All these experiments were run on an HP Pavilion dv8 Notebook PC with an Intel Core i7 CPU Q720 @1.60 Ghz processor and 6 GB of RAM, using MATLAB R2020b. First, Table 3 shows the percentage of cases in which the normwise relative errors of tanh\_expm are lower than, greater than, or equal to those of tanh\_tayps and tanh\_pol. These normwise relative errors were obtained as

$$\text{Er} = \frac{||\tanh(A) - \tanh(A)||\_1}{||\tanh(A)||\_1}.$$

where tanh(*A*) represents the exact solution and ˜ tanh(*A*) stands for the approximate one.

**Table 3.** The relative error comparison, for the three tests, between tanh\_expm vs. tanh\_tayps and tanh\_expm vs tanh\_pol.


As we can appreciate, from the point of view of the accuracy of the results, tanh\_tayps outperformed tanh\_expm in 68% of the matrices for Test 1 and 100% and 77.36% of them for Tests 2 and 3. On the other hand, tanh\_pol obtained slightly more modest results with improvement percentages equal to 56%, 100%, and 69.81% for Tests 1, 2, and 3, respectively.

Table 4 reports the computational complexity of the algorithms. This complexity was expressed as the number of matrix products required to calculate the hyperbolic tangent of the different matrices that make up each of the test cases. This number of products includes the number of matrix multiplications and the cost of the systems of linear equations that were solved in the recovering phase, by all the codes, together with one more in Step 2 of Algorithm <sup>1</sup> by tanh\_expm.

The cost of each system of linear equations with *n* right-hand side vectors, where *n* denotes the size of the square coefficient matrices, was taken as 4/3 matrix products. The cost of other arithmetic operations, such as the sum of matrices or the product of a matrix by a vector, was not taken into consideration. As can be seen, the lowest computational cost corresponded to tanh\_expm, followed by tanh\_pol and tanh\_tayps. For example, tanh\_expm demanded 1810 matrix products to compute the matrices belonging to Test 1, compared to 1847 by tanh\_pol and 2180 by tanh\_tayps.

**Table 4.** Matrix products (P) corresponding to the tanh\_expm, tanh\_tayps, and tanh\_pol functions for Tests 1–3.


Respectively, for the three experiments, Figures 1–3 illustrate the normwise relative errors (a), the performance profiles (b), the ratio of the relative errors (c), the lowest and highest relative error rate (d), the ratio of the matrix products (e), and the polynomial orders (f) for our three codes to be evaluated.

Figures 1a, 2a and 3a correspond to the normwise relative error. As they reveal, the three methods under evaluation were numerically stable for all the matrices that were computed, and all of them provided very accurate results, where the relative errors incurred were always less than 10−11. The solid line that appears in these figures traces the function *k*tanh*u*, where *k*tanh (or *cond*) stands for the condition number of the matrix hyperbolic tangent function [4] (Chapter 3), and *u* represents the unit roundoff.

It is clear that the errors incurred by all the codes were usually quite close to this line for the three experiments and even below it, as was largely the case for Tests 1 and 3. For the sake of readability in the graphs, normwise relative errors lower than 10−<sup>20</sup> were plotted with that value in the figures. Notwithstanding, their original quantities were maintained for the rest of the results.

Figures 1b, 2b and 3b depict the performance profile of the codes. In them, the *α* coordinate on the *x*-axis ranges from 1 to 5 in steps equal to 0.1. For a concrete *α* value, the *p* coordinate on the *y*-axis indicates the probability that the considered algorithm has a relative error lower than or equal to *α*-times the smallest relative error over all the methods on the given test.

The implementation of tanh\_tayps always achieved the results with the highest accuracy, followed closely by tanh\_pol. For Tests 1 and 2, Figures 1b and 2b indicate that the results provided by them are very similar, although the difference in favour of tanh\_tayps is more remarkable in Test 3. As expected from the percentages given in Table 3, tanh\_expm computed the hyperbolic tangent function with the worst accuracy, most notably for the matrices from Tests 2 and 3.

**Figure 1.** Experimental results for Test 1.

Precisely, the relationship among the normwise relative errors incurred by the codes to be examined is displayed in Figures 1c, 2c and 3c. All these ratios are presented in decreasing order with respect to Er(tanh\_tayps)/Er(tanh\_expm). This factor is less than 1 for the great majority of the matrices, which indicates that tanh\_tayps and tanh\_pol are more accurate codes than tanh\_expm for estimating the hyperbolic tangent function.

These data are further corroborated by the results shown in Figures 1d, 2d and 3d. These graphs report the percentages of the matrices, for each of the tests, in which each code resulted in the lowest or highest normwise relative error among the errors provided by all of them. Thus, tanh\_tayps exhibited the smallest relative error in 44% of the matrices for Test 1 and in 47% of them for Test 3, followed by tanh\_pol in 29% and 36% of the cases, respectively. For all the other cases, tanh\_expm was the most reliable method. For Test 2, tanh\_pol was the most accurate code in 53% of the matrices, and tanh\_tayps was the most accurate in 47% of them. In line with these results, tanh\_expm was found to be the approach that led to the largest relative errors in 51% of the cases in Test 1, in 100% of them in Test 2, and in 64% for Test 3.

**Figure 2.** Experimental results for Test 2.

In contrast, although tanh\_expm proved to be the most inaccurate code, it is also evident that its computational cost was usually the lowest one, as Table 4 and Figures 1e, 2e and 3e reported. The ratio between the number of tanh\_tayps and tanh\_expm matrix products ranged from 0.82 to 1.43 for Test 1, was equal to 1.20 for Test 2, and ranged from 0.82 to 1.8 for Test 3. Regarding tanh\_pol and tanh\_expm, this quotient varied from 0.82 to 1.13 for Test 1, from 0.68 to 1.2 for Test 3, and was equal to 1 for Test 2.

Table 5 lists, in order for Tests 1, 2, and 3, the minimum, maximum, and average values of the degree of the Taylor polynomials *m* and the scaling parameter *s* employed by the three codes. Additionally, and in a more detailed way, Figures 1f, 2f and 3f illustrate the order of the polynomial used in the calculation of each of the matrices that compose the testbed. The value of *<sup>m</sup>* allowed to be chosen was between 2 and 30 for tanh\_expm and tanh\_tayps and between 2 and 21 for tanh\_pol. As we can see, the average value of *<sup>m</sup>* that was typically used varied from 25 and 30 for tanh\_expm. It was around 25 for tanh\_tayps, and it ranged from 14 to 21 for tanh\_pol.

**Figure 3.** Experimental results for Test 3.


**Table 5.** The minimum, maximum, and average parameters *m* and *s* employed for Tests 1–3, respectively.

Having concluded the first part of the experimental results, we continue by comparing tanh\_tayps and tanh\_pol, the Taylor series-based codes that returned the best accuracy in the results. Table <sup>6</sup> presents the percentage of cases in which tanh\_tayps gave place to relative errors that were lower than, greater than, or equal to those of tanh\_pol. According to the exposed values, both methods provided very similar results, and the percentage of cases in which each method was more accurate than the other was approximately equal to 50% for the different tests.


**Table 6.** The relative error comparison for the three tests between tanh\_tayps vs. tanh\_pol.

Figures 4–6 incorporate the normwise relative errors (a), the ratio of relative errors (b), and the ratio of matrix products (c) between tanh\_tayps and tanh\_pol. The graphs corresponding to the performance profiles and the polynomial orders are not included now since the results match with those already shown in the previous figures. All this information is also complemented by Table 7, which collects, respectively for each test, the minimum, maximum, and average relative errors incurred by both methods to be analysed, together with the standard deviation.

As Table <sup>6</sup> details, for Tests 1 and 3, tanh\_tayps slightly improved on tanh\_pol in the percentage of matrices in which the relative error committed was lower, although it is true that the difference between the results provided by the two methods was small in most cases. However, when such a difference occurred, it was more often in favour of tanh\_tayps than the other way around, in quantitative terms.

With all this, we can also appreciate that the mean relative error and the standard deviation incurred by tanh\_pol were lower than those of tanh\_tayps. For matrices that were part of Test 2, the numerical results achieved by both methods were almost identical, and the differences between them were not significant.

Er

(**c**) Ratio of matrix products.

**Figure 5.** Experimental results for Test 2.

**Figure 6.** Experimental results for Test 3.

To conclude the analysis and with regard to the computational cost of both codes, such as depicted in Figures 4c, 5c and 6c, it simply remains to note that tanh\_tayps performed between 1 and 1.43 more matrix products compared with tanh\_pol for Test 1, 1.2 more for Test 2, and between 1.06 and 1.5 more for Test 3. Therefore, tanh\_pol computed the matrix

tangent function with a very similar accuracy in the results compared to tanh\_tayps but with a considerably lower computational cost.


**Table 7.** The minimum, maximum, and average values and the standard deviation of the relative errors committed by tanh\_tayps and tanh\_pol for Tests 1–3.

#### **4. Conclusions**

Two alternative methods to approximate the matrix hyperbolic tangent were addressed in this work. The first was derived from its own definition and reduced to the computation of a matrix exponential. The second method deals with its Taylor series expansion. In this latter approach, two alternatives were developed and differed on how the evaluation of the matrix polynomials was performed. In addition, we provided algorithms to determine the scaling factor and the order of the polynomial. As a result, we dealt with a total of three MATLAB codes (tanh\_expm, tanh\_tayps, and tanh\_pol), which were evaluated on a complete testbed populated with matrices of three different types.

The Taylor series-based codes reached the most accurate results in the tests, a fact that is in line with the recommendation suggested in [33] of using a Taylor development against other alternatives whenever possible. However, codes based on Taylor series can be computationally expensive if the Paterson–Stockmeyer method is employed to evaluate the polynomial, as we confirmed with the tanh\_tayps implementation. One idea to reduce this problem is to use Sastre formulas, as we did in our tanh\_pol code, resulting in an efficient alternative that significantly reduced the number of matrix operations without affecting the accuracy.

The results found in this paper demonstrated that the three codes were stable and provided acceptable accuracy. However, and without underestimating the other two codes, the tanh\_pol implementation proposed here offered the best ratio of accuracy/computational cost and proved to be an excellent method for the computation of the matrix hyperbolic tangent.

**Author Contributions:** Conceptualization, E.D.; methodology, E.D., J.I., J.M.A. and J.S.; software, J.I., J.M.A. and J.S.; validation, J.M.A.; formal analysis, J.I. and J.S.; investigation, E.D., J.I., J.M.A. and J.S.; writing—original draft preparation, E.D., J.I., J.M.A. and J.S.; writing—review and editing, P.A.-J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Spanish Ministerio de Ciencia e Innovación under grant number TIN2017-89314-P.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
