1. Introduction and Notation
The calculus of matrix functions has been, for a long period of time, an area of interest in applied mathematics due to its multiple applications in many branches of science and engineering; see [
1] and the references therein. Among these matrix functions, the matrix exponential stands out due to both its applications and the difficulties of its effective calculation. Basically, given a square matrix
, its exponential function is defined by the matrix series:
Directly related to the exponential function, we find the matrix logarithm. Specifically, given a nonsingular matrix
whose eigenvalues lie in
, we define the matrix logarithm of
A as any matrix
satisfying the matrix equation:
Of course, there are infinite solutions of Equation (
1), but we only focus on the principal matrix logarithm or the standard branch of the logarithm, denoted by
, which is the unique logarithm of matrix
A (see Theorem 1.31 of [
1]) whose eigenvalues all lie in the strip
. Indeed, this principal matrix logarithm is the most used in applications in many fields of research from pure science to engineering [
2], such as quantum chemistry and mechanics [
3,
4], buckling simulation [
5], biomolecular dynamics [
6], machine learning [
7,
8,
9,
10], graph theory [
11,
12], the study of Markov chains [
13], sociology [
14], optics [
15], mechanics [
16], computer graphics [
17], control theory [
18], computer-aided design (CAD) [
19], optimization [
20], the study of viscoelastic fluids [
21,
22], the analysis of the topological distances between networks [
23], the study of brain–machine interfaces [
24], and also in statistics and data analysis [
25], among other areas. Just considering various branches of engineering, the matrix logarithm can be employed to compute the time-invariant component of the state transition matrix of ordinary differential equations with periodic time-varying coefficients [
26] or to recover the coefficient matrix of a differential system governed by the linear differential equation
from observations from the state vector
y [
27].
The applicability of the matrix logarithm in so many distinct areas has motivated different approaches for its evaluation. One of the most-used methods was that proposed by Kenney and Laub in [
28], based on the inverse scaling and squaring method and a Padé approximation, exploiting the matrix identity:
This method finds an integer s such that is close to the identity matrix, approximates by a [p/q] Padé approximation , and finally, computes .
These same authors developed in [
27] an algorithm that consists of obtaining the Schur decomposition
, where
Q is a unitary matrix,
T is an upper triangular matrix, and
is the conjugate transpose of
Q. Then,
is computed as follows: the main diagonal is calculated by applying the logarithm scalar function on its elements, and the upper diagonals are computed by using the Fréchet derivative of the logarithm function. Finally,
is computed as
.
Later, most of the developed algorithms basically used the inverse scaling and squaring method with Padé approximants to dense or triangular matrices; see [
29,
30,
31,
32,
33,
34,
35]. Nevertheless, some new algorithms are based on other methods, among which the following can be highlighted:
An algorithm based on the arithmetic–geometric mean iteration; see [
36];
The use of contour integrals; see [
37];
Methods based on different quadrature formulas, proposed in [
38,
39].
Finally, we should mention the built-in MATLAB function, called
logm, that computes the principal matrix logarithm by means of the algorithms described in [
32,
33].
Throughout this paper, we refer to the identity matrix of order
n as
, or
I. In addition, we denote by
the set of eigenvalues of a matrix
, whose spectral radius
is defined as:
With
, we denote the result reached after rounding
x to the nearest integer greater than or equal to
x, and
is the result reached after rounding
x to the nearest integer less than or equal to
x. The matrix norm
stands for any subordinate matrix norm; in particular,
is the usual 1-norm. Recall that if
is a matrix in
, its 2-norm or Euclidean norm represented by
satisfies [
40]:
All the implemented codes in this paper are intended for IEEE double-precision arithmetic, where the unit round off is . Their implementations for other distinct precisions are straightforward.
This paper is organized as follows:
Section 2 describes an inverse scaling and squaring Taylor algorithm based on efficient evaluation formulas [
41] to approximate the matrix logarithm, including an error analysis.
Section 3 includes the results corresponding to the experiments performed in order to compare the numerical and computational performance of different codes against a test battery composed of distinct types of matrices. Finally,
Section 4 presents the conclusions.
3. Numerical Experiments
In this section, we compare the following five MATLAB codes in terms of accuracy and efficiency:
logm_iss_full: It calculates the matrix logarithm using the transformation-free form of the inverse scaling and squaring method with Padé approximation. Matrix square roots are computed by the product form of the Denman–Beavers iteration ([
1], Equation (6.29)). It corresponds to Algorithm 5.2 described in [
32], denoted as the
iss_new code;
logm_new: It starts with the transformation of the input matrix
A to the Schur triangular form
. Thereafter, it computes the logarithm of the triangular matrix
T using the inverse scaling and squaring method with Padé approximation. The square roots of the upper triangular matrix T are computed by the Björck and Hammarling algorithm ([
1,
49], Equation (6.3)), solving one column at a time. It is an implementation of Algorithm 4.1 explained in [
32], designated as the
iss_schur_new function;
logm: It is a MATLAB built-in function in charge of working out the matrix logarithm from the algorithms included in [
32,
33]. MATLAB R2015b incorporated an improved version that applies inverse scaling and squaring and Padé approximants to the whole triangular Schur factor, whereas the previous
logm applied this technique to the individual diagonal blocks in conjunction with the Parlett recurrence. Although it follows a similar main scheme to the
logm_new code, the square roots of the upper triangular matrix
T are computed using a recursive blocking technique from the Björck and Hammarling recurrence, which works outs the square roots of increasingly smaller triangular matrices [
50]. The technique is rich on matrix multiplications and has the requirement of solving the Sylvester equation. In addition, it allows parallel architectures to be exploited simply by using threaded BLAS;
logm_polf: This is an implementation of Algorithm 1, where Taylor matrix polynomials are evaluated thanks to the Sastre formulas previously detailed. Values from the set were used as the approximation polynomial order for all the tests carried out in this section, and forward relative errors were considered;
logm_polb: It is an identical implementation to the previous code, but it is based on relative backward errors.
A testbed consisting of the following three sets of different matrices was employed in all the numerical experiments performed. In addition, MATLAB Symbolic Math Toolbox with 256 digits of precision was the tool chosen to obtain the “exact”matrix logarithm, using the vpa (variable-precision floating-point arithmetic) function:
Set 1: The 100 diagonalizable complex matrices. They have the form , where D is a diagonal matrix with complex eigenvalues and V is an orthogonal matrix such as , H being a Hadamard matrix and n the matrix order. The 2-norm of these matrices varies from to 300. The “exact”matrix logarithm was calculated as ;
Set 2: The 100 nondiagonalizable complex matrices computed as , where J is a Jordan matrix with complex eigenvalues whose algebraic multiplicity is randomly generated between 1 and 3. V is an orthogonal random matrix with elements in progressively wider intervals from , for the first matrix, to , for the last one. As the 2-norm, these matrices range from to . The “exact”matrix logarithm was computed as ;
Set 3: The 52 matrices
A from the Matrix Computation Toolbox (MCT) [
51] and the 20 from the Eigtool MATLAB Package (EMP) [
52], all of them with a
order. The
“exact”matrix logarithm was calculated by means of the following three-step procedure:
Calling the MATLAB function eig and providing, as a result, a matrix V and a diagonal matrix D such that , using the vpa function. If any of the elements of matrix D, i.e., the eigenvalues of matrix A, have a real value less than or equal to zero, this will be replaced by the result of the sum of its absolute value and a random number in the interval . Thus, a new matrix will be generated. Next, matrices and will be calculated in the form and ;
Computing the matrix logarithm thanks to the logm and vpa functions, that is ;
Considering the
“exact”matrix logarithm of
only if:
A total of forty-six matrices, forty from the MCT and six from the EMP, were considered for the numerical experiments. The others were excluded for the following reasons:
- -
Matrices 17, 18, and 40 from the MCT and Matrices 7, 9, and 14 from the EMP did not pass the previous procedure for the “exact”matrix logarithm computation;
- -
The relative error incurred by some of the codes in comparison was greater than or equal to unity for Matrices 4, 6, 12, 26, 35, and 38 belonging to the MCT and Matrices 1, 4, 10, 19, and 20 pertaining to the EMP. The reason was the ill-conditioning of these matrices for the matrix logarithm function;
- -
Matrices 2, 9, and 16, which are part of the MCT, and Matrices 15 and 18, incorporated in the EMP, resulted in a failure to execute the logm_iss_full code. More specifically, the error was due to the fact that the function sqrtm_dbp, responsible for computing the principal square root of a matrix using the product form of the Denman–Beavers iteration, did not converge after a maximum allowed number of 25 iterations;
- -
Matrices 8, 11, 13, and 16 from the EMP were already incorporated in the MCT.
If we had not taken the previously described precaution of modifying all those matrices with eigenvalues less than or equal to zero, they would have been directly discarded, and evidently, the number of matrices that would be part of this set would be much more reduced.
The accuracy of the distinct codes for each matrix
A was tested by computing the normwise relative error as:
where
means the
“exact”solution and
denotes the approximate one. All the tests were carried out on a Microsoft Windows 10 ×64 PC system with an Intel Core i7-6700HQ CPU @2.60Ghz processor and 16 GB of RAM, using MATLAB R2020b.
First of all, three experiments were performed, one for each of the three above sets of matrices, respectively, in order to evaluate the influence of the type of error employed by comparing the numerical and computational performance of
logm_polf and
logm_polb.
Table 4 shows the percentage of cases in which the normwise relative error of
logm_polf was lower, greater than, or equal to that of
logm_polb. As expected, according to the
and
values exposed in
Table 3, both codes gave almost identical results, with a practically inappreciable improvement for
logm_polf in the case of Sets 1 and 3.
The computational cost of logm_polf and logm_polb was basically identical, although slightly lower in the case of the latter code, since the logm_polb function required, on some cases, approximation polynomials of lower orders than those of logm_polf. Both of them performed an identical number of matrix square roots.
After analyzing these results, we can conclude that there are no significant differences between the two codes, and either of them could be used in the remainder of this section to establish a comparison with the other ones. Be that as it may, the code chosen to be used for this purpose was logm_polf, and from here on, we designate it as logm_pol, a decision justified by its numerical and computational equivalence to logm_polb.
Having decided which code based on the Taylor approximation to use, we now assess its numerical peculiarities and those of the other methods for our test battery.
Table 5 sets out a comparison, in percentage terms, of the relative errors incurred by
logm_pol with respect to
logm_iss_full,
logm_new, and
logm.
As we can see, logm_pol outperformed logm_iss_full in accuracy in 100% of the matrices in Sets 1 and 2 and 97.83% of them in Set 3. logm_pol achieved better accuracy than logm_new and logm in 97%, 89%, and 89.13% of the matrices for Sets 1, 2, and 3, respectively.
Graphically,
Figure 1,
Figure 2 and
Figure 3 provide the results obtained according to the experiments carried out for each of the sets of matrices. More in detail, the normwise relative errors (a), the performance profiles (b), the ratio of the relative errors (c), the lowest and highest relative error rates (d), the polynomial or diagonal Padé orders (e), the execution times (f), and the ratio of the execution times (g) for the four codes analyzed can be appreciated.
Figure 1a,
Figure 2a and
Figure 3a illustrate the normwise relative errors assumed by each of the codes when calculating the logarithm of the different matrices that comprise each of the sets. The solid line that appears in them depicts the function
, where
(or
) represents the condition number of the matrix logarithm function ([
1], Chapter 3) and
u is the unit round off. The numerical stability of each method is evidenced if its errors are not much higher than this solid line. The matrices were ordered by decreasing value of
.
Roughly speaking, logm_pol obtained the lowest error values for most of the test matrices, whereas the top values corresponded to logm_iss_full, for Sets 1 and 2, or to logm_new and logm, for Set 3. Indeed, in the case of the first two sets, the vast majority of errors incurred by logm_pol are below the continuous line, while those of logm_new and logm are around this line and, in the case of logm_iss_full, clearly above it. The results provided by logm_new and logm were identical in the different experiments performed, with the exception of Set 3, where they were very similar. With regard to Set 3, the high condition number of the logarithm function for some test matrices meant that the maximum relative error committed by logm_pol punctually reached in one case the order of , closely followed by the error recorded by logm_iss_full. Nevertheless, for this third set of matrices and except for a few concrete exceptions, all methods provided results whose errors are usually located below the line.
Accordingly, these results showed that the numerical methods, on which the different codes were based, are stable, especially in the case of Taylor and its implementation in the
logm_pol function, whose relative errors, as mentioned before, are always positioned in lowest positions on the graphs, thus delivering excellent results. It should be clarified that Matrices 19, 21, 23, 27, 51, and 52 from the MCT and Matrix 17 from the EMP are not plotted in
Figure 3a because of the excessive condition number of the logarithm function, although it was taken into account in all other results.
Figure 1b,
Figure 2b and
Figure 3b trace the performance profile of the four codes where, on the
x-axis,
takes values from 1–5 with increments of 0.1. Given a specific value for the
coordinate, the corresponding amount
p on the
y-axis denotes the probability that the relative error committed by a particular algorithm is less than or equal to
-times the smallest relative error incurred by all of them.
In consonance with the results given in
Table 5, the performance profiles showed that
logm_pol was the most accurate function for the three matrix sets. In the case of the matrices belonging to the first two sets, the probability reached practically 100% in most of the figures, while it tended to be around 95% for the matrices that comprised the third set. Regarding the other codes,
logm_new and
logm offered practically identical results in accuracy, with a few more noticeable differences in the initial part of
Figure 3b in favor of
logm_new. On the other hand,
logm_iss_full was significantly surpassed by
logm_new and
logm in Sets 1 and 2, with the opposite being true for Set 3, although in a more discrete manner.
The ratios in the normwise relative errors between
logm_pol and the other tested codes are presented, in decreasing order according to Er(
logm_iss_full)/Er(
logm_pol), in
Figure 1c,
Figure 2c and
Figure 3c. For the vast majority of the matrices in Sets 1 and 2, the values plotted are always greater than one, which again confirmed the superiority in terms of accuracy of
logm_pol over the other codes, especially for
logm_iss_full. Similar conclusions can be extracted from the analysis of the results for Set 3, although, in this case, the improvement of
logm_pol was more remarkable over
logm_new and
logm for a small group of matrices.
In the form of a pie chart, the percentage of matrices in our test battery in which each method yielded the smallest or the largest normwise relative error over the other three ones is displayed in
Figure 1d,
Figure 2d and
Figure 3d. Thus, on the positive side,
logm_pol resulted in the lowest error in 94%, 86%, and 89% of the cases for, respectively, each of the matrix sets. Additionally,
logm_pol never gave the most inaccurate result in the first two matrix sets and only gave the worst result in 2% of the matrices in Set 3. As can be seen,
logm_iss_full gave the highest relative errors in 94% and 88% of the matrices correspondingly belonging to Set 1 and Set 2. On the contrary, the highest error rates were shared between
logm_new and
logm in Set 3.
The value of parameter
m, or, in other words, the Taylor approximation polynomial order, in the case of
logm_pol, or the Padé approximant degree, in the case of
logm_iss_full,
logm_new, and
logm, is shown in
Figure 1e,
Figure 2e and
Figure 3e, respectively, for the three sets. It should be clarified that the value and meaning of
m is not comparable between
logm_pol and the rest of the codes. Additionally, and for the three matrix sets,
Table 6 collects those values of
m, together with those ones of
s, i.e., the number of matrix square roots computed, in the form of maxima, minima, means, and medians. Whereas the
logm_pol function always required values of
m much higher than the others, the values of
s were more similar among the different codes, always reaching the smallest values on average in the case of
logm_pol. Consequently,
logm_pol needed to calculate a lower number of square roots compared to those required by the other codes. Moreover, it should be pointed out that
logm_iss_full,
logm_new, and
logm were forced to use an atypically high value of
s for a few matrices in Set 3. In all the experiments performed,
logm_new and
logm always employed an identical value of
m and
s.
On the other hand,
Figure 1f,
Figure 2f and
Figure 3f and
Table 7 inform us about the execution times spent by the different codes analyzed on computing the logarithm of all the matrices that constituted our testbed. To obtain these overall times, the runs were launched six times, discarding the first one and calculating the average time among the others. As we can appreciate,
logm was always the fastest function, followed by
logm_iss_full, for Sets 1 and 2, or by
logm_pol, for Set 3. In contrast,
logm_new always invested the longest time for our three experiments. Numerically speaking, and according to the times recorded for Set 3,
logm_pol was 1.74- and 2.04-times faster than
logm_iss_full and
logm_new, respectively. This notwithstanding,
logm_pol and the other codes were clearly outperformed by
logm, whose speed of execution in computing the logarithm of this third set of matrices was 2.71-times higher than that of
logm_pol.
In more detail,
Figure 1g,
Figure 2g and
Figure 3g provide the ratio between the computation time of the other codes and
logm_pol, individually for each of the matrices, in a decreasing order consistent with the factor T(
logm_iss_full)/T(
logm_pol). As an example, for Set 3, this ratio took values from 0.28 to 15.11 for
logm_iss_full, from 0.32 to 28.03 for
logm_new, and from 0.05 a 3.93 for
logm.
Lastly, it is worth mentioning that, after performing the pertinent execution time profiles, it could be appreciated that much of the time spent by all the codes was involved in the square root computation. Indeed, observe that logm resulted in better execution times than logm_new, despite being two codes that share a similar theoretical scheme. This was mainly due to two reasons: first, because the recursive approach that logm follows to compute the square root of a triangular matrix is much more efficient than that of logm_new; second, this is due to implementation issues since, among others, logm is composed of distinct built-in functions in charge of, for example, calculating the Schur form, solving the Sylvester equation, or solving systems of linear equations, resulting in a very cost-effective code that is not possible to beat in speed as long as all its improvements are not also implemented in its competitors.