1. Introduction
Calculating the inverse of a matrix, especially of a large size, is a very difficult task with a high computational cost. The alternative is the use of iterative algorithms to estimate it. In a vast range of fields, such as image and signal processing [
1,
2,
3,
4], encryption [
5,
6], control system analysis [
7,
8], etc., it is necessary to calculate the inverse or different generalized inverses in order to solve the problems posed.
In recent years, many iterative schemes of different orders of convergence have been designed to estimate the inverse of a complex matrix
A or some generalized inverse (Moore–Penrose inverse, Drazin inverse, etc.). In 2013, Weiguo et al. in [
9] constructed a sequence of third-order iterations converging to the Moore–Penrose matrix. In the same year as well, Toutounian and Soleymani in [
10] presented a high-order method for approximating inverses and pseudo-inverses of complex matrices based on Homeier’s scheme, with a derivative-free composition. More recently, Stanimirovic et al. in [
11] designed efficient transformations of the hyperpower iterative method for computing the generalized inverses
with the aim of minimizing the number of required matrices by matrix products per cycle. In 2020, Kaur et al. established in [
12] new formulations of the fifth-order hyperpower method to compute the weighted Moore–Penrose inverse, improving the efficiency indices. Such approximations were found to be robust and effective when implemented as a preconditioned matrix to solve the linear systems. All these schemes were designed with the starting point of iterative procedures without memory, that is where each new iteration is calculated using only the information provided by the previous iterate.
Iterative procedures that use more than one previous iterate to calculate the next one are called methods with memory. In the context of matrix inverse approximations, the secant scheme was proposed by the authors in [
13]. For a nonsingular matrix, the secant method gives an estimation of the inverse and, when the matrix is singular, an approximation of the pseudo-inverse and Drazin inverse. Furthermore, the superlinear convergence was demonstrated in all cases.
In this manuscript, we focused on constructing several iterative methods with memory, free of inverse operators and with different orders of convergence, for finding the inverse of a nonsingular complex matrix. We also analyzed the proposed schemes for computing the Moore–Penrose inverse of complex rectangular matrices. On the other hand, these procedures allow approximating other generalized inverses such as the Drazin inverse, the group inverse, etc., although this is beyond the scope of this work.
Let
A be a complex
nonsingular matrix. The design of iterative algorithms without memory for estimating the inverse matrix, which we call Newton–Schulz-type methods, is mostly based on iterative solvers for the scalar equation
applied to the nonlinear matrix equation:
where
is a nonlinear matrix function.
The most-used iterative scheme to approximate
is the Newton–Schulz scheme [
14]:
where
I is an identity matrix of size
.
On the other hand, in the context of iterative procedures with memory, the authors presented in [
13] a secant-type method (SM), whose iterative expression is:
where
,
are the starting guesses. With a particular choice of the initial approximations, the authors proved that sequence
, obtained by (
3), converges to
with order of convergence
.
To analyze the convergence order of the iterative methods with memory for solving nonlinear equations
, the
R-order is used (see [
15]), which we summarize below.
Theorem 1. Let ψ be an iterative method with memory that generates a sequence of approximations to the root α of , and let this sequence converge to α. If there exists a nonzero constant η and nonnegative numbers , , such that the inequality:holds, then the R-order of convergence of the iterative method ψ satisfies the inequality:where is the unique positive root of the equation: Here, denotes the error of the approximation in the kth iterative step.
Continuing with the idea of designing iterative procedures with memory, in this work, we used the scalar iterative methods of Kurchatov and Steffensen with memory, which we adapted to the context of matrix equations. In the scalar case, this kind of procedure can reach a higher order of convergence than schemes without memory with the same number of functional evaluations and usually has better stability properties. However, as far as we know, only the secant scheme has been extended to matrix equations with good results, in [
13].
Kurchatov’s scheme is an iterative algorithm with memory with quadratic convergence to solve scalar equations
(see [
16]), which is deduced from Newton’s method by substituting
by Kurchatov’s divided difference, that is:
Regarding Steffensen’s scheme, it was initially developed in [
17], where the derivative
was replaced by the divided difference
. It still holds the original quadratic convergence, but in practice, the set of converging initial estimations is quite smaller than that of Newton’s scheme. It was Traub who, in [
18], introduced an accelerating parameter
in the divided difference such that
derived an iterative method whose error equation was:
with
. With the aim of increasing the order of convergence of the scheme, Traub defined an approximation of
by
, obtaining the procedure:
which, given initial values
and
, has order of convergence
.
The stability of these schemes with memory for scalar problems was analyzed firstly in [
19,
20], showing very stable performance in both cases. In what follows, some definitions and properties about vectorial discrete dynamical systems are introduced. These tools are shown to be useful in the following sections.
Basic Concepts of Qualitative Studies of Schemes with Memory
Let us start with an iterative scheme with memory, which is used to solve the scalar problem
, using two previous iterates to calculate the next one:
with
and
being its seeds. Therefore, a solution
is estimated whether
or, equivalently,
. This estimation can be obtained as a fixed point of
by means of:
with again
and
being the seeds.
To study the qualitative behavior of a fixed point iterative scheme with memory, it is applied on a low-degree polynomial
, generating a vectorial rational operator,
. Then, the orbit of
is the set:
Therefore, a point is fixed if it is the only point belonging to its orbit, and it is a T-periodic point if is orbit is only defined by and .
On the other hand, the orbit of a point can be classified depending on its asymptotic behavior, according to the next result.
Theorem 2 ([
21], p. 558)
. Let be . Let us also assume that is a period-k point. Let , be the eigenvalues of the Jacobian : - (a)
If , , then is an attractor. It is said to be a superattractor if .
- (b)
If exists such that , then the periodic point is unstable (repelling or saddle).
- (c)
If , , then is a repulsor.
Moreover, we define a point as a critical point of G if . Indeed, if is not a zero of , it is named a free critical point. In a similar way, if a fixed point is not a zero of , it is named a strange fixed point.
Denoting by
an attracting fixed point of:
G, its basin of attraction
is defined as:
The union of all the basins of attraction of G is defined as the Fatou set, , and its complementary set in is the Julia set . It holds all the repelling fixed points and sets the boundary among the basins of attraction.
In
Section 2 and
Section 3, we build a Kurchatov-type and a Steffensen-type iterative method with memory, respectively, to estimate the inverse of a nonsingular complex matrix. Both schemes are free of inverse operators, and we prove their order of convergence and stability. In
Section 4, we extend these schemes to calculate the Moore–Penrose inverse of a complex rectangular matrix.
Section 5 is devoted to the numerical tests to analyze their performance and confirm the theoretical results. We end the work with some conclusions.
2. Kurchatov-Type Method
The Kurchatov divided difference was defined by [
16] as:
For an equation
, Kurchatov’s method has order or convergence two and iterative expression:
If we apply this method to
, where
and
is an
nonsingular matrix, not necessarily diagonalizable, we obtain:
Next, we show a technical lemma whose result we use later.
Lemma 1. Let be invertible and commutative, , then: Proof. Using algebraic manipulations,
and the lemma is proven. □
It is known that, for any nonsingular
complex matrix
A, there exist
unitary matrices
U and
V, such that
,
being the diagonal matrix of the singular values of
A,
with
and
. Moreover,
is the conjugate transpose of
U. Then,
, that is
,
, and
, and we apply them in Equation (
5).
where
and
. If the initial approximations
,
satisfy
and
are diagonal matrices, then all the matrices
are diagonal and
, for all
. Applying several algebraic manipulations on the last equation, we can ensure:
Using Lemma 1, we obtain:
In this iterative expression, still, an inverse matrix appears. Therefore, Kurchatov’s scheme is not directly transferable for the calculation of the matrix inverses.
Therefore, we propose a slight change in the Kurchatov divided difference scheme:
obtaining a new iterative scheme,
Now, we apply this iterative procedure to the matrix equation
:
Using the transformations defined above and Lemma 1, we obtain:
and then,
From this expression, by using again the transformations defined above, we have
and
where three matrix products appear, but there are no inverse operators. Equation (
8) corresponds to the iterative scheme of the modified Kurchatov-type method for matrix inversion, which we call
.
2.1. Convergence Analysis
Let us consider now two
unitary matrices
U and
V, satisfying
, where the singular values
,
of
A fulfill
. We define again
,
. By using Equation (
7), we obtain:
Next, we demonstrate the order of convergence of the iterative procedure .
Theorem 3. Let be a nonsingular matrix and and be the initial approximation such that matrices and are diagonal. Then, sequence obtained by means of the iterative expression (8), where is the singular-value decomposition of A, converges to the inverse , with convergence order 1.6180. Proof. Let us denote
. By means of component-by-component calculations, we obtain:
Let us also denote
; by subtracting
from the both sides of the last equation, we obtain:
This result shows that, for each
,
in Equation (
9) converges to
with order of convergence
, which is the only positive root of
(see Theorem 1). In this way, for each
, there exists
satisfying
,
. Moreover,
tends to zero when
. Indeed,
,
. Furthermore,
with
. Using this result, we obtain:
Therefore, we can affirm that converges to . □
To demonstrate the stability of the modified Kurchatov-type iterative scheme, we used the definition introduced by Higham in [
14] for the stability of the iterative scheme
, with a fixed point
. Assuming that
H is Frechét differentiable in fixed point
, then the process is stable in a neighborhood of
if there exists a positive constant
C such that
.
Theorem 4. The modified Kurchatov-type iterative scheme to estimate the inverse of a nonsingular complex matrix defined by expression (8) is stable. Proof. The modified Kurchatov-type method can be written as follows:
So, denoting by
, we conclude
Now, as
, we get
Therefore, matrix
is idempotent, and (
8) is a stable iterative process. □
2.2. Qualitative Analysis of Kurchatov-Type Scheme
Now, we considered the Kurchatov-type scheme as a scalar iterative procedure for solving nonlinear equations, analyzed its convergence, and studied its stability. In this analysis, the stability is understood as the dependence on the initial estimations. It is made by applying the vectorial real dynamical concepts to the Kurchatov-type scheme, whose expression is:
In the following result, we show that this scheme for solving nonlinear scalar equations has superlinear convergence.
Theorem 5. Let us consider a sufficiently differentiable function in an open neighborhood I of the simple root α of nonlinear equation . Furthermore, let us assume that is continuous at α and that and α are near enough between them. Then, sequence , , generated by the Kurchatov-type scheme, converges to α with order of convergence , its error equation being:where means that the following terms in the error equation depend on the powers of errors and such that the sum of the exponents is at least two; moreover, and , . Proof. It is known that the Taylor expansion of
around
is:
By using the Genocchi–Hermite formula (see [
15]) and the expansion of
in the Taylor series around
x,
with
, the expansion of the divided difference is:
Then, the expression of the difference divided as a function of the errors
and
is:
By applying Theorem 1, the only positive real root of (where the coefficients correspond to the exponents of and in the error equation) is the convergence order of the method, that is . □
From now on, we denote by the fixed point operator associated with the Kurchatov-type method applied on . As it does not use derivatives, it is not possible to establish a scaling theorem. Therefore, we cannot make the analysis on generic second-degree polynomials.
The fixed point operator depends on two variables:
(denoted by
x) and
(denoted by
z). Therefore,
Let us analyze now the qualitative performance of the rational operator by means of the asymptotic behavior of the fixed points and the existence of free critical points.
Theorem 6. The only fixed points of rational operator KT are the roots , both being superattracting. Moreover, KT has no free critical points.
Proof. We solve equation:
finding that the only fixed points are
and
. To study their stability, we considered:
and therefore, its eigenvalues are:
and:
As , we concluded that fixed points and are superattracting.
As an immediate consequence of this analysis of and its eigenvalues , , the only satisfying are those , where . Then, they are the only critical points, and there do not exist free critical points. □
From this results, we concluded that no other performance than convergence to the roots is possible, when the Kurchatov-type is applied on , as any other basin of attraction would need a free critical point inside.
In
Figure 1, we show the dynamical plane of the KT operator,
x and
z being real, corresponding to the abscissa and ordinate axis, respectively (see [
22] for the routines). The mesh used has
points, and the maximum amount of iterations is 40. The distance to the root lower than
is the stopping criterion.
Then, if each point of the mesh is considered as a seed of the method, when it converges to one of the roots of (located in ), it is represented in orange or green color, depending on the root it has converged to. The color is brighter the lower the amount of iterations is. When the initial estimation, after the iterative process, reaches 40 iterations with no convergence, it is colored in black.
In
Figure 1, we observe that the basins of attraction of both roots of
have symmetric shapes. In spite of the only basins of attraction being those of the roots (Theorem 6), there are black areas in the dynamical planes. They correspond to areas of slow convergence, whose initial estimations converge with a higher number of iterations.
In [
19,
20], the stability analysis of the secant and Steffensen with memory schemes was performed, among other iterative schemes with memory. In both cases, the dynamical planes plotted showed full convergence to the roots, without black areas of slow convergence or divergence. Later on, we checked how the the differences among the Kurchatov-type and secant or Steffensen with memory stability affected their numerical performance.
5. Numerical Experiments
In this section, we present the numerical tests of the behavior of the Steffensen method with memory (SMM) and Kurchatov type’s (MKTM) designed to calculate the inverse and the Moore–Penrose inverse applied to different matrices. For comparison, we used the Newton–Schulz method (NS) [
14], and the secant method (SM) [
13]. The numerical calculations were made with MATLAB 2022a (MathWorks, USA) using a 3 GHz 10-Core Intel Xeon W processor, 64 GB 2666 MHz DDR4 (iMac Pro). As stopping criteria for all numerical tests, we used
or
,
being the nonlinear matrix equation to be solved for estimating the inverse or pseudo-inverse of a complex matrix
A.
In addition, when verifying the theoretical results numerically, we used the order of approximate computational convergence (COC) introduced by Jay [
25], defined as:
Another form of the numerical approximation of the order of theoretical convergence presented by the authors in [
26], denoted as ACOC, is defined as:
To show the order of convergence of the methods in the numerical tests, we used any of these estimates of computational order. In the tables, we write “-” when the COC (or ACOC) vector is unstable. Furthermore, the mean of elapsed time after 50 executions of the codes appears in the tables, calculated by using the CPU-time command.
Example 1. As a first example, we looked for the inverse random matrices of size , where , and 500. Since Newton–Schulz’s method needs one initial point, we took it as . The rest of the methods need two initial points, which we took as and .
Table 1 shows the results obtained by approximating the inverses of nonsingular random matrices of sizes
, and 500 using the Newton–Schulz, secant, Steffensen with memory, and Kurchatov-type methods. The number of iterations, residuals, and the COC value are shown. The results confirmed the theoretical order of convergence obtained from each method. Using all methods, the approximation of the inverse of A was obtained. In all cases, the Steffensen method with memory showed better results in terms of the number of iterations and the computational time. The graphs shown in
Figure 2 represent the results presented in
Table 1 for the cases of matrices of
, and 500.
Example 2. Now, we built square matrices of size using different MATLAB functions, such as:
- (a)
. Symmetric and positive definite matrix of size , .
- (b)
. Riemann matrix of size .
- (c)
. Hankel matrix of size .
- (d)
. Toeplitz matrix of size .
- (e)
. Leslie matrix of size with application in problems of population models.
- (f)
. Parter matrix of size .
Here, we used as stopping criteria
or
and the same initial approximations as in Example 1. The numerical results obtained are shown in
Table 2 and
Figure 3. As in the previous example, the proposed methods showed a good performance in terms of stability, precision, and number of iterations required.
Example 3. Finally, we tested the methods for computing the Moore–Penrose inverse of random matrices for different values of m and n. The matrices of the initial approximations were calculated in the same way as in the previous examples, and the stopping criterion was the one used in Example 1.
The results obtained for the number of iterations, the residuals, and the ACOC value of Example 3 are shown in
Table 3 and
Figure 4. The methods gave us an approximation of the inverse Moore–Penrose and showed the same behavior as in the previous examples.
6. Conclusions
In this manuscript, we widened the set of iterative methods able to be applied for estimating generalized inverses of complex matrices, by using schemes with memory able to improve the Newton–Schulz scheme. Two procedures with memory were designed for approximating the inverses of nonsingular complex matrices or pseudo-inverses, in the case that the matrices are singular. The order of convergence and stability were proven, and in the case of the Steffensen with memory scheme, its order of convergence improved that of the Newton–Schulz method.
The method using Kurchatov’s divided differences cannot be directly adapted to estimate generalized inverses of complex matrices. To overcome this difficulty, Kurchatov-type divided differences were used. In this process, there was a decrease in the order of convergence of the starting method and a change in its behavior.
The technique used to adapt the iterative methods can be applied to the resolution of other types of matrix equations with an especially significant role in various areas such as control theory, dynamic programming, ladder networks, statistics, etc.
This research opens new ways in the design of iterative procedures for solving this kind of nonlinear matrix equation, with promising numerical performance, which showed agreement with theoretical results and stable results.