An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean

Liu, Tao; Li, Ting; Ullah, Malik Zaka; Alzahrani, Abdullah Khamis; Shateyi, Stanford

doi:10.3390/math12111772

Open AccessArticle

An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean

by

Tao Liu

¹

,

Ting Li

¹

,

Malik Zaka Ullah

²

,

Abdullah Khamis Alzahrani

² and

Stanford Shateyi

^3,*

¹

School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China

²

Mathematical Modelling and Applied Computation (MMAC) Research Group, Department of Mathematics, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia

³

Department of Mathematics and Applied Mathematics, School of Mathematical and Natural Sciences, University of Venda, P. Bag X5050, Thohoyandou 0950, South Africa

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(11), 1772; https://doi.org/10.3390/math12111772

Submission received: 1 May 2024 / Revised: 28 May 2024 / Accepted: 4 June 2024 / Published: 6 June 2024

(This article belongs to the Special Issue Advances in Computational Mathematics and Applied Mathematics)

Download

Browse Figures

Versions Notes

Abstract

:

The target of this work is to present a multiplication-based iterative method for two Hermitian positive definite matrices to find the geometric mean. The method is constructed via the application of the matrix sign function. It is theoretically investigated that it has fourth order of convergence. The type of convergence is also discussed, which is global under an appropriate choice of the initial matrix. Numerical experiments are reported based on input matrices of different sizes as well as various stopping termination levels with comparisons to methods of the same nature and same number of matrix–matrix multiplications. The simulation results confirm the efficiency of the proposed scheme in contrast to its competitors of the same nature.

Keywords:

iterative approach; geometric mean; matrix sign; fractal; basins of attraction; fourth order of convergence

MSC:

41A25; 65F60

1. Introduction

1.1. The Sign for a Matrix

The matrix sign function (MSF) [1], alternatively referred to as the matrix signum function or sign of a matrix, is an operation applied to matrices, producing a matrix of identical dimensions. The origin of the function sign is taken from its scalar counterpart, which works on real numbers, assigning +1 to positive scalars, −1 to negative scalars, and 0 to zero ([2], chapter 11). Extending the sign function to matrices was a progression aimed at aiding the exploration of matrix theory and the development of novel algorithms for addressing matrix equations and systems [3]. The MSF for an invertible matrix

W \in C^{n \times n}

can be written as

sign (W) = U,

(1)

and then, computed by ([4]; p. 107)

U = \frac{2}{π} W \int_{0}^{\infty} {(t^{2} I + W^{2})}^{- 1} d t,

(2)

where I is the identity matrix.

Since its inception, this function has been applied across diverse fields of mathematics and scientific computation, playing crucial roles in numerical analysis [4,5]. Recent investigations [6,7] have concentrated on enhancing methods for effectively finding the MSF, developing high-order techniques, and investigating its correlations with other matrix functions and properties. This function has emerged as a valuable instrument for manipulating and characterizing matrices in various domains.

For small to moderately sized matrices, it is feasible to compute the spectral factorization and subsequently assess

f (W)

. Higham [4] details numerous methods for computing functions of matrices within this size range. Ref. [8] proposed a foundational framework for computing several matrix functions, including (1). However, for large matrices W, the computational cost associated with computing the spectral factorization may become prohibitively high. Similarly, other techniques that rely on factorizing W to compute

f (W)

may also become impractical for large matrices lacking an exploitable structure. In such scenarios, iterative methods emerge as viable options.

1.2. Matrix Geometric Mean (MGM)

The geometric mean serves as a measure of central tendency for a finite set of real numbers, computed by taking the product of their values, and then, finding the nth root. When focusing on matrices, the MGM is a tool that averages a set of matrices with positive definiteness. In some recent papers [9,10], the geometric mean for two positive definite (PD) matrices has been identified as the midpoint of the geodesic joining the two matrices.

When faced with two matrices, the determination of the MGM necessitates the consideration of the following function:

ϕ : A^{n} \times A^{n} \to A^{n},

(3)

wherein

A^{n}

stands for the set for all

n \times n

HPD (Hermitian PD) matrices. Here,

G M e a n (W, Z)

could be provided by [11]

W # Z : = G M e a n (W, Z) = W {(W^{- 1} Z)}^{\frac{1}{2}},

(4)

which is a sub-case of the formulation below for

t \in R

[12]:

W #_{t} B : = W {(W^{- 1} Z)}^{t} .

(5)

In [13] (p. 105), the following formulation was provided for computing the MGM:

W # Z = W^{\frac{1}{2}} {(W^{- \frac{1}{2}} Z W^{- \frac{1}{2}})}^{\frac{1}{2}} W^{\frac{1}{2}},

(6)

for the HPD W and Z matrices of suitable dimensions. For

G M e a n (\cdot, \cdot)

, we have

G M e a n (V, I) : = diag (\sqrt{v_{1}}, \sqrt{v_{2}}, \dots, \sqrt{v_{n}}),

(7)

wherein

V = diag (v_{1}, v_{2}, \dots, v_{n})

stands for a diagonal matrix with

v_{i} > 0

, and I stands for the unit matrix. It can be asserted that

W # Z

possesses all the attributes essential for a geometric mean [14], such as

Z # W = W # Z .

(8)

If Z and W commute with each other, then we have

W # Z = {(W Z)}^{\frac{1}{2}} .

Here, X stands for the matrix square root (principal) of W and it is given by

X = W^{\frac{1}{2}}

as the solution of the following matrix equation:

X^{2} = W,

(9)

where here W does not have real non-positive eigenvalues. In fact, the matrix

W # Z

solves the following Riccati equation ([13]; p. 106):

Z = X W^{- 1} X .

(10)

Additionally, by using the characteristics of the square root (principal), one has

W {(W^{- 1} Z)}^{\frac{1}{2}} = {(Z W^{- 1})}^{\frac{1}{2}} W = Z {(Z W^{- 1})}^{\frac{1}{2}} = {(W Z^{- 1})}^{\frac{1}{2}} Z = W # Z .

(11)

The MGM has several important features as follows:

\begin{matrix} W # W & = & W, \\ {(W # Z)}^{- 1} & = & W^{- 1} # Z^{- 1}, \\ W # Z & \leq & \frac{1}{2} (W + Z) . \end{matrix}

(12)

1.3. Goals

The objective of this article is to introduce a novel approach for computing (11) for two appropriate matrices by initially determining the MSF.

It is demonstrated that this iterative technique achieves global convergence for this purpose, provided a suitable initial approximation, with a fourth-order convergence rate.
Detailed convergence proofs and numerical simulations are provided.
It can be inferred that the proposed scheme serves as an effective tool for computing (4) of two HPD matrices.
An advantage of the proposed method is its ability to obtain larger attraction basins, resulting in a larger convergence radius compared to similar methods for computing the matrix sign function. This leads to faster convergence, thereby reducing the total number of matrix multiplications.

From both practical and theoretical standpoints, the quest for the computation of the MGM holds significance. This endeavor often relies on iterative techniques, prominently leveraging various matrix–matrix products.

1.4. Structure

The rest of this work is structured as follows. Section 2 furnishes some techniques for determining the MSF. Subsequently, Section 3 elucidates the utility of high-order schemes and introduces a solver tailored for addressing nonlinear equations. The iterative approach is extended to handle matrices and its efficacy is substantiated via analysis, showing a convergence order of four in Section 4. In Section 5, we investigate the attraction basins to guarantee the global convergence behavior compared to analogous methods. The stability of the proposed matrix iteration is discussed in Section 6. Section 7 discusses the extension of the scheme for computing the MGM for two HPD matrices. Section 8 presents the outcomes of our numerical investigation, validating our theoretical insights and highlighting the practicality of our approach. Finally, Section 9 offers our concluding remarks.

2. Several Existing Iterations

Let f represent a real-valued function and have the nonlinear problem [15]

f (t) = 0 .

(13)

In the case where

f (ξ) = 0

,

ξ

is identified as a root of f. Given that (13) typically lacks an exact solution in a general context, it becomes imperative to seek an approximate solution through iterative approaches [16,17]. Newton’s method stands out as a foundational iterative technique for this purpose, boasting convergence order and efficiency index values of 2 and 2.41, respectively. Alternatively, the root can be sought employing the fixed-point scheme in the format below:

k_{q + 1} = g (k_{q}), q = 0, 1, 2, \dots .

(14)

The field of iterative approaches finds fruitful application in solving matrix-related challenges, including the computation of matrix functions, as highlighted in works such as [18,19].

Let us recall an efficient general family of methods for finding the MSF here. The authors in [20] provided a general framework as a family of methods for calculating (1). Considering

ι_{1} + ι_{2} \geq 1

, then Ref. [20] discussed that the iteration structure below,

k_{q + 1} = \frac{k_{q} P_{ι_{1}} (1 - k_{q}^{2})}{Q_{ι_{2}} (1 - k_{q}^{2})} : = ψ_{2 ι_{1} + 1, 2 ι_{2}},

(15)

converges with convergence speed

ι_{1} + ι_{2} + 1

to

\pm 1

. Therefore, the quadratically convergent Newton’s solver can be obtained by

K_{q + 1} = \frac{1}{2} (K_{q}^{- 1} + K_{q}),

(16)

where

K_{0} = W,

(17)

is the initial guess and W represents the input matrix based on (1). Observing that the reciprocal Padé approximations can be formulated using the inverses of (15), we recognize that Newton’s method offers an iterative strategy for approximating (1); for more, see [21,22].

Employing (15), the following famous methods, specifically, the locally convergent Newton–Schulz solver that does not require matrix inversion,

K_{q + 1} = \frac{1}{2} K_{q} (3 I - K_{q}^{2}),

(18)

and the globally convergent Halley’s solver,

K_{q + 1} = [K_{q}^{2} + I] {[K_{q} (3 I + K_{q}^{2})]}^{- 1},

(19)

can be extracted. Further state-of-the-art developments can be observed in [23,24].

3. A Multi-Step Method for Nonlinear Equations

Initially, we introduce the following secant-type iterative technique devoid of memory to address (30); see also the discussions in [25,26,27]. Let us consider the following structure:

\begin{matrix} \{\begin{matrix} d_{q} = k_{q} - f^{'} {(k_{q})}^{- 1} f (k_{q}), q = 0, 1, \dots, \\ y_{q} = k_{q} - \frac{i_{1} f (k_{q}) - i_{2} f (d_{q})}{i_{3} f (k_{q}) - i_{4} f (d_{q})} \frac{f (k_{q})}{f^{'} (k_{q})}, \\ k_{q + 1} = y_{q} - f {[y_{q}, k_{q}]}^{- 1} f (y_{q}), \end{matrix} \end{matrix}

(20)

which gives the following error equation:

\begin{matrix} ϵ_{q + 1} = \frac{a_{2} (i_{3} - i_{1})}{i_{3}} ϵ_{q}^{2} + O (ϵ_{q}^{3}), \end{matrix}

(21)

where

a_{j} = {(f^{'} (ξ) j!)}^{- 1} (f^{(j)} (ξ)), and ϵ_{q} = k_{q} - ξ .

The connection described in (21) leads to the choice of

i_{1} = i_{3}

, thus converting the error equation to

\begin{matrix} ϵ_{q + 1} = \frac{a_{2}^{2} (i_{1} + i_{2} - i_{4})}{i_{1}} ϵ_{q}^{3} + O (ϵ_{q}^{4}) . \end{matrix}

(22)

Hence, it is imperative to ascertain the remaining undetermined coefficients in such a manner that guarantees

i_{1} + i_{2} - i_{4} = 0,

thereby diminishing the newly appeared asymptotic (22). Furthermore, their choice should strive to minimize the ensuing error equation, specifically,

\frac{(3 i_{1} a_{2} a_{3} (i_{1} + i_{2} - i_{4}) - a_{2}^{3} (- i_{4} (5 i_{1} + i_{2}) + i_{1} (3 i_{1} + 5 i_{2}) + i_{4}^{2}))}{i_{1}^{2}} ϵ_{q}^{4} .

We choose now

i_{1} = i_{3} = 29

,

i_{2} = 30

, and

i_{4} = 59

. The second substep outlined in (23) marks an advancement from the procedure outlined in [28]. Moreover, this methodology is devised to widen the attraction basins, offering a comparative advantage over other methods with similar attributes. Hence, we derive the following iterative method:

\begin{matrix} \{\begin{matrix} d_{q} = k_{q} - f^{'} {(k_{q})}^{- 1} f (k_{q}), \\ y_{q} = k_{q} - \frac{29 f (k_{q}) - 30 f (d_{q})}{29 f (k_{q}) - 59 f (d_{q})} \frac{f (k_{q})}{f^{'} (k_{q})}, \\ k_{q + 1} = y_{q} - f {[y_{q}, k_{q}]}^{- 1} f (y_{q}), \end{matrix} \end{matrix}

(23)

where the divided difference operator (see, e.g., [29]) is obtained via

f [l, j] : = (f (j) - f (l)) {(j - l)}^{- 1} .

Theorem 1.

Assume ξ in D as a single zero of

f : D \subseteq C \to C

that is a differentiable function (sufficiently). Additionally, let us consider that

k_{0}

is sufficiently close to the solution. Consequently, the iterates produced by (23) exhibit a convergence of at least fourth order.

Proof.

By expanding

f (k_{q})

and

f^{'} (k_{q})

around

ξ

, we obtain

f (k_{q}) = f^{'} (ξ) [ϵ_{q} + a_{2} ϵ_{q}^{2} + a_{3} ϵ_{q}^{3} + a_{4} ϵ_{q}^{4} + a_{5} ϵ_{q}^{5} + O (ϵ_{q}^{6})],

(24)

and

f^{'} (k_{q}) = f^{'} (ξ) [1 + 2 a_{2} ϵ_{q} + 3 a_{3} ϵ_{q}^{2} + 4 a_{4} ϵ_{q}^{3} + 5 a_{5} ϵ_{q}^{4} + O (ϵ_{q}^{5})] .

(25)

Now, from (24) and (25), one obtains

d_{q} = ξ + a_{2} ϵ_{l}^{2} + (- 2 a_{2}^{2} + 2 a_{3}) ϵ_{l}^{3} - (- 4 a_{2}^{3} + 7 a_{2} a_{3} - 3 a_{4}) ϵ_{l}^{4} + O (ϵ_{l}^{5}) .

(26)

By expanding

f (d_{q})

around

ξ

and using (26), it is possible to write

y_{q} = ξ - \frac{1}{29} a_{2}^{2} ϵ_{l}^{4} + (\frac{927 a_{2}^{3}}{841} - \frac{33 a_{2} a_{3}}{29}) ϵ_{l}^{4} + O (ϵ_{l}^{5}) .

(27)

From (27) and (24), one obtains that

f [y_{q}, k_{q}] = f^{'} (k_{q}) + a_{2} f^{'} (k_{q}) ϵ_{l}^{1} + a_{3} f^{'} (k_{q}) ϵ_{l}^{2} (a_{4} f^{'} (k_{q}) - \frac{a_{2}^{3} f^{'} (k_{q})}{29}) ϵ_{l}^{3} + O (ϵ_{l}^{4}) .

(28)

Now, by the use of (27) and (28) we attain

ϵ_{q + 1} = y_{q} - f {[y_{q}, k_{q}]}^{- 1} f (y_{q}) - ξ = - \frac{1}{29} a_{2}^{3} ϵ_{q}^{4} + O (ϵ_{q}^{5}) .

(29)

The proof is complete. □

4. Expanding to the Matrix Context

Using (23) to address solving

F (U) : = U^{2} - I = 0,

(30)

leads to the following scheme:

K_{q + 1} = 2 K_{q} (37 I + 72 K_{q}^{2} + 7 K_{q}^{4}) {[15 I + 146 K_{q}^{2} + 71 K_{q}^{4}]}^{- 1} .

(31)

Since the convergence of the iterative methods must be performed for ±, so each constructed iteration in this category can be written as a fraction in the scalar form, which means that its reciprocal can be convergent to ∓. Due to this similarity, one can derive the reciprocal version of (31) and express it as follows:

K_{q + 1} = (15 I + 146 K_{q}^{2} + 71 K_{q}^{4}) {[2 K_{q} (37 I + 72 K_{q}^{2} + 7 K_{q}^{4})]}^{- 1} .

(32)

The process begins with an initial value and progressively refines the estimation with each iterate until reaching convergence. This iterative characteristic proves beneficial when handling intricate or sizable matrices, as direct methods might entail high computational costs. Currently, we are examining the convergence properties of (32) to establish a convergence outcome.

Theorem 2.

When determining the sign of matrix W, under the condition of no eigenvalues residing on the imaginary axis, we begin with an initial approximation

K_{0}

sufficiently near to U, selected using (17). This choice ensures commutativity with W. Consequently, the scheme (32) (or equivalently (31)) converges towards the sign matrix U with a convergence rate of four.

Proof.

The method we introduce requires matrix multiplications, similar to its competitors. However, much of the convergence theory for our method relies on computing eigenvalues (see, e.g., [30,31]) from one iteration to the next. Let us employ the Jordan block matrix J to decompose W in the following manner using the invertible matrix L:

W = L J L^{- 1} .

(33)

Utilizing this, in conjunction with the iterative approach, results in an iteration structure akin to the original iteration structure, albeit focusing on the eigenvalues transitioning from step q to step

q + 1

, as demonstrated below:

\begin{matrix} λ_{q + 1}^{i} = & (15 + 146 {λ_{q}^{i}}^{2} + 71 {λ_{q}^{i}}^{4}) \\ \times {[2 λ_{q}^{i} (37 + 72 {λ_{q}^{i}}^{2} + 7 {λ_{q}^{i}}^{4})]}^{- 1}, 1 \leq i \leq n, \end{matrix}

(34)

where

b_{i} = sign λ_{q}^{i} (λ) = \pm 1

. Generally, (34) reveals that the eigenvalues tend to

b_{i} = \pm 1

, i.e.,

lim_{q \to \infty} |\frac{λ_{q + 1}^{i} - b_{i}}{λ_{q + 1}^{i} + b_{i}}| = 0 .

(35)

This indicates convergence and suggests that the eigenvalues approach

\pm 1

with each iteration, leading to eigenvalue clustering during the iterative process. Following the examination of convergence, determining the convergence rate becomes essential. For this purpose, it is taken into account that

Θ_{q} = 2 K_{q} (37 I + 72 K_{q}^{2} + 7 K_{q}^{4}) .

(36)

Thus, we can write the following:

\begin{matrix} K_{q + 1} - U & = & (15 I + 146 K_{q}^{2} + 71 K_{q}^{4}) Θ_{q}^{- 1} - U \\ = & [15 I + 146 K_{q}^{2} + 71 K_{q}^{4} - U Θ_{q}] Θ_{q}^{- 1} \\ = & [15 I + 146 K_{q}^{2} + 71 K_{q}^{4} - 74 K_{q} U - 144 K_{q}^{3} U - 14 K_{q}^{5} U] Θ_{q}^{- 1} \\ = & [- 15 {(K_{q} - U)}^{4} + 14 K_{q} U (K^{4} - 4 K_{q}^{3} U + 6 K_{q}^{2} U^{2} - 4 K_{q} U^{3} + I)] Θ_{q}^{- 1} \\ = & [- 15 {(K_{q} - U)}^{4} + 14 K_{q} U {(K_{q} - U)}^{4}] Θ_{q}^{- 1} \\ = & {(K_{q} - U)}^{4} [- 15 I + 14 K_{q} U] Θ_{q}^{- 1} . \end{matrix}

(37)

By employing (37), one can derive the following:

∥ K_{q + 1} - U ∥ \leq (∥ Θ_{q}^{- 1} ∥ ∥ 15 I - 14 K_{q} U ∥) {∥ K_{q} - U ∥}^{4},

(38)

indicating a convergence order of four for (32). This concludes the proof. The analysis of error for (31) can be inferred in a similar manner. □

5. Attraction Basins

It is essential to highlight how the suggested approach can be contrasted with its counterparts from the Padé iterations for computing the MSF. The fourth-order methods within the Padé family can be outlined as [20]

K_{q + 1} = [I + 6 K_{q}^{2} + K_{q}^{4}] {[4 K_{q} (I + K_{q}^{2})]}^{- 1}, Padé [1, 2],

(39)

K_{q + 1} = [4 K_{q} (I + K_{q}^{2})] {[I + 6 K_{q}^{2} + K_{q}^{4}]}^{- 1}, Reciprocal of Padé [1, 2] .

(40)

Constructing high-order schemes is only practical if they can rival existing methods in performance and computational cost. Hence, comparing (31) and (32) to (39) and (40) is crucial to demonstrate that the presented solver maintains the same convergence order as the established Padé methods while requiring similar computational resources. Moreover, its larger convergence radii, as evidenced by the corresponding basins of attractions, underscore its superiority.

To assess the global convergence and broader attraction basins of the presented solver compared to its counterparts, attraction basins are plotted. The region

[- 2, 2] \times [- 2, 2] \in C

is partitioned into a grid of initial points, each tested for convergence based on the criterion

| k_{q}^{2} - 1 | \leq 10^{- 2}

. Diverging points are marked in black. The numerical results are depicted in Figure 1 and Figure 2, with shading indicating the number of iterations required for convergence.

While Newton’s solver and iterative methods (31) and (39) exhibit global convergence, the attraction basins for (31) and (32) show lighter areas, suggesting faster convergence compared to their Padé counterparts.

6. Stability

The stability analysis regarding (32) is presented in the following theorem. Theorem 3 extends a fundamental outcome discussed in [32] concerning the stability of pure matrix iterations. Additionally, it mentions that:

$U^{2} = I$ ;
$U^{- 1} = U$ .

Theorem 3.

Using (32) and considering that W does not possess any purely imaginary eigenvalues, we can conclude that the sequence

{K_{q}}_{q = 0}^{\infty}

, with

K_{0} = W

, remains asymptotically stable.

Proof.

Suppose that

β_{q}

represents a perturbation in the computational process of the iterative method at the q-th iteration, and express this as follows:

{\tilde{K}}_{q} = K_{q} + β_{q} .

(41)

Now, let us conduct a first-order error analysis, indicating that for all

i \geq 2

,

{(β_{q})}^{i} \approx 0 .

(42)

If

β_{q}

is sufficiently small, then (42) holds well, allowing one to express

\begin{matrix} {\tilde{K}}_{q + 1} = & [15 I + 146 {\tilde{K}}_{q}^{2} + 71 {\tilde{K}}_{q}^{4}] {[2 {\tilde{K}}_{q} (37 I + 72 {\tilde{K}}_{q}^{2} + 7 {\tilde{K}}_{q}^{4})]}^{- 1} . \end{matrix}

(43)

As we reach a large-enough value for q, indicating the convergence phase, we assume that

K_{q}

is approximately equal to

sign (W)

, denoted as U. Through significant simplifications, we derive that

{\tilde{K}}_{q + 1} \approx (U + \frac{1}{2} β_{q} - \frac{1}{2} U β_{q} U) .

(44)

Using

β_{q + 1} = {\tilde{K}}_{q + 1} - K_{q + 1},

we can write

β_{q + 1} \approx \frac{1}{2} β_{q} - \frac{1}{2} U β_{q} U .

(45)

This results in the fact that the next iteration, denoted as

q + 1

, remains within certain limits, meaning

∥ β_{q + 1} ∥ \leq \frac{1}{2} ∥ β_{0} - U β_{0} U ∥ .

(46)

Hence, the sequence

{K_{q}}_{q = 0}^{\infty}

generated by (32) achieves asymptotic stability. With that, the proof comes to a close. □

7. Extension to MGM

An efficient way to calculate the geometric mean of two HPD matrices W and Z, without having to find the matrix square roots (principal), relies on (refer to [4] (page 131))

sign ([\begin{matrix} 0 & W \\ Z^{- 1} & 0 \end{matrix}]) = [\begin{matrix} 0 & T \\ T^{- 1} & 0 \end{matrix}],

(47)

and therefore, the mean can be obtained as follows:

T = W {(Z^{- 1} W)}^{- \frac{1}{2}} = W {(W^{- 1} Z)}^{\frac{1}{2}} = W # Z .

(48)

If the starting matrix is chosen correctly and the matrices do not have eigenvalues on the imaginary axis, there will not be any breakdown when computing the inverse matrix for (31) or (16). It is worth mentioning that for any appropriate matrix E such that

W + E

is PD, we have

sign (\begin{matrix} 0 & W + E \\ I & 0 \end{matrix}),

(49)

as a fixed value of (32).

8. Computational Aspects

Various methods examined previously are contrasted under equivalent conditions within Mathematica [33]. To demonstrate the effectiveness of the innovative approach, we conduct computational simulations of various sizes. The subsequent termination criterion in

l_{\infty}

is employed

∥ K_{q + 1} - K_{q} ∥_{\infty} \leq ε .

(50)

The Cauchy stopping criterion (50) can be employed instead of the convergence criterion

K_{q + 1}^{2} - I = 0

, as it is significantly easier to implement in higher dimensions. This approach circumvents the need for additional matrix powers in the algorithmic step, leading to faster convergence by eliminating one further matrix–matrix multiplication.

The globally convergent methods (31), (32), (39), (40), and (16) are denoted in this section as PM1, PM2, PD1, PD2, and NM2, respectively. All the fourth-order methods require four matrix products and one matrix inverse per cycle. Here, we tackle the Riccati problem (10) for two HPD matrices specified by

W = {(\begin{matrix} 2 & 0 & 1 \\ 0 & 2 & 0 & 1 \\ 1 & 0 & 2 & 0 \\ ⋱ & ⋱ & ⋱ & ⋱ & ⋱ \\ 1 & 0 & 2 \end{matrix})}_{n \times n},

Z = {(\begin{matrix} \frac{3}{2} & \frac{2}{3} \\ \frac{2}{3} & ⋱ & ⋱ \\ ⋱ & ⋱ & \frac{2}{3} \\ \frac{2}{3} & \frac{3}{2} \end{matrix})}_{n \times n} .

Several details are in order:

We consider different sizes and employ the same termination criterion.
The inverse of matrix W in (10) was calculated directly, after which both matrices were used in the iterative methods for comparative analysis.
The comparison outcomes for different iterative techniques are provided in Figure 3 and Figure 4.
All the iterative methods having fourth order examined here incur an equivalent computational expense concerning matrix–matrix products and inverse calculations.

The findings concerning the calculation of the geometric mean of the two HPD matrices demonstrate the superiority of PM1 and PM2 over their counterparts of similar order, showcasing their efficiency. Clearly the computational CPU time for PM1 and PM2 decreases in contrast to PD1 and PD2 since they use the same number of matrix–matrix products and inverses per computing cycle but PM1 and PM2 have larger attraction basins based on the discussions in Section 5. The MGM is applicable only to HPD matrices with positive real eigenvalues. In contrast, the proposed method for the matrix sign function can be applied to all matrices with complex eigenvalues, provided none of them lie on the imaginary axis.

It is worth noting that such iterative approaches can be expedited (in a similar way as in Newton’s method [4]) by computing an additional parameter at each iteration and substituting

K_{q}

with

μ_{q} K_{q}

, as outlined below:

μ_{q} = \{\begin{matrix} | \det (K_{q}) |^{\frac{- 1}{n}}, (determinantal scaling), \\ \sqrt{\frac{ρ (K_{q}^{- 1})}{ρ (K_{q})}}, (spectral scaling), \\ \sqrt{\frac{∥ K_{q}^{- 1} ∥}{∥ K_{q} ∥}}, (norm scaling) . \end{matrix}

(51)

We conclude this section by emphasizing the significance of learning procedures within the realm of artificial intelligence and machine learning models [34,35,36]. Designing a strategy based on machine learning tools could efficiently accelerate the convergence of such iterative structures by developing a model that quickly transitions the initial matrix into the convergence phase. This could be focused on in future works on this field.

9. Conclusions

The concept of the geometric mean, initially defined for positive scalars, can be extended to HPD matrices in multiple ways. These extensions aim to capture essential properties akin to those expected of a mean, to varying degrees. A practical use of the MSF arises in determining the MGM of two HPD matrices. This is particularly necessary in addressing a specific category of nonlinear matrix equations like (10).

In this study:

We introduced a computationally intensive approach for determining the sign of a matrix, which was subsequently demonstrated to exhibit a fourth-order convergence order.
The new method demonstrates global convergence and competes favorably against prominent alternatives from the Padé solvers.
The stability of the scheme was brought forward.
Computational experiments were conducted to show the efficacy of our iterative technique (and its reciprocal) across various test scenarios.

Forthcoming research lines can be concentrated on two aspects. First, it would be more efficient if a sharper initial matrix could be designed so as to put the iterative approach much closer to the convergence phase, leading to a faster convergence. And second, it is favorable to improve the results by extending them to higher orders while possessing larger attraction basins when compared to the exiting multiplication-rich methods from the Padé family of methods.

Author Contributions

Conceptualization, T.L. (Tao Liu), T.L. (Ting Li), M.Z.U., A.K.A. and S.S.; formal analysis, T.L. (Tao Liu), T.L. (Ting Li), M.Z.U., A.K.A. and S.S.; funding acquisition, T.L. (Tao Liu) and S.S.; investigation, T.L. (Tao Liu), T.L. (Ting Li), M.Z.U., A.K.A. and S.S.; methodology, T.L. (Tao Liu), M.Z.U., A.K.A. and S.S.; supervision, T.L. (Tao Liu), M.Z.U., A.K.A. and S.S.; validation, T.L. (Tao Liu), M.Z.U., A.K.A. and S.S.; writing—original draft, T.L. (Tao Liu), T.L. (Ting Li), M.Z.U., A.K.A. and S.S.; writing—review and editing, T.L. (Tao Liu), T.L. (Ting Li), M.Z.U., A.K.A. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

The research of the first author was funded by the Research Project on Graduate Education and Teaching Reform of Hebei Province of China (YJG2024133).

Data Availability Statement

In terms of the data availability statement, it is affirmed that there is no data sharing applicable to this article, as there were no new data created throughout the course of this work.

Acknowledgments

The fourth author states that: This research work was funded by Institutional Fund Projects under grant no. (IFPIP: 1331-130-1443). The authors gratefully acknowledge technical and financial support provided by the Ministry of Education and King Abdulaziz University, DSR, Jeddah, Saudi Arabia.

Conflicts of Interest

The authors affirm that there are no personal relationships or identifiable conflicting financial interests that could be perceived to influence the research presented in this manuscript.

References

Denman, E.D.; Beavers, A.N. The matrix sign function and computations in systems. Appl. Math. Comput. 1976, 2, 63–94. [Google Scholar] [CrossRef]
Hogben, L. Handbook of Linear Algebra; Chapman and Hall/CRC: Boca Raton, FL, USA, 2007. [Google Scholar]
Roberts, J.D. Linear model reduction and solution of the algebraic Riccati equation by use of the sign function. Int. J. Cont. 1980, 32, 677–687. [Google Scholar] [CrossRef]
Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2008. [Google Scholar]
Li, X.-P.; Nunes, R.W.; Vanderbilt, D. Density-matrix electronic-structure method with linear system-size scaling. Phys. Rev. B 1993, 47, 10891–10894. [Google Scholar] [CrossRef] [PubMed]
Shi, L.; Zaka Ullah, M.; Kumar Nashine, H.; Alansari, M.; Shateyi, S. An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices. Fractal Fract. 2023, 7, 684. [Google Scholar] [CrossRef]
Soleymani, F.; Kumar, A. A fourth-order method for computing the sign function of a matrix with application in the Yang—Baxter-like matrix equation. Comput. Appl. Math. 2019, 38, 64. [Google Scholar] [CrossRef]
Al-Mohy, A.; Higham, N. A scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl. 2009, 31, 970–989. [Google Scholar] [CrossRef]
Soleymani, F.; Sharifi, M.; Shateyi, S.; Khaksar Haghani, F. An algorithm for computing geometric mean of two Hermitian positive definite matrices via matrix sign. Abstr. Appl. Anal. 2014, 2014, 978629. [Google Scholar] [CrossRef]
Jebreen, H.B.; Akgül, A. A fast iterative method to find the matrix geometric mean of two HPD matrices. Math. Meth. Appl. Sci. 2019, 42, 5615–5625. [Google Scholar] [CrossRef]
Pusz, G.; Woronowicz, S.L. Functional calculus for sesquilinear forms and the purification map. Rep. Math. Phys. 1975, 8, 159–170. [Google Scholar] [CrossRef]
Lawson, J.D.; Lim, Y. The geometric mean, matrices, metrics and more. Amer. Math. Month. 2001, 108, 797–812. [Google Scholar] [CrossRef]
Bhatia, R. Positive Definite Matrices, Princeton Series in Applied Mathematics; Princeton University Press: Princeton, NJ, USA, 2007. [Google Scholar]
Iannazzo, B. The geometric mean of two matrices from a computational viewpoint. Numer. Lin. Alg. Appl. 2016, 23, 208–229. [Google Scholar] [CrossRef]
McNamee, J.M.; Pan, V.Y. Numerical Methods for Roots of Polynomials—Part I; Academic Press: Cambridge, MA, USA; Elsevier: Amsterdam, The Netherlands, 2007. [Google Scholar]
McNamee, J.M.; Pan, V.Y. Numerical Methods for Roots of Polynomials—Part II; Academic Press: Cambridge, MA, USA; Elsevier: Amsterdam, The Netherlands, 2013. [Google Scholar]
Shil, S.; Nashine, H.K.; Soleymani, F. On an inversion-free algorithm for the nonlinear matrix problem X^αA^*X^−βA + B^*X^−γB = I. Int. J. Comput. Math. 2022, 99, 2555–2567. [Google Scholar] [CrossRef]
Byers, R.; Xu, H. A new scaling for Newton’s iteration for the polar decomposition and its backward stability. SIAM J. Matrix Anal. Appl. 2008, 30, 822–843. [Google Scholar] [CrossRef]
Soheili, A.R.; Soleymani, F. Iterative methods for nonlinear systems associated with finite difference approach in stochastic differential equations. Numer. Algor. 2016, 71, 89–102. [Google Scholar] [CrossRef]
Kenney, C.S.; Laub, A.J. Rational iterative methods for the matrix sign function. SIAM J. Matrix Anal. Appl. 1991, 12, 273–291. [Google Scholar] [CrossRef]
Greco, F.; Iannazzo, B.; Poloni, F. The Padé iterations for the matrix sign function and their reciprocals are optimal. Lin. Algebra Appl. 2012, 436, 472–477. [Google Scholar] [CrossRef]
Soleymani, F.; Stanimirović, P.S.; Shateyi, S.; Haghani, F.K. Approximating the matrix sign function using a novel iterative method. Abstr. Appl. Anal. 2014, 2014, 105301. [Google Scholar] [CrossRef]
Jung, D.; Chun, C.; Wang, X. Construction of stable and globally convergent schemes for the matrix sign function. Lin. Alg. Appl. 2019, 580, 14–36. [Google Scholar] [CrossRef]
Sharma, P.; Kansal, M. Extraction of deflating subspaces using disk function of a matrix pencil via matrix sign function with application in generalized eigenvalue problem. J. Comput. Appl. Math. 2024, 442, 115730. [Google Scholar] [CrossRef]
Haghani, F.K.; Soleymani, F. An improved Schulz-type iterative method for matrix inversion with application. Trans. Inst. Meas. Control. 2014, 36, 983–991. [Google Scholar] [CrossRef]
Ogbereyivwe, O.; Atajeromavwo, E.J.; Umar, S.S. Jarratt and Jarratt-variant families of iterative schemes for scalar and system of nonlinear equations. Iran. J. Numer. Anal. Optim. 2024, 14, 391–416. [Google Scholar]
Dehghani-Madiseh, M. Moore-Penrose inverse of an interval matrix and its application. J. Math. Model. 2024, 12, 145–155. [Google Scholar]
Zaka Ullah, M.; Muaysh Alaslani, S.; Othman Mallawi, F.; Ahmad, F.; Shateyi, S.; Asma, M. A fast and efficient Newton-type iterative scheme to find the sign of a matrix. Aims Math. 2023, 8, 19264–19274. [Google Scholar] [CrossRef]
Khdhr, F.W.; Soleymani, F.; Saeed, R.K.; Akgül, A. An optimized Steffensen-type iterative method with memory associated with annuity calculation. The Euro. Phy. J. Plus 2019, 134, 146. [Google Scholar] [CrossRef]
Cordero, A.; Soleymani, F.; Torregrosa, J.R.; Zaka Ullah, M. Numerically stable improved Chebyshev–Halley type schemes for matrix sign function. J. Comput. Appl. Math. 2017, 318, 189–198. [Google Scholar] [CrossRef]
Liu, T.; Zaka Ullah, M.; Alshahrani, K.M.A.; Shateyi, S. From fractal behavior of iteration methods to an efficient solver for the sign of a matrix. Fractal Fract. 2023, 7, 32. [Google Scholar] [CrossRef]
Iannazzo, B. Numerical Solution of Certain Nonlinear Matrix Equations. Ph.D. Thesis, Universita degli studi di Pisa, Pisa, Italy, 2007. [Google Scholar]
Hoste, J. Mathematica Demystified; McGraw-Hill: New York, NY, USA, 2009. [Google Scholar]
Larijani, A.; Dehghani, F. An efficient optimization approach for designing machine models based on combined algorithm. FinTech 2024, 3, 40–54. [Google Scholar] [CrossRef]
Mohammad, M.; Trounev, A.; Cattani, C. Stress state and waves in the lithospheric plate simulation: A 3rd generation AI architecture. Results Phys. 2023, 53, 106938. [Google Scholar] [CrossRef]
Mohammadabadi, S.M.S.; Yang, L.; Yan, F.; Zhang, J. Communication-efficient training workload balancing for decentralized multi-agent learning. arXiv 2024, arXiv:2405.00839. [Google Scholar]

Figure 1. Basins of attraction shaded based upon the number iterations required to fulfill the convergence criterion for (16) (left) and (39) (right).

Figure 2. Basins of attraction shaded based upon the number iterations required to fulfill the convergence criterion for (31) (left) and (32) (right).

Figure 3. Simulation results for different tolerances in the stopping criterion. It shows PM1 and PM2 arrive at the convergence phase quicker than their competitors of the same order in a smaller number of iterations.

Figure 4. Simulation results for different dimensions. It shows PM1 and PM2 arrive at the convergence phase quicker than their competitors of the same order in a smaller number of iterations.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, T.; Li, T.; Ullah, M.Z.; Alzahrani, A.K.; Shateyi, S. An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean. Mathematics 2024, 12, 1772. https://doi.org/10.3390/math12111772

AMA Style

Liu T, Li T, Ullah MZ, Alzahrani AK, Shateyi S. An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean. Mathematics. 2024; 12(11):1772. https://doi.org/10.3390/math12111772

Chicago/Turabian Style

Liu, Tao, Ting Li, Malik Zaka Ullah, Abdullah Khamis Alzahrani, and Stanford Shateyi. 2024. "An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean" Mathematics 12, no. 11: 1772. https://doi.org/10.3390/math12111772

APA Style

Liu, T., Li, T., Ullah, M. Z., Alzahrani, A. K., & Shateyi, S. (2024). An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean. Mathematics, 12(11), 1772. https://doi.org/10.3390/math12111772

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Efficient Iterative Approach for Hermitian Matrices Having a Fourth-Order Convergence Rate to Find the Geometric Mean

Abstract

1. Introduction

1.1. The Sign for a Matrix

1.2. Matrix Geometric Mean (MGM)

1.3. Goals

1.4. Structure

2. Several Existing Iterations

3. A Multi-Step Method for Nonlinear Equations

4. Expanding to the Matrix Context

5. Attraction Basins

6. Stability

7. Extension to MGM

8. Computational Aspects

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI