An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices

Shi, Lei; Ullah, Malik Zaka; Nashine, Hemant Kumar; Alansari, Monairah; Shateyi, Stanford

doi:10.3390/fractalfract7090684

Open AccessArticle

An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices

by

Lei Shi

¹

,

Malik Zaka Ullah

²

,

Hemant Kumar Nashine

³,

Monairah Alansari

² and

Stanford Shateyi

^4,*

¹

School of Mathematics and Statistics, Anyang Normal University, Anyang 455002, China

²

Mathematical Modelling and Applied Computation (MMAC) Research Group, Department of Mathematics, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia

³

Mathematics Division, School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal-Indore Highway, Kothrikalan, Sehore 466114, Madhya Pradesh, India

⁴

Department of Mathematics and Applied Mathematics, School of Mathematical and Natural Sciences, University of Venda, Private Bag X5050, Thohoyandou 0950, South Africa

^*

Author to whom correspondence should be addressed.

Fractal Fract. 2023, 7(9), 684; https://doi.org/10.3390/fractalfract7090684

Submission received: 22 August 2023 / Revised: 7 September 2023 / Accepted: 12 September 2023 / Published: 14 September 2023

(This article belongs to the Special Issue Fractional-Order Chaotic System: Control and Synchronization, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

The computation of the sign function of a matrix plays a crucial role in various mathematical applications. It provides a matrix-valued mapping that determines the sign of each eigenvalue of a nonsingular matrix. In this article, we present a novel iterative algorithm designed to efficiently calculate the sign of an invertible matrix, emphasizing the enlargement of attraction basins. The proposed solver exhibits convergence of order four, making it highly efficient for a wide range of matrices. Furthermore, the method demonstrates global convergence properties. We validate the theoretical outcomes through numerical experiments, which confirm the effectiveness and efficiency of our proposed algorithm.

Keywords:

matrix function; fractal behavior; sign; iterative method; invertible; Newton’s method

MSC:

65F60; 41A25

1. Introductory Notes

The matrix sign function, also known as the matrix signum function or simply matrix sign, is a mathematical function that operates on matrices, returning a matrix of the same size with elements representing the signs of the corresponding elements in the input matrix. The concept of the sign function can be traced back to the scalar sign function, which operates on individual real numbers ([1], Chapter 11). The scalar sign function offers +1 for positive numbers,

- 1

for negative numbers, and 0 for zero.

The extension of the sign function to matrices emerged as a natural generalization. This function was introduced to facilitate the study of matrix theory and to develop new algorithms for solving matrix equations and systems. This function provides valuable information about the structure and properties of matrices. The earliest references to the matrix sign could be found in the mathematical literature in the late 1960s and early 1970s [2]. This function for a square nonsingular matrix

M \in C^{n \times n}

, i.e.,

sign (M) = N,

(1)

can be expressed as follows [3] (p. 107):

N = \frac{2}{π} M \int_{0}^{\infty} {(t^{2} I + M^{2})}^{- 1} d t,

(2)

where I is is the unit matrix.

Several authors who contributed to the development and study of the matrix sign function include Nicholas J. Higham, Charles F. Van Loan, and Gene H. Golub; see the textbook [3]. Since its introduction, it has found applications in various areas of mathematics and scientific computing. It has been utilized in numerical analysis, linear algebra algorithms, control theory, graph theory, and stochastic differential equations; see, e.g., [4]. In recent years, research has focused on refining algorithms for efficiently computing the matrix sign function, higher-order iterative methods, and exploring its connections to other matrix functions and properties. This function has proven to be a fruitful tool for characterizing and manipulating matrices in a wide range of disciplines. Expanding the discussion, the work by Al-Mohy and Higham [5], while primarily centered on the matrix exponential, introduced a highly influential algorithm that serves as a fundamental basis for computing various other matrix functions, including (1).

Here are some areas where this function finds utility. 1. Stability analysis: In control theory and dynamical systems, the matrix sign function is employed to analyze the stability of linear time-invariant systems. By examining the signs of the eigenvalues of a system’s matrix representation, the matrix sign function helps determine stability properties, such as asymptotic stability or instability [6]. 2. Matrix fraction descriptions: This function plays a key appearance in representing matrix fraction descriptions of linear time-invariant sets. Matrix fractions are used in system theory to describe transfer functions, impedance matrices, and other related quantities. Then, (1) is employed in the realization theory of these matrix fractions [7]. 3. Robust control: Robust control techniques aim to design control systems that can handle uncertainties and disturbances. This function is utilized in robust control algorithms to assess the stability and performance of uncertain systems. It helps analyze the worst-case behavior of the system under varying uncertain conditions [2]. 4. Discrete-time systems: In the analysis and design of discrete-time systems, the matrix sign function aids in studying the stability and convergence properties of difference equations. It allows one the examination of the spectral radius of a matrix and determination of the long-term behavior of discrete-time systems [3]. 5. Matrix equations: This function is employed in solving matrix equations. For example, it can be employed in the calculation of the matrix square root or the matrix logarithm, which have applications in numerical methods, optimization, and signal processing [8].

It is an interesting choice to compute (1) by numerical iterative methods which are mainly constructed by resolving the following nonlinear matrix equations:

F (H) : = H^{2} - I = 0 .

(3)

The matrix

H = N

from (1) is the solution of (3) since it satisfies the definition of

N^{2} = I

. In this work, we focus on proposing a new solver for (3) in the scalar format and then extending for the matrix environment in order to be computationally economic when compared to the famous iteration solvers of the same type to find (1) of a nonsingular matrix.

The remainder of this work is arranged as follows. In Section 2, we look at some important methods for figuring out the sign of a matrix. Then, in Section 3, we explain why higher-order methods are useful and introduce a solver for solving nonlinear scalar equations. We extend this solver to work with matrices and prove its effectiveness through detailed analysis, showing that it has a convergence order of four. We also examine the attraction basins to ensure the solver works globally and covers a wider range compared to similar methods. We discuss its stability as well. In Section 4, we present the results of our numerical study to validate our theoretical findings, demonstrating the usefulness of our method. Finally, in Section 5, we provide our conclusions.

2. Iteration Methods

Kenney and Laub in [9] presented a general and important family of iterations for finding (1) via the application of the Padé approximations:

f (α) = 1 / \sqrt{(1 - α)} .

We assume now that the

(ι_{1}, ι_{2})

-Padé fraction approximate to

f (α)

is given as

\frac{P_{ι_{1}} (α)}{Q_{ι_{2}} (α)},

where Q, P are polynomials of suitable degrees in the Padé approximation and

ι_{1} + ι_{2} \geq 1

. Then, [9] discussed that the iteration method below,

h_{l + 1} = \frac{h_{l} P_{ι_{1}} (1 - h_{l}^{2})}{Q_{ι_{2}} (1 - h_{l}^{2})} : = ψ_{2 ι_{1} + 1, 2 ι_{2}},

(4)

converges with convergence speed

ι_{1} + ι_{2} + 1

to

\pm 1

. Thus, the second-order Newton’s method (NIM) can be constructed as follows:

H_{l + 1} = \frac{1}{2} (H_{l}^{- 1} + H_{l}),

(5)

where

H_{0} = M

(6)

is the starting matrix and M represents the input matrix as given in (1). We note that the reciprocal Padé-approximations can be defined based on reciprocals of (4). Newton’s method provides an iterative approach to approximating (1). It starts with an initial matrix and then improves the approximation in each iteration until convergence is achieved. This iterative nature makes it useful when dealing with complex matrices or large matrices where direct methods may be computationally expensive.

Newton’s iterative scheme plays an important role in finding (1) due to its effectiveness in approximating the solution iteratively. Some reasons highlighting the importance of Newton’s iteration for computing (1) lie in the advantages of iterative approximation (see [10]). It has quadratic convergence properties, especially when the initial matrix is close to the desired solution. Also, it needs the calculation of the derivative of the function being approximated. In the case of finding (1), this involves the derivative of the sign function, which can be derived analytically. Utilizing the derivative information can help guide the iterative process towards the solution.

Employing (4), the subsequent renowned techniques, namely the locally convergent inversion-free Newton–Schulz solver,

H_{l + 1} = \frac{1}{2} H_{l} (3 I - H_{l}^{2}),

(7)

and Halley’s solver,

H_{l + 1} = [H_{l}^{2} + I] {[H_{l} (3 I + H_{l}^{2})]}^{- 1},

(8)

can be extracted, as in [11]. A quartically convergent solver is proposed in [12] as follows:

\begin{matrix} H_{l + 1} = H_{l} ((1 - 6 δ) I + 2 (- 7 + 2 δ) H_{l}^{2} + (- 3 + 2 δ) H_{l}^{4}) \\ \times {[(1 - 2 δ) I - 2 (3 + 2 δ) H_{l}^{2} + (- 11 + 6 δ) H_{l}^{4}]}^{- 1} . \end{matrix}

(9)

Parameter

δ

is an independent real constant. Additionally, two alternative quartically convergent methods are derived from (4) with global convergence, which can be expressed as follows:

H_{l + 1} = [4 H_{l} (I + H_{l}^{2})] {[I + 6 H_{l}^{2} + H_{l}^{4}]}^{- 1}, Reciprocal of Padé [1, 2],

(10)

H_{l + 1} = [H_{l}^{4} + 6 H_{l}^{2} + I] {[4 H_{l} (I + H_{l}^{2})]}^{- 1}, Padé [1, 2] .

(11)

3. A New Numerical Method

Higher-order iteration solvers to compute (1) offer several advantages compared to lower-order methods [13,14]. To discuss further, higher-order iterative methods typically converge faster than lower-order methods. By incorporating more information from the matrix and its derivatives, these methods can achieve faster convergence rates, leading to fewer iterations required to reach a desired level of accuracy. This can improve the computational efficiency of the matrix sign function computation.

Higher-order iterative methods may provide higher accuracy in approximating the matrix sign function. This is particularly beneficial when dealing with matrices that have small eigenvalues or require high precision computations. Such methods often exhibit improved robustness compared to lower-order methods. They are typically more stable and less sensitive to variations in the input matrix. This robustness is particularly advantageous in situations where the matrix may have a non-trivial eigenvalue distribution or when dealing with ill-conditioned matrices. Higher-order methods can provide more reliable and accurate results in such cases. In addition, such iterations are applicable to a wide range of matrix sizes and types. They can handle both small and large matrices efficiently. Moreover, these methods can be adapted to handle specific matrix structures or properties, such as symmetric or sparse matrices, allowing for versatile applications across different domains.

It is significant to note that the choice of the appropriate iterative method relies on various factors, including the characteristics of the matrix, computational resources, desired accuracy, and specific application requirements. While higher-order iterative methods offer advantages, they may also come with increased memory requirements if not treated well. Thus, a careful consideration of these trade-offs is necessary and an efficient method must be constructed. To propose an efficient one, we proceed as follows.

We now examine the scalar form of Equation (3), that is to say,

f (h) = h^{2} - 1 = 0

. Here,

f (h)

is the scalar version of the nonlinear matrix Equation (3), which means that

h = \pm 1

. In this paper, we employ uppercase letter “H” when addressing matrices, while utilizing lowercase letter “h” to denote scalar inputs. We propose a refined adaptation of the Newton’s scheme, comprising three sequential steps as outlined below:

\begin{matrix} \{\begin{matrix} d_{l} = h_{l} - f^{'} {(h_{l})}^{- 1} f (h_{l}), l = 0, 1, \dots, \\ x_{l} = h_{l} - \frac{21 f (h_{l}) - 22 f (d_{l})}{21 f (h_{l}) - 43 f (d_{l})} \frac{f (h_{l})}{f^{'} (h_{l})}, \\ h_{l + 1} = x_{l} - f {[x_{l}, h_{l}]}^{- 1} f (x_{l}), \end{matrix} \end{matrix}

(12)

where

f [x_{l}, h_{l}]

is a divided difference operator. For further insights into solving nonlinear scalar equations using high-order iterative methods, refer to works [15,16,17]. Additionally, valuable information can be found in modern textbooks such as [18,19] or classical textbooks like [20]. The second substep in (12) represents a significant improvement over the approach presented in [21]. The coefficients involved in this substep are computed through a method of unknown coefficients, which involves rigorous computations. Initially, these coefficients are considered unknown, and then they are carefully determined to achieve a fourth order of convergence. Furthermore, this process is designed to expand the basins of attraction, providing an advantage over other solvers with similar characteristics.

Theorem 1.

Assume

ρ \in D

as a simple root of

f : D \subseteq C \to C

, which is an enough smooth function. By considering an enough close guess

h_{0}

the method (12) converges to ρ and the rate of convergence is four.

Proof.

The proof entails a meticulous derivation of Taylor’s expansion for each sub-step of the iterative process around the simple root

ρ

. Nonetheless, it is observed that (12) satisfies the subsequent error equation,

\begin{matrix} α_{l + 1} = - \frac{1}{21} v_{2}^{3} α_{l}^{4} + O (α_{l}^{5}), \end{matrix}

(13)

where

v_{j} = \frac{f^{(j)} (ρ)}{j! f^{'} (ρ)}

and

α_{l} = h_{l} - ρ

. Considering that the method (12) belongs to the category of fixed-point type methods, similar to Newton’s method, its convergence is local. This implies that the initial guess

h_{0}

should be sufficiently close to the root to guarantee convergence. The proof finishes now. □

To illustrate more on the mathematical derivation of the second substep (12), in fact, first, we consider (12) as follows:

\begin{matrix} \{\begin{matrix} d_{l} = h_{l} - f^{'} {(h_{l})}^{- 1} f (h_{l}), l = 0, 1, \dots, \\ x_{l} = h_{l} - \frac{a_{1} f (h_{l}) - a_{2} f (d_{l})}{a_{3} f (h_{l}) - a_{4} f (d_{l})} \frac{f (h_{l})}{f^{'} (h_{l})}, \\ h_{l + 1} = x_{l} - f {[x_{l}, h_{l}]}^{- 1} f (x_{l}), \end{matrix} \end{matrix}

(14)

which offers the following error equation:

\begin{matrix} α_{l + 1} = \frac{v_{2} (a_{3} - a_{1})}{a_{3}} α_{l}^{2} + O (α_{l}^{3}) . \end{matrix}

(15)

The relationship (16) results in selecting

a_{1} = a_{3}

, consequently transforming the error equation into

\begin{matrix} α_{l + 1} = \frac{v_{2}^{2} (a_{1} + a_{2} - a_{4})}{a_{1}} α_{l}^{3} + O (α_{l}^{4}) . \end{matrix}

(16)

Therefore, we need to determine the remaining unspecified coefficients in a way that ensures

a_{1} + a_{2} - a_{4} = 0

, reducing the term in (16). Moreover, their selection should aim to minimize the subsequent error equation, namely

\frac{(3 a_{1} v_{2} v_{3} (a_{1} + a_{2} - a_{4}) - v_{2}^{3} (- a_{4} (5 a_{1} + a_{2}) + a_{1} (3 a_{1} + 5 a_{2}) + a_{4}^{2}))}{a_{1}^{2}} α_{l}^{4}

. This leads us to a choice of

a_{1} = a_{3} = 21

,

a_{2} = 22

and

a_{4} = 43

.

Now, we can solve (3) by the iterative method (12). Pursuing this yields

H_{l + 1} = 2 H_{l} (27 I + 52 H_{l}^{2} + 5 H_{l}^{4}) {[11 I + 106 H_{l}^{2} + 51 H_{l}^{4}]}^{- 1}

(17)

with the initial value (6). To clarify the derivation in detail, we provide a simple yet efficient Mathematica code for this purpose as follows:

ClearAll[‘‘Global‘*’’]
f[x_] := x^2 - 1
fh = f[h];
fh1 = f’[h];
y = h - fh/fh1;
fy = f[y];
x = h - ((21 fh - 22 fy)/(21 fh - 43 fy)) fh/fh1 // FullSimplify;
ddo1 = (x - h)^-1 (f[x] - f[h]);
h1 = x - f[x]/ddo1 // FullSimplify

Likewise, we acquire the reciprocal expression of (17) through the following procedure:

H_{l + 1} = (11 I + 106 H_{l}^{2} + 51 H_{l}^{4}) {[2 H_{l} (27 I + 52 H_{l}^{2} + 5 H_{l}^{4})]}^{- 1} .

(18)

Theorem 2.

Let

H_{0}

be an appropriate initialization and M is nonsingular. Then, (18) (or (17)) tends to N with fourth convergence order.

Proof.

We consider B to be a non-singular, non-unique transformation matrix, and represent the Jordan canonical form in the following form:

J_{p} = B^{- 1} M B = d i a g (J_{m_{1}}, J_{m_{1}}, \dots, J_{m_{p}}),

(19)

where the eigenvalues of M are distinct

σ (M) = {η_{1}, η_{2}, \dots, η_{k}}

and

m_{1} + m_{2} + \dots + m_{p} = n

. For function f which possesses

m_{j} - 1

degrees of differentiability at

η_{j}

for

j = 1, 2, \dots, k

, [22] and is given on the spectrum of M, the matrix function

f (M)

can be expressed by

f (M) = B f (J) B^{- 1},

(20)

wherein

f (J) = d i a g (f (J_{m_{1}} (η_{1})), \dots, f (J_{m_{k}} (η_{k}))),

(21)

with

f (J_{m_{j}} (η_{j}))) = \sum_{υ = 0}^{m_{j} - 1} \frac{1}{υ!} f^{(υ)} (η_{j}) \cdot S_{m_{j}}^{υ} .

(22)

Considering

m_{j}

as the jth Jordan block size associated to

η_{j}

, we have

J_{m_{j}} (η_{j}) = (\begin{matrix} η_{j} & 1 & 0 & \dots & 0 \\ 0 & η_{j} & 1 & ⋱ & ⋮ \\ ⋮ & ⋱ & ⋱ & 0 \\ ⋮ & ⋱ & 1 \\ 0 & \dots & \dots & 0 & η_{j} \end{matrix}) = : η_{j} I + S_{m_{j}} .

(23)

To continue, we employ the Jordan block matrix J and decompose M by utilizing a nonsingular matrix B of identical dimensions, leading to the following decomposition:

M = B J B^{- 1} .

(24)

Through the utilization of this decomposition and an analysis of the solver’s structure, we derive an iterative sequence for the eigenvalues from iterate l to iterate

l + 1

in the following manner:

\begin{matrix} η_{l + 1}^{i} = (11 + 106 η {_{l}^{i}}^{2} + 51 η {_{l}^{i}}^{4}) \\ \times {[2 η_{l}^{i} (27 + 52 η {_{l}^{i}}^{2} + 5 η {_{l}^{i}}^{4})]}^{- 1}, i = 1, 2, \dots, n, \end{matrix}

(25)

where

n_{i} = sign (η_{l}^{i}) = \pm 1 .

(26)

In a broad sense and upon performing certain matrix simplifications, the iterative process (25) reveals that the eigenvalues converge to

n_{i} = \pm 1

, viz.,

lim_{l \to \infty} |\frac{η_{l + 1}^{i} - n_{i}}{η_{l + 1}^{i} + n_{i}}| = 0 .

(27)

Equation (27) indicates the convergence of the eigenvalues toward

\pm 1

. With each iteration, the eigenvalues tend to cluster closer to

\pm 1

. Having analyzed the theoretical convergence of the method, we now focus on examining the rate of convergence. For this purpose, we consider

Λ_{l} = 2 H_{l} (27 I + 52 H_{l}^{2} + 5 H_{l}^{4}) .

(28)

Utilizing (28) and recognizing that

H_{l}

represents rational functions of M while also demonstrating commutativity with N within a similar manner as M, we can express

\begin{matrix} H_{l + 1} - N & = & (11 I + 106 H_{l}^{2} + 51 H_{l}^{4}) Λ_{l}^{- 1} - N \\ = & [11 I + 106 H_{l}^{2} + 51 H_{l}^{4} - N Λ_{l}] Λ_{l}^{- 1} \\ = & [11 I + 106 H_{l}^{2} + 51 H_{l}^{4} - 2 H_{l} (27 N + 52 H_{l}^{2} N + 5 H_{l}^{4} N] Λ_{l}^{- 1} \\ = & [- 11 {(H_{l} - N)}^{4} + 10 H_{l} N (H^{4} - 4 H_{l}^{3} N \\ + 6 H_{l}^{2} N^{2} - 4 H_{l} N^{3} + I)] Λ_{l}^{- 1} \\ = & [- 11 {(H_{l} - N)}^{4} + 10 H_{l} N {(H_{l} - N)}^{4}] Λ_{l}^{- 1} \\ = & {(H_{l} - N)}^{4} [- 11 I + 10 H_{l} N] Λ_{l}^{- 1} . \end{matrix}

(29)

Using (29) and a 2-norm, it is possible to obtain

∥ H_{l + 1} - N ∥ \leq (∥ Λ_{l}^{- 1} ∥ ∥ 10 H_{l} N - 11 I ∥) {∥ H_{l} - N ∥}^{4} .

(30)

This shows the convergence rate of order four under a suitable selection of the initial matrix, such as (6). □

We note that the proofs of the theorems in this section are new and were obtained for our proposed method.

Attraction basins are useful in understanding the global convergence behavior of iteration schemes to calculate (1). In the context of iterative methods, the basins of attraction refer to regions in the input space where the iterative process converges to a specific solution or behavior. When designing a solver for computing (1), it is crucial to ensure that the method tends to the desired solution regardless of the initial guess. The basins of attraction provide insights into the convergence behavior by identifying the regions of the input space that lead to convergence and those that lead to divergence or convergence to different solutions.

By studying the basins of attraction, one can analyze the stability and robustness of an iterative method for computing (1). In general, here is how basins of attraction help in understanding global convergence [23]:

Convergence Analysis: Basins of attraction provide information about the convergence properties of the iterative sequence. The regions in the input space that belong to the basin of attraction of a particular solution indicate where the method converges to that solution. By analyzing the size, shape, and location of these basins, one can gain insights into the convergence behavior and determine the conditions under which convergence is guaranteed.
Stability Assessment: Basins of attraction reveal the stability of the iterative method. If the basins of attraction are well-behaved and do not exhibit fractal or intricate structures, it indicates that the method is stable and robust. On the other hand, if the basins are complex and exhibit fractal patterns, it suggests that the method may be sensitive to initial conditions and can easily diverge or converge to different solutions.
Optimization and Method Refinement: Analyzing the basins of attraction can guide the optimization and refinement of the iterative method. By identifying regions of poor convergence or instability, adjustments can be made to the algorithm to improve its performance. This may involve modifying the iteration scheme, incorporating adaptive techniques, or refining the convergence criteria.
Algorithm Comparison: Basins of attraction can be used to compare various iteration methods for finding (1). By studying the basins associated with different methods, one can assess their convergence behavior, stability, and efficiency. This information can aid in selecting the most suitable method for a particular problem or in developing new algorithms that overcome the limitations of existing approaches.

We introduced techniques (17) and (18) with the aim of expanding the attraction regions pertaining to these methods in the solution of

f (h) = h^{2} - 1 = 0

. To provide greater clarity, we proceed to explore how the proposed methods exhibit global convergence and enhanced convergence radii by visually representing their respective attraction regions in the complex domain,

[- 2, 2] \times [- 2, 2],

when solving

f (h) = 0

. In pursuit of this objective, we partition the complex plane into numerous points using a mesh, and subsequently assess the behavior of each point as an initial value to ascertain whether it converges or diverges. Upon convergence, the point is shaded in accordance with the number of iterations, leading to the following termination criterion:

| f (h_{l}) | \leq 10^{- 3} .

Figure 1, Figure 2 and Figure 3 include the basins of attractions for different methods. We recal that the fractal behavior of iterative methods in a complex plane determine their local or global convergence under certain conditions. They reveal that for (17) and (18), they own larger radii of convergence in comparison to their same-order competitors from (4). The presence of a lighter area indicates the expansion of the convergence radii as they extend to encompass (1).

Theorem 3.

According to Equation (18) and assuming an invertible matrix M, the sequence

{H_{l}}_{l = 0}^{\infty}

with

H_{0} = M

is asymptotically stable.

Proof.

We consider a perturbed calculation of

U_{l}

in the

l^{t h}

iterate of performing the numerical solver,; for more, see [24]. And we define the following relation per cycle:

{\tilde{H}}_{l} = H_{l} + U_{l} .

(31)

Throughout the remainder of the proof, we assume the validity of relation

{(U_{l})}^{i} \approx 0

for

i \geq 2

, which holds true under the condition of performing a first-order error analysis, considering the smallness of

U_{l}

. We obtain

\begin{matrix} {\tilde{H}}_{l + 1} = & [11 I + 106 {\tilde{H}}_{l}^{2} + 51 {\tilde{H}}_{l}^{4}] \\ \times {[2 {\tilde{H}}_{l} (27 I + 52 {\tilde{H}}_{l}^{2} + 5 {\tilde{H}}_{l}^{4})]}^{- 1} . \end{matrix}

(32)

In the phase of convergence, it is considered that

H_{l} \approx sign (M) = N;

(33)

utilizing the following established facts (pertaining to the invertible matrix H and an arbitrary matrix E) as referenced in [25] (p. 188),

{(E + H)}^{- 1} ≃ H^{- 1} - H^{- 1} E H^{- 1} .

(34)

We also utilize

N^{- 1} = N

and

N^{2} = I

(which are special cases of

N^{2 j} = I

) and

N^{2 j + 1} = N

,

j \geq 1

) to obtain

{\tilde{H}}_{l + 1} \approx (N + \frac{1}{2} U_{l} - \frac{1}{2} N U_{l} N) .

(35)

By further simplifications and using

U_{l + 1} = {\tilde{H}}_{l + 1} - H_{l + 1}

, we can find

U_{l + 1} \approx 2^{- 1} U_{l} - 2^{- 1} N U_{l} N .

(36)

This leads to the conclusion that (18) at the iterate

l + 1

is bounded, thus offering us

∥ U_{l + 1} ∥ \leq \frac{1}{2} ∥ U_{0} - N U_{0} N ∥ .

(37)

Therefore, the sequence

{H_{l}}_{l = 0}^{\infty}

produced via (17) is stable. □

We end this section by noting that our proposed method comes with error estimation techniques that enable the estimation of the approximation error during the computation. This provides valuable information about the quality of the calculated resolution. Additionally, its higher-order nature allows for better control over the accuracy of the approximation by adjusting the order of the method or specifying convergence tolerances. Additionally, our method can take advantage of parallel computing architectures to accelerate the computation process. It can be parallelized to distribute the computational workload across multiple processing units, enabling faster computations for large matrices or in high-performance computing environments.

4. Computational Study

Here, we evaluate the performance of the proposed solvers for various types of problems. The whole implementations are run by Mathematica (see [26]). Computational issues such as convergence detection are considered. We divide the tests into two parts, tests of theoretical values and tests of practical values. The following methods are compared: (5) denoted by NIM, (8), denoted by HM, (11), shown by Padé, (17) shown by P1, (18) shown by P2. We also compare the results with the Zaka Ullah et al. method (ZM) [21] as follows:

H_{l + 1} = (5 I + 42 H_{l}^{2} + 17 H_{l}^{4}) {[H_{l} (23 I + 38 H_{l}^{2} + 3 H_{l}^{4})]}^{- 1} .

(38)

For all the compared iterative methods, we employ the initial matrix

H_{0}

by (6). The calculation of the absolute error is performed as follows:

E_{l + 1} = {∥ H_{l + 1}^{2} - I ∥}_{2} \leq ϵ .

(39)

Here,

ϵ

represents the stopping value. Observing that the reported times are based on a single execution of the entire program, it is important to note that these reported times encompass all calculations, including the computation of norm residuals and other relevant operations.

Example 1.

Ten real random matrices are generated using a random seed

SeedRandom [12345];

and their matrix sign functions are computed and compared. We produce ten random matrices in the interval

[- 1, 1]

of the sizes

100 \times 100

till

1000 \times 1000

under

ϵ = 10^{- 5}

.

Numerical comparisons for Example 1 are presented in Table 1 and Table 2, substantiating the effectiveness of the methods proposed in this paper. Notably, both P1 and P2 contribute to a decrease in the required number of iterations for determining the matrix sign, resulting in a noticeable reduction in elapsed CPU time (measured in seconds). This reduction is evident in the average CPU times across ten randomly generated matrices of varying dimensions.

Example 2.

In this test, the matrix sign function is computed for ten complex random matrices with the same seed as in Example 1 under

ϵ = 10^{- 4}

by the piece of Mathematica code below:

number = 10;
Table[M[n] = RandomComplex[{-20 - 20 I,
      20 + 20 I}, {100 n, 100 n}];, {n, number}];

Table 3 and Table 4 provide numerical comparisons for Example 2, further reaffirming the efficiency of the proposed method in determining the sign of ten randomly generated complex matrices. Comparable computational experiments conducted for various analogous problems consistently corroborate the aforementioned findings, with P1 emerging as the superior solver.

Stabilized Solution of a Riccati Equation

We take into account the ARE, also known as the algebraic Riccati equation, arising in optimal control problems in continuous/discrete time as follows [27]:

L (Y) = Y A_{1} + A_{1}^{T} Y + P - Y A_{2} L^{- 1} {A_{2}}^{T} Y = 0,

(40)

where

L = L^{T} \in R^{m \times m}

is positive definite,

P = P^{T} \in R^{n \times n}

is positive semi-definite,

A_{1}, P \in R^{n \times n}

,

A_{2} \in R^{n \times m}

, and

Y \in R^{n \times n}

are the unknown matrix. Typically, we seek a solution that stabilizes the system, meaning the eigenvalues of

A_{1} - A_{2} L^{- 1} A_{2}^{T} Y

have negative real parts. It is important to notice that if the pair

(A_{1}, A_{2})

is stabilizable, and

(A_{1}, P)

is detectable, then there exists a stabilizing unique resolution Y for Equation (40) in

R^{n \times n}

. Furthermore, this solution Y is both symmetric and positive semi-definite. Assuming Y is the stabilizing resolution for the ARE in Equation (40), all eigenvalues of

A_{1} - A_{2} L^{- 1} A_{2}^{T} Y

have negative real parts, which can be seen from the following equation:

(\begin{matrix} A_{1} & A_{2} L^{- 1} {A_{2}}^{T} \\ P & - A_{1}^{T} \end{matrix}) (\begin{matrix} I & 0 \\ - Y & I \end{matrix}) =

(\begin{matrix} I & 0 \\ - Y & 0 \end{matrix}) (\begin{matrix} A_{1} - A_{2} L^{- 1} {A_{2}}^{T} Y & A_{2} L^{- 1} {A_{2}}^{T} \\ 0 & - A_{1}^{T} + Y A_{2} L^{- 1} {A_{2}}^{T} \end{matrix}) .

(41)

We obtain

Z = (\begin{matrix} Z_{11} & Z_{12} \\ Z_{21} & Z_{22} \end{matrix}) = sign (H) = (\begin{matrix} I & 0 \\ - Y & I \end{matrix}) (\begin{matrix} - I & K \\ 0 & I \end{matrix}) {(\begin{matrix} I & 0 \\ - Y & I \end{matrix})}^{- 1}

(42)

for a suitable matrix K, wherein

H = (\begin{matrix} A_{1} & A_{2} L^{- 1} {A_{2}}^{T} \\ P & - A_{1}^{T} \end{matrix}) .

(43)

Now, we find Y as follows:

(\begin{matrix} Z_{11} & Z_{12} \\ Z_{21} & Z_{22} \end{matrix}) (\begin{matrix} I \\ - Y \end{matrix}) = (\begin{matrix} - I \\ Y \end{matrix}),

(44)

and thus

- (\begin{matrix} Z_{12} \\ Z_{22} \end{matrix}) Y + (\begin{matrix} Z_{11} \\ Z_{21} \end{matrix}) + (\begin{matrix} - I \\ Y \end{matrix}) = 0,

(45)

which implies

(\begin{matrix} Z_{12} \\ Z_{22} + I \end{matrix}) Y = (\begin{matrix} Z_{11} + I \\ Z_{21} \end{matrix}) .

(46)

After determining the sign of H, solving the required solution becomes feasible by addressing the overdetermined system (46). This can be achieved through standard algorithms designed for solving such systems. In our test scenario, we utilize P1 with the termination condition (39) in the infinity norm, along with a tolerance of

10^{- 8}

to compute the sign of H during the solution of the ARE (40). As a practical instance, this procedure involves the following input matrices:

A_{1} = (\begin{matrix} 3 . & - 1 . & 0 & 0 & 0 \\ - 1 . & 3 . & - 1 . & 0 & 0 \\ 0 & - 1 . & 3 . & - 1 . & 0 \\ 0 & 0 & - 1 . & 3 . & - 1 . \\ 0 & 0 & 0 & - 1 . & 3 . \end{matrix}),

(47)

A_{2} = (\begin{matrix} 0.8 & 0 & 0 & - 1.6 & 0 \\ 0 & 0.8 & 0 & 0 & - 1.6 \\ 0 & 0 & 0.8 & 0 & 0 \\ - 1.6 & 0 & 0 & 0.8 & 0 \\ 0 & - 1.6 & 0 & 0 & 0.8 \end{matrix}),

(48)

P = (\begin{matrix} 4.55719 & 0 . & 0 . & 0 . & 0 . \\ 0 . & 9.77826 & 0 . & 0 . & 0 . \\ 0 . & 0 . & 9.43215 & 0 . & 0 . \\ 0 . & 0 . & 0 . & 9.62216 & 0 . \\ 0 . & 0 . & 0 . & 0 . & 3.02348 \end{matrix}),

(49)

L = (\begin{matrix} 500 . & 100 . & - 200 . & 0 . & 0 . \\ 100 . & 600 . & - 100 . & 0 . & - 200 . \\ - 200 . & - 100 . & 500 . & 0 . & - 200 . \\ 0 . & 0 . & 0 . & 400 . & 0 . \\ 0 . & - 200 . & - 200 . & 0 . & 400 . \end{matrix}) .

(50)

The resulting matrix, which serves as the solution to (40), is

Y = (\begin{matrix} 2033.67 & - 605.049 & - 249.922 & 1641.71 & - 372.572 \\ - 605.049 & 1085.56 & 370.904 & - 587.446 & 780.42 \\ - 249.922 & 370.904 & 2798.35 & - 301.419 & 457.111 \\ 1641.71 & - 587.446 & - 301.419 & 2079.22 & - 583.461 \\ - 372.572 & 780.42 & 457.111 & - 583.461 & 1819.58 \end{matrix}) .

(51)

To verify the accuracy of matrix Y, we calculate the residual norm of (40) in

l_{\infty}

using (51), resulting in

{∥ L (Y) ∥}_{\infty} = 1.58 \times 10^{- 6}

. This value affirms the precision of the approximation we attained for (40) through the matrix sign function and P1 approach.

5. Conclusions

In this paper, we provided details on why higher-order methods are necessary for finding the sign of invertible matrices. Then, it was discussed by basins of attractions that the proposed method to compute (1) has global convergence, and its convergence radii were wider than those of its competitors. The stability of the method was discussed in detail, and computational results supported the theoretical outcomes. Several other tests were conducted by the authors and upheld the efficiency of the presented iterative scheme in computing the matrix sign. Future works can be pursued on extension of the presented solver to find the matrix sector function. In fact, this function is a generalization of the sign function, and it is necessary to compute such a matrix function by high-order iterative methods. On the other hand, it would be interesting and of practical use to propose more efficient variants of (17) to have larger basins of attractions but with the same number of matrix multiplications. These could be our future research directions.

Author Contributions

Conceptualization, L.S., M.Z.U. and H.K.N.; formal analysis, L.S., M.Z.U., M.A. and S.S.; funding acquisition, S.S.; investigation, L.S., M.Z.U. and M.A.; methodology, H.K.N., M.A. and S.S.; supervision, L.S., M.Z.U. and H.K.N.; validation, M.Z.U., M.A. and S.S.; writing—original draft, L.S. and M.Z.U.; writing—review and editing, L.S. and M.Z.U. All authors have read and agreed to the published version of the manuscript.

Funding

Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia, under grant no. (IFPIP: 383-247-1443).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Regarding the data availability statement, it is confirmed that data sharing does not apply to this article, as no new data were generated during the course of this paper.

Acknowledgments

This research work was funded by Institutional Fund Projects under grant no. (IFPIP: 383-247-1443). The authors gratefully acknowledge technical and financial support provided by the Ministry of Education and King Abdulaziz University, DSR, Jeddah, Saudi Arabia. The authors express their gratitude to referees for providing valuable feedback, comments, and corrections on an earlier version of this submission. Their help have significantly improved the readability and reliability of this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hogben, L. Handbook of Linear Algebra; Chapman and Hall/CRC: Boca Raton, FL, USA, 2007. [Google Scholar]
Roberts, J.D. Linear model reduction and solution of the algebraic Riccati equation by use of the sign function. Int. J. Cont. 1980, 32, 677–687. [Google Scholar] [CrossRef]
Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2008. [Google Scholar]
Soheili, A.R.; Amini, M.; Soleymani, F. A family of Chaplygin–type solvers for Itô stochastic differential equations. Appl. Math. Comput. 2019, 340, 296–304. [Google Scholar] [CrossRef]
Al-Mohy, A.; Higham, N. A scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl. 2009, 31, 970–989. [Google Scholar] [CrossRef]
Denman, E.D.; Beavers, A.N. The matrix sign function and computations in systems. Appl. Math. Comput. 1976, 2, 63–94. [Google Scholar] [CrossRef]
Tsai, J.S.H.; Chen, C.M. A computer-aided method for solvents and spectral factors of matrix polynomials. Appl. Math. Comput. 1992, 47, 211–235. [Google Scholar] [CrossRef]
Ringh, E. Numerical Methods for Sylvester-Type Matrix Equations and Nonlinear Eigenvalue Problems. Doctoral Thesis, Applied and Computational Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden, 2021. [Google Scholar]
Kenney, C.S.; Laub, A.J. Rational iterative methods for the matrix sign function. SIAM J. Matrix Anal. Appl. 1991, 12, 273–291. [Google Scholar] [CrossRef]
Soleymani, F.; Stanimirović, P.S.; Shateyi, S.; Haghani, F.K. Approximating the matrix sign function using a novel iterative method. Abstr. Appl. Anal. 2014, 2014, 105301. [Google Scholar] [CrossRef]
Gomilko, O.; Greco, F.; Ziȩtak, K. A Padé family of iterations for the matrix sign function and related problems. Numer. Lin. Alg. Appl. 2012, 19, 585–605. [Google Scholar] [CrossRef]
Cordero, A.; Soleymani, F.; Torregrosa, J.R.; Zaka Ullah, M. Numerically stable improved Chebyshev–Halley type schemes for matrix sign function. J. Comput. Appl. Math. 2017, 318, 189–198. [Google Scholar] [CrossRef]
Rani, L.; Soleymani, F.; Kansal, M.; Kumar Nashine, H. An optimized Chebyshev-Halley type family of multiple solvers: Extensive analysis and applications. Math. Meth. Appl. Sci. 2022, in press. [Google Scholar] [CrossRef]
Liu, T.; Zaka Ullah, M.; Alshahrani, K.M.A.; Shateyi, S. From fractal behavior of iteration methods to an efficient solver for the sign of a matrix. Fractal Fract. 2023, 7, 32. [Google Scholar] [CrossRef]
Khdhr, F.W.; Soleymani, F.; Saeed, R.K.; Akgül, A. An optimized Steffensen-type iterative method with memory associated with annuity calculation. Eur. Phy. J. Plus 2019, 134, 146. [Google Scholar] [CrossRef]
Dehghani-Madiseh, M. A family of eight-order interval methods for computing rigorous bounds to the solution to nonlinear equations. Iran. J. Numer. Anal. Optim. 2023, 13, 102–120. [Google Scholar]
Ogbereyivwe, O.; Izevbizua, O. A three-free-parameter class of power series based iterative method for approximation of nonlinear equations solution. Iran. J. Numer. Anal. Optim. 2023, 13, 157–169. [Google Scholar]
McNamee, J.M.; Pan, V.Y. Numerical Methods for Roots of Polynomials—Part I; Elsevier: Amsterdam, The Netherlands, 2007. [Google Scholar]
McNamee, J.M.; Pan, V.Y. Numerical Methods for Roots of Polynomials—Part II; Elsevier: Amsterdam, The Netherlands, 2013. [Google Scholar]
Traub, J.F. Iterative Methods for the Solution of Equations; Prentice-Hall: New York, NY, USA, 1964. [Google Scholar]
Zaka Ullah, M.; Muaysh Alaslani, S.; Othman Mallawi, F.; Ahmad, F.; Shateyi, S.; Asma, M. A fast and efficient Newton-type iterative scheme to find the sign of a matrix. AIMS Math. 2023, 8, 19264–19274. [Google Scholar] [CrossRef]
Bhatia, R. Matrix Analysis; Graduate Texts in Mathematics; Springer: New York, NY, USA, 1997; Volume 169. [Google Scholar]
Soleymani, F.; Kumar, A. A fourth-order method for computing the sign function of a matrix with application in the Yang–Baxter-like matrix equation. Comput. Appl. Math. 2019, 38, 64. [Google Scholar] [CrossRef]
Iannazzo, B. Numerical Solution of Certain Nonlinear Matrix Equations. Ph.D. Thesis, Universita Degli Studi di Pisa, Pisa, Italy, 2007. [Google Scholar]
Stewart, G.W. Introduction to Matrix Computations; Academic Press: New York, NY, USA, 1973. [Google Scholar]
Styś, K.; Styś, T. Lecture Notes in Numerical Analysis with Mathematica; Bentham eBooks: Sharjah, United Arab Emirates, 2014. [Google Scholar]
Soheili, A.R.; Toutounian, F.; Soleymani, F. A fast convergent numerical method for matrix sign function with application in SDEs. J. Comput. Appl. Math. 2015, 282, 167–178. [Google Scholar] [CrossRef]

$Fractalfract 07 00684 g001$

Figure 1. The attraction basins of (5) on the left and (11) on the right are illustrated.

$Fractalfract 07 00684 g001$

$Fractalfract 07 00684 g002$

Figure 2. Attraction basins of (10) on the left, and locally convergent Padé [1,3] on the right.

$Fractalfract 07 00684 g002$

$Fractalfract 07 00684 g003$

Figure 3. Attraction basins of (17) on the left, and (18) on the right.

$Fractalfract 07 00684 g003$

Table 1. Computational comparisons based on the number of iterates in Example 1.

Matrix	$n \times n$	NIM	HM	Padé	P1	P2	ZM
#1	$100 \times 100$	10	7	5	5	5	5
#2	$200 \times 200$	15	9	8	7	7	7
#3	$300 \times 300$	17	11	9	8	8	8
#4	$400 \times 400$	13	8	7	6	6	6
#5	$500 \times 500$	14	9	7	6	6	6
#6	$600 \times 600$	14	9	7	6	6	6
#7	$700 \times 700$	13	8	7	6	6	6
#8	$800 \times 800$	17	11	9	7	8	8
#9	$900 \times 900$	20	13	10	9	9	9
#10	$1000 \times 1000$	14	9	7	6	6	6
Mean		14.7	9.4	7.6	6.6	6.7	6.7

Table 2. Computational comparisons based on the CPU times (in seconds) for Example 1.

Matrix	$n \times n$	NIM	HM	Padé	P1	P2	ZM
#1	$100 \times 100$	0.009	0.006	0.007	0.007	0.007	0.007
#2	$200 \times 200$	0.05	0.04	0.04	0.03	0.03	0.03
#3	$300 \times 300$	0.14	0.11	0.10	0.11	0.11	0.10
#4	$400 \times 400$	0.22	0.20	0.19	0.19	0.17	0.16
#5	$500 \times 500$	0.39	0.36	0.32	0.30	0.30	0.30
#6	$600 \times 600$	0.62	0.54	0.47	0.45	0.46	0.47
#7	$700 \times 700$	0.83	0.68	0.67	0.63	0.66	0.68
#8	$800 \times 800$	1.50	1.28	1.19	1.01	1.15	1.17
#9	$900 \times 900$	2.32	2.02	1.80	1.70	1.72	1.73
#10	$1000 \times 1000$	2.15	1.88	1.73	1.53	1.55	1.56
Mean		0.82	0.71	0.65	0.60	0.62	0.62

Table 3. Computational comparisons based on the number of iterates in Example 2.

Matrix	$n \times n$	NIM	HM	Padé	P1	P2	ZM
#1	$100 \times 100$	16	10	8	7	7	7
#2	$200 \times 200$	22	14	11	10	10	10
#3	$300 \times 300$	19	12	10	8	8	8
#4	$400 \times 400$	20	13	10	9	9	9
#5	$500 \times 500$	20	13	10	9	9	9
#6	$600 \times 600$	22	14	11	10	10	10
#7	$700 \times 700$	20	13	10	9	9	9
#8	$800 \times 800$	23	14	12	10	10	10
#9	$900 \times 900$	21	14	11	9	10	10
#10	$1000 \times 1000$	22	14	11	10	10	10
Mean		20.5	13.1	10.4	9.1	9.2	9.2

Table 4. Computational comparisons based on CPU times (in seconds) for Example 2.

Matrix	$n \times n$	NIM	HM	Padé	P1	P2	ZM
#1	$100 \times 100$	0.07	0.03	0.02	0.02	0.03	0.03
#2	$200 \times 200$	0.14	0.13	0.12	0.12	0.13	0.11
#3	$300 \times 300$	0.36	0.31	0.30	0.28	0.30	0.27
#4	$400 \times 400$	0.73	0.68	0.62	0.64	0.61	0.64
#5	$500 \times 500$	1.25	1.20	1.07	1.109	1.05	1.09
#6	$600 \times 600$	2.16	1.96	1.85	1.89	1.81	1.84
#7	$700 \times 700$	2.93	2.74	2.41	2.55	2.45	2.41
#8	$800 \times 800$	4.87	4.09	4.15	3.79	3.80	3.78
#9	$900 \times 900$	6.33	5.75	5.18	4.63	5.12	5.03
#10	$1000 \times 1000$	9.61	8.03	7.30	6.94	7.09	7.08
Mean		2.85	2.49	2.30	2.19	2.24	2.23

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shi, L.; Ullah, M.Z.; Nashine, H.K.; Alansari, M.; Shateyi, S. An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices. Fractal Fract. 2023, 7, 684. https://doi.org/10.3390/fractalfract7090684

AMA Style

Shi L, Ullah MZ, Nashine HK, Alansari M, Shateyi S. An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices. Fractal and Fractional. 2023; 7(9):684. https://doi.org/10.3390/fractalfract7090684

Chicago/Turabian Style

Shi, Lei, Malik Zaka Ullah, Hemant Kumar Nashine, Monairah Alansari, and Stanford Shateyi. 2023. "An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices" Fractal and Fractional 7, no. 9: 684. https://doi.org/10.3390/fractalfract7090684

APA Style

Shi, L., Ullah, M. Z., Nashine, H. K., Alansari, M., & Shateyi, S. (2023). An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices. Fractal and Fractional, 7(9), 684. https://doi.org/10.3390/fractalfract7090684

Article Menu

An Enhanced Numerical Iterative Method for Expanding the Attraction Basins When Computing Matrix Signs of Invertible Matrices

Abstract

1. Introductory Notes

2. Iteration Methods

3. A New Numerical Method

4. Computational Study

Stabilized Solution of a Riccati Equation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI