*Article* **Eigenvalue Estimates via Pseudospectra †**

**Georgios Katsouleas \*,‡, Vasiliki Panagakou ‡ and Panayiotis Psarrakos ‡**

> Department of Mathematics, Zografou Campus, National Technical University of Athens, 15773 Athens, Greece; vpanagakou@mail.ntua.gr (V.P.); ppsarr@math.ntua.gr (P.P.)

**\*** Correspondence: g\_katsouleas@mail.ntua.gr

† This paper is dedicated to Mr. Constantin M. Petridi.

‡ These authors contributed equally to this work.

**Abstract:** In this note, given a matrix *A* ∈ C*n*×*n* (or a general matrix polynomial *<sup>P</sup>*(*z*), *z* ∈ C) and an arbitrary scalar *λ*0 ∈ C, we show how to define a sequence {*μk*}*<sup>k</sup>*∈<sup>N</sup> which converges to some element of its spectrum. The scalar *λ*0 serves as initial term (*μ*0 = *λ*0), while additional terms are constructed through a recursive procedure, exploiting the fact that each term *μk* of this sequence is in fact a point lying on the boundary curve of some pseudospectral set of *A* (or *<sup>P</sup>*(*z*)). Then, the next term in the sequence is detected in the direction which is normal to this curve at the point *μk*. Repeating the construction for additional initial points, it is possible to approximate peripheral eigenvalues, localize the spectrum and even obtain spectral enclosures. Hence, as a by-product of our method, a computationally cheap procedure for approximate pseudospectra computations emerges. An advantage of the proposed approach is that it does not make any assumptions on the location of the spectrum. The fact that all computations are performed on some dynamically chosen locations on the complex plane which converge to the eigenvalues, rather than on a large number of predefined points on a rigid grid, can be used to accelerate conventional grid algorithms. Parallel implementation of the method or use in conjunction with randomization techniques can lead to further computational savings when applied to large-scale matrices.

**Keywords:** pseudospectra; eigenvalues; matrix polynomial; perturbation; Perron root; large-scale matrices; approximation algorithm

## **1. Introduction**

The theory of pseudospectra originates in numerical analysis and can be traced back to Landau [1], Varah [2], Wilkinson [3], Demmel [4], and Trefethen [5], motivated by the need to obtain insights into systems evolving in ways that the eigenvalues alone could not explain. This is especially true in problems where the underlying matrices or linear operators are non-normal or exhibit in some sense large deviations from normality. A better understanding of such systems can be gained through the concept of pseudospectrum, which, for a matrix *A* ∈ C*n*×*n* and a positive parameter > 0, was introduced as the subset of the complex plane that is bounded by the <sup>−</sup>1–level set of the norm of the resolvent (*μ<sup>I</sup>* − *<sup>A</sup>*)−<sup>1</sup>. A second definition stated in terms of perturbations characterizes the elements of this set as eigenvalues of some perturbation *A* + *E* with *E* ≤ . In this sense, the notion of pseudospectrum provides information that goes beyond eigenvalues, while retaining the advantage of being a natural extension of the spectral set. In fact, for different values of magnitude , pseudospectrum provides a global perspective on the effects of perturbations; this is in stark contrast to the concept of condition number, where only the worst-case scenario is considered.

On one hand, pseudospectrum may be used as a visualization tool to reveal information regarding the matrix itself and the sensitivity of its eigenvalues. Applications within numerical analysis include convergence of nonsymmetric matrix iterations [6], backward error analysis of eigenvalue algorithms [7], and stability of spectral methods [8]. On the

**Citation:** Katsouleas, G.; Panagakou,V.; Psarrakos, P. Eigenvalue Estimates via Pseudospectra. *Mathematics* **2021**, *9*, 1729. https://doi.org/10.3390/ math9151729

Academic Editors: Khalide Jbilou and Marilena Mitrouli

Received: 30 May 2021 Accepted: 18 July 2021 Published: 22 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

other hand, it is a versatile tool that has been used to obtain quantitative bounds on the transient behavior of differential equations in finite time, which may deviate from the long-term asymptotic behavior [9]. Important results involving pseudospetra have been also been obtained in the context of spectral theory and spectral properties of banded Toeplitz matrices [10,11]. Although emphasis has been placed on the standard eigenproblem, attention has also been drawn to matrix pencils [12] and more general matrix polynomials [13,14] arising in vibrating systems, control theory, etc. For a comprehensive overview of this research field and its applications, the interested reader may refer to [15].

In this note, we propose an application of pseudospectral sets as a mean to obtain eigenvalue estimates in the vicinity of some complex scalar. In particular, given a matrix (or a general matrix polynomial) and a scalar *λ*0 ∈ C, we construct a sequence {*μk*}*<sup>k</sup>*∈<sup>N</sup> that converges to some element of its spectrum. The scalar *λ*0 serves as initial term (*μ*0 = *λ*0), while additional terms are constructed through an iterative procedure, exploiting the fact that each term *μk* of this sequence is in fact a point lying on the boundary curve of some pseudospectral set. Then, the next term in the sequence is detected in the perpendicular direction to the tangent line at the point *μk*. Repeating the construction for a tuple of initial points encircling the spectrum, several peripheral eigenvalues are approximated. Since the pseudospectrum may be disconnected, this procedure allows the identification of individual connected components and, as a by-product, a convenient and numerically efficient procedure for approximate pseudospectrum computation emerges. Moreover, this approach is clearly amenable to parallelization or randomization and can lead to significant computational savings when applied to probems involving large–scale matrices.

Our paper is organized as follows. In Section 2, we provide the necessary theoretical background on the method and provide examples for the constant matrix case. As confirmed by numerical experiments, the method can provide a sufficiently accurate pseudospectrum computation at a much-reduced computational cost, especially in cases where the spectrum is convexly independent (i.e., each eigenvalue does not lie in the convex hull of the others) or exhibits large eigenvalue gaps. A second application of the method on Perron-root approximation for non–negative matrices is presented. Then, Section 3 shows how the procedure may be modified to estimate the spectrum of more general matrix polynomials. Numerical experiments showcasing the application of the method on damped mass–spring and gyroscopic systems conclude the paper.

#### **2. Eigenvalues via Pseudospectra**

Let the matrix *A* ∈ C*n*×*n* with spectrum *σ*(*A*) = {*μ* ∈ C : det(*μ<sup>I</sup>* − *A*) = <sup>0</sup>}, where det(·) denotes the *determinant* of a matrix. With respect to the ·2–norm, the *pseudospectrum* of *A* is defined by

$$\begin{split} \sigma\_{\varepsilon}(A) &= \left\{ \mu \in \mathbb{C} : \frac{1}{\|(\mu I - A)^{-1}\|\_{2}} \le \varepsilon \right\} \\ &= \left\{ \mu \in \mathbb{C} : \mu \in \sigma(A + E) \text{ for some } E \in \mathbb{C}^{n \times n} \text{ with } \|E\| \le \varepsilon \right\} \\ &= \{ \mu \in \mathbb{C} : s\_{\min}(\mu I - A) \le \varepsilon \}, \end{split}$$

where *<sup>s</sup>*min(·) denotes the smallest *singular value* of a matrix and > 0 is the maximum norm of admissible perturbations.

For every choice of increasing positive parameters 0 < 1 < 2 < 3 < ... , the corresponding closed, strictly nested sequence of pseudospectra

$$
\sigma\_{\mathfrak{c}\_1}(A) \subset \sigma\_{\mathfrak{c}\_2}(A) \subset \sigma\_{\mathfrak{c}\_3}(A) \subset \dots \dots
$$

is obtained. In fact, the respective boundaries satisfy the inclusions

$$\begin{aligned} \partial \sigma\_{\varepsilon\_1}(A) &\subseteq \{ \mu \in \mathbb{C} : \operatorname{s\_{\min}}(\mu I - A) = \epsilon\_1 \} \\ \partial \sigma\_{\varepsilon\_2}(A) &\subseteq \{ \mu \in \mathbb{C} : \operatorname{s\_{\min}}(\mu I - A) = \epsilon\_2 \} \\ \partial \sigma\_{\varepsilon\_3}(A) &\subseteq \{ \mu \in \mathbb{C} : \operatorname{s\_{\min}}(\mu I - A) = \epsilon\_3 \} \\ \vdots & \vdots \end{aligned}$$

It is also clear that, for any *λ* ∈ *<sup>σ</sup>*(*A*), *<sup>s</sup>*min(*<sup>λ</sup><sup>I</sup>* − *A*) = 0.

Our objective now is to exploit the properties of these sets to detect an eigenvalue of *A* in the vicinity of a given scalar *λ*0 ∈ C\*σ*(*A*). This given point of interest may be considered to lie on the boundary of some pseudospectral set, i.e., there exists some non–negative parameter ˆ1 > 0, such that

$$
\lambda\_0 \in \partial \sigma\_{\mathbb{A}\_1}(A) \subseteq \{ \mu \in \mathbb{C} : s\_{\min}(\mu I - A) = \pounds \}.\tag{1}
$$

Indeed, points satisfying the equality *<sup>s</sup>*min(*μ<sup>I</sup>* − *A*) = for some > 0 and lying in the interior of *σ*(*A*) are finite in number. Thus, in the generic case, we may think of the inclusion (1) as an equality.

We consider the real–valued function *gA* : C → R+ with *gA*(*z*) = *<sup>s</sup>*min(*zI* − *<sup>A</sup>*). In the process of formulating a curve-tracing algorithm for pseudospectrum computation [16], Brühl analyzed *gA*(*z*) and, identifying C ≡ R2, noted that its differentiability is explained by the following Theorem in [17]:

**Theorem 1.** *Let the matrix valued function <sup>P</sup>*(*χ*) : R*<sup>d</sup>* → C*n*×*n be real analytic in a neighborhood of χ*0 = *<sup>x</sup>*10,..., *xd*0! *and let σ*0 *a simple nonzero singular value of <sup>P</sup>*(*<sup>χ</sup>*0) *with u*0*, v*0 *its associated left and right singular vectors, respectively.*

*Then, there exists a neighborhood* N *of χ*0 *on which a simple nonzero singular value σ*(*χ*) *of <sup>P</sup>*(*χ*) *is defined with corresponding left and right singular vectors u*(*χ*) *and <sup>v</sup>*(*χ*)*, respectively, such that <sup>σ</sup>*(*<sup>χ</sup>*0) = *σ*0*, <sup>u</sup>*(*<sup>χ</sup>*0) = *u*0*, <sup>v</sup>*(*<sup>χ</sup>*0) = *v*0 *and the functions σ, u, v are real analytic on* N *. The partial derivatives of σ*(*χ*) *are given by*

$$\frac{\partial s(\chi\_0)}{\partial \chi^j} = \text{Re}\left(u\_0^\* \frac{\partial P(\chi\_0)}{\partial \chi^j} v\_0\right), \quad j = 1, \dots, d.$$

Hence, recalling (1) and assuming ˆ1 is a simple singular value of the matrix *<sup>P</sup>*(*<sup>λ</sup>*0) = *λ*0 *I* − *A*, then

$$\left. \nabla s\_{\min}(zI - A) \right|\_{z=\lambda\_0} = \left( \text{Re}(v\_{\min}^\* \mu\_{\min}), \text{Im}(v\_{\min}^\* \mu\_{\min}) \right) = v\_{\min}^\* \mu\_{\min}.$$

where *u*min and *v*min denote the left and right singular vectors of *λ*0 *I* − *A* associated to ˆ 1 = *<sup>s</sup>*min(*<sup>λ</sup>*0 *I* − *<sup>A</sup>*), respectively [16] (Corollary 2.2).

On the other hand, if *λ* is an eigenvalue of *A* near *λ*0, it holds |*λ* − *<sup>λ</sup>*0| ≤ ˆ1. The latter observation follows from the fact that

$$\begin{aligned} \sigma\_{\mathfrak{c}}(A) &\supseteq \sigma(A) + D(0, \mathfrak{c}) \\ &= \{ z \in \mathbb{C} : \text{dist}(z, \sigma(A)) \le \mathfrak{c} \} \end{aligned}$$

where *D*(0, ) = {*z* ∈ C : |*z*| ≤ } and equality holds for normal matrices. So, the scalar

$$\begin{aligned} \mu\_1 &= \lambda\_0 - \left. \hat{\mathfrak{e}}\_1 \cdot \frac{\nabla s\_{\min}(zI - A)}{|\nabla s\_{\min}(zI - A)|} \right|\_{z = \lambda\_0} \\ &= \lambda\_0 - s\_{\min}(\lambda\_0 I - A) \cdot \frac{\upsilon\_{\min}(\lambda\_0)^\* \mu\_{\min}(\lambda\_0)}{|\upsilon\_{\min}(\lambda\_0)^\* \mu\_{\min}(\lambda\_0)|} \end{aligned}$$

can be considered to be an estimate of eigenvalue *λ*. In particular, *λ*0 ∈ *∂σ*ˆ1 (*A*) and *μ*1 lies in the interior of *σ*ˆ1 (*A*). Moreover, the sequence

$$\begin{aligned} \mu\_0 &= \lambda\_0\\ \mu\_1 &= \mu\_0 - s\_{\min}(\mu\_0 I - A) \cdot \frac{v\_{\min}(\mu\_0)^\* u\_{\min}(\mu\_0)}{|v\_{\min}(\mu\_0)^\* u\_{\min}(\mu\_0)|}\\ \mu\_2 &= \mu\_1 - s\_{\min}(\mu\_1 I - A) \cdot \frac{v\_{\min}(\mu\_1)^\* u\_{\min}(\mu\_1)}{|v\_{\min}(\mu\_1)^\* u\_{\min}(\mu\_1)|}\\ \vdots & \vdots\\ \mu\_k &= \mu\_{k-1} - s\_{\min}(\mu\_{k-1} I - A) \cdot \frac{v\_{\min}(\mu\_{k-1})^\* u\_{\min}(\mu\_{k-1})}{|v\_{\min}(\mu\_{k-1})^\* u\_{\min}(\mu\_{k-1})|} \end{aligned} \tag{2}$$

converges to *λ*.

> The above process requires the computation of the triplet

$$(s\_{\min}(\mu\_k I - A), \mu\_{\min}(\mu\_k), v\_{\min}(\mu\_k))$$

at every point *μk*; see [18].

*Remark.* To avoid the computational burden of computing the (left and right) singular vectors, a cheaper alternative would be to consider at each iteration (*k* = 0, 1, 2, ... ) the canonical octagon with vertices

$$\boldsymbol{\nu}\_{k,j} = \mu\_k + \boldsymbol{\varepsilon}^{\mathbf{i}\left(j, \#\right)} \cdot s\_{\min}(\mu\_k \boldsymbol{I} - \boldsymbol{A}), \; j = 0, 1, 2, \dots, 7$$

instead and simply compute

$$\theta\_{k,j} = s\_{\min} \left( p\_{k,j} I - A \right), \ j = 0, 1, 2, \dots, 7.$$

In this case, instead of (2), we can set

$$
\mu\_{k+1} = \mu\_k + e^{\mathbf{i}\left(\hat{\imath}\_0 \circledast\right)} \cdot \theta\_{k,\hat{\jmath}\_0},
$$

with *j*0 such that

$$
\theta\_{k,j\_0} = \min\_{j=0,1,2,\dots,\mathcal{T}} \theta\_{k,j} = \min\_{j=0,1,2,\dots,\mathcal{T}} \left( s\_{\min} \left( p\_{k,j} I - A \right) \right).
$$

#### *2.1. Numerical Experiments*

2.1.1. Pseudospectrum Computation

The approximating sequences in (2) may be utilized to implement a computationally cheap procedure to visualize matrix pseudospectra, at least in cases where the order of the matrix is small or when its spectrum exhibits large eigenvalue gaps. Several related techniques for pseudospectrum computation have appeared in the literature. These fall largely into two categories: grid [14] and path-following algorithms [16,19–21]. Grid algorithms begin by evaluating the function *<sup>s</sup>*min(*zI* − *A*) on a predefined grid on the complex plane and lead to a graphical visualization of the boundary *∂σ*(*A*) by plotting the -contours of *<sup>s</sup>*min(*zI* − *<sup>A</sup>*). This approach faces two severe challenges; namely, the requirement of a–priori information on the location of the spectrum to correctly identify a suitable region to discretize, as well as the typically large number of grid points the computations have to be performed upon. path-following algorithms, on the other hand, require an initial step to detect a starting point on the curve *∂σ*(*A*) and then proceed to compute additional boundary points for each connected component of *σ*(*A*). The main drawbacks of this latter approach lie in the difficulty in performing the initial step and the need to correctly identify every connected component of *σ*(*A*) in order to repeat

the procedure and properly trace its boundary. Moreover, cases where pseudospectrum computation is required for a whole tuple of parameters drastically compromise the efficiency of path-following algorithms.

Our approach is to use the approximating sequences (2) to decrease the number of singular value evaluations and therefore speed up the computation of pseudospectra. The basic steps are outlined as follows:


 *nj*) *∂σjk* length*nj* sequencedetermined, so that *<sup>s</sup>*min(*μjnj I* − *A*) ≤ 0 for all *j* = 1, ... ,*s*, where 0 is some prefixed parameter value. In other words, 0 indicates the tolerance with which the approached by the constructed sequences eigenvalues should be approximated and corresponds to the minimum parameter for which pseudospectra will be computed.


>

 =

 ... ,  are

$$\mu = \min\_{j=1,\ldots,s} \max\_{j=1,\ldots,\nu\_j} \epsilon\_k^j (>\epsilon\_0) \text{ and } \ell = \max\_{j=1,\ldots,s} \min\_{j=1,\ldots,\nu\_j} \epsilon\_k^j (<\epsilon\_0).$$

v. If necessary, repeat the procedure for *t* additional points between the centroids of the detected clusters, constructing additional sequences, so that

$$\min\_{j=s+1,\ldots,s+t} \max\_{j=1,\ldots,n\_j} \epsilon\_k^j > u \text{ and } \max\_{j=s+1,\ldots,s+t} \min\_{j=1,\ldots,n\_j} \epsilon\_k^j < \ell.$$


The proposed method successfully localizes the spectrum, initiating the procedure with a restricted number of points. Then, singular value computations are kept to a minimum by considering points only on the constructed sequences. Pseudospectrum components corresponding to *peripheral* eigenvalues *λ* ∈/ co(*σ*(*A*)\{*λ*}) not in the convex hull of the other eigenvalues, are thus extremely easy to identify. This approach is also well–suited to cases, where the matrix has *convexly independent* spectrum; i.e., when *λ* ∈/ co(*σ*(*A*)\{*λ*}), for every *λ* ∈ *<sup>σ</sup>*(*A*). Moreover, it is clearly amenable to parallelization, which could lead to significant computational savings in cases of large matrices.

**Application 1.** *We consider a random matrix A* ∈ C6×6*, the sole constraint being that its eigenvalues are distant form each other; the real and imaginary parts of its entries follow the standardized normal distribution scaled by* 104*. For the proposed procedure, we select initial points* #*μj*0\$<sup>10</sup>*j*=<sup>1</sup> ⊂ {*z* ∈ C : |*z*| = *A*1} *and exploit the fact that the corresponding sequences* #*μjk*\$*nj k*=0 *generated as in (2) converge to some element of <sup>σ</sup>*(*A*)*. The number nj of terms in each sequence* (*j* = 1, ... 10) *is determined, so that all values* #*<sup>s</sup>*min(*μjnj I* − *<sup>A</sup>*)\$<sup>10</sup>*j*=<sup>1</sup> *do not exceed* 0 = 0.5*. The sequences are organized into distinct clusters, grouping together those sequences which approximate the same element of <sup>σ</sup>*(*A*)*. This grouping is performed using a k-means clustering algorithm, where*

*the optimal number of clusters is evaluated via the silhouette criterion and using a distance metric based on the sum of absolute differences between points. Since six different groups are identified, clearly all elements of σ*(*A*) *have been sufficiently approximated by at least one of the sequences. For an illustration, refer to Figure 1a; different colors have been used to differentiate between polygonal chains corresponding to distinct clusters. The construction so far required 914 singular value computations. Having calculated all parameter values jk such that μjk* ∈ *∂σjk* (*A*) *during the previous procedure, it is possible to interpolate between these known points along the trajectories formed by* #*μjk*\$*nj k*=0 *(j* = 1, ... 10*) to approximate boundary points of ∂σ*(*A*) *for selected values* > 0 = 0.5*. Since all ten trajectories converge to eigenvalues from points encircling <sup>σ</sup>*(*A*)*, to obtain better pseudospectra approximations, it is necessary to repeat the procedure for additional suitably selected points. Hence, for each cluster we consider three additional points; see Figure 1b. In particular, denoting c*1, ... *c*6 *the centroids of the clusters, for each j we consider the three centroids* #*cj*,*<sup>k</sup>*\$<sup>3</sup>*k*=<sup>1</sup> *which lie closest to cj and take the convex combinations*

$$p\_k^j = \frac{1}{6}(5c\_j + c\_{j,k}), \ k = 1,2,3.$$

*Then, additional sequences corresponding to these extra points are constructed so that the desired parameter values of for which pseudospectra should be computed (in this instance, the triple of* = 1, 5, 10 ∈ [-, *u*]*) may be interpolated within these trajectories, as for the ten initial ones. This imposes an extra cost of 1170 additional singular value computations (2084 in total). The resulting approximations of the pseudospectra components identified by the upper left corner trajectories for* = 1, 5, 10 *are depicted in greater detail in Figure 1c; the relevant eigenvalue is indicated by "\*".*

An advantage of this procedure is that it does not require some a–priori knowledge of the initial region Ω on the complex plane where the spectrum is located. In fact, the very nature of this specific example, whose spectrum covers a wide area Ω, would render computations on a suitable grid impractical. Another way in which this method diverges from conventional grid algorithms is in that the computations are performed on a dynamically chosen set of points, iteratively selected as the corresponding trajectories converge to peripheral eigenvalues and identify the relevant pseudospectrum components, rather than on a large number of predefined points on a rigid grid.

**Application 2.** *To demonstrate how the procedure works in cases of larger matrices, in this application we examine the matrix A* = 10−<sup>7</sup> · *Pores*2*, where Pores*2 *is a* 1024 × 1024 *matrix from the Harwell-Boeing sparse matrix collection [22] related to a non–symmetric computational fluid dynamics problem. Here, the factor* 10−<sup>7</sup> *is used for scaling purposes and is related to the norm–*·1 *order of the matrix under consideration. Initiating the procedure with 30 equidistributed points on the circle* #*z* ∈ C : |*z*| = 12 *A*1\$*, the method required a total of 810 singular value computations for a minimum parameter value of* 0 = 0.005*; the resulting pseudospectra visualizations for* = <sup>10</sup>−1, <sup>10</sup>−1.5, 10−<sup>2</sup> *are depicted in Figure 2. For this example, we have opted not to introduce additional points.*

(**a**) Trajectories of 10 sequences converging to *<sup>σ</sup>*(*A*).

(**b**) Additional interior points (red circles) and relevant trajectories (solid black lines).

(**c**) Pseudospectrum component in the upper left side for = 1, 5, 10.

**Figure 1.** Pseudospectrum computation for random *A* ∈ C6×<sup>6</sup> with spectral gaps, using 10 initial points.

**Figure 2.** Pseudospectra computations for a non–symmetric sparse matrix of order 1024 from the Harwell-Boeing collection and = <sup>10</sup>−1, <sup>10</sup>−1.5, 10−2.

*Perron root computation.* Applications of non–negative matrices, i.e., matrices with exclusively non–negative real entries, abound in such diverse fields as probability theory, dynamical systems, Internet search engines, tournament matrices etc. In this context, the dominant eigenvalue of non–negative matrices, also referred to as *Perron root*, is of central importance. Localization of the Perron root has been extensively studied in the literature; relevant bounds can be found in [23–27]. Its computation is typically carried out using the power method; the convergence rate of this approach depends on the relative magnitudes of the two dominant eigenvalues. Relevant methods have appeared in [28–30], among others. As a second application of the approximating sequences (2), the following experiment reports an elegant way of approximating Perron roots.

**Application 3.** *For this experiment, we considered a tuple of 50 non–negative matrices* {*<sup>A</sup>*-}50-=1 ⊂ R500×<sup>500</sup> + *with uniformly distributed entries in* (0, <sup>50</sup>)*. The symmetry of σ*(*<sup>A</sup>*-) *with respect to the real axis suggests that it suffices to restrict the computations exclusively to the closed upper half–plane. Hence, for each of the matrices A*-*, we initiated the construction of the sequences (2) from equidistributed initial terms* #*μ*-,*j* 0 \$10*j*=1 ⊂ *z* ∈ C : |*z*| = 10<sup>4</sup> · *<sup>A</sup>*-1, Im(*z*) ≥ 0 (- = 1, ... 50). *As expected, the rightmost of these points formed sequences converging to the Perron root of A*-*, while each of the remaining ones approximated some other peripheral eigenvalue. In the generic case, the magnitude of the second highest eigenvalue of A*- *was much smaller than the Perron root. Figure 3 is illustrative of this separation; the blue curve traces the boundary of the numerical range of such a matrix, red points indicate its eigenvalues, while the cyan lines correspond to the trajectories of the constructed sequences. Denoting* #*μ*-,*j k* \$*n*-,*j k*=0 *(j* = 1, ... ,*s*-*) those sequences approximating the Perron root λ*- ∈ *<sup>σ</sup>*(*<sup>A</sup>*-) *(*- = 1, ... , 50*), then the relative error in each iteration* """*μ*-,*j k* −*λ*- """ |*<sup>λ</sup>*- | *, k* = 0, 1, ... , min(*<sup>n</sup>*-,*j*)*, decreases rapidly, even though the initial points μ* -,*j* 0*(j* = 1, . . . ,*s*-*) were chosen to be extremely remote from <sup>σ</sup>*(*<sup>A</sup>*-)*. Averages*

$$\frac{1}{\ell} \sum\_{\ell=1}^{50} \frac{1}{s\_{\ell}} \sum\_{j=1}^{s\_{\ell}} \frac{\left| \mu\_{k}^{\ell,j} - \lambda\_{\ell} \right|}{|\lambda\_{\ell}|}$$

*of these relative approximation errors over the tuple of matrices for the first k* = 1, 2, ... , 5 *iterations are demonstrated in the first column of Table 1, verifying that a reliable estimate for the Perron root may in the generic case be obtained after the computation of as few as 3 terms in the corresponding trajectories.*

*The remaining (*10− *s*-*) sequences converge to some other peripheral eigenvalues λ*-,1, ... , *λ*-,*s*- ∈ *<sup>σ</sup>*(*<sup>A</sup>*-)*, reasonable approximations of which require a rather larger number of iterations, as can be seen from the second column of Table 1 reporting.*

**Figure 3.** Indicative numerical range of 500 × 500 non–negative matrix and 10 approximating trajectories.

**Table 1.** Relative approximation errors for Perron root and other peripheral eigenvalues of 500 × 500 non–negative matrices.


Application 3 suggests that any reasonable upper bound *μ*0 ∈ R suffices to yield reliable estimations for the Perron root after computation of only 2–3 terms in the sequence (2).

The previous experiment may seem excessively optimistic. Indeed, there can be instances when the situation is much more demanding.

**Application 4.** *The Frank matrix is well–known to have ill-conditioned eigenvalues. For this application, we test the behavior of the proposed method on the Frank matrix of order* 32*, the normalized matrix of eigenvectors of which has condition number* 7.81 × 1011*. Figure 4 depicts the resulting pseudospectra visualizations for* = 0.001, 0.005, 0.01, 0.02, 0.03*, initiating the procedure from 30 points located on the upper semiellipse centered at* (40, 0) *with semi–major and semi–minor axes lengths equal to 70 and 15, respectively. The depicted trajectories were constructed, so that the final terms in each polygonal chain lie within <sup>σ</sup>*0.001(*A*)*. Then, according to the distances of the final terms of consecutive sequences, at most two additional points are introduced on the line segment connecting these respective final terms. The necessary iterations for the construction of the relevant sequences are reported in Table 2 for different numbers of initial points.*

*The approximating quality of the sequences is much compromised when compared to the generic case, requiring many more iterations, especially for the eigenvalues with smallest real parts; these are also the most ill-conditioned ones. In fact, the seven rightmost sequences converging to the Perron root (refer to Figure 4) display the fastest convergence, the second group of thirteen sequences leading to the intermediate eigenvalues being somewhat more compromised, while the leftmost sequences naturally exhibit even more diminished approximation quality. Mean relative approximation errors for these three groups are reported in Table 3.*

**Figure 4.** Pseudospectra computation for the Frank matrix of order 32 and = 0.001, 0.005, 0.01, 0.02, 0.03. Additional points selected between the endpoints of the initial sequences are denoted by red circles, while eigenvalues are denoted by red stars.

**Table 2.** Number of iterations for different numbers of initial points (0 = 0.001).


**Table 3.** Relative approximation errors for Perron root and other eigenvalues of the Frank matrix of order 32.


For the numerical experiments in this section, we have restricted ourselves to initial points encircling the spectrum. Another option would be to use our method in tandem with randomization techniques for the initial points selection.

#### **3. Matrix Polynomials**

The derivation of eigenvalue approximating sequences may be readily extended to account for the general matrix polynomial case

$$P(\lambda) = A\_m \lambda^m + A\_{m-1} \lambda^{m-1} + \dots + A\_1 \lambda + A\_{0\prime}$$

where *λ* ∈ C and *Aj* ∈ C*n*×*n* (*j* = 0, 1, ... , *m*), with *Am* = 0. Recall that the *spectrum* of *P*(*λ*) is the set of all its *eigenvalues*; i.e., *σ*(*P*) = {*λ* ∈ C : det *P*(*λ*) = <sup>0</sup>}. For a scalar *λ*0 ∈ *<sup>σ</sup>*(*P*), the nonzero solutions *v*0 ∈ C*n* to the system *<sup>P</sup>*(*<sup>λ</sup>*0)*<sup>v</sup>*0 = 0 are the *eigenvectors* of *P*(*λ*) corresponding to *λ*0.

The –*pseudospectrum* of *P*(*λ*) was introduced in [14] for a given parameter > 0 and a set of nonnegative weights **w** ∈ R*m*+<sup>1</sup> + as the set

$$\sigma\_{\mathfrak{c},\mathbf{w}}(P) = \left\{ \lambda \in \mathbb{C} : \det P\_{\Lambda}(\lambda) = 0, \left\| \Delta\_{\mathfrak{j}} \right\| \le \epsilon w\_{\mathfrak{j}}, j = 0, 1, \dots, m \right\} \tag{3}$$

of eigenvalues of all admissible perturbations *<sup>P</sup>*Δ(*λ*) of *P*(*λ*) of the form

$$P\_{\Delta}(\lambda) = (A\_m + \Delta\_m)\lambda^m + (A\_{m-1} + \Delta\_{m-1})\lambda^{m-1} + \dots + (A\_1 + \Delta\_1)\lambda + (A\_0 + \Delta\_0)\lambda$$

where the norms of the matrices Δ*j* ∈ C*n*×*n* (*j* = 0, 1, ... , *m*) satisfy the specified (, **<sup>w</sup>**)- related constraints. In contrast to the constant matrix case, a whole tuple of perturbing matrices Δ*j* is involved, which explains the presence of the additional parameter vector **w** in the definition of *σ*,**<sup>w</sup>**(*P*). However, considering for some *A* ∈ C*n*×*n* the pencil *P*(*λ*) = *Inλ* − *A*, note that (3) reduces to the usual –*pseudospectrum* of the matrix *A* ∈ C*n*×*n* for the choice of **w** = {*<sup>w</sup>*0, *<sup>w</sup>*1} = {1, <sup>0</sup>}, since

$$\sigma\_{\mathfrak{c},\{1,0\}}(P) = \{\lambda \in \mathbb{C} : \det(I\_n\lambda - (A + \Delta\_0)) = 0, ||\Delta\_0|| \le \mathfrak{c}\} = \sigma\_{\mathfrak{c}}(A).$$

In the general case, the nonnegative weights *wjmj*=<sup>0</sup> allow freedom in how perturbations are measured; for example, in an absolute sense when *w*0 = *w*1 = ··· = *wm* = 1, or in a relative sense when *wj* = *Aj* (*j* = 0, 1, ... , *m*). On the other hand, the choice = 0 leads to *<sup>σ</sup>*0,**w**(*P*) = *<sup>σ</sup>*(*P*).

From a computational viewpoint, a more convenient characterization [14] (Lemma 2.1) for this set is given by

$$\sigma\_{\mathfrak{e},\mathbf{w}}(P) = \{ \lambda \in \mathbb{C} : s\_{\min}(P(\lambda)) \le \epsilon q\_{\mathbf{w}}(|\lambda|) \},$$

where *<sup>s</sup>*min(*P*(*λ*))is the minimum singular value of the matrix *P*(*λ*) and the scalar polynomial

$$q\_{\mathbf{w}}(\lambda) = w\_{\mathbf{w}}\lambda^{m} + w\_{m-1}\lambda^{m-1} + \cdots + w\_{1}\lambda + w\_{0}\lambda$$

is defined in terms of the weights *wjmj*=<sup>0</sup> used in the definition (3) of *σ*,**<sup>w</sup>**(*P*). In fact, since the eigenvalues of *<sup>P</sup>*Δ(*λ*) are continuous with respect to the entries of its coefficient matrices, the boundary of *σ*,**<sup>w</sup>**(*P*) is expressed as

$$\partial \sigma\_{\mathfrak{e}, \mathbf{W}}(P) \subseteq \{\lambda \in \mathbb{C} : s\_{\min}(P(\lambda)) = \mathfrak{e} q\_{\mathbf{W}}(|\lambda|) \};\tag{4}$$

the equality *<sup>s</sup>*min(*P*(*λ*)) = *q***w**(|*λ*|) is satisfied for some > 0 only for a finite number of points *λ* ∈ int(*σ*,**<sup>w</sup>**(*P*)).

Suppose now that we want to approximate an eigenvalue of a matrix polynomial which lies in the neighborhood of some point of interest *μ*0 ∈ C\*σ*(*P*) on the complex plane. Expression (4) suggests that the derivation of a convergen<sup>t</sup> sequence in Section 2 may be readily adapted for our purposes. Indeed, for every scalar *μ*0 ∈ C, there exists some ˆ 1 > 0, such that *μ*0 ∈ *∂σ*ˆ1,**w** and then (4) implies ˆ1 = *<sup>s</sup>*min(*P*(*μ*0)) *q***w**(|*μ*0|) . Moreover, assuming *<sup>s</sup>*min(*P*(*μ*0)) is a simple singular value of the matrix *<sup>P</sup>*(*μ*0), we may invoke Theorem 1 to conclude that the function *gP* : C → R+ with *gP*(*z*) = *<sup>s</sup>*min(*P*(*z*)) is real analytic in a neighborhood of *μ*0 = *x*0 + **i***y*0. In fact,

$$\nabla \mathcal{G}\_P(\mathbf{x}\_0 + \mathbf{i}y\_0) = \left( \text{Re} \left( \boldsymbol{\mu}\_{\text{min}}^\* \frac{\partial P(\mathbf{x}\_0 + \mathbf{i}y\_0)}{\partial \mathbf{x}} \boldsymbol{v}\_{\text{min}} \right), \text{Re} \left( \boldsymbol{\mu}\_{\text{min}}^\* \frac{\partial P(\mathbf{x}\_0 + \mathbf{i}y\_0)}{\partial \mathbf{y}} \boldsymbol{v}\_{\text{min}} \right) \right).$$

where *u*min and *v*min denote the left and right singular vectors of *<sup>P</sup>*(*μ*0) associated to *<sup>s</sup>*min(*P*(*μ*0)) = ˆ1*q***w**(|*μ*0|), respectively [13] (Corollary 4.2).

As in the constant matrix case, moving from the initial point *μ*0 ∈ *∂σ*ˆ1,**w** towards the interior of *σ*ˆ1,**w** in the normal direction to the curve *∂σ*ˆ1,**<sup>w</sup>**, the scalar

$$\mu\_1 = \mu\_0 - \hat{\varepsilon}\_1 \cdot \frac{\nabla[s\_{\min}(P(z)) - \hat{\varepsilon}\_1 q\_\mathbf{w}(|z|)]}{|\nabla[s\_{\min}(P(z)) - \hat{\varepsilon}\_1 q\_\mathbf{w}(|z|)]|}\bigg|\_{z = x\_0 + iy\_0}$$

with ˆ1 = *<sup>s</sup>*min(*P*(*μ*0)) *q***w**(|*μ*0|) can be considered to be an estimate of some eigenvalue *λ* ∈ *<sup>σ</sup>*(*P*). In this way, a convergen<sup>t</sup> sequence {*μk*}*<sup>k</sup>*∈<sup>N</sup> to the eigenvalue *λ* ∈ *σ*(*P*) is recursively defined with initial point *μ*0 and general term

$$\mu\_k = \mu\_{k-1} - \frac{s\_{\min}(P(\mu\_{k-1}))}{q\_{\mathbf{w}}(|\mu\_{k-1}|)} \cdot \frac{\nabla[s\_{\min}(P(z)) - \pounds\_{k-1} q\_{\mathbf{w}}(|z|)]}{|\nabla[s\_{\min}(P(z)) - \pounds\_{k-1} q\_{\mathbf{w}}(|z|)]|} \bigg|\_{z = \mu\_{k-1} = x\_{k-1} + \mathfrak{i}y\_{k-1}} \tag{5}$$
