Probabilistic Perturbation Bounds for Invariant, Deflating and Singular Subspaces

Petkov, Petko H.

doi:10.3390/axioms13090597

Open AccessArticle

Probabilistic Perturbation Bounds for Invariant, Deflating and Singular Subspaces

by

Petko H. Petkov

Department of Engineering Sciences, Bulgarian Academy of Sciences, 1040 Sofia, Bulgaria

Axioms 2024, 13(9), 597; https://doi.org/10.3390/axioms13090597

Submission received: 22 July 2024 / Revised: 17 August 2024 / Accepted: 21 August 2024 / Published: 2 September 2024

(This article belongs to the Special Issue New Trends in Discrete Probability and Statistics)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this paper, we derive new probabilistic bounds on the sensitivity of invariant subspaces, deflation subspaces and singular subspaces of matrices. The analysis exploits a unified method for deriving asymptotic perturbation bounds of the subspaces under interest and utilizes probabilistic approximations of the entries of random perturbation matrices implementing the Markoff inequality. As a result of the analysis, we determine with a prescribed probability asymptotic perturbation bounds on the angles between the corresponding perturbed and unperturbed subspaces. It is shown that the probabilistic asymptotic bounds proposed are significantly less conservative than the corresponding deterministic perturbation bounds. The results obtained are illustrated by examples comparing the known deterministic perturbation bounds with the new probabilistic bounds.

Keywords:

perturbation analysis; probabilistic bounds; invariant subspaces; deflating subspaces; singular subspaces

MSC:

47A55; 15A18; 65F15; 65F25

1. Introduction

In this paper, we are concerned with the derivation of realistic perturbation bounds on the sensitivity of various important subspaces arising in matrix analysis. Such bounds are especially needed in the perturbation analysis of high dimension subspaces when the known bounds may produce very pessimistic results. We show that much tighter bounds on the subspace sensitivity can be obtained by using a probabilistic approach based on the Markoff inequality.

The sensitivity of invariant, deflation and singular subspaces of matrices is considered in detail in the fundamental book of Stewart and Sun [1], as well as in the surveys of Bhatia [2] and Li [3]. In particular, perturbation analysis of the eigenvectors and invariant subspaces of matrices affected by deterministic and random perturbations is presented in several papers and books, see for instance [4,5,6,7,8,9,10,11,12]. Survey [13] is entirely devoted to the asymptotic (first-order) perturbation analysis of eigenvalues and eigenvectors. The algorithmic and software problems in computing invariant subspaces are discussed in [14]. The sensitivity of deflating subspaces arising in the generalized Schur decomposition is considered in [15,16,17,18], and numerical algorithms for analyzing this sensitivity are presented in [19]. Bounds on the sensitivity of singular values and singular spaces of matrices that are subject to random perturbations are derived in [20,21,22,23,24,25,26], to name a few. The stochastic matrix theory that can be used in case of stochastic perturbations is developed in the papers of Stewart [27], and Edelman and Rao [28].

In [29], the author proposed new componentwise perturbation bounds of unitary and orthogonal matrix decomposition based on probabilistic approximations of the entries of random perturbation matrices implementing the Markoff inequality. It was shown that using such bounds it is possible to decrease significantly the asymptotic perturbation bounds of the corresponding similarity or equivalence transformation matrices. Based on the probabilistic asymptotic estimates of the entries of random perturbation matrices, presented in [29], in this paper we derive new new probabilistic bounds on the sensitivity of invariant subspaces, deflation subspaces and singular subspaces of matrices. The analysis and the examples given demonstrate that, in contrast to the known deterministic bounds, the probabilistic bounds are much tighter with a sufficiently high probability. The analysis performed exploits a unified method for deriving asymptotic perturbation bounds of the subspaces under interest, developed in [30,31,32,33,34], and utilizes probabilistic approximations of the entries of random perturbation matrices implementing the Markoff inequality. As a result of the analysis, we determine, with a prescribed probability, asymptotic perturbation bounds on the angles between the perturbed and unperturbed subspaces. It is proved that the new probabilistic asymptotic bounds are significantly less conservative than the corresponding deterministic perturbation bounds. The results obtained are illustrated by examples comparing the deterministic perturbation bounds derived by Stewart [16,35] and Sun [11,18] with the probabilistic bounds derived in this paper.

The paper is structured as follows. In Section 2, we briefly present the main results concerning the derivation of lower magnitude bounds on the entries of a random matrix using only its Frobenius norm. In the next three sections, Section 3, Section 4 and Section 5, we show the application of this approach to derive probabilistic perturbation bounds for the invariant, deflating and singular subspaces of matrices, respectively. We illustrate the theoretical results by examples demonstrating that the probability bounds of such subspaces are much tighter than the corresponding deterministic asymptotic bounds. We note that the known deterministic bounds for the invariant, deflating and singular subspaces are presented briefly as theorems without proof, only with the purpose to compare them with the new bounds.

All computations in the paper are performed with MATLAB^® Version 9.9 (R2020b) [36] using IEEE double-precision arithmetic. M-files implementing the perturbation bounds described in the paper can be obtained from the author.

2. Probabilistic Bounds for Random Matrices

Consider an

m \times n

random matrix,

δ A

, with uncorrelated elements. In the componentwise perturbation analysis of matrix decompositions, we have to use a matrix bound

Δ A = [Δ a_{i j}], Δ a_{i j} > 0

, so that

| δ A | ⪯ Δ A

, i.e.,

| δ a_{i j} | \leq Δ a_{i j}, i = 1, 2, \dots, m, j = 1, 2, \dots, n,

(1)

where

Δ a_{i j} = ∥ δ A ∥

and

∥ . ∥

is some matrix norm. However, if for instance we use the Frobenius norm of

δ A

, we have that

Δ a_{i j} = {∥ δ A ∥}_{F}

and

{∥ Δ A ∥}_{F} = \sqrt{m n} {∥ δ A ∥}_{F},

which produces very pessimistic results for a large m and n. To reduce

{∥ Δ A ∥}_{F}

, in [29] it is proposed to decrease the entries of

Δ A

, taking a bound with entries

Δ a_{i j} = {∥ δ A ∥}_{F} / Ξ

, where

Ξ > 1

. Of course, in the general case, such a bound will not satisfy (1) for all i and j. However, we can allow to exist some entries,

δ a_{i j}

, of the perturbation

δ A

that exceed in magnitude, with some prescribed probability, the corresponding bound,

Δ a_{i j}

. This probability can be determined by the Markoff inequality ([37], Section 5-4)

P {ξ \geq a} \leq \frac{E {ξ}}{a},

(2)

where

P {ξ \geq a}

is the probability that the random variable

ξ

is greater or equal to a given number, a, and

E {ξ}

is the average (or mean value) of

ξ

. Note that this inequality is valid for arbitrary distribution of

ξ

, which makes it conservative for a specific probability distribution. Applying the Markoff inequality, with

ξ

equal to the entry

| δ a_{i j} |

and a equal to the corresponding bound

Δ a_{i j}

, we obtain the following result [29].

Theorem 1.

For an

m \times n

random perturbation,

δ A

, and a desired probability

0 < P^{r e f} < 1

, the estimate

Δ A = [Δ a_{i j}]

, where

Δ a_{i j} = \frac{{∥ δ A ∥}_{F}}{Ξ},

and

Ξ = (1 - P^{r e f}) \sqrt{m n},

(3)

satisfies the inequality

P {| δ a_{i j} | < Δ a_{i j}} \geq P^{r e f}, i = 1, 2, \dots, m, j = 1, 2, \dots, n .

Theorem 1 allows the decrease of the mean value of the bound,

Δ A

, and hence the magnitude of its entries by the quantity

Ξ

, choosing the desired probability,

P^{r e f}

, less than 1. The value

P^{r e f} = 1

corresponds to the case of the deterministic bound,

Δ A

, with entries

Δ a_{i j} = {∥ δ A ∥}_{F}

, when it is fulfilled that all entries of

Δ A

are larger than or equal to the corresponding entries of the perturbation,

δ A

. The value

P^{r e f} < 1

corresponds to

Δ a_{i j} = {∥ δ A ∥}_{F} / Ξ

, where

Ξ > 1

. As mentioned above, the probability bound produced by the Markoff inequality is very conservative, with the actual results being much better than the results predicted by the probability,

P^{r e f}

. This is due to the fact that the Markoff inequality is valid for the worst possible distribution of the random variable

ξ

.

According to Theorem 1, the using of the scaling factor (3) guarantees that the inequality

| δ a_{i j} | < Δ a_{i j}

holds for each i and j with a probability no less than

P^{r e f}

. Since the entries of

δ A

are uncorrelated, this means that, for sufficiently large m and n, the number

P^{r e f} %

also gives a lower bound on the relative number of the entries that satisfy the above inequality.

In some cases, for instance in the perturbation analysis of the Singular Value Decomposition, tighter perturbation bounds are obtained if instead of the norm

{∥ δ A ∥}_{F}

we use the spectral norm

{∥ δ A ∥}_{2}

. The following result is an analogue of Theorem 1 that allows us to use

{∥ δ A ∥}_{2}

at the price of producing smaller values of

Ξ

.

Theorem 2.

For an

m \times n

random perturbation,

δ A

, and a desired probability

0 < P^{r e f} < 1

, the estimate

Δ A = [Δ a_{i j}]

, where

Δ a_{i j} = \frac{{∥ δ A ∥}_{2}}{Ξ_{2}},

and

Ξ_{2} = (1 - P^{r e f}) \sqrt{m},

(4)

satisfies the inequality

P {| δ a_{i j} | < Δ a_{i j}} \geq P^{r e f}, i = 1, 2, \dots, m, j = 1, 2, \dots, n .

(5)

This result follows directly from Theorem 1, replacing

{∥ δ A ∥}_{F}

by its upper bound

\sqrt{n} {∥ δ A ∥}_{2}

.

Since, frequently,

{∥ δ A ∥}_{2}

is of the order of

{∥ δ A ∥}_{F}

, for a large n, Theorem 2 may produce pessimistic results in the sense that the actual probability of fulfilling the inequality

| δ a_{i j} | < Δ a_{i j}

is much larger than the value predicted by (5).

In several instances of the perturbation analysis, we have to determine a bound on the elements of the vector

x = M f,

(6)

where M is a given matrix and

f \in R^{p}

is a random vector with a known probabilistic bound on the elements. In accordance with (6), we have that the following deterministic asymptotic (linear) componentwise bound is valid,

| x_{ℓ} | \leq x_{ℓ}^{l i n} : = ∥ M_{ℓ, 1 : p} ∥_{2} {∥ f ∥}_{2}, ℓ = 1, 2, \dots, p .

(7)

A probability bound on

| x_{ℓ} |, ℓ = 1, 2, \dots, p

can be determined by the following theorem [29].

Theorem 3.

If the estimate of the parameter vector x is chosen as

x^{e s t} = | M f | / Ξ

, where Ξ is determined according to

Ξ = \sqrt{p} (1 - P^{r e f}),

(8)

then

P {| x_{ℓ} | \leq ∥ x^{e s t} ∥_{2}} \leq P^{r e f} .

(9)

Since

∥ x^{e s t} ∥_{2} \leq {∥ M ∥}_{2} {∥ f ∥}_{2} / Ξ,

the inequality (9) shows that the probability estimate of the component

| x_{ℓ} |

can be determined if in the linear estimate (7) we replace the perturbation norm

{∥ f ∥}_{2}

by the probability estimate

{∥ f ∥}_{2} / Ξ

, where the scaling factor,

Ξ

, is taken as shown in (8) for a specified probability,

P^{r e f}

. In this way, instead of the linear estimate,

x_{ℓ}^{l i n}

, we obtain the probabilistic estimate

x_{ℓ}^{e s t} = x_{ℓ}^{l i n} / Ξ = ∥ M_{ℓ, 1 : p} ∥_{2} {∥ f ∥}_{2} / Ξ, ℓ = 1, 2, \dots, p .

3. Perturbation Bounds for Invariant Subspaces

3.1. Problem Statement

Let

U^{H} A U = [\begin{matrix} T_{11} & T_{12} \\ 0 & T_{22} \end{matrix}]

(10)

be the Schur decomposition of the matrix

A \in C^{n \times n}

, where

T_{11} \in C^{k \times k}

contains a given group of the eigenvalues of A ([38], Section 2.3). The matrix, U, of the unitary similarity transformation can be partitioned as

U = [U_{1}, U_{2}], U_{1} \in C^{n \times k},

where the columns of

U_{1}

are the basis vectors of the invariant subspace,

X

, associated with the eigenvalues of the block

T_{11}

, and

U_{2} \in C^{n \times (n - k)}

is the unitary complement of

U_{1}

,

U_{2}^{H} U_{1} = 0

. The invariant subspace satisfies the relation

A X \subset X

. Note that the eigenvalues of A can be reordered in the desired way on the diagonal of T (and hence of

T_{11}

) using unitary similarity transformations ([39], Chapter 7).

The invariant subspace,

X

, is called simple if the matrices

T_{11}

and

T_{22}

have no eigenvalues in common.

If matrix A is a subject to a perturbation,

δ A

, then, instead of the decomposition (10), we have the decomposition

{\tilde{U}}^{H} \tilde{A} \tilde{U} = [\begin{matrix} {\tilde{T}}_{11} & {\tilde{T}}_{12} \\ 0 & {\tilde{T}}_{22} \end{matrix}], {\tilde{T}}_{11} \in C^{k \times k}, \tilde{A} = A + δ A

(11)

with a perturbed matrix of the unitary transformation

\tilde{U} = [{\tilde{U}}_{1}, {\tilde{U}}_{2}] .

The columns of matrix

{\tilde{U}}_{1}

are basis vectors of the perturbed invariant subspace,

\tilde{X}

. We shall assume that matrix A has distinct eigenvalues, i.e.,

X

is a simple invariant subspace that ensures finite perturbations

δ U = \tilde{U} - U

and

δ T = \tilde{T} - T

for small perturbations of A.

Let

X = R (U_{X})

and

Y = R (U_{Y})

be two subspaces of dimension k, where

rank (U_{X}) = rank (U_{Y}) = k

. The distance between

X

and

Y

can be characterized by the gap between these subspaces, defined as [11]

{gap}_{F} (X, Y) = \frac{1}{\sqrt{2}} {∥ P_{X} - P_{Y} ∥}_{F},

(12)

where

P_{X}

and

P_{Y}

are the orthogonal projections onto

X

and

Y

, respectively. Further on, we shall measure the sensitivity of an invariant subspace of dimension k by the canonical angles

Θ_{1} \geq Θ_{2} \geq \dots \geq Θ_{k} \geq 0

between the perturbed and unperturbed subspaces ([35], Chapter 4). The maximum angle

Θ_{\max} = Θ_{1}

is related to the value of the gap between

\tilde{X}

and

X

by the relationship [40]

Θ_{\max} \leq \arcsin ({gap}_{F} (\tilde{X}, X)) .

(13)

We note that the maximum angle between

\tilde{X}

and

X

can be computed efficiently from [41]

Θ_{\max} (\tilde{X}, X) = \arcsin (∥ U_{2}^{H} {\tilde{U}}_{1} ∥_{2}) .

(14)

3.2. $s e p$ -Based Global Bound

Define

E = U^{H} δ A U = [\begin{matrix} E_{11} & E_{12} \\ E_{21} & E_{22} \end{matrix}], E_{11} \in C^{k \times k}, E_{22} \in C^{(n - k) \times (n - k)}

and

M = I_{k} \otimes T_{22} - T_{11}^{T} \otimes I_{n - k},

where ⊗ denotes the Kronecker product ([42], Chapter 4). The norm of the matrix

M^{- 1}

is closely related to the quantity separation between two matrices. The separation between given matrices

A \in C^{n \times n}

and

B \in C^{m \times m}

characterizes the distance between the spectra

λ (A)

and

λ (B)

and is defined as

sep (A, B) = \min_{X \neq 0} \frac{{∥ A X - X B ∥}_{F}}{{∥ X ∥}_{F}} .

(15)

Note that

sep (A, B) = 0

, if and only if A and B have eigenvalues in common.

In the given case, the separation

sep (T_{11}, T_{22}) = \min_{X \neq 0} \frac{∥ T_{22} X - X T_{11} ∥_{F}}{{∥ X ∥}_{F}}

(16)

between the two blocks

T_{11}

and

T_{22}

can be determined from

sep (T_{11}, T_{22}) = σ_{\min} (M)

(17)

which is equivalent to

∥ M^{- 1} ∥_{2} = 1 / sep (T_{11}, T_{22}) .

Assume that the spectra of

T_{11}

and

T_{22}

are disjoint, so that

sep (T_{11}, T_{22}) > 0

. Then the following theorem gives an estimate of the sensitivity of an invariant subspace of A.

Theorem 4

([16]). Let matrix A be decomposed, as in (10). Given the perturbation, E, set

\begin{matrix} γ & = & ∥ E_{21} ∥_{2}, \\ η & = & ∥ T_{12} + E_{12} ∥_{2}, \\ δ & = & sep (T_{11}, T_{22}) - ∥ E_{11} ∥_{2} - {∥ E_{22} ∥}_{2} . \end{matrix}

If

δ > 0

and

\frac{γ η}{δ^{2}} < \frac{1}{4},

(18)

then there is a unique matrix, P, satisfying

{∥ P ∥}_{2} \leq \frac{2 γ}{δ + \sqrt{δ^{2} - 4 γ η}} < 2 \frac{γ}{δ},

(19)

such that the columns of

{\tilde{U}}_{1} = (U_{1} + U_{2} P) {(I_{k} + P^{H} P)}^{- 1 / 2}

span a right invariant subspace of

A + δ A

.

It may be shown that the singular values of the matrix

U_{2}^{H} {\tilde{U}}_{1} = P {(I_{k} + P^{H} P)}^{- 1 / 2}

are the sines of the canonical angles

Θ_{1}, Θ_{2}, \dots, Θ_{k}

between the invariant subspaces

\tilde{X}

and

X

. That is why, if P has singular values

σ_{1}, σ_{2}, \dots σ_{k}

, then the singular values of

I_{k} + P^{H} P

are

1 + σ_{1}^{2}, 1 + σ_{2}^{2}, \dots, 1 + σ_{k}^{2}

and

\frac{σ_{i}}{\sqrt{1 + σ_{i}^{2}}} = \sin (Θ_{i}) .

Hence

σ_{i} = \tan (Θ_{i}), i = 1, 2, \dots, k .

Thus, Theorem 4 bounds the tangents of the canonical angles between the perturbed,

\tilde{X}

, and unperturbed,

X

, invariant subspace of A. The maximum canonical angle fulfils

Θ_{\max} (\tilde{X}, X) \leq \arctan (\frac{2 γ}{δ + \sqrt{δ^{2} - 4 γ η}}) .

(20)

3.3. Perturbation Expansion Bound

A global perturbation bound for invariant subspaces is derived by Sun [11,40] using the perturbation expansion method. The essence of this method is to expand the perturbed basis

{\tilde{U}}_{X}

in infinite series in the powers of

E_{11}, E_{12}, E_{21}, E_{22}

and then estimate the series sum.

Theorem 5

([11]). Let matrix A be decomposed, as in (10). Given the perturbation

E = U^{H} δ A U

, set

\begin{matrix} β & = & ∥ E_{11} ∥_{2} + ∥ E_{22} ∥_{2}, β_{1} = ∥ E_{12} ∥_{F}, β_{2} = {∥ E_{21} ∥}_{F}, \\ α_{12} & = & ∥ T_{12} ∥_{2}, δ = sep (T_{11}, T_{22}), \\ γ_{1} & = & \frac{β_{2}}{δ}, γ_{2} = \frac{β γ_{1} + α_{12} γ_{1}^{2}}{δ} \end{matrix}

and

γ_{m} = \frac{β γ_{m - 1} + β_{1} \sum_{j = 1}^{m - 2} γ_{m - 1 - j} γ_{j} + α_{12} \sum_{j = 1}^{m - 1} γ_{m - j} γ_{j}}{δ} .

(21)

Then the simple invariant subspace,

X

, of the matrix

A + δ A

has the following qth-order perturbation estimation for any natural number, q:

d_{F} (\tilde{X}, X) \leq \sum_{m = 1}^{q} γ_{m} + {O (∥ δ A ∥}_{F}^{q + 1}) .

(22)

Theorem 5 can be used to estimate the canonical angles between the perturbed and unperturbed invariant subspaces. Taking into account (13), we obtain the bound

Θ_{\max} (\tilde{X}, X) = \arcsin (\sum_{m = 1}^{q} γ_{m}) .

(23)

The implementation of Theorem 5 to estimate the sensitivity of an invariant subspace shows that the bound (23) tends to overestimate severely the true value of the angle

Θ_{\max}

for

q > 1

and large n. In practice, it is possible to obtain reasonable results if we use only the first-order term (

q = 1

) in the expansion (23), i.e., if we use the linear bound

Θ_{\max} (\tilde{X}, X) = \arcsin (γ_{1}) = \arcsin (\frac{∥ E_{21} ∥_{F}}{δ}) .

(24)

3.4. Bound by the Splitting Operator Method

The essence of the splitting operator method for perturbation analysis of matrix problems [31] consists in the separate deriving of perturbation bounds on

| δ U |

and

| δ T |

. For this aim, we introduce the perturbation parameter vector

x = vec (Low (δ W)) \in C^{p}, p = n (n - 1) / 2,

where the components of x are the entries of the strictly lower triangular part of the matrix

δ W = U^{H} δ U

. This vector is then used to find bounds on the various elements of the Schur decomposition.

Let

F = - U^{H} (δ A) U

and construct the vector

f = vec (Low (F)) \in C^{p} .

The equation for the perturbation parameters represents a linear system of equations [32]

M x = f + Δ^{x},

(25)

where

M \in C^{p \times p}

is a matrix whose elements are determined from the entries of T, and the components of the vector

Δ^{x} \in C^{p}

contain higher-order terms in the perturbations

δ u_{i}, i = 1, 2, \dots, n

. Specifically, matrix M is determined by

M = Ω (I_{n} \otimes T - T^{T} \otimes I_{n}) Ω^{T} \in C^{p \times p},

(26)

where

\begin{matrix} Ω & : = & [diag (Ω_{1}, Ω_{2}, \dots, Ω_{n - 1}), 0_{p \times n}] \in R^{p \times n^{2}}, \\ Ω_{k} & : = & [0_{(n - k) \times k}, I_{n - k}] \in R^{(n - k) \times n}, k = 1, 2, \dots, n - 1 . \end{matrix}

(27)

Note that matrix M is non-singular although the matrix

I_{n} \otimes T - T^{T} \otimes I_{n}

is not of full rank.

Equation (25) is independent from the equations that determine the perturbations of the elements of the Schur form T. This first allows us to solve (25) and estimate

| δ U |

and then to use the solution obtained to determine bounds on the elements of

| δ T |

.

Neglecting the second-order term

Δ^{x}

in (25), we obtain the first-order (linear) approximation of x,

x = M^{- 1} f .

(28)

Since

{∥ f ∥}_{2} \leq {∥ δ A ∥}_{F}

, we have that

| x_{ℓ} | ⪯ x_{ℓ}^{l i n}, ℓ = 1, 2, \dots, p,

where

x_{ℓ}^{l i n} = ∥ M_{ℓ, 1 : p}^{- 1} ∥_{2} {∥ δ A ∥}_{F}

(29)

is the asymptotic bound on

| x_{ℓ} |

.

The matrix

| δ W |

can be estimated as

| δ W | ⪯ δ W^{l i n} + Δ^{W},

(30)

where

δ W^{l i n} = [\begin{matrix} 0 & x_{1}^{l i n} & x_{2}^{l i n} & \dots & x_{n - 1}^{l i n} \\ x_{1}^{l i n} & 0 & x_{n}^{l i n} & \dots & x_{2 n - 3}^{l i n} \\ x_{2} & x_{n}^{l i n} & 0 & \dots & x_{3 n - 6}^{l i n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n - 1^{l i n}} & x_{2 n - 3}^{l i n} & x_{3 n - 6}^{l i n} & \dots & 0 \end{matrix}] \in R^{n \times n}

is a first-order approximation of

| δ W |

, and

Δ^{W}

contains higher-order terms in x. Thus, an asymptotic (linear) approximation of the matrix

| δ U |

can be determined as

| δ U | ⪯ δ U^{l i n} = | U | | U^{H} δ U | = | U | δ W^{l i n} .

(31)

Since

{\tilde{U}}_{1} = U_{1} + δ U_{1}, U_{2}^{H} U_{1} = 0,

one has that

\sin (Θ_{\max} (\tilde{X}, X)) = ∥ U_{2}^{H} δ U_{1} ∥_{2} = {∥ δ W_{k + 1 : n, 1 : k} ∥}_{2} .

(32)

Equation (32) shows that the sensitivity of the invariant subspace,

X

, of dimension k is connected to the values of the perturbation parameters

x_{ℓ} = u_{i}^{H} δ u_{j}, ℓ = i + (j - 1) n - \frac{j (j + 1)}{2}, i > k, j = 1, 2, \dots, k

. Consequently, if the perturbation parameters are known, it is possible to find at once sensitivity estimates for all invariant subspaces with dimension

k = 1, 2, \dots, n - 1

. More specifically, let

δ W = [\begin{matrix} * & * & * & \dots & * \\ x_{1} & * & * & \dots & * \\ x_{2} & x_{n} & * & \dots \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n - 1} & x_{2 n - 3} & x_{3 n - 6} & \dots & * \end{matrix}],

where * is an unspecified entry. Then we have that the maximum angle between the perturbed and unperturbed invariant subspaces of dimension k is

Θ_{\max} (\tilde{X}, X) = \arcsin (∥ δ W_{k + 1 : n, 1 : k} ∥_{2}) .

(33)

In this way, we obtain the following result.

Theorem 6.

Let matrix A be decomposed, as in (10), and assume that the Frobenius norm of the perturbation

δ A

is known. Set

L = [\begin{matrix} * & * & * & \dots & * \\ M_{1, 1 : p}^{- 1} & * & * & \dots & * \\ M_{2, 1 : p}^{- 1} & M_{n, 1 : p}^{- 1} & * & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ M_{n - 1, 1 : p}^{- 1} & M_{2 n - 3, 1 : p}^{- 1} & M_{3 n - 6, 1 : p}^{- 1} & \dots & * \end{matrix}] \in C^{n \times n \times p},

where matrix M is determined by (26). Then the following asymptotic estimate holds,

Θ_{\max}^{l i n} (\tilde{X}, X) \leq \arcsin (∥ L_{k + 1 : n, 1 : k p} ∥_{2} {∥ f ∥}_{2}) \leq \arcsin (∥ L_{k + 1 : n, 1 : k p} ∥_{2} {∥ δ A ∥}_{F}) .

(34)

The proof of Theorem 6 follows directly from (33), replacing the matrix

δ W

by its linear approximation,

δ W^{l i n}

, and substituting each

x_{ℓ}^{l i n}

by its approximation (29). Note that, as always in the case of perturbation bounds, the equality can be achieved only for specially constructed perturbation matrices.

Denote by

δ λ = [\begin{matrix} δ t_{11} \\ δ t_{22} \\ ⋮ \\ δ t_{n n} \end{matrix}] \in C^{n}

the changes of the diagonal elements of T, i.e., the perturbations of the eigenvalues

λ_{i}

of A. Then the first-order eigenvalue perturbations satisfy

| δ λ_{i} | \leq δ λ_{i}^{l i n}, i = 1, 2, \dots, n,

where

δ λ_{i}^{l i n} = ∥ Z_{i, 1 : p + n} ∥_{2} {∥ δ A ∥}_{F},

(35)

and

Z = [N M^{- 1}, I_{n}] \in C^{n \times (p + n)},

N = Π (I_{n} \otimes T - T^{T} \otimes I_{n}) Ω^{T} \in C^{n \times p},

Π = diag (e_{1}^{T}, e_{2}^{T}, \dots, e_{n}^{T}) = [e_{1} e_{1}^{T}, e_{2} e_{2}^{T}, \dots, e_{n} e_{n}^{T}] \in R^{n \times n^{2}} .

The obtained linear bound (35) coincides numerically with the well-known asymptotic bounds from the literature [12,35,39]. The quantity

∥ Z_{i, 1 : p + n} ∥_{2}

is equal to the condition number of the eigenvalue

λ_{i}

.

3.5. Probabilistic Perturbation Bound

The idea of determining tighter perturbation bounds of matrix subspaces consists in replacing the Frobenius or 2-norm of the matrix perturbation by a much smaller probabilistic estimate of the perturbation entries, obtained using Theorems 1 and 3. This allows us to decrease, with a specified probability, the perturbation bounds for the different subspaces, achieving better results for higher dimensional problems. In simple terms, we replace

∥ δ A ∥

in the corresponding asymptotic estimate by the ratio

∥ δ A ∥ / Ξ

, thus decreasing the perturbation bound by the quantity

Ξ > 1

, which is determined by the desired probability,

P^{r e f}

. We shall illustrate this idea considering first the case of invariant subspaces.

Using Theorem 3, the probabilistic perturbation bounds of x and

δ U

in the case of the Schur decomposition can be found from (29) and (31), respectively, replacing in (29) the perturbation norm

{∥ δ A ∥}_{F}

by the quantity

{∥ δ A ∥}_{F} / Ξ

, where

Ξ

is determined according to (3) from the desired probability,

P^{r e f}

, and the problem order, n. In this way, we obtain the probabilistic asymptotic estimate

Θ_{\max}^{e s t} (\tilde{X}, X) = \arcsin (∥ L_{k + 1 : n, 1 : k p} ∥_{2} {∥ δ A ∥}_{F} / Ξ)

(36)

of the maximum angle between the perturbed and unperturbed invariant subspaces of dimension k. In the same way, from (35), we obtain a probabilistic asymptotic estimate

δ λ_{i}^{e s t} = ∥ Z_{i, 1 : p + n} ∥_{2} {∥ δ A ∥}_{F} / Ξ, i = 1, 2, \dots, n,

(37)

of the eigenvalue perturbations.

3.6. Bound Comparison

In the next example, we compare the invariant subspace deterministic perturbation bounds, obtained by the

s e p

-based approach, the perturbation expansion method and the splitting operator method, with the probabilistic bound obtained by using the Markoff inequality.

Example 1.

Consider a

100 \times 100

matrix A, taken as

A = Q_{0} J_{0} Q_{0}^{- 1},

where

J_{0} = [\begin{matrix} 0.1 & 0.1 & τ & τ & \dots & τ & τ \\ - 0.1 & 0.1 & τ & τ & \dots & τ & τ \\ 0.2 & 0.2 & \dots & τ & τ \\ - 0.2 & 0.2 & \dots & τ & τ \\ ⋱ & ⋮ & ⋮ \\ 5.0 & 5.0 \\ - 5.0 & 5.0 \end{matrix}], τ = 0.4

and the matrix

Q_{0}

is constructed as [43]

\begin{matrix} Q_{0} & = & H_{2} Σ H_{1}, Q_{0}^{- 1} = H_{1} Σ^{- 1} H_{2}, \\ H_{1} & = & I_{n} - 2 u u^{T} / n, H_{2} = I_{n} - 2 v v^{T} / n, \\ u & = & {[1, 1, 1, \dots, 1]}^{T}, v = {[1, - 1, 1, \dots, {(- 1)}^{n - 1}]}^{T}, \\ Σ & = & diag (1, σ, σ^{2}, \dots, σ^{n - 1}), \end{matrix}

where

H_{1}, H_{2}

are elementary reflections, σ is taken equal to

1.065

and

cond (Q_{0}) = 510.0481

. The eigenvalues of A,

λ_{i} = 0.1 \pm 0.1 j_{0}, 0.2 \pm 0.2 j_{0}, \dots, 5.0 \pm 5.0 j_{0},

are complex conjugated. The perturbation of A is taken as

δ A = 10^{- 12} \times A_{0}

, where

A_{0}

is a matrix with random entries with normal distribution and

{∥ δ A ∥}_{F} = 9.91429 \times 10^{- 11}

. The matrix M in (25) is of order

n = 4950

and its inverse satisfies

∥ M^{- 1} ∥_{2} = 8.0431 \times 10^{4}

, which shows that the eigenvalue problem for A is ill-conditioned since the perturbations of A can be “amplified”

10^{4}

times in x and, consequently, in

δ U

and

Θ (\tilde{X}, X)

.

In Figure 1, we show the mean value of the matrix

[Δ a_{i j} / | δ a_{i j} |]

and the relative number

N {Δ a_{i j} \geq | δ a_{i j} |} / (n^{2})

of the entries of the matrix

| δ A |

for which

| δ a_{i j} | \leq | Δ a_{i j} |

, obtained for normal and uniform distribution of the entries of

δ A

and for different values of the desired probability,

P^{r e f}

. For the case of normal distribution and

P^{r e f} = 90 %

, the size of the probability entry bound,

Δ a_{i j}

, decreases 10 times in comparison with the size of the entry bound

{∥ δ A ∥}_{F}

, which allows the decrease of the mean value of the ratio

Δ a_{i j} / | δ a_{i j} |

from

638.38

to

63.838

(Table 1). For

P^{r e f} = 40 %

, the probability bound,

Δ a_{i j}

, is 60 times smaller than the bound

{∥ δ A ∥}_{F}

, and even for this small desired probability the number of entries for which

| δ a_{i j} | \leq | Δ a_{i j} |

is still

90.34 %

.

In Figure 2, we compare the asymptotic bound,

δ λ_{i}^{l i n}

, and the probabilistic estimate,

δ λ_{i}^{e s t}

, with the actual eigenvalue perturbations

| δ λ_{i} |

for normal distribution of perturbation entries and probabilities

P^{r e f} = 90 %, 80 %

and

60 %

. (For clarity, the perturbations of the superdiagonal elements of T are hidden). The probabilistic bound,

δ λ_{i}^{e s t}

, is much tighter than the linear bound,

δ λ_{i}^{l i n}

, and the inequality

| δ λ_{i} | \leq δ λ_{i}^{e s t}

is satisfied for all eigenvalues and all chosen probabilities. In particular, the size of the estimate,

δ λ_{i}^{e s t}

, is 10 times smaller than the linear estimate,

δ λ_{i}^{l i n}

, for

P^{r e f} = 90 %

, 20 times for

P^{r e f} = 80 %

and 40 times for

P^{r e f} = 60 %

.

In Figure 3, we show the asymptotic bound,

δ Θ_{k}^{l i n}

, and the probabilistic estimate,

δ Θ_{k}^{e s t}

, along with the actual value

| δ Θ_{k} |

of the maximum angle between the perturbed and unperturbed invariant subspace of dimensions

k = 1, 2, \dots, n - 1

for the same probabilities

P^{r e f} = 90 %, 80 %

and

60 %

. The probability estimate satisfies

| δ Θ_{k} | < δ Θ_{k}^{e s t}

for all

1 \leq k < 100

. For comparison, we give the global bound on the maximum angle between the perturbed and unperturbed invariant subspaces computed by (20) and the first-order bound determined by (24). (The computation of the

s e p

-based estimate is performed by using the numerical algorithm, presented in [14].) The global bound is slightly larger than the asymptotic bounds, and the asymptotic bounds (24) and (34) coincide.

4. Perturbation Bounds for Deflating Subspaces

4.1. Problem Statement

Let

A - λ B

be a square matrix pencil [35]. Then there exist unitary matrices U and V, such that

T = U^{H} A V

and

R = U^{H} B V

are both upper triangular. The pair

(T, R)

constitute the generalized Schur form of the pair

(A, B)

. The eigenvalues of

A - λ B

are then equal to

t_{i i} / r_{i i}

, the ratios of the diagonal entries of T and R. Further on, we consider the case of regular pencils for which

\det (A - λ B) \neq 0

and in addition the matrix B is non-singular, i.e., the generalized eigenvalues are finite.

Suppose that the generalized Schur form of the pencil

A - λ B

is reordered as

U^{H} A V = [\begin{matrix} T_{11} & T_{12} \\ 0 & T_{22} \end{matrix}], U^{H} B V = [\begin{matrix} R_{11} & R_{12} \\ 0 & R_{22} \end{matrix}],

(38)

where

T_{11} - λ R_{11}

contains k specified eigenvalues. Partitioning conformally the unitary matrices of the equivalence transformation as

U = [U_{1}, U_{2}], V = [V_{1}, V_{2}]

, we obtain orthonormal bases

U_{1} \in C^{n \times k}

and

V_{1} \in C^{n \times k}

of the left,

Y

, and right,

X

, deflating subspace, respectively. The deflating subspaces satisfy the relations

A X \subset Y

and

B X \subset Y

.

Further on, we are interested in the sensitivity of the deflating subspaces corresponding to specified generalized eigenvalues of the pair

(T_{11}, R_{11})

. Suppose that the matrices of the pencil

A - λ B

are perturbed as

\tilde{A} = A + δ A

and

\tilde{B} = B + δ B

. The upper triangular matrices in the generalized Schur decomposition are represented as

{\tilde{U}}^{H} \tilde{A} \tilde{V} = [\begin{matrix} {\tilde{T}}_{11} & {\tilde{T}}_{12} \\ 0 & {\tilde{T}}_{22} \end{matrix}], {\tilde{U}}^{H} \tilde{B} \tilde{V} = [\begin{matrix} {\tilde{R}}_{11} & {\tilde{R}}_{12} \\ 0 & {\tilde{R}}_{22} \end{matrix}],

where

\tilde{U} = [\begin{matrix} {\tilde{U}}_{1}, & {\tilde{U}}_{2} \end{matrix}], \tilde{V} = [\begin{matrix} {\tilde{V}}_{1}, & {\tilde{V}}_{2} \end{matrix}]

are the modified equivalence transformation matrices.

4.2. $d i f$ -Based Global Bound

Let the unperturbed left and right deflation subspaces corresponding to the first k eigenvalues be denoted by

Y = R (U_{1})

and

X = R (V_{1})

, respectively, and their perturbed counterparts as

\tilde{X}

and

\tilde{Y}

.

If

∥ δ A ∥

and

∥ δ B ∥

are small, we may expect that the spaces

\tilde{X}

and

\tilde{Y}

will be close to the spaces

X

and

Y

, respectively.

Define the matrices

E = U^{H} δ A V = [\begin{matrix} E_{11} & E_{12} \\ E_{21} & E_{22} \end{matrix}], F = U^{H} δ B V = [\begin{matrix} F_{11} & F_{12} \\ F_{21} & F_{22} \end{matrix}]

and

M = [\begin{matrix} I_{k} \otimes T_{22} & - T_{11}^{T} \otimes I_{n - k}, \\ I_{k} \otimes R_{22} & - R_{11}^{T} \otimes I_{n - k} \end{matrix}] .

Note that matrix M is invertible if and only if the pencils

T_{11} - λ R_{11}

and

T_{22} - λ R_{22}

have no eigenvalues in common.

Define the difference between the spectra of

T_{11} - λ R_{11}

and

T_{22} - λ R_{22}

as [15]

dif (T_{11}, R_{11}; T_{22}, R_{22}) = \min_{[S, W] \neq 0} \frac{{||[\begin{matrix} T_{22} S - W T_{11} \\ R_{22} S - W R_{11} \end{matrix}]||}_{F}}{{∥ [S, W] ∥}_{F}} .

It is possible to prove the relationship

dif (T_{11}, R_{11}; T_{22}, R_{22}) = σ_{\min} (M),

so that

∥ M^{- 1} ∥_{2} = 1 / dif (T_{11}, R_{11}; T_{22}, R_{22}) .

Assume that the spectra of

T_{11} - λ R_{11}

and

T_{22} - λ R_{22}

are disjoint so that

dif (T_{11}, R_{11}; T_{22}, R_{22}) > 0 .

The following theorem gives an estimate of the sensitivity of a deflating subspace of a regular pencil.

Theorem 7

([1,15]). Let the regular pencil

A - λ B

be represented as in (38). Given the perturbation

(E, F)

, set

\begin{matrix} γ & = & ∥ (E_{21}, F_{21}) ∥_{F}, \\ η & = & ∥ (T_{12} + E_{12}, R_{12} + F_{12}) ∥_{F}, \\ δ & = & dif (T_{11}, R_{11}; T_{22}, R_{22}) \\ - \max {∥ E_{11} ∥_{F} + ∥ E_{22} ∥_{F}, ∥ F_{11} ∥_{F} + ∥ F_{22} ∥_{F}} . \end{matrix}

If

δ > 0

and

\frac{η γ}{δ^{2}} < \frac{1}{4},

there are matrices P and Q satisfying

{∥ (P, Q) ∥}_{F} \leq \frac{2 γ}{δ + \sqrt{δ^{2} - 4 γ η}},

(39)

such that the columns of

{\tilde{V}}_{1} = V_{1} + V_{2} P

and

{\tilde{U}}_{1} = U_{1} + U_{2} Q

span right and left deflating subspaces for

A + δ A - λ (B + δ B)

. Note that

{∥ (P, Q) ∥}_{F} = \sqrt{{∥ P ∥}_{F}^{2} + {∥ Q ∥}_{F}^{2}} .

It is possible to show that Equation (39) bounds the tangents of the canonical angles between

\tilde{X}

and

X

or

\tilde{Y}

and

Y

, similarly to the ordinary eigenvalue problem. Specifically, we have that

\max {Φ_{\max} (\tilde{X}, X), Θ_{\max} (\tilde{Y}, Y)} \leq \arctan (\frac{2 γ}{δ + \sqrt{δ^{2} - 4 γ η}}) .

(40)

Theorem 7 can be considered as a generalization of Theorem 4 for the ordinary eigenvalue problem.

4.3. Perturbation Expansion Bound

Global perturbation bounds for deflating subspaces that produce individual perturbation bounds for each subspace in a pair of deflating subspaces are presented in [18].

Theorem 8.

Let

(A, B)

be an

n \times n

regular matrix pair represented as in (38), and let

X = R (V_{1})

and

Y = R (U_{1})

. Moreover, let

κ_{X}, κ_{Y}

be defined by

κ_{X} = ∥ P_{X} ∥_{2}, κ_{Y} = {∥ P_{Y} ∥}_{2},

where

\begin{matrix} P_{X} & = & [(R_{11}^{T} \otimes I_{n - k}) S^{- 1}, (- T_{11}^{T} \otimes I_{n - k}) S^{- 1}], \\ P_{Y} & = & [(I_{k} \otimes R_{22}) S^{- 1}, (- I_{k} \otimes T_{22}) S^{- 1}], \\ S & = & T_{11} \otimes R_{22} - R_{11}^{T} \otimes T_{22} \end{matrix}

and let

κ = \sqrt{κ_{X}^{2} + κ_{Y}^{2}}, γ = {||[\begin{matrix} E_{21} \\ F_{21} \end{matrix}]||}_{F}, η = {||[\begin{matrix} T_{12} + E_{12} \\ R_{12} + F_{12} \end{matrix}]||}_{F},

(41)

ε = \sqrt{∥ (E_{11}, F_{11}) ∥_{F}^{2} + {∥ (E_{22}, F_{22}) ∥}_{F}^{2}} .

(42)

If

κ (2 \sqrt{γ η} + ε) < 1,

then there exists a pair

\tilde{X} = R ({\tilde{V}}_{1})

and

\tilde{Y} = R ({\tilde{U}}_{1})

of k-dimensional deflating subspaces of

(A + δ A, B + δ B)

such that

\begin{matrix} Φ_{\max} (\tilde{X}, X) \leq \arctan (\frac{2 κ_{X} γ}{1 - κ ε + \sqrt{{(1 - κ ε)}^{2} - 4 κ^{2} γ η}}), \end{matrix}

(43)

\begin{matrix} Θ_{\max} (\tilde{X}, X) \leq \arctan (\frac{2 κ_{Y} γ}{1 - κ ε + \sqrt{{(1 - κ ε)}^{2} - 4 κ^{2} γ η}}) . \end{matrix}

(44)

4.4. Bound by the Splitting Operator Method

The application of the splitting operator method to the perturbation analysis of the generalized Schur decomposition is performed in [34].

Consider again the generalized Schur decomposition (38). To derive perturbation bounds of the matrices

δ U_{1} = {\tilde{U}}_{1} - U_{1}

and

δ V_{1} = {\tilde{V}}_{1} - V_{1}

, we introduce the perturbation parameter vectors

x = vec (Low (δ W_{U})) \in C^{p}, p = n (n - 1) / 2,

and

y = vec (Low (δ W_{V})) \in C^{p},

where

δ W_{U} = U^{H} δ U

and

δ W_{V} = V^{H} δ V

. Let

F = - U^{H} (δ A) V, G = - U^{H} (δ B) V

and construct the vectors

f = vec (Low (F)) \in C^{p}, g = vec (Low (G)) \in C^{p} .

Then, asymptotic bounds of x and y can be found by solving the linear system of equations

[\begin{matrix} M_{x f} & - M_{x g} \\ M_{y f} & - M_{y g} \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} f \\ g \end{matrix}],

(45)

where

\begin{matrix} M_{x f} = Ω (T^{T} \otimes I_{n}) Ω^{T}, & M_{x g} = Ω (I_{n} \otimes T) Ω^{T}, \end{matrix}

(46)

\begin{matrix} M_{y f} = Ω (R^{T} \otimes I_{n}) Ω^{T}, & M_{y g} = Ω (I_{n} \otimes R) Ω^{T} \end{matrix}

(47)

and

\begin{matrix} Ω & : = & [diag (Ω_{1}, Ω_{2}, \dots, Ω_{n - 1}), 0_{p \times n}] \in R^{p \times n^{2}}, \\ Ω_{k} & : = & [0_{(n - k) \times k}, I_{n - k}] \in R^{(n - k) \times n}, k = 1, 2, \dots, n - 1 . \end{matrix}

(48)

Hence, linear approximations of the elements of x and y can be determined from

\begin{matrix} x_{ℓ}^{l i n} & = & ∥ M_{ℓ, 1 : 2 p}^{- 1} ∥_{2} \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}, \end{matrix}

(49)

\begin{matrix} y_{ℓ}^{l i n} & = & ∥ M_{ℓ + p, 1 : 2 p}^{- 1} ∥_{2} \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}, \end{matrix}

(50)

\begin{matrix} ℓ = 1, 2, \dots, p, \end{matrix}

(51)

where

M = [\begin{matrix} M_{x f} & - M_{x g} \\ M_{y f} & - M_{y g} \end{matrix}] .

The matrices

| δ W_{U} |

and

| δ W_{V} |

can be estimated as

\begin{matrix} | δ W_{U} | & ⪯ & δ W_{U}^{l i n} + Δ^{W_{U}}, \end{matrix}

(52)

\begin{matrix} | δ W_{V} | & ⪯ & δ W_{V}^{l i n} + Δ^{W_{V}}, \end{matrix}

(53)

where

δ W_{U}^{l i n} = [\begin{matrix} 0 & x_{1}^{l i n} & x_{2}^{l i n} & \dots & x_{n - 1}^{l i n} \\ x_{1}^{l i n} & 0 & x_{n}^{l i n} & \dots & x_{2 n - 3}^{l i n} \\ x_{2} & x_{n}^{l i n} & 0 & \dots & x_{3 n - 6}^{l i n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n - 1^{l i n}} & x_{2 n - 3}^{l i n} & x_{3 n - 6}^{l i n} & \dots & 0 \end{matrix}] \in R^{n \times n},

δ W_{V}^{l i n} = [\begin{matrix} 0 & y_{1}^{l i n} & y_{2}^{l i n} & \dots & y_{n - 1}^{l i n} \\ y_{1}^{l i n} & 0 & y_{n}^{l i n} & \dots & y_{2 n - 3}^{l i n} \\ y_{2} & y_{n}^{l i n} & 0 & \dots & y_{3 n - 6}^{l i n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ y_{n - 1^{l i n}} & y_{2 n - 3}^{l i n} & y_{3 n - 6}^{l i n} & \dots & 0 \end{matrix}] \in R^{n \times n}

and

Δ^{W_{U}}, Δ^{W_{V}}

contain higher-order terms in

x, y

. Thus, asymptotic approximation of the matrices

| δ U |

and

| δ V |

can be determined as

\begin{matrix} | δ U | & ⪯ & δ U^{l i n} = | U | | U^{H} δ U | = | U | δ W_{U}^{l i n}, \end{matrix}

(54)

\begin{matrix} | δ V | & ⪯ & δ V^{l i n} = | V | | V^{H} δ V | = | V | δ W_{V}^{l i n} . \end{matrix}

(55)

Let

{\tilde{V}}_{1}

and

V_{1}

be the orthonormal bases of the perturbed and unperturbed right deflation subspace,

X

, of dimension k, and

{\tilde{U}}_{1}

and

U_{1}

be, respectively, the orthonormal bases of the perturbed and unperturbed left deflation subspace,

Y

, of the same dimension. Since

\begin{matrix} {\tilde{V}}_{1} & = & V_{1} + δ V_{1}, V_{2}^{H} V_{1} = 0, \\ {\tilde{U}}_{1} & = & U_{1} + δ U_{1}, U_{2}^{H} U_{1} = 0, \end{matrix}

we have that

\begin{matrix} \sin (Φ_{\max} (\tilde{X}, X)) & = & ∥ V_{2}^{H} δ V_{1} ∥_{2}, \\ \sin (Θ_{\max} (\tilde{Y}, Y)) & = & ∥ U_{2}^{H} δ U_{1} ∥_{2} . \end{matrix}

Using these expressions, it is possible to show that

\begin{matrix} Φ_{\max} (\tilde{X}, X) & = & \arcsin (∥ δ {W_{V}}_{k + 1 : n, 1 : k p} ∥_{2}), \\ Θ_{\max} (\tilde{Y}, Y) & = & \arcsin (∥ δ {W_{U}}_{k + 1 : n, 1 : k p} ∥_{2}) . \end{matrix}

Implementing the asymptotic approximations of the elements of the vectors x and y, we obtain the following result.

Theorem 9.

Let the pair

(A, B)

be decomposed, as in (38), and assume that the Frobenius norms of the perturbations

δ A

and

δ B

are known. Set

L_{X} = [\begin{matrix} * & * & * & \dots & * \\ M_{1 + p, 1 : 2 p}^{- 1} & * & * & \dots & * \\ M_{2 + p, 1 : 2 p}^{- 1} & M_{n + p, 1 : 2 p}^{- 1} & * & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ M_{n - 1 + p, 1 : 2 p}^{- 1} & M_{2 n - 3 + p, 1 : 2 p}^{- 1} & M_{3 n - 6 + p, 1 : 2 p}^{- 1} & \dots & * \end{matrix}] \in C^{n \times 2 n p},

L_{Y} = [\begin{matrix} * & * & * & \dots & * \\ M_{1, 1 : 2 p}^{- 1} & * & * & \dots & * \\ M_{2, 1 : 2 p}^{- 1} & M_{n, 1 : 2 p}^{- 1} & * & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ M_{n - 1, 1 : 2 p}^{- 1} & M_{2 n - 3, 1 : 2 p}^{- 1} & M_{3 n - 6, 1 : 2 p}^{- 1} & \dots & * \end{matrix}] \in C^{n \times 2 n p} .

Then, the following asymptotic bounds of the angles between the perturbed and unperturbed deflation subspaces of dimension

1 \leq k < n

are valid,

\begin{matrix} Φ_{\max}^{l i n} (\tilde{X}, X) & = & \arcsin (∥ {L_{X}}_{k + 1 : p, 1 : 2 k p} ∥_{2} \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}), \end{matrix}

(56)

\begin{matrix} Θ_{\max}^{l i n} (\tilde{Y}, Y) & = & \arcsin (∥ {L_{Y}}_{k + 1 : p, 1 : 2 k p} ∥_{2} \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}) . \end{matrix}

(57)

Consider the sensitivity of the generalized eigenvalues, i.e., the sensitivity of a simple finite generalized eigenvalue,

λ_{i}

, under perturbations of the matrices A and B. If the perturbed pencil is denoted by

\tilde{A} + \tilde{λ} \tilde{B} = A + δ A + \tilde{λ} (B + δ B),

we want to know how the difference between

\tilde{λ}

and

λ

depends on the size of the perturbation measured by the quantity

ε = \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}} .

The distance between two generalized eigenvalues

λ = α / β

and

μ = γ / δ

is measured by the so-called chordal distance, defined as [15]

χ (λ, μ) = \frac{| α δ - β γ |}{\sqrt{{| α |}^{2} + {| β |}^{2}} \sqrt{{| γ |}^{2} + {| δ |}^{2}}} .

Substituting

α = t_{i i}, \tilde{α} = {\tilde{t}}_{i i}, β = r_{i i}, \tilde{β} = {\tilde{r}}_{i i},

we find that

χ (λ, \tilde{λ}) = \frac{| t_{i i} {\tilde{r}}_{i i} - r_{i i} {\tilde{t}}_{i i} |}{\sqrt{| t_{i i} |^{2} + {| r_{i i} |}^{2}} \sqrt{| {\tilde{t}}_{i i} |^{2} + {| {\tilde{r}}_{i i} |}^{2}}}

or, in a first order-approximation,

χ^{l i n} (λ, \tilde{λ}) = \frac{| r_{i i} δ t_{i i} + t_{i i} δ r_{i i} |}{| t_{i i} |^{2} + {| r_{i i} |}^{2}}

(58)

The asymptotic approximations of the diagonal element perturbations of the matrices R and T satisfy

\begin{matrix} δ t_{i i} & = & cond (t_{i i}) \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}, \end{matrix}

(59)

\begin{matrix} δ r_{i i} & = & cond (r_{i i}) \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}, \end{matrix}

(60)

where the numbers

cond (t_{i i})

and

cond (r_{i i})

are determined from

cond (t_{i i}) = ∥ Z_{i, 1 : 2 n} ∥_{2}, cond (r_{i i}) = {∥ Z_{i + n, 1 : 2 n} ∥}_{2},

Z = [Q M^{- 1}, I_{2 n}], Q = [\begin{matrix} N_{a 1} & N_{a 2} \\ N_{b 1} & N_{b 2} \end{matrix}],

\begin{matrix} N_{a 1} & = & - Π (T^{T} \otimes I_{n}) Ω^{T}, N_{a 2} = Π (I_{n} \otimes T) Ω^{T}, \\ N_{b 1} & = & - Π (R^{T} \otimes I_{n}) Ω^{T}, N_{b 2} = Π (I_{n} \otimes R) Ω^{T} . \end{matrix}

Replacing the expressions for the perturbations

δ t_{i i}

and

δ r_{i i}

in (58), we find that

χ (\tilde{λ}, λ) \leq χ^{l i n} (\tilde{λ}, λ) = cond (λ_{i}) \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}

(61)

where the number

cond (λ_{i}) = \frac{| r_{i i} | cond (t_{i i}) + | t_{i i} | cond (r_{i i})}{| t_{i i} |^{2} + {| r_{i i} |}^{2}}

can be considered as a condition number of the generalized eigenvalue,

λ_{i}

.

4.5. Probabilistic Perturbation Bound

Implementing again Theorem 3, the probabilistic perturbation bounds of the perturbation parameter vectors x and y can be found from (49) and (50), respectively, replacing the quantity

\sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}

by the ratio

\sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}} / Ξ

, where

Ξ

is determined according to (3) from the desired probability,

P^{r e f}

, and the problem order, n. This means that the probabilistic asymptotic estimates of

| δ U |

and

| δ V |

can be obtained from (54) and (55), replacing the linear estimates

x^{l i n}

,

y^{l i n}

by

x^{e s t} = x^{l i n} / Ξ, y^{e s t} = y^{l i n} / Ξ,

respectively. As a result, we obtain the probabilistic bounds on the angles between the perturbed and unperturbed deflating subspaces as

\begin{matrix} Φ_{\max}^{e s t} (\tilde{X}, X) & = & \arcsin (∥ {L_{X}}_{k + 1 : p, 1 : 2 k p} ∥_{2} \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}} / Ξ), \end{matrix}

(62)

\begin{matrix} Θ_{\max}^{e s t} (\tilde{Y}, Y) & = & \arcsin (∥ {L_{Y}}_{k + 1 : p, 1 : 2 k p} ∥_{2} \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}} / Ξ), \\ k = 1, 2, \dots, n - 1 . \end{matrix}

(63)

In the same way, we may obtain a probabilistic asymptotic estimate of the eigenvalue perturbations, replacing in (61) the expression

\sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}}

by

\sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}} / Ξ

. This yields

χ (\tilde{λ}, λ) \leq χ^{e s t} (\tilde{λ}, λ) = cond (λ_{i}) \sqrt{{∥ δ A ∥}_{F}^{2} + {∥ δ B ∥}_{F}^{2}} / Ξ .

(64)

4.6. Bound Comparison

Example 2.

Consider a

80 \times 80

matrix pencil

A - λ B

, where

A = Q_{A} J_{A} Q_{A}^{- 1}, B = Q_{B} J_{B} Q_{B}^{- 1},

\begin{matrix} J_{A} = [\begin{matrix} 0.1 & 0.1 & τ & τ & \dots & τ & τ \\ - 0.1 & 0.1 & τ & τ & \dots & τ & τ \\ 0.2 & 0.2 & \dots & τ & τ \\ - 0.2 & 0.2 & \dots & τ & τ \\ ⋱ & ⋮ & ⋮ \\ 5.0 & 5.0 \\ - 5.0 & 5.0 \end{matrix}], & J_{B} = [\begin{matrix} 1 & ρ & ρ & ρ & \dots & ρ & ρ \\ 1 & ρ & ρ & \dots & ρ & ρ \\ 1 & ρ & \dots & ρ & ρ \\ 1 & \dots & ρ & ρ \\ ⋱ & ⋮ & ⋮ \\ 1 & ρ \\ 1 \end{matrix}], \\ τ = 0.05, & ρ = 0.1, \end{matrix}

the matrices

Q_{A}

and

Q_{B}

are constructed as in Example 1,

\begin{matrix} Q_{A} & = & H_{2} Σ_{A} H_{1}, Q_{B} = H_{2} Σ_{B} H_{1}, \\ H_{1} & = & I_{n} - 2 u u^{T} / n, H_{2} = I_{n} - 2 v v^{T} / n, \\ u & = & {[1, 1, 1, \dots, 1]}^{T}, v = {[1, - 1, 1, \dots, {(- 1)}^{n - 1}]}^{T}, \\ Σ_{A} & = & diag (1, σ_{A}, σ_{A}^{2}, \dots, σ_{A}^{n - 1}), \\ Σ_{B} & = & diag (1, σ_{B}, σ_{B}^{2}, \dots, σ_{B}^{n - 1}), \end{matrix}

H_{1}, H_{2}

are elementary reflections,

σ_{A}

is taken equal to

1.02

and

σ_{B}

is taken equal to

1.01

.

In Figure 4, we give the mean value of the ratios

Δ a_{i j} / | δ a_{i j} |

and

Δ b_{i j} / | δ b_{i j} |

obtained for two random distributions and different values of

P^{r e f}

between 100% and 0%, where

Δ A

is the probabilistic bound of

| δ A |

and

Δ B

is the probabilistic bound of

| δ B |

. For each

P^{r e f}

, the scaling factor, Ξ, is determined from (3). The same ratios are represented numerically in Table 2 for the perturbation with normal distribution and three values of

P^{r e f}

. The results show that if, instead of the deterministic estimates of

δ A

and

δ B

, we use the corresponding probabilistic estimates

Δ A

and

Δ B

with

P^{r e f} = 90

%, then the mean values of

[Δ a_{i j} / | δ a_{i j} |]

and

[Δ b_{i j} / | δ b_{i j} |]

decrease by a factor of eight. A further reduction in

P^{r e f}

to 80 % leads to the decreasing of the mean values by a factor of 16.

In Figure 5, we show the relative numbers

N {Δ a_{i j} \geq | δ a_{i j} |} / (n^{2})

and

N {Δ b_{i j} \geq | δ b_{i j} |} / (n^{2})

of the entries for which

Δ a_{i j} \geq | δ a_{i j} |

and

Δ b_{i j} \geq | δ b_{i j} |

, respectively. In both cases, the relative number of entries for which

Δ a_{i j} \geq | δ a_{i j} |

and

Δ b_{i j} \geq | δ b_{i j} |

remains 100 %, which shows that the decreasing of the estimates of

δ A

and

δ B

can be done safely.

In Figure 6, we show the asymptotic chordal metric bound,

χ_{i}^{l i n}

, of the generalized eigenvalue perturbations along with the probabilistic estimates,

χ_{i}^{e s t}

, and the actual eigenvalue perturbations

| χ_{i} |

for normal distribution of perturbation entries and probabilities

P^{r e f} = 90 %, 80 %

and

60 %

. The probabilistic bound,

χ_{i}^{e s t}

, is much tighter than the linear bound,

χ_{i}^{l i n}

, and the inequality

| χ_{i} | \leq χ_{i}^{e s t}

is satisfied for all eigenvalues. In particular, the size of the estimate,

χ_{i}^{e s t}

, is 8 times smaller than the linear estimate,

χ_{i}^{l i n}

, for

P^{r e f} = 90 %

, 16 times for

P^{r e f} = 80 %

and 32 times for

P^{r e f} = 60 %

.

In Figure 7 and Figure 8, we show the asymptotic bounds,

δ Φ_{k}^{l i n}

and

δ Θ_{k}^{l i n}

, of the maximum angles between the perturbed and unperturbed deflating subspaces of dimension

k = 1, 2, \dots, n - 1

along with the probabilistic estimates,

δ Φ_{k}^{e s t}

,

δ Θ_{k}^{e s t}

, and the actual values of these angles for the same probabilities

P^{r e f} = 90 %, 80 %

and

60 %

. The probability estimates satisfy

| δ Φ_{k} | < δ Φ_{k}^{e s t}

,

| δ Θ_{k} | < δ Θ_{k}^{e s t}

for all

1 \leq k < 100

. For comparison, we give also the global bounds on the maximum angles between the perturbed and unperturbed invariant subspaces computed by (40) and (43), (44). Since the determination of the corresponding bounds requires us to know the norms of parts of the perturbations E and F, which are unknown, these norms are replaced by the Frobenius norms of the whole corresponding matrices. The global bounds practically coincide with the corresponding asymptotic bounds obtained by the splitting operator method.

5. Perturbation Bounds for Singular Subspaces

5.1. Problem Statement

Let

A \in R^{m \times n}

. The factorization

U^{H} A V = Σ,

(65)

where

U \in R^{m \times m}, V \in R^{n \times n}

are orthogonal matrices and

Σ

is a diagonal matrix, is called the singular value decomposition of A ([38], Section 2.6). If

m \geq n

, matrix

Σ

has the form

Σ = [\begin{matrix} Σ_{n} \\ 0_{(m - n) \times n)} \end{matrix}], Σ_{n} = diag (σ_{1}, σ_{2}, \dots, σ_{n}) .

(66)

where the numbers

σ_{i} \geq 0

are the singular values of A. If

U = [U_{1}, U_{2}]

and

[V_{1}, V_{2}]

are partitioned such that

U_{1} \in R^{m \times k}, V_{1} \in R^{n \times k}, k < n

, then

X = R (V_{1})

,

Y = R (U_{1})

form a pair of singular subspaces for A. These subspaces satisfy

A X \subset Y

and

A^{T} Y \subset X

.

If matrix A is subject to a perturbation,

δ A

, then there exists another pair of orthogonal matrices,

\tilde{U} = [{\tilde{U}}_{1}, {\tilde{U}}_{2}]

,

\tilde{V} = [{\tilde{V}}_{1}, {\tilde{V}}_{2}]

, and a diagonal matrix,

\tilde{Σ}

, such that

{\tilde{U}}^{H} (A + δ A) \tilde{V} = \tilde{Σ}

(67)

where

\begin{matrix} \tilde{Σ} & = & [\begin{matrix} {\tilde{Σ}}_{n} \\ 0_{(m - n) \times n)} \end{matrix}], {\tilde{Σ}}_{n} = diag ({\tilde{σ}}_{1}, {\tilde{σ}}_{2}, \dots, {\tilde{σ}}_{n}), \\ {\tilde{Σ}}_{n} & = & Σ_{n} + δ Σ_{n}, δ Σ_{n} = diag (δ σ_{1}, δ σ_{2}, \dots, δ σ_{n}) . \end{matrix}

The aim of the perturbation analysis of singular subspaces consists in bounding the angles between the perturbed,

\tilde{X} = R ({\tilde{V}}_{1})

, and unperturbed,

X = R (V_{1})

, right singular subspaces, and the angles between the perturbed,

\tilde{Y} = R ({\tilde{U}}_{1})

, and unperturbed,

Y = R (U_{1})

, left singular subspaces.

5.2. Global Bound

Consider first the global perturbation bound on the singular subspaces derived by Stewart [16].

Theorem 10.

Let matrix A be decomposed, as in (65), and let

X

and

Y

form a pair of singular subspaces for A. Let

δ A \in R^{m \times n}

be given and partition

U^{T} Δ A V

conformingly with U and V as

E = U^{T} δ A V = [\begin{matrix} E_{11} & E_{12} \\ E_{21} & E_{22} \end{matrix}], E_{11} \in C^{k \times k} .

Let

η = γ = ∥ (E_{21}, E_{12}^{T}) ∥_{F}

and let

δ = \min_{\begin{matrix} 1 \leq j < k \\ k + 1 \leq q \leq m \end{matrix}} | σ_{j} - σ_{q} | - ∥ E_{11} ∥_{2} - {∥ E_{22} ∥}_{2},

where, if

m \geq n

, Σ is understood to have

m - n

zero singular values. Set

κ = \frac{2 k_{2}}{1 - 2 k_{2} + \sqrt{1 - 4 k_{2}}}, k_{2} = \frac{γ η}{δ} .

If

\frac{γ}{δ} < \frac{1}{2},

then there are matrices

P \in R^{(m - k) \times k}

and

Q \in R^{(n - k) \times k}

satisfying

{∥ (P, Q) ∥}_{F} \leq (1 + κ) \frac{γ}{δ} < 2 \frac{γ}{δ}

(68)

such that

\tilde{X} = R (V_{1} + V_{2} P)

and

\tilde{Y} = R (U_{1} + U_{2} Q)

form a pair of singular subspaces of

A + Δ A

.

Theorem 10 bounds the tangents of the angles between the perturbed,

\tilde{X}

, and unperturbed,

X

, right singular subspaces and the angles between the perturbed,

\tilde{Y}

, and unperturbed,

Y

, left singular subspaces of A. Thus, the maximum angles between perturbed and unperturbed singular subspaces fulfil

\begin{matrix} Φ_{\max} (\tilde{X}, X) \leq \arctan ((1 + κ) \frac{γ}{δ}), \end{matrix}

(69)

\begin{matrix} Θ_{\max} (\tilde{Y}, Y) \leq \arctan ((1 + κ) \frac{γ}{δ}) . \end{matrix}

(70)

Note that Theorem 10 produces equal bounds on the maximum angles between the perturbed and unperturbed right and left singular subspaces.

5.3. Perturbation Expansion Bound

Global perturbation bounds for singular subspaces that produce individual perturbation bounds for each subspace in a pair of singular subspaces are presented in [18].

Theorem 11.

Let

A \in R^{m \times n} (m \geq n)

and let

U = [U_{1}, U_{2}] \in R^{m \times m}, V = [V_{1}, V_{2}] \in R^{n \times n}

be orthogonal matrices with

U_{1} \in R^{m \times k}, V_{1} \in R^{n \times k}

such that

U^{H} A V = [\begin{matrix} Σ_{1} & 0 \\ 0 & Σ_{2} \end{matrix}],

(71)

Σ_{1} = diag (σ_{1}, \dots, σ_{k}), Σ_{2} = [\begin{matrix} Σ_{21} \\ 0 \end{matrix}], Σ_{21} = diag (σ_{k + 1}, \dots, σ_{n}) .

Assume that

σ_{j} \geq 0

for each j and the singular values of

Σ_{1}

are different from the singular values of

Σ_{2}

. Let

X = R (V_{1})

and

Y = R (U_{1})

. For

δ A \in R^{m \times n}

, let

E = U^{T} δ A V = [\begin{matrix} E_{11} & E_{12} \\ E_{21} & E_{22} \end{matrix}], E_{11} \in R^{k \times k} .

Moreover, let

\begin{matrix} κ_{X} & = & \max_{\begin{matrix} 1 \leq j < k \\ k + 1 \leq q \leq n \end{matrix}} \frac{\sqrt{σ_{j}^{2} + σ_{q}^{2}}}{| σ_{j}^{2} - σ_{q}^{2} |}, \\ κ_{Y} & = & \max_{\begin{matrix} 1 \leq j < k \\ k + 1 \leq q \leq m \end{matrix}} \frac{\sqrt{σ_{j}^{2} + σ_{q}^{2}}}{| σ_{j}^{2} - σ_{q}^{2} |}, \end{matrix}

where we define

σ_{n + 1} = \dots = σ_{m} = 0

if

m > n

and let

κ = \sqrt{κ_{X}^{2} + κ_{Y}^{2}}, γ = {||[\begin{matrix} E_{12}^{T} \\ F_{21} \end{matrix}]||}_{F}, ε = ∥ E_{11} ∥_{2} + {∥ E_{22} ∥}_{2} .

(72)

If

κ (2 γ + ε) < 1,

then there exists a pair,

\tilde{X} = R ({\tilde{V}}_{1})

and

\tilde{Y} = R ({\tilde{U}}_{1})

, of k-dimensional singular subspaces of

A + δ A

, such that

\begin{matrix} Φ_{\max} (\tilde{X}, X) \leq \arctan (\frac{2 κ_{X} γ}{1 - κ ε + \sqrt{{(1 - κ ε)}^{2} - 4 κ^{2} γ^{2}}}), \end{matrix}

(73)

\begin{matrix} Θ_{\max} (\tilde{Y}, Y) \leq \arctan (\frac{2 κ_{Y} γ}{1 - κ ε + \sqrt{{(1 - κ ε)}^{2} - 4 κ^{2} γ^{2}}}) . \end{matrix}

(74)

5.4. Bound by the Splitting Operator Method

Similarly to the perturbation analysis of the generalized Schur decomposition, performed by using the splitting operator method, it is appropriate first to find bounds on the entries of the matrices

δ W_{U} : = U^{T} δ U_{n}

and

δ W_{V} : = V^{T} δ V

, where

U_{n}

is a matrix that consists of the first n columns of U. The matrices

δ W_{U}

and

δ W_{V}

are related to the corresponding perturbations

δ U_{n} = U δ W_{U}

and

δ V = V δ W_{V}

by orthogonal transformations.

Let us define the vectors of the subdiagonal entries of the matrices

δ W_{U}

and

δ W_{V}

,

x = vec (Low (δ W_{U})), y = vec (Low (δ W_{V})) .

We have that

\begin{matrix} x & = & {[\underset{m - 1}{\underset{︸}{u_{2}^{T} δ u_{1}, u_{3}^{T} δ u_{1}, \dots, u_{m}^{T} δ u_{1}}}, \underset{m - 2}{\underset{︸}{u_{3}^{T} δ u_{2}, \dots, u_{m}^{T} δ u_{2}}}, \dots, \underset{m - n}{\underset{︸}{u_{n + 1}^{T} δ u_{n}, \dots, u_{m}^{T} δ u_{n}}}]}^{T}, \\ : = & {[x_{1}, x_{2}, \dots, x_{p}]}^{T} \in R^{p}, \\ y & : = & {[\underset{n - 1}{\underset{︸}{v_{2}^{T} δ v_{1}, v_{3}^{T} δ v_{1}, \dots, v_{n}^{T} δ v_{1}}}, \underset{n - 2}{\underset{︸}{v_{3}^{T} δ v_{2}, \dots, v_{n}^{T} δ v_{2}}}, \dots, \underset{1}{\underset{︸}{v_{n}^{T} δ v_{n - 1}}}]}^{T}, \\ : = & {[y_{1}, y_{2}, \dots, y_{q}]}^{T} \in R^{q}, \end{matrix}

where

p = \sum_{i = 1}^{n} (m - i) = n (n - 1) / 2 + (m - n) n = n (2 m - n - 1) / 2, q = n (n - 1) / 2 .

It is fulfilled that

\begin{matrix} x_{k} & = & u_{i}^{T} δ u_{j}, k = i + (j - 1) m - \frac{j (j + 1)}{2}, \\ 1 \leq j \leq n, j < i \leq m, 1 \leq k \leq p, \\ y_{ℓ} & = & v_{i}^{T} δ v_{j}, ℓ = i + (j - 1) n - \frac{j (j + 1)}{2}, \\ 1 \leq j \leq n, j < i \leq n - 1, 1 \leq ℓ \leq q . \end{matrix}

Further on the quantities

x_{k}, k = 1, 2, \dots, p

and

y_{ℓ}, ℓ = 1, 2, \dots, q

will be considered as perturbation parameters since they determine the perturbations

δ U_{1}

and

δ V

of the singular vectors.

Let us represent the matrix

δ E = U^{T} δ A V

as

\begin{matrix} δ E & = & [\begin{matrix} δ E_{1} \\ δ E_{2} \end{matrix}], δ E_{1} \in R^{n \times n}, δ E_{2} \in R^{(m - n) \times n} . \end{matrix}

Define the vectors

x_{(1)} = Ω_{1} x, a n d x_{(2)} = Ω_{2} x,

where

\begin{matrix} Ω_{1} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{q \times p}, \\ ω_{k} & : = & [I_{n - k}, 0_{(n - k) | \times (m - n)}] \in R^{(n - k) \times (m - k)}, k = 1, 2, \dots, n, \\ Ω_{1} Ω_{1}^{T} = I_{q}, {∥ Ω_{1} ∥}_{2} = 1, \\ Ω_{2} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{(m - n) n \times p}, \\ ω_{k} & : = & [0_{(m - n) \times (n - k)}, I_{m - n}] \in R^{(m - n) \times (m - k)}, k = 1, 2, \dots, n, \\ Ω_{2} Ω_{2}^{T} = I_{(m - n) n}, {∥ Ω_{2} ∥}_{2} = 1 . \end{matrix}

It is possible to show that

x = Ω_{1}^{T} x_{(1)} + Ω_{2}^{T} x_{(2)} .

Following the analysis performed in [30], it can be shown that the unknown vectors

x_{(1)}

and y satisfy the system of linear equations

[\begin{matrix} - S_{1} & S_{2} \\ S_{2} & - S_{1} \end{matrix}] [\begin{matrix} x_{(1)} \\ y \end{matrix}] = - [\begin{matrix} f \\ g \end{matrix}] + [\begin{matrix} Δ_{1} \\ Δ_{2} \end{matrix}] .

(75)

where

\begin{matrix} S_{1} & = & diag (\underset{n - 1}{\underset{︸}{σ_{1}, σ_{1}, \dots, σ_{1}}}, \underset{n - 2}{\underset{︸}{σ_{2}, \dots, σ_{2}}}, \dots, \underset{1}{\underset{︸}{σ_{n - 1}}}), \end{matrix}

(76)

\begin{matrix} S_{2} & = & diag (\underset{n - 1}{\underset{︸}{σ_{2}, σ_{3}, \dots, σ_{n}}}, \underset{n - 2}{\underset{︸}{σ_{3}, \dots, σ_{n}}}, \dots, \underset{1}{\underset{︸}{σ_{n}}}), \\ S_{i} \in R^{q \times q}, i = 1, 2, \end{matrix}

(77)

f = vec (Low (δ E_{1})) \in R^{q}, g = vec ({(Up (δ E_{1}))}^{T}) \in R^{q},

and

Δ_{1}, Δ_{2}

contain higher-order terms in

δ U_{n}, δ A

and

δ V

.

In this way, the determining of the vectors

x_{(1)}

and y reduces to the solution of the system of symmetric coupled equations (75) with diagonal matrices of size

q \times q

. The vector

x_{(2)}

can be found from the separate equation

x_{(2)} = vec ((δ E_{2} + Δ_{3}) Σ_{n}^{- 1}),

(78)

where

Δ_{3}

contains higher-order terms in

δ u_{j}, δ v_{j}, δ A

and

Σ_{n}

, defined in (66).

Neglecting the higher-order terms

Δ_{1}, Δ_{2}

in (75), we obtain

[\begin{matrix} x_{(1)} \\ y \end{matrix}] = - [\begin{matrix} S_{x f} & S_{x g} \\ S_{y f} & S_{y g} \end{matrix}] [\begin{matrix} f \\ g \end{matrix}],

where, taking into account that

S_{1}

and

S_{2}

commute, we have that

\begin{matrix} S_{x f} & = & {(S_{2}^{2} - S_{1}^{2})}^{- 1} S_{1} \in R^{q \times q}, S_{y g} = S_{x f}, \\ S_{x g} & = & {(S_{2}^{2} - S_{1}^{2})}^{- 1} S_{2} \in R^{q \times q}, S_{y f} = S_{x g} . \end{matrix}

The matrices

S_{x f}

and

S_{x g}

are diagonal matrices whose nontrivial entries are determined by the singular values

σ_{1}, \dots, σ_{n}

.

Hence, the components of the vectors

x_{(1)}

and y satisfy

\begin{matrix} | {x_{(1)}}_{ℓ} | & \leq & ∥ {S_{x f}}_{ℓ, 1 : q}, {S_{x g}}_{ℓ, 1 : q} ∥_{2} {∥[\begin{matrix} f \\ g \end{matrix}]∥}_{2}, \\ | y_{ℓ} | & \leq & ∥ {S_{y f}}_{ℓ, 1 : q}, {S_{y g}}_{ℓ, 1 : q} ∥_{2} {∥[\begin{matrix} f \\ g \end{matrix}]∥}_{2}, \\ ℓ = 1, 2, \dots, q . \end{matrix}

Taking into account the diagonal form of

S_{x f}, S_{x g}, S_{y f}, S_{y g}

, we obtain that

\begin{matrix} | {x_{(1)}}_{ℓ} | & \leq & | {S_{x f}}_{ℓ, ℓ} | | f_{ℓ} | + | {S_{x g}}_{ℓ, ℓ} | | g_{ℓ} |, \end{matrix}

(79)

\begin{matrix} | y_{ℓ} | & \leq & | {S_{y f}}_{ℓ, ℓ} | | f_{ℓ} | + | {S_{y g}}_{ℓ, ℓ} | | g_{ℓ} |, \\ ℓ = 1, 2, \dots, q . \end{matrix}

(80)

Since only one element of f and g participates in (79) and (80), these elements can be replaced by

{∥ δ A ∥}_{2}

, and we find that the linear approximations of the vectors

x_{(1)}

and y fulfil

\begin{matrix} | x_{(1)} | & ⪯ & x_{(1)}^{l i n}, | y | ⪯ y^{l i n}, \end{matrix}

(81)

\begin{matrix} {x_{(1)}^{l i n}}_{ℓ} & = & (| {S_{x f}}_{ℓ, ℓ} | + | {S_{x g}}_{ℓ, ℓ} {|) ∥ δ A ∥}_{2}, \end{matrix}

(82)

\begin{matrix} {y^{l i n}}_{ℓ} & = & (| {S_{y f}}_{ℓ, ℓ} | + | {S_{y g}}_{ℓ, ℓ} {|) ∥ δ A ∥}_{2}, \\ ℓ = 1, 2, \dots, q . \end{matrix}

(83)

where

S_{y g} = S_{x f}

,

S_{y f} = S_{x g}

.

An asymptotic estimate of the vector

x_{(2)}

is obtained from (78), neglecting the higher-order term

Δ_{3}

. Since each element of

x_{(2)}

depends only on one element of

δ E_{2}

, we have that

\begin{matrix} x_{(2)}^{l i n} & = & vec (Z), \\ Z & = & {∥ δ A ∥}_{2} \times [\begin{matrix} 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \\ 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \end{matrix}] . \end{matrix}

(84)

As a result of determining the linear estimates (82)–(84), we obtain an asymptotic approximation of the vector x as

| x | ⪯ x^{l i n},

where

x^{l i n} = Ω_{1}^{T} x_{(1)}^{l i n} + Ω_{2}^{T} x_{(2)}^{l i n} .

(85)

The matrices

| δ W_{U} |

and

| δ W_{V} |

can be estimated as

\begin{matrix} | δ W_{U} | & ⪯ & δ W_{U}^{l i n} + Δ^{W_{U}}, \end{matrix}

(86)

\begin{matrix} | δ W_{V} | & ⪯ & δ W_{V}^{l i n} + Δ^{W_{V}}, \end{matrix}

(87)

where

\begin{matrix} δ W_{U}^{l i n} & = & [\begin{matrix} 0 & x_{1}^{l i n} & x_{2}^{l i n} & \dots & x_{n - 1}^{l i n} \\ x_{1}^{l i n} & 0 & x_{m}^{l i n} & \dots & x_{m + n - 3}^{l i n} \\ x_{2}^{l i n} & x_{m}^{l i n} & 0 & \dots & x_{2 m + n - 6}^{l i n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n - 1}^{l i n} & x_{m + n - 3}^{l i n} & x_{2 m + n - 6}^{l i n} & \dots & 0 \\ x_{n}^{l i n} & x_{m + n - 2}^{l i n} & x_{2 m + n - 5}^{l i n} & \dots & x_{(n - 1) (2 m - n) / 2 + 1}^{l i n} \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ x_{m - 1}^{l i n} & x_{2 m - 3}^{l i n} & x_{3 m - 6}^{l i n} & \dots & x_{p}^{l i n} \end{matrix}], \end{matrix}

(88)

\begin{matrix} δ W_{V}^{l i n} & = & [\begin{matrix} 0 & y_{1}^{l i n} & y_{2}^{l i n} & \dots & y_{n - 1}^{l i n} \\ y_{1}^{l i n} & 0 & y_{n}^{l i n} & \dots & y_{2 n - 3}^{l i n} \\ y_{2}^{l i n} & y_{n}^{l i n} & 0 & \dots & y_{3 n - 6}^{l i n} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ y_{n - 1}^{l i n} & y_{2 n - 3}^{l i n} & y_{3 n - 6}^{l i n} & \dots & 0 \end{matrix}] \end{matrix}

(89)

and the matrices

Δ^{W_{U}}, Δ^{W_{V}}

contain higher-order terms in

δ U, δ V

. The matrices

δ W_{U}^{l i n}

and

δ W_{V}^{l i n}

are asymptotic approximations of the matrices

δ W_{U}

and

δ W_{V}

, respectively.

We have that

\begin{matrix} | δ U_{n} | ⪯ | U | | U^{T} δ U_{n} | ⪯ δ U_{n}^{l i n} = | U | δ W_{U}^{l i n}, \end{matrix}

(90)

\begin{matrix} | δ V | ⪯ | V | | V^{T} δ V | ⪯ δ V^{l i n} = | V | δ W_{V}^{l i n}, \end{matrix}

(91)

Assume that the singular value decomposition of A is reordered as

U^{T} A V = [\begin{matrix} Σ_{1} & 0 \\ 0 & Σ_{2} \end{matrix}],

(92)

where

U = [U_{1}, U_{2}]

and

V = [V_{1}, V_{2}]

are orthogonal matrices with

U_{1} \in R^{m \times k}

and

V_{1} \in R^{n \times k}, k < n

, and

Σ_{1} \in R^{k \times k}

contains the desired singular values. The matrices

{\tilde{U}}_{1}

and

U_{1}

are the orthonormal bases of the perturbed and unperturbed left singular subspace,

X

, of dimension

k < n

, and

{\tilde{V}}_{1}

and

V_{1}

are the orthonormal bases of the perturbed and unperturbed right deflation subspace,

Y

, of the same dimension. We have that

\begin{matrix} \sin (Φ_{\max} (\tilde{X}, X)) & = & ∥ V_{2}^{T} δ V_{1} ∥_{2} = {∥ V_{1 : n, k + 1 : n}^{T} δ V_{1 : n, 1 : k} ∥}_{2}, \\ \sin (Θ_{\max} (\tilde{Y}, Y)) & = & ∥ U_{2}^{T} δ U_{1} ∥_{2} = {∥ U_{1 : m, k + 1 : m}^{T} δ U_{1 : m, 1 : k} ∥}_{2}, \\ k = 1, \dots, n - 1 . \end{matrix}

Using the asymptotic approximations of the elements of the vectors x and y, we obtain the following result.

Theorem 12.

Let matrix A be decomposed, as in (92). Given the spectral norm of the perturbation

δ A

, set the matrices

δ W_{U}^{l i n}

and

δ W_{V}^{l i n}

in (88), (89) using the linear estimates of the perturbation parameters x and y determined from (81)–(84). Then, the asymptotic bounds of the angles between the perturbed and unperturbed singular subspaces of dimension k are given by

\begin{matrix} Φ_{\max}^{l i n} (\tilde{X}, X) & = & \arcsin (∥ {δ W_{V}^{l i n}}_{k + 1 : n, 1 : k} ∥_{2}), \end{matrix}

(93)

\begin{matrix} Θ_{\max}^{l i n} (\tilde{Y}, Y) & = & \arcsin (∥ {δ W_{U}^{l i n}}_{k + 1 : m, 1 : k} ∥_{2}), \\ k = 1, \dots, n - 1 . \end{matrix}

(94)

The perturbed matrix of the singular values satisfies

δ Σ_{n} = diag (E) + diag (Δ_{4}),

where

E = U^{T} δ A V

and

Δ_{4}

contains higher-order terms in

δ U, δ V

and

δ A

. Neglecting the higher-order terms, we obtain for the singular value perturbation the asymptotic bound

| δ σ_{i} | = | δ E_{i i} |, i = 1, 2, \dots, n .

(95)

Bounding each diagonal element

| δ E_{i i} |

by

{∥ δ A ∥}_{2}

, we find the normwise estimate

| δ σ_{i} {| \leq ∥ δ A ∥}_{2},

which is in accordance with Weyl’s theorem ([44], Chapter 1). We have in a first-order approximation that

| δ σ_{i} | \leq δ σ_{i}^{l i n}, δ σ_{i}^{l i n} : = {∥ δ A ∥}_{2}, i = 1, 2, \dots, n .

(96)

5.5. Probabilistic Perturbation Bound

Implementing a derivation similar to the one used in the proof of Theorem 3, the probabilistic estimates

x^{e s t}

and

y^{e s t}

of the parameter vectors x and y can be obtained from the deterministic estimates

x^{l i n}

and

y^{l i n}

. For this aim, the value of

{∥ δ A ∥}_{2}

in the expressions (82) and (83) for

x^{l i n}

and

y^{l i n}

, respectively, is replaced by the value of

{∥ δ A ∥}_{2} / Ξ_{2}

, where

Ξ_{2}

is determined from (4) for the specified value of

P^{r e f}

. According to (85), the probabilistic perturbation bound of x fulfils

x^{e s t} = Ω_{1}^{T} x_{(1)}^{l i n} / Ξ_{2} + Ω_{2}^{T} x_{(2)}^{l i n} / Ξ_{2},

where the estimates

x_{(1)}^{l i n}

and

x_{(2)}^{l i n}

satisfy (81) and (84), respectively. The bound of y is found from

y^{e s t} = y^{l i n} / Ξ_{2},

where

y^{l i n}

satisfies (83).

The bounds on

| δ U_{n} |

and

| δ V |

are determined from (90) and (91), respectively. According to (96), the probability bound on the singular value perturbations is found from

δ σ_{i}^{e s t} = {∥ δ A ∥}_{2} / Ξ_{2}, i = 1, 2, \dots, n .

(97)

5.6. Bound Comparison

Example 3.

Consider a

400 \times 200

matrix, taken as

A = U_{0} [\begin{matrix} Σ_{0} \\ 0_{(m - n) \times n} \end{matrix}] V_{0}^{T},

where

\begin{matrix} Σ_{0} & = & diag ({σ_{0}}_{1}, {σ_{0}}_{2}, \dots, {σ_{0}}_{n}), \\ {σ_{0}}_{i} = (n - i + 1) / 30000, i = 1, 2, \dots, n, \end{matrix}

the matrices

U_{0}

and

V_{0}

are constructed as in Example 2,

\begin{matrix} U_{0} & = & M_{2} S_{U} M_{1}, \\ M_{1} & = & I_{m} - 2 e e^{T} / m, M_{2} = I_{m} - 2 f f^{T} / m, \\ e & = & {[1, 1, 1, \dots, 1]}^{T}, f = {[1, - 1, 1, \dots, {(- 1)}^{m - 1}]}^{T}, \\ S_{U} & = & diag (1, σ, σ^{2}, \dots, σ^{m - 1}), \\ V_{0} & = & N_{2} S_{V} N_{1}, \\ N_{1} & = & I_{n} - 2 g g^{T} / n, N_{2} = I_{n} - 2 h h^{T} / n, \\ g & = & {[1, 1, 1, \dots, 1]}^{T}, h = {[1, - 1, 1, \dots, {(- 1)}^{n - 1}]}^{T}, \\ S_{V} & = & diag (1, τ, τ^{2}, \dots, τ^{n - 1}), \end{matrix}

and the matrices

M_{1}, M_{2}, N_{1}, N_{2}

are elementary reflections. The condition numbers of

U_{0}

and

V_{0}

with respect to the inversion are controlled by the variables σ and τ and are equal to

σ^{m - 1}

and

τ^{n - 1}

, respectively. In the given case,

σ = 1.015

,

τ = 1.03

and

cond (U_{0}) = 380.1464

,

cond (V_{0}) = 358.5979

.

The perturbation of A is taken as

δ A = 10^{- 9} \times A_{0}

, where

A_{0}

is a matrix with random entries with normal distribution generated by the MATLAB®function randn. This perturbation satisfies

{∥ δ A ∥}_{2} = 3.3800 \times 10^{- 8}

. The linear estimates

x^{l i n}

and

y^{l i n}

, which are of size 19900, are found by using (49) and (50), respectively, computing in advance the diagonal matrices

S_{x f}

and

S_{x g}

. These matrices satisfy

∥ S_{x f} ∥_{2} = 9.0330 \times 10^{3}, {∥ S_{x g} ∥}_{2} = 8.3205 \times 10^{3},

which means that the perturbations in A can be increased nearly

10^{4}

times in x and y.

In Figure 9, we represent the mean value of the matrix

[Δ a_{i j} / | δ a_{i j} |]

and the scaling factor

Ξ_{2}

as a function of

P_{r e f}

. Since, in the given case we use

{∥ δ A ∥}_{2}

instead of

{∥ δ A ∥}_{F}

, the value of

Ξ_{2} = (1 - P_{r e f}) \sqrt{m}

for a given probability

P_{r e f}

is relatively small. For instance, if

P_{r e f} = 50

%, then we have that

Ξ_{2} = 10

and the mean value of

[Δ a_{i j} / | δ a_{i j} |]

is equal to 285.23 (Table 3).

In Figure 10, we compare the actual perturbations,

δ σ_{i}

, of the singular values with the normwise bound (96) and the probabilistic bound (97) of the singular value perturbations for

P^{r e f} = 90 %, 80 %

and

50 %

. The probabilistic perturbation bound

δ σ_{i}^{e s t} = {∥ δ A ∥}_{2} / Ξ_{2}

is tighter than the normwise bound

{∥ δ A ∥}_{2}

. Specifically,

δ σ_{i}^{e s t}

is 2 times smaller than

{∥ δ A ∥}_{2}

for

P^{r e f} = 0.9

, 4 times for

P^{r e f} = 0.8

and 10 times for

P^{r e f} = 0.5

. The inequality

| δ σ_{i} | \leq δ σ_{i}^{e s t}

is satisfied for all singular values and probabilities due to the small values of

Ξ_{2}

. Note that tighter probability estimates can be obtained if instead of

{∥ δ A ∥}_{2}

we use

{∥ δ A ∥}_{F}

and the scaling parameter

Ξ = (1 - P^{r e f}) \sqrt{m n}

.

In Figure 11 and Figure 12, we show the actual values of the angles between the perturbed and unperturbed right and left singular subspaces, respectively, along with the corresponding linear bounds and probability bounds. For comparison, we give also the global bounds (69), (70) and (73), (74). As in the case of determining the deflation subspace global bounds, since the norms of parts of the perturbation matrix E are unknown, these norms are approximated by the 2-norms of the whole corresponding matrices. Clearly, the probabilistic bounds outperformall deterministic bounds. For instance, if

P^{r e f} = 0.5

, the probabilistic bounds are 10 times smaller than the deterministic asymptotic bound, as predicted by the analysis.

6. Conclusions

The splitting operator approach used in this paper allows us to derive unified asymptotic perturbation bounds of invariant, deflation and singular matrix subspaces that are comparable with the known perturbation bounds. These unified bounds make it possible to find easily probabilistic perturbation estimates of the subspaces that are considerably less conservative than the corresponding deterministic bounds, especially for high-order problems.

The proposed probability estimates have two disadvantages. First, they can be conservative due to the properties of the Markoff inequality so that the actual probability of the results obtained can be much better than those predicted by these estimates. Secondly, in the case of high-order problems, their computation requires much more memory than the known bounds due to the use of the Kronecker products.

Funding

This research received no external funding.

Notation

$C$ ,	the set of complex numbers;
$j_{0} = \sqrt{- 1}$ ,	the imaginary unit;
$R$ ,	the set of real numbers;
$C^{n \times m}$ ,	the space of $n \times m$ complex matrices;
$R^{n \times m}$ ,	the space of $n \times m$ real matrices;
$A = [a_{i j}]$ ,	a matrix with entries $a_{i j}$ ;
$a_{j}$ ,	the jth column of A;
${\bar{a}}_{i j}$ ,	the complex conjugate of $a_{i j}$ ;
$A_{i, 1 : n}$ ,	the ith row of an $m \times n$ matrix A;
$A_{i_{1} : i_{2}, j_{1} : j_{2}}$ ,	the part of matrix A from row $i_{1}$
	to $i_{2}$ and from column $j_{1}$ to $j_{2}$ ;
$Low (A)$ ,	the strictly lower triangular part of A;
$Up (A)$ ,	the strictly upper triangular part of A;
$diag (A)$ ,	the diagonal of A;
$diag (a_{1}, a_{2}, \dots, a_{n})$ ,	the square matrix with diagonal elements equal to
	$a_{1}, a_{2}, \dots, a_{n}$ .
$\| A \|$ ,	the matrix of absolute values of the elements of A;
$A^{T}$ ,	the transposed A;
$A^{H}$ ,	the Hermitian conjugate of A;
$A^{- 1}$ ,	the inverse of A;
$0_{m \times n}$ ,	the zero $m \times n$ matrix;
$I_{n}$ ,	the unit $n \times n$ matrix;
$e_{j}$ ,	the jth column of $I_{n}$ ;
$δ A$ ,	the perturbation of A;
$\det (A)$ ,	the determinant of A;
$σ_{i} (A)$ ,	the ith singular value of A;
${∥ A ∥}_{2}$ ,	the spectral norm of A;
${∥ A ∥}_{F}$ ,	the Frobenius norm of A;
$: =$ ,	equal by definition;
⪯,	relation of partial order. If $a, b \in R^{n}$ , then $a ⪯ b$ means
	$a_{i} \leq b_{i}, i = 1, 2, \dots, n$ ;
$X = R (X)$ ,	the subspace spanned by the columns of X;
$U^{⊥}$ ,	the unitary complement of U, $U^{H} U^{⊥} = 0$ ;
$A \otimes B$ ,	the Kronecker product of A and B;
$gap (X, Y)$ ,	the gap between the subspaces $X$ and $Y$ ;
$sep (A, B)$ ,	the separation between A and B;
$dif (A, B; C, D)$ ,	the difference between the spectra of $A - λ B$ and $C - λ D$ ;
$vec (A)$ ,	the vec mapping of $A \in C^{n \times m}$ . If A is partitioned
	columnwise as $A = [a_{1}, a_{2}, \dots a_{m}]$ , then
	$vec (A) = {[a_{1}^{T}, a_{2}^{T}, \dots, a_{m}^{T}]}^{T}$ ;
$P_{v e c}$ ,	the vec-permutation matrix. $vec (A^{T}) = P_{v e c} vec (A)$ ;
$P {y > y}$ ,	the probability of the event ${y > y}$ ;
$E {ξ}$ ,	the average value or mean of the random variable $ξ$ ;
$N {a_{i j} \geq b_{i j}}$ ,	the number of the entries of A that are
	greater or equal to the corresponding entries of B.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated during the current study are available from the author upon reasonable request.

Acknowledgments

The author is grateful to the anonymous reviewers whose remarks and suggestions helped to improve the manuscript.

Conflicts of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

References

Stewart, G.W.; Sun, J.-G. Matrix Perturbation Theory; Academic Press: New York, NY, USA, 1990; ISBN 978-0126702309. [Google Scholar]
Bhatia, R. Matrix factorizations and their perturbations. Linear Algebra Appl. 1994, 197–198, 245–276. [Google Scholar] [CrossRef]
Li, R. Matrix perturbation theory. In Handbook of Linear Algebra, 2nd ed.; Hogben, L., Ed.; CRC Press: Boca Raton, FL, USA, 2014; ISBN 978-1-4665-0729-6. [Google Scholar]
Adhikari, S.; Friswell, M.I. Random matrix eigenvalue problems in structural dynamics. Int. J. Numer. Methods Eng. 2006, 69, 562–591. [Google Scholar] [CrossRef]
Benaych-Georges, F.; Enriquez, N.; Michail, A. Eigenvectors of a matrix under random perturbation. Random Matrices Theory Appl. 2021, 10, 2150023. [Google Scholar] [CrossRef]
Benaych-Georges, F.; Nadakuditi, R.R. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices. Adv. Math. 2011, 227, 494–521. [Google Scholar] [CrossRef]
Cape, J.; Tang, M.; Priebe, C.E. Signal-plus-noise matrix models: Eigenvector deviations and fluctuations. Biometrika 2019, 106, 243–250. [Google Scholar] [CrossRef]
Michaïl, A. Eigenvalues and Eigenvectors of Large Matrices under Random Perturbations. Ph.D. Thesis, Université Paris Decartes, Paris, France, 2018. Available online: https://theses.hal.science/tel-02468213 (accessed on 20 August 2024).
O’Rourke, S.; Vu, V.; Wang, K. Eigenvectors of random matrices: A survey. J. Combin. Theory Ser. A 2016, 144, 361–442. [Google Scholar] [CrossRef]
O’Rourke, S.; Vu, V.; Wang, K. Optimal subspace perturbation bounds under Gaussian noise. In Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, 25–30 June 2023; pp. 2601–2606. [Google Scholar] [CrossRef]
Sun, J.-G. Perturbation expansions for invariant subspaces. Linear Algebra Appl. 1991, 153, 85–97. [Google Scholar] [CrossRef]
Wilkinson, J. The Algebraic Eigenvalue Problem; Clarendon Press: Oxford, UK, 1965; ISBN 978-0-19-853418-1. [Google Scholar]
Greenbaum, A.; Li, R.-C.; Overton, M.L. First-order perturbation theory for eigenvalues and eigenvectors. SIAM Rev. 2020, 62, 463–482. [Google Scholar] [CrossRef]
Bai, Z.; Demmel, J.; Mckenney, A. On computing condition numbers for the nonsymmetric eigenproblem. ACM Trans. Math. Softw. 1993, 19, 202–223. [Google Scholar] [CrossRef]
Stewart, G.W. On the sensitivity of the eigenvalue problem Ax=λBx. SIAM J. Numer. Anal. 1972, 9, 669–686. [Google Scholar] [CrossRef]
Stewart, G.W. Error and perturbation bounds for subspaces associated with certain eigenvalue problems. SIAM Rev. 1973, 15, 727–764. [Google Scholar] [CrossRef]
Sun, J.-G. Perturbation bounds for the generalized Schur decomposition. SIAM J. Matrix Anal. Appl. 1995, 16, 1328–1340. [Google Scholar] [CrossRef]
Sun, J.-G. Perturbation analysis of singular subspaces and deflating subspaces. Numer. Math. 1996, 73, 235–263. [Google Scholar] [CrossRef]
Kågström, B.; Poromaa, P. Computing eigenspaces with specified eigenvalues of a regular matrix pair (A,B) and condition estimation: Theory, algorithms and software. Numer. Algorithms 1996, 12, 369–407. [Google Scholar] [CrossRef]
Benaych-Georges, F.; Nadakuditi, R.R. The singular values and vectors of low rank perturbations of large rectangular random matrices. J. Multivariate Anal. 2012, 111, 120–135. [Google Scholar] [CrossRef]
Konstantinides, K.; Yao, K. Statistical analysis of effective singular values in matrix rank determination. IEEE Trans. Acoustic. Speech Signal Proc. 1988, 36, 757–763. [Google Scholar] [CrossRef]
Liu, H.; Wang, R. An Exact sin Θ Formula for Matrix Perturbation Analysis and Its Applications; ArXiv e-prints in Statistics Theory [math.ST]; Cornell University Library: Ithaca, NY, USA, 2020; pp. 1–31. [Google Scholar] [CrossRef]
O’Rourke, S.; Vu, V.; Wang, K. Random perturbation of low rank matrices: Improving classical bounds. Lin. Algebra Appl. 2018, 540, 26–59. [Google Scholar] [CrossRef]
O’Rourke, S.; Vu, V.; Wang, K. Matrices with Gaussian noise: Optimal estimates for singular subspace perturbation. IEEE Trans. Inform. Theory 2024, 70, 1978–2002. [Google Scholar] [CrossRef]
Wang, K. Analysis of Singular Subspaces under Random Perturbations; ArXiv e-Prints in Statistics Theory [math.ST]; Cornell University Library: Ithaca, NY, USA, 2024; pp. 1–68. [Google Scholar] [CrossRef]
Wang, R. Singular vector perturbation under Gaussian noise. SIAM J. Matrix Anal. Appl. 2015, 36, 158–177. [Google Scholar] [CrossRef]
Stewart, G.W. Stochastic perturbation theory. SIAM Rev. 1990, 32, 579–610. [Google Scholar] [CrossRef]
Edelman, A.; Rao, N.R. Random matrix theory. Acta Numer. 2005, 14, 1–65. [Google Scholar] [CrossRef]
Petkov, P. Probabilistic perturbation bounds of matrix decompositions. Numer Linear Alg. Appl. 2024. [Google Scholar] [CrossRef]
Angelova, V.; Petkov, P. Componentwise perturbation analysis of the Singular Value Decomposition of a matrix. Appl. Sci. 2024, 14, 1417. [Google Scholar] [CrossRef]
Konstantinov, M.; Petkov, P. Perturbation Methods in Matrix Analysis and Control; NOVA Science Publishers, Inc.: New York, NY, USA, 2020; Available online: https://novapublishers.com/shop/perturbation-methods-in-matrix-analysis-and-control (accessed on 20 August 2024).
Petkov, P. Componentwise perturbation analysis of the Schur decomposition of a matrix. SIAM J. Matrix Anal. Appl. 2021, 42, 108–133. [Google Scholar] [CrossRef]
Petkov, P. Componentwise perturbation analysis of the QR decomposition of a matrix. Mathematics 2022, 10, 4687. [Google Scholar] [CrossRef]
Zhang, G.; Li, H.; Wei, Y. Componentwise perturbation analysis for the generalized Schur decomposition. Calcolo 2022, 59. [Google Scholar] [CrossRef]
Stewart, G.W. Matrix Algorithms; Vol. II: Eigensystems; SIAM: Philadelphia, PA, USA, 2001; ISBN 0-89871-503-2. [Google Scholar]
The MathWorks, Inc. MATLAB, Version 9.9.0.1538559 (R2020b); The MathWorks, Inc.: Natick, MA, USA, 2020; Available online: http://www.mathworks.com (accessed on 20 August 2024).
Papoulis, A. Probability, Random Variables and Stochastic Processes, 3rd ed.; McGraw Hill, Inc.: New York, NY, USA, 1991; ISBN 0-07-048477-5. [Google Scholar]
Horn, R.; Johnson, C. Matrix Analysis, 2nd ed.; Cambridge University Press: Cambridge, UK, 2013; ISBN 978-0-521-83940-2. [Google Scholar]
Golub, G.H.; Van Loan, C.F. Matrix Computations, 4th ed.; The Johns Hopkins University Press: Baltimore, MD, USA, 2013; ISBN 978-1-4214-0794-4. [Google Scholar]
Sun, J.-G. Stability and Accuracy. Perturbation Analysis of Algebraic Eigenproblems; Technical Report; Department of Computing Science, Umeå University: Umeå, Sweden, 1998; pp. 1–210. [Google Scholar]
Davis, C.; Kahan, W. The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 1970, 7, 1–46. [Google Scholar] [CrossRef]
Horn, R.; Johnson, C. Topics in Matrix Analysis; Cambridge University Press: Cambridge, UK, 1991; ISBN 0-521-30587-X. [Google Scholar]
Bavely, C.A.; Stewart, G.W. An algorithm for computing reducing subspaces by block diagonalization. SIAM J. Numer. Anal. 1979, 16, 359–367. [Google Scholar] [CrossRef]
Stewart, G.W. Matrix Algorithms; Vol. I: Basic Decompositions; SIAM: Philadelphia, PA, USA, 1998; ISBN 0-89871-414-1. [Google Scholar]

Figure 1. Mean value of

[Δ a_{i j} / | δ a_{i j} |]

as a function of

P^{r e f}

(left) and the mean value of

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

(right) as a function of

P^{r e f}

for two random distributions of the entries of a

100 \times 100

matrix.

Figure 1. Mean value of

[Δ a_{i j} / | δ a_{i j} |]

as a function of

P^{r e f}

(left) and the mean value of

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

(right) as a function of

P^{r e f}

for two random distributions of the entries of a

100 \times 100

matrix.

Figure 2. Eigenvalue perturbations and their linear and probabilistic bounds.

Figure 3. Angles between the perturbed and unperturbed invariant subspaces and their bounds.

Figure 4. Mean values of

[Δ a_{i j} / | δ a_{i j} |]

and

[Δ b_{i j} / | δ b_{i j} |]

for two random distributions as functions of

P^{r e f}

.

Figure 4. Mean values of

[Δ a_{i j} / | δ a_{i j} |]

and

[Δ b_{i j} / | δ b_{i j} |]

for two random distributions as functions of

P^{r e f}

.

Figure 5. Mean values of

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

and

N {Δ b_{i j} > | δ b_{i j} |} / n^{2}

for two random distributions as functions of

P^{r e f}

.

Figure 5. Mean values of

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

and

N {Δ b_{i j} > | δ b_{i j} |} / n^{2}

for two random distributions as functions of

P^{r e f}

.

Figure 6. Chordal metric perturbation bounds of the generalized eigenvalues.

Figure 7. Angles between perturbed and unperturbed right deflating subspaces.

Figure 8. Angles between perturbed and unperturbed left deflating subspaces.

Figure 9. Mean value of the matrix

[Δ a_{i j} / | δ a_{i j} |]

(left) and the scaling factor (right) as a function of

P_{r e f}

.

Figure 9. Mean value of the matrix

[Δ a_{i j} / | δ a_{i j} |]

(left) and the scaling factor (right) as a function of

P_{r e f}

.

Figure 10. Singular value perturbations and their bounds.

Figure 11. Angles between the perturbed and unperturbed right singular subspaces.

Figure 12. Angles between the perturbed and unperturbed left singular subspaces.

Table 1. The mean value of the ratios

[Δ a_{i j} / | δ a_{i j} |]

and the relative number of entries for which

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

, obtained for five values of

P^{r e f}

,

n = 100

.

Table 1. The mean value of the ratios

[Δ a_{i j} / | δ a_{i j} |]

and the relative number of entries for which

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

, obtained for five values of

P^{r e f}

,

n = 100

.

$P^{ref}$	$Ξ$	$E {[Δ a_{ij} / \| δ a_{ij} \|]}$	$N {Δ a_{ij} > \| δ a_{ij} \|} / n^{2}$
%			%
$100.0$	$1.0000 \times 10^{0}$	$6.3838 \times 10^{2}$	$100.0000$
$90.0$	$1.0000 \times 10^{1}$	$6.3838 \times 10^{1}$	$100.0000$
$80.0$	$2.0000 \times 10^{1}$	$3.1919 \times 10^{1}$	$100.0000$
$60.0$	$4.0000 \times 10^{1}$	$1.5959 \times 10^{1}$	$98.8200$
$40.0$	$6.0000 \times 10^{1}$	$1.0640 \times 10^{1}$	$90.3400$

Table 2. The mean value of the ratios

[Δ a_{i j} / | δ a_{i j} |]

and

[Δ b_{i j} / | δ b_{i j} |]

, obtained for three values of

P^{r e f}

,

n = 80

.

Table 2. The mean value of the ratios

[Δ a_{i j} / | δ a_{i j} |]

and

[Δ b_{i j} / | δ b_{i j} |]

, obtained for three values of

P^{r e f}

,

n = 80

.

$P^{ref}$	$Ξ$	$E {[Δ a_{ij} / \| δ a_{ij} \|]}$	$E {[Δ b_{ij} / \| δ b_{ij} \|]}$
%
$100.0$	$1.0000 \times 10^{0}$	$5.5561 \times 10^{2}$	$7.9274 \times 10^{2}$
$90.0$	$8.0000 \times 10^{0}$	$6.9451 \times 10^{1}$	$9.9093 \times 10^{1}$
$80.0$	$1.6000 \times 10^{1}$	$3.4725 \times 10^{1}$	$4.9546 \times 10^{1}$

Table 3. The mean value of the ratios

[Δ a_{i j} / | δ a_{i j} |]

and the relative number of entries for which

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

, obtained for five values of

P^{r e f}

,

n = 100

.

Table 3. The mean value of the ratios

[Δ a_{i j} / | δ a_{i j} |]

and the relative number of entries for which

N {Δ a_{i j} > | δ a_{i j} |} / n^{2}

, obtained for five values of

P^{r e f}

,

n = 100

.

$P^{ref}$	$Ξ_{2}$	$E {[Δ a_{ij} / \| δ a_{ij} \|]}$	$N {Δ a_{ij} > \| δ a_{ij} \|} / n^{2}$
%			%
$100.0$	$1.0000 \times 10^{0}$	$2.8523 \times 10^{3}$	$100.0000$
$90.0$	$2.0000 \times 10^{0}$	$1.4262 \times 10^{3}$	$100.0000$
$80.0$	$4.0000 \times 10^{0}$	$7.1309 \times 10^{2}$	$100.0000$
$50.0$	$1.0000 \times 10^{1}$	$2.8523 \times 10^{2}$	$100.0000$
$20.0$	$1.6000 \times 10^{1}$	$1.7827 \times 10^{1}$	$100.0000$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Petkov, P.H. Probabilistic Perturbation Bounds for Invariant, Deflating and Singular Subspaces. Axioms 2024, 13, 597. https://doi.org/10.3390/axioms13090597

AMA Style

Petkov PH. Probabilistic Perturbation Bounds for Invariant, Deflating and Singular Subspaces. Axioms. 2024; 13(9):597. https://doi.org/10.3390/axioms13090597

Chicago/Turabian Style

Petkov, Petko H. 2024. "Probabilistic Perturbation Bounds for Invariant, Deflating and Singular Subspaces" Axioms 13, no. 9: 597. https://doi.org/10.3390/axioms13090597

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Probabilistic Perturbation Bounds for Invariant, Deflating and Singular Subspaces

Abstract

1. Introduction

2. Probabilistic Bounds for Random Matrices

3. Perturbation Bounds for Invariant Subspaces

3.1. Problem Statement

3.2. s e p -Based Global Bound

3.3. Perturbation Expansion Bound

3.4. Bound by the Splitting Operator Method

3.5. Probabilistic Perturbation Bound

3.6. Bound Comparison

4. Perturbation Bounds for Deflating Subspaces

4.1. Problem Statement

4.2. d i f -Based Global Bound

4.3. Perturbation Expansion Bound

4.4. Bound by the Splitting Operator Method

4.5. Probabilistic Perturbation Bound

4.6. Bound Comparison

5. Perturbation Bounds for Singular Subspaces

5.1. Problem Statement

5.2. Global Bound

5.3. Perturbation Expansion Bound

5.4. Bound by the Splitting Operator Method

5.5. Probabilistic Perturbation Bound

5.6. Bound Comparison

6. Conclusions

Funding

Notation

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. $s e p$ -Based Global Bound

4.2. $d i f$ -Based Global Bound