Componentwise Perturbation Analysis of the Singular Value Decomposition of a Matrix

Angelova, Vera; Petkov, Petko

doi:10.3390/app14041417

Open AccessArticle

Componentwise Perturbation Analysis of the Singular Value Decomposition of a Matrix

by

Vera Angelova

¹

and

Petko Petkov

^2,*

¹

Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, 1113 Sofia, Bulgaria

²

Department of Engineering Sciences, Bulgarian Academy of Sciences, 1040 Sofia, Bulgaria

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(4), 1417; https://doi.org/10.3390/app14041417

Submission received: 13 January 2024 / Revised: 1 February 2024 / Accepted: 3 February 2024 / Published: 8 February 2024

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

A rigorous perturbation analysis is presented for the singular value decomposition (SVD) of a real matrix with full column rank. It is proved that the SVD perturbation problem is well posed only when the singular values are distinct. The analysis involves the solution of symmetric coupled systems of linear equations. It produces asymptotic (local) componentwise perturbation bounds on the entries of the orthogonal matrices participating in the decomposition of the given matrix and on its singular values. Local bounds are derived for the sensitivity of the singular subspaces measured by the angles between the unperturbed and perturbed subspaces. Determining the asymptotic bounds of the orthogonal matrices and the sensitivity of singular subspaces requires knowing only the norm of the perturbation of the given matrix. An iterative scheme is described to find global bounds on the respective perturbations, and results from numerical experiments are presented.

Keywords:

singular value decomposition (SVD); singular values; singular subspaces; perturbation analysis; componentwise perturbation bounds

MSC:

15A18; 65F25; 47A55; 47H14

1. Introduction

As it is known ([1], Ch. 2), ([2], Ch. 1), ([3], Ch, 2), each

m \times n, m \geq n

matrix

A \in R^{m \times n}

can be represented by the singular value decomposition (SVD) in the factorized form

A = U Σ V^{T},

(1)

where the

m \times m

matrix U and the

n \times n

matrix V are orthogonal and the

m \times n

matrix

Σ

is diagonal:

Σ = [\begin{matrix} Σ_{n} \\ 0_{(m - n) \times n)} \end{matrix}], Σ_{n} = diag (σ_{1}, σ_{2}, \dots, σ_{n}) .

The numbers

σ_{i} \geq 0

are called singular values of the matrix A. The columns of

U : = [u_{1}, u_{2}, \dots, u_{m}], u_{j} \in R^{m},

are called left singular vectors and the columns of

V : = [v_{1}, v_{2}, \dots, v_{n}], v_{j} \in R^{n}

are the right singular vectors. The subspaces spanned by sets of left and right singular vectors are called left and right singular subspaces, respectively.

The singular value decomposition has a long and interesting history described in [4].

The SVD has many properties, making it an invaluable tool in matrix analysis and matrix computations; see the references cited above. Among them is that the rank r of A equals the number of its nonzero singular values and the equality

{∥ A ∥}_{2} = σ_{\max}

. The usual assumption is that by an appropriate ordering of the columns

u_{j}

and

v_{j}

, the singular values appear in the order

σ_{\max} : = σ_{1} \geq σ_{2} \geq \dots \geq σ_{n} : = σ_{min} .

If some of the singular values are equal to zero, then A has

r < n

linearly independent columns, and the matrix

Σ

in (1) can be represented as

Σ = [\begin{matrix} Σ_{r} & 0_{r \times (n - r)} \\ 0_{(m - r) \times r} & 0_{(m - r) \times (n - r)} \end{matrix}], Σ_{r} = diag (σ_{1}, σ_{2}, \dots, σ_{r}) .

Further, we shall consider the case

r = n

, i.e., assume that the matrix A is of full column rank.

This paper concerns the case when the matrix A is subject to an additive perturbation

δ A

. In such a case, there exists another pair of orthogonal matrices

\tilde{U}

and

\tilde{V}

and a diagonal matrix

\tilde{Σ}

, such that

\tilde{A} : = A + δ A = \tilde{U} \tilde{Σ} {\tilde{V}}^{T} .

(2)

The perturbation analysis of the singular value decomposition consists in determining the changes in the quantities related to the elements of the decomposition due to the perturbation

δ A

. This includes determining bounds on the changes of the entries of the orthogonal matrices that reduce the original matrix to diagonal form and bounds on the perturbations of the singular values. Hence, the analysis aims to find bounds on the sizes of

δ U = \tilde{U} - U

,

δ V = \tilde{V} - V

, and

δ Σ = \tilde{Σ} - Σ

as functions of the size of

δ A

. It should be emphasized that problems of such a kind arise in most singular value decomposition applications. The most important application of the perturbation analysis is to assess the accuracy of the computed SVD since each algorithm for its determination produces the SVD of

A + δ A

, not of A, where the size of

δ A

depends on the properties of the floating point arithmetic used and the corresponding algorithm. Knowing the size of

δ A

, we may use the perturbation bounds to estimate the difference between the actual and computed elements of the decomposition. For a deep and systematic presentation of the matrix perturbation theory and its use in accuracy estimation, the reader is referred to the books of Stewart [2], Stewart and Sun [5], and Wilkinson [6].

According to Weyl’s theorem ([2], Ch. 1), we have that

| δ σ_{i} | = | {\tilde{σ}}_{i} - σ_{i} | \leq {∥ δ A ∥}_{2}, i = 1, 2, \dots, n,

(3)

which shows that the singular values are perturbed by no more than the 2-norm of the perturbation of A; i.e., the singular values are always well conditioned. The SVD perturbation analysis is well defined if the matrix A is of full column rank n; i.e.,

σ_{m i n} \neq 0

, since otherwise, the corresponding left singular vector is undetermined.

The size of the perturbations

δ A

,

δ U

,

δ V

, and

δ Σ

is usually measured by using some of the matrix norms, which leads to the so-called normwise perturbation analysis. In several cases, we are interested in the size of the perturbations of the individual entries of

δ U

,

δ V

, and

δ Σ

, so that it is necessary to implement a componentwise perturbation analysis [7]. This analysis has an advantage when the individual components of

δ U

,

δ V

, and

δ Σ

differ significantly in magnitude, and the normwise estimates do not produce tight bounds on the perturbations.

The literature on the perturbation theory of the singular value decomposition is significant. The first results on this topic are obtained by Wedin [8] and Stewart [9], who developed estimates of the sensitivity of pairs of singular subspaces (see also ([5], Ch. V)). Other results concerning the sensitivity of the singular vectors and singular subspaces are obtained in [10,11], and a survey on the perturbation theory of the SVD until 1990 can be found in [12]. Using the technique of perturbation expansions for invariant subspaces, Sun [13] derived perturbation bounds for a pair of singular subspaces that improved the bounds obtained in [9] and contained as a special case the bounds presented in [11]. A perturbation theory for the singular values and singular subspaces of diagonally dominant matrices, including graded matrices, is presented in [14], and optimal perturbation bounds for the case of structured perturbations are derived in [15]. The seminal paper by Demmel and his co-authors [16] contains a perturbation theory that provides relative perturbation bounds for the singular values and singular subspaces for different classes of matrices. The problem of backward error analysis of SVD algorithms is discussed in [17]. High-accuracy algorithms for computing SVD are proposed by Drmač and Veselić in [18,19] and implemented in the LAPACK package [20]. An improvement in the results of [8] is presented in [21]. Several results concerning the sensitivity of the SVD are summarized in the survey [22], and a reach bibliography on the accurate computation of SVD can be found in [23]. Finally, some recent applications of SVD are described in [24,25]. It should be pointed out that the available SVD perturbation theory provides bounds on the sensitivity of the singular vectors and singular subspaces but does not provide perturbation bounds on the individual entries of the matrices U and V. Such bounds are important in several applications, and this fact justifies the opinion that apart from the large number of results about the sensitivity of the singular values and singular vectors, a complete componentwise perturbation analysis of the SVD is not available up to the moment.

This paper presents a rigorous perturbation analysis of the orthogonal matrices, singular subspaces, and singular values of a real matrix of full column rank. It is proved that the SVD perturbation problem is well posed only in the case of distinct (simple) singular values. The analysis produces asymptotic (local) componentwise perturbation bounds of the entries of the orthogonal matrices U and V and the singular values of the given matrix. Local bounds are derived for the sensitivity of a pair of singular subspaces measured by the angles between the unperturbed and perturbed subspaces. An iterative scheme is described to find global bounds on the respective perturbations, and the results of numerical experiments are presented. The analysis performed in the paper implements the same methodology as the one used previously in [26] to determine componentwise perturbation bounds of the QR decomposition of a matrix. However, the SVD perturbation analysis has some distinctive features, making it a self-dependent problem.

The paper is organized as follows. In Section 2, we derive the basic nonlinear algebraic equations used to perform the perturbation analysis of the SVD. After introducing in Section 3 the perturbation parameters that determine the perturbations of the matrices U and V, we derive a symmetric system of coupled equations for these parameters in Section 4. The solution to the equations for the first-order terms of the perturbation parameters allows us to find asymptotic bounds on the parameters in Section 5, on the singular values in Section 6, and on the perturbations of the matrices U and V in Section 7. Using the bounds on the perturbation parameters in Section 8, we derive bounds on the sensitivity of the singular subspaces. In Section 9, we develop an iterative scheme for finding global bounds on the perturbations, and in Section 10 we present the results of some numerical experiments illustrating the proposed analysis. Some conclusions are drawn in Section 11.

2. Basic Equations

The perturbed singular value decomposition of A (2) can be written as

{\tilde{U}}^{T} \tilde{A} \tilde{V} = \tilde{Σ},

(4)

where

\begin{matrix} \tilde{U} & = & U + δ U : = [{\tilde{u}}_{1}, {\tilde{u}}_{2}, \dots, {\tilde{u}}_{m}], {\tilde{u}}_{j} \in R^{m}, \\ δ U & : = & [δ u_{1}, δ u_{2}, \dots, δ u_{m}], δ u_{j} \in R^{m}, \\ {\tilde{u}}_{j} & = & u_{j} + δ u_{j}, j = 1, 2, \dots, m, \\ \tilde{V} & = & V + δ V : = [{\tilde{v}}_{1}, {\tilde{v}}_{2}, \dots, {\tilde{v}}_{n}], {\tilde{v}}_{j} \in R^{n}, \\ δ V & : = & [δ v_{1}, δ v_{2}, \dots, δ v_{n}], δ v_{j} \in R^{n}, \\ {\tilde{v}}_{j} & = & v_{j} + δ v_{j}, j = 1, 2, \dots, n, \end{matrix}

and

\begin{matrix} \tilde{Σ} & = & Σ + δ Σ = [\begin{matrix} {\tilde{Σ}}_{n} \\ 0 \end{matrix}], \\ {\tilde{Σ}}_{n} & = & Σ_{n} + δ Σ_{n} = diag ({\tilde{σ}}_{1}, {\tilde{σ}}_{2}, \dots, {\tilde{σ}}_{n}), \\ δ Σ_{n} & = & diag (δ σ_{1}, δ σ_{2}, \dots, δ σ_{n}), \\ {\tilde{σ}}_{i} & = & σ_{i} + δ σ_{i} . \end{matrix}

Equation (4) is rewritten as

δ U^{T} A V + U^{T} A δ V + δ F = [\begin{matrix} δ Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] + Δ_{0},

(5)

where

δ F = U^{T} δ A V

and the matrix

Δ_{0}

Δ_{0} = - δ U^{T} A δ V - U^{T} δ A δ V - δ U^{T} δ A V - δ U^{T} δ A δ V \in R^{m \times n}

contains only higher-order terms in the elements of

δ A

,

δ U

, and

δ V

.

Let the matrices U and

δ U

be separated as

U = [U_{1}, U_{2}], U_{1} \in R^{m \times n}

and

δ U = [δ U_{1}, δ U_{2}],

δ U_{1} \in R^{m \times n}

, respectively. Since the matrix A can be represented as

A = U_{1} Σ_{n} V^{T}

, the matrix

U_{2}

is not well determined but should satisfy the orthogonality condition

{[U_{1}, U_{2}]}^{T} [U_{1}, U_{2}] = I_{m}

. The perturbation

δ U_{2}

is also undefined so that we can bound only the perturbations of the entries in the first n columns of U, i.e., the entries of

δ U_{1}

. Further on, we shall use (5) to determine componentwise bounds on

δ U_{1}

,

δ V

and

δ Σ_{n}

.

3. Perturbed Orthogonal Matrices and Perturbation Parameters

In the perturbation analysis of the SVD, it is convenient to find first componentwise bounds on the entries of the matrices

δ W_{U} : = U^{T} δ U_{1}

and

δ W_{V} : = V^{T} δ V

, which are related to the corresponding perturbations

δ U_{1}

and

δ V

by orthogonal transformations. The implementation of the matrices

δ W_{U}

and

δ W_{V}

allows us to find bounds on

δ U_{1} = U δ W_{U}

(6)

and

δ V = V δ W_{V}

(7)

using orthogonal transformations without increasing the norms of

δ W_{U}

and

δ W_{V}

. This helps to determine bounds on

δ U_{1}

and

δ V

, which are as tight as possible.

First, consider the matrix

δ W_{U} = U^{T} δ U_{1} = [\begin{matrix} u_{1}^{T} δ u_{1} & u_{1}^{T} δ u_{2} & \dots & u_{1}^{T} δ u_{n} \\ u_{2}^{T} δ u_{1} & u_{2}^{T} δ u_{2} & \dots & u_{2}^{T} δ u_{n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ u_{n}^{T} δ u_{1} & u_{n}^{T} δ u_{2} & \dots & u_{n}^{T} δ u_{n} \\ u_{n + 1}^{T} δ u_{1} & u_{n + 1}^{T} δ u_{2} & \dots & u_{n + 1}^{T} δ u_{n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ u_{m}^{T} δ u_{1} & u_{m}^{T} δ u_{2} & \dots & u_{m}^{T} δ u_{n} \end{matrix}] \in R^{m \times n} .

Further on, we shall use the vector of the subdiagonal entries of the matrix

δ W_{U}

,

\begin{matrix} x & : = & {[x_{1}, x_{2}, \dots, x_{p}]}^{T} \in R^{p}, \\ = & {[\underset{m - 1}{\underset{︸}{u_{2}^{T} δ u_{1}, u_{3}^{T} δ u_{1}, \dots, u_{m}^{T} δ u_{1}}}, \underset{m - 2}{\underset{︸}{u_{3}^{T} δ u_{2}, \dots, u_{m}^{T} δ u_{2}}}, \dots, \underset{m - n}{\underset{︸}{u_{n + 1}^{T} δ u_{n}, \dots, u_{m}^{T} δ u_{n}}}]}^{T}, \end{matrix}

where

p = \sum_{i = 1}^{n} (m - i) = n (n - 1) / 2 + (m - n) n = n (2 m - n - 1) / 2 .

As it will become clear later on, together with the orthogonality condition

{[U_{1} + δ U_{1}]}^{T} [U_{1} + δ U_{1}] = I_{n},

the vector x contains the whole information necessary to find the perturbation

δ U_{1}

. This vector may be expressed as

x = vec (Low (δ W_{U})),

or, equivalently,

x = Ω_{x} vec (δ W_{U}),

where

\begin{matrix} Ω_{x} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{p \times m n}, \\ ω_{i} & : = & [0_{(m - i) \times i}, I_{m - i}] \in R^{(m - i) \times m}, i = 1, 2, \dots, n, \\ Ω_{x} Ω_{x}^{T} = I_{p}, {∥ Ω_{x} ∥}_{2} = 1 \end{matrix}

is a matrix that “pulls out” the p elements of x from the

m \cdot n

elements of

δ W_{U}

(we consider

0_{0 \times i}

as an empty matrix). If, for instance,

m = 6

and

n = 4

, then

Low (δ W_{U}) = [\begin{matrix} δ {w_{u}}_{21} \\ δ {w_{u}}_{31} & δ {w_{u}}_{32} \\ δ {w_{u}}_{41} & δ {w_{u}}_{42} & δ {w_{u}}_{43} \\ δ {w_{u}}_{51} & δ {w_{u}}_{52} & δ {w_{u}}_{53} & δ {w_{u}}_{54} \\ δ {w_{u}}_{61} & δ {w_{u}}_{62} & δ {w_{u}}_{63} & δ {w_{u}}_{64} \end{matrix}],

\begin{matrix} vec (δ W_{U}) & = & [δ {w_{u}}_{11} δ {w_{u}}_{21} δ {w_{u}}_{31} δ {w_{u}}_{41} δ {w_{u}}_{51} δ {w_{u}}_{61} | \dots \\ δ {w_{u}}_{12} δ {w_{u}}_{22} δ {w_{u}}_{32} δ {w_{u}}_{42} δ {w_{u}}_{52} δ {w_{u}}_{62} | \dots \\ δ {w_{u}}_{13} δ {w_{u}}_{23} δ {w_{u}}_{33} δ {w_{u}}_{43} δ {w_{u}}_{53} δ {w_{u}}_{63} | \dots \\ {δ {w_{u}}_{14} δ {w_{u}}_{24} δ {w_{u}}_{34} δ {w_{u}}_{44} δ {w_{u}}_{54} δ {w_{u}}_{64}]}^{T}, \end{matrix}

and the matrix

Ω_{x} = [\begin{matrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{matrix}]

gives the relationship between the subdiagonal entries of the matrix

δ W_{U}

and the parameter vector x,

\begin{matrix} x & = & Ω_{x} vec (δ W_{U}) = [δ {w_{u}}_{21} δ {w_{u}}_{31} δ {w_{u}}_{41} δ {w_{u}}_{51} δ {w_{u}}_{61} | δ {w_{u}}_{32} δ {w_{u}}_{42} δ {w_{u}}_{52} δ {w_{u}}_{62} | \dots \\ {δ {w_{u}}_{43} δ {w_{u}}_{53} δ {w_{u}}_{63} | δ {w_{u}}_{54} δ {w_{u}}_{64}]}^{T} . \end{matrix}

We have that

\begin{matrix} x_{k} & = & u_{i}^{T} δ u_{j}, k = i + (j - 1) m - \frac{j (j + 1)}{2}, \\ 1 \leq j \leq n, j < i \leq m, 1 \leq k \leq p . \end{matrix}

In a similar way, we introduce the vector of the subdiagonal entries of the matrix

δ W_{V} = V^{T} δ V

(note that V is a square matrix),

\begin{matrix} y & = & {[y_{1}, y_{2}, \dots, y_{q}]}^{T} \in R^{q}, \\ y_{i} & = & {[\underset{n - 1}{\underset{︸}{v_{2}^{T} δ v_{1}, v_{3}^{T} δ v_{1}, \dots, v_{n}^{T} δ v_{1}}}, \underset{n - 2}{\underset{︸}{v_{3}^{T} δ v_{2}, \dots, v_{n}^{T} δ v_{2}}}, \dots, \underset{1}{\underset{︸}{v_{n}^{T} δ v_{n - 1}}}]}^{T}, \end{matrix}

where

q = n (n - 1) / 2

. It is fulfilled that

y = vec (Low (δ W_{V}))

or, equivalently,

y = Ω_{y} vec (δ W_{V}),

where

\begin{matrix} Ω_{y} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{q \times n^{2}}, \\ ω_{i} & : = & [0_{(n - i) \times i}, I_{n - i}] \in R^{(n - i) \times n}, i = 1, 2, \dots, n, \\ Ω_{y}^{T} Ω_{y} = I_{q}, {∥ Ω_{y} ∥}_{2} = 1 . \end{matrix}

In this case,

\begin{matrix} y_{ℓ} & = & v_{i}^{T} δ v_{j}, ℓ = i + (j - 1) n - \frac{j (j + 1)}{2}, \\ 1 \leq j \leq n, j < i \leq n - 1, 1 \leq ℓ \leq q . \end{matrix}

Further on, the quantities

x_{k}, k = 1, 2, \dots, p

and

y_{ℓ}, ℓ = 1, 2, \dots, q

will be referred to as perturbation parameters since they determine the perturbations

δ U_{1}

and

δ V

as well as the sensitivity of the singular values and singular subspaces.

First, consider the matrix

δ W_{U} = U^{T} δ U_{1} : = [δ {w_{u}}_{1}, δ {w_{u}}_{2}, \dots, δ {w_{u}}_{n}], δ {w_{u}}_{j} \in R^{m} .

Lemma 1.

The matrix

(8)

is the linear (asymptotic) approximation of the matrix

δ W_{U}

; i.e., for sufficiently small

{∥ δ A ∥}_{2}

, it is fulfilled that

δ Q_{U} \approx δ W_{U}

.

Proof.

Using the vector x of the perturbation parameters, the matrix

δ W_{U}

is written in the form

where the diagonal and superdiagonal entries have to be determined.

First, determine the entries of the superdiagonal part of

W_{U}

. Since

{\tilde{U}}^{T} \tilde{U} = I_{m}

, it follows that

U^{T} δ U = - δ U^{T} U - δ U^{T} δ U

(9)

and

δ u_{i}^{T} u_{j} = - u_{i}^{T} δ u_{j} - δ u_{i}^{T} δ u_{j}, 1 \leq i \leq m, i < j \leq m .

(10)

According to the orthogonality condition (10), the entries of the strictly upper triangular part of

δ W_{U}

can be represented as

\begin{matrix} u_{i}^{T} δ u_{j} & = & - u_{j}^{T} δ u_{i} - δ u_{i}^{T} δ u_{j}, \\ = & - x_{k} - δ u_{i}^{T} δ u_{j}, k = j + (i - 1) m - \frac{i (i + 1)}{2}, 1 \leq i \leq n, i < j \leq n . \end{matrix}

Now consider how to determine the diagonal entries of the matrix

δ W_{U}

,

δ {d_{u}}_{j j} : = u_{j}^{T} δ u_{j}, j = 1, 2, \dots, n

by the elements of x. Since

δ u_{j}^{T} u_{j} = u_{j}^{T} δ u_{j}

, according to (10) we have that

2 u_{j}^{T} δ u_{j} = - δ u_{j}^{T} δ u_{j}, j = 1, 2, \dots, n

or

δ {d_{u}}_{j j} = - {∥ δ u_{j} ∥}_{2}^{2} / 2 .

Since

δ {w_{u}}_{j} = U^{T} δ u_{j},

it follows that

δ {d_{u}}_{j j} = - {∥ δ {w_{u}}_{j} ∥}_{2}^{2} / 2 .

(11)

The above expression shows that

δ {d_{u}}_{j j}

is always negative and the entries

| δ {d_{u}}_{j j} |,

j = 1, 2, \dots, n

depend quadratically on the entries of

δ W_{U}

. On the other hand, in a linear setting, we have for the superdiagonal elements of

δ W_{U}

that

[\begin{matrix} δ {w_{u}}_{1 j} \\ ⋮ \\ δ {w_{u}}_{j - 1, j} \\ δ {w_{u}}_{j j} \\ δ {w_{u}}_{j + 1, j} \\ ⋮ \\ δ {w_{u}}_{m, j} \end{matrix}] = [\begin{matrix} δ {q_{u}}_{1 j} \\ ⋮ \\ δ {q_{u}}_{j - 1, j} \\ 0 \\ δ {q_{u}}_{j + 1, j} \\ ⋮ \\ δ {q_{u}}_{m, j} \end{matrix}] + [\begin{matrix} 0 \\ ⋮ \\ 0 \\ δ {d_{u}}_{j j} \\ 0 \\ ⋮ \\ 0 \end{matrix}] \begin{matrix} \leftarrow j \end{matrix}, j = 1, 2, \dots, n .

Hence,

∥ δ {w_{u}}_{j} ∥^{2} = ∥ δ {q_{u}}_{j} + δ {d_{u}}_{j j} ∥_{2}^{2} = {∥ δ {q_{u}}_{j} ∥}_{2}^{2} + δ {d_{u}}_{j j}^{2} .

(12)

From (12) and (11), we obtain the quadratic equation

δ {d_{u}}_{j j}^{2} + 2 δ {d_{u}}_{j j} + {∥ δ {q_{u}}_{j} ∥}_{2}^{2} = 0 .

(13)

From the two possible solutions to this equation, we take the root

δ {d_{u}}_{j j} = - {∥ δ {q_{u}}_{j} ∥}_{2}^{2} / (1 + \sqrt{1 - ∥ δ {q_{u}}_{j} ∥_{2}^{2}}), j = 1, 2, \dots, n,

(14)

since in this case

δ {d_{u}}_{j j} \to 0

with

∥ δ {q_{u}}_{j} ∥_{2} \to 0

. The expression (14) allows us to find an approximation of

δ {d_{u}}_{j j}

from the entries of the matrix

δ Q_{U}

. For small perturbation

δ A

(small values of

{∥ δ {q_{u}}_{j} ∥}_{2}

), we have the estimate (also following from (11))

| δ {d_{u}}_{j j} | \leq δ {d_{u}}_{j j}^{l i n} = {∥ δ {q_{u}}_{j} ∥}_{2}^{2} / 2 .

So, for small perturbations the quantity

| δ {d_{u}}_{j j} |, j = 1, 2, \dots, n

depends quadratically on

{∥ δ {q_{u}}_{j} ∥}_{2}

.

Thus, the matrix

δ W_{U}

can be represented as the sum

δ W_{U} = δ Q_{U} + δ D_{U} - δ N_{U},

(15)

where according to (8), the matrix

δ Q_{U}

has entries depending only on the perturbation parameters x,

and the matrix

contains second-order terms in

δ u_{j}, j = 1, 2, \dots, n

.

Since the matrices

δ D_{U}

and

δ N_{U}

contain only second-order terms in the entries of

δ U

, taking into account that

x \to 0

and consequently

δ U \to 0

with

{∥ δ A ∥}_{2} \to 0

, it follows from (15) that the matrix

δ Q_{U}

is the linear approximation of

δ W_{U}

. □

Similarly, for the matrix

like the case of

δ W_{U}

, it is possible to show that

δ W_{V} = δ Q_{V} + δ D_{V} - δ N_{V},

(16)

where

has elements depending only on the perturbation parameters y,

{d_{v}}_{j j} = v_{j}^{T} δ v_{j}, j = 1, 2, \dots, n

and the matrix

contains second-order terms in

δ v_{j}, j = 1, 2, \dots, n

. The diagonal entries of

δ D_{V}

are determined as in the case of the matrix

δ D_{U}

.

4. Equations for the Perturbation Parameters

In this section, we derive exact nonlinear equations for the perturbation parameters x and y and equations for their linear approximations. At this stage, we assume that the perturbation

δ A

is known, but in Section 5 we show how to use these equations to find asymptotic approximations of x and y knowing only the perturbation norm.

The elements of the perturbation parameter vectors x and y can be determined from Equation (5). For this aim, it is appropriate to transform this equation as follows. Taking into account that

A V = U Σ = U_{1} Σ_{n}

and

U^{T} A = Σ V^{T}

, the equation is represented as

δ U^{T} U_{1} Σ_{n} + [\begin{matrix} Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] V^{T} δ V + δ F = [\begin{matrix} {\tilde{Σ}}_{n} - Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] + Δ_{0} .

(17)

where

After transposing (9), we obtain that

δ U^{T} U_{1} = - U^{T} δ U_{1} - δ U^{T} δ U_{1} .

(18)

Substituting in (17) the term

δ U^{T} U_{1}

with the expression on the right hand side of (18), we obtain

- U^{T} δ U_{1} Σ_{n} + [\begin{matrix} Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] V^{T} δ V + δ F = [\begin{matrix} {\tilde{Σ}}_{n} - Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] + Δ,

(19)

where

\begin{matrix} Δ & = & Δ_{0} + δ U^{T} δ U_{1} Σ_{n} \\ = & δ U^{T} δ U_{1} Σ_{n} - δ U^{T} A δ V - U^{T} δ A δ V - δ U^{T} δ A V - δ U^{T} δ A δ V \end{matrix}

(20)

contains higher-order terms in the entries of

δ A

,

δ U

, and

δ V

.

Replacing the matrices

U^{T} δ U_{1}

and

V^{T} δ V

by

δ W_{U}

, and

δ W_{V}

, respectively, (19) is rewritten as

- δ W_{U} Σ_{n} + [\begin{matrix} Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] δ W_{V} + δ F = [\begin{matrix} {\tilde{Σ}}_{n} - Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] + Δ

or

\begin{matrix} - δ Q_{U} Σ_{n} & + & [\begin{matrix} Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] δ Q_{V} + δ F = [\begin{matrix} {\tilde{Σ}}_{n} - Σ_{n} \\ 0_{(m - n) \times n} \end{matrix}] \\ + (δ D_{U} - δ N_{U}) Σ_{n} - Σ (δ D_{V} - δ N_{V}) + Δ . \end{matrix}

(21)

Note that the matrices

δ D_{U}, δ N_{U}, δ D_{V}, δ N_{V}

and

Δ

contain only higher-order terms in the entries of

δ A, δ U

and

δ V

.

Equation (21) is the basic equation in the perturbation analysis of the SVD performed in this paper. This equation represents a diagonal system of linear equations with respect to the unknown entries of

δ Q_{U}

and

δ Q_{V}

, allowing us to solve it efficiently even for high-order matrices. Neglecting the higher-order terms in this equation, we obtain in Section 5 asymptotic bounds for the elements of the SVD. The approximation of these terms makes it possible to determine global perturbation bounds in Section 9.

The entries of the matrices

δ Q_{U}

and

δ Q_{V}

can be substituted by the corresponding elements

x_{k}, k = i + (j - 1) m - \frac{j (j + 1)}{2}

and

y_{ℓ}, ℓ = i + (j - 1) n - \frac{j (j + 1)}{2}

of the vectors x and y as shown in the previous section. This leads to the representation of Equation (21) as two matrix equations with respect to two groups of the entries of x,

(22)

and

(23)

where

\begin{matrix} Δ_{1} & = & δ U_{1}^{T} δ U_{1} Σ_{n} - δ U_{1}^{T} A δ V - U_{1}^{T} δ A δ V - δ U_{1}^{T} δ A V - δ U_{1}^{T} δ A δ V, \end{matrix}

(24)

\begin{matrix} Δ_{2} & = & δ U_{2}^{T} δ U_{1} Σ_{n} - δ U_{2}^{T} A δ V - U_{2}^{T} δ A δ V - δ U_{2}^{T} δ A V - δ U_{2}^{T} δ A δ V, \end{matrix}

(25)

Δ : = [\begin{matrix} Δ_{1} \\ Δ_{2} \end{matrix}], Δ_{1} \in R^{n \times n}, Δ_{2} \in R^{(m - n) \times n} .

We note that the estimation of

Δ_{2}

requires knowing an estimate of

δ U_{2}

, which is undetermined.

Equations (22) and (23) are the basic equations of the SVD perturbation analysis. They can be used to obtain asymptotic as well as global perturbation bounds on the elements of the vectors x and y.

Let us introduce the vectors

x_{(1)} = Ω_{1} x, and x_{(2)} = Ω_{2} x,

where

\begin{matrix} Ω_{1} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{q \times p}, \\ ω_{i} & : = & [I_{n - i}, 0_{(n - i) | \times (m - n)}] \in R^{(n - i) \times (m - i)}, i = 1, 2, \dots, n, \\ Ω_{1} Ω_{1}^{T} = I_{q}, {∥ Ω_{1} ∥}_{2} = 1, \\ Ω_{2} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{(m - n) n \times p}, \\ ω_{i} & : = & [0_{(m - n) \times (n - i)}, I_{m - n}] \in R^{(m - n) \times (m - i)}, i = 1, 2, \dots, n, \\ Ω_{2} Ω_{2}^{T} = I_{(m - n) n}, {∥ Ω_{2} ∥}_{2} = 1 . \end{matrix}

The vector

x_{(1)}

contains the elements of the unknown vector x participating in (22), while

x_{(2)}

contains the elements of x participating in (23). It is easy to prove that

x = Ω_{1}^{T} x_{(1)} + Ω_{2}^{T} x_{(2)} .

Taking into account that

\begin{matrix} Low ((δ {D_{U}}_{1} - δ {N_{U}}_{1}) Σ_{n} - Σ_{n} (δ D_{V} - δ N_{V})) & = & 0, \\ Up (δ {D_{U}}_{1} Σ_{n} - Σ_{n} δ D_{V}) & = & 0, \end{matrix}

the strictly lower part of (22) can be represented column-wise as the following system of linear equations with respect to the unknown vectors

x_{(1)}

and y,

- S_{1} x_{(1)} + S_{2} y = - f + vec (Low (Δ_{1})),

(26)

where

\begin{matrix} S_{1} & = & diag (\underset{n - 1}{\underset{︸}{σ_{1}, σ_{1}, \dots, σ_{1}}}, \underset{n - 2}{\underset{︸}{σ_{2}, \dots, σ_{2}}}, \dots, \underset{1}{\underset{︸}{σ_{n - 1}}}), \end{matrix}

(27)

\begin{matrix} S_{2} & = & diag (\underset{n - 1}{\underset{︸}{σ_{2}, σ_{3}, \dots, σ_{n}}}, \underset{n - 2}{\underset{︸}{σ_{3}, \dots, σ_{n}}}, \dots, \underset{1}{\underset{︸}{σ_{n}}}), \\ S_{i} \in R^{q \times q}, i = 1, 2 . \end{matrix}

(28)

and

\begin{matrix} f & = & vec (Low (δ F_{1})) = Ω_{3} vec (δ F_{1}) \in R^{q}, \\ Ω_{3} & : = & [diag (ω_{1}, ω_{2}, \dots, ω_{n})] \in R^{q \times n^{2}}, \\ ω_{i} & : = & [0_{(n - i) \times i}, I_{n - i}] \in R^{(n - i) \times n}, i = 1, 2, \dots, n, \\ Ω_{3} Ω_{3}^{T} = I_{q}, {∥ Ω_{3} ∥}_{2} = 1 . \end{matrix}

Similarly, the strictly upper part of (22) is represented row-wise as the system of equations

S_{2} x_{(1)} - S_{1} y = - g - vec ({(Up (δ {N_{U}}_{1} Σ_{n} - Σ_{n} δ N_{V} - Δ_{1}))}^{T}),

(29)

where

\begin{matrix} g & = & vec ({(Up (δ F_{1}))}^{T}) = Ω_{4} vec (δ F_{1}) \in R^{q}, \\ Ω_{4} & : = & [\begin{matrix} 0_{(n - 1) \times n}, & ω_{1}, \\ 0_{(n - 2) \times 2 n}, & ω_{2}, \\ ⋮ & ⋮ \\ 0_{1 \times (n - 1) n}, & ω_{n - 1} \end{matrix}] \in R^{q \times n^{2}}, \\ ω_{i} & : = & [\begin{matrix} [0_{1 \times (i - 1)}, 1, 0_{1 \times (n - i)}] & 0_{1 \times n} & \dots & 0_{1 \times n} \\ 0_{1 \times n} & [0_{1 \times (i - 1)}, 1, 0_{1 \times (n - i)}] & \dots & 0_{1 \times n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0_{1 \times n} & 0_{1 \times n} & \dots & [0_{1 \times (i - 1)}, 1, 0_{1 \times (n - i)}] \end{matrix}] \\ \in R^{(n - i) \times (n - i) . n}, i = 1, 2, \dots, n - 1, \\ Ω_{4} Ω_{4}^{T} = I_{q}, {∥ Ω_{4} ∥}_{2} = 1 . \end{matrix}

It should be noted that the operators

Low

and

Up

, used in (26) and (29), respectively, take only the entries of the strict lower and strict upper part of the corresponding matrix. These entries are then arranged by the operator

vec

column by column, excluding the zeros above or below the diagonal. For instance, if

n = 4

, the elements of the vectors f and g satisfy

In this way, the solution to (22) reduces to the solution to the two symmetric coupled Equations (26) and (29) with diagonal matrices of size

q \times q

. The Equation (23) can be solved independently yielding

x_{(2)} = vec ((δ F_{2} - Δ_{2}) Σ_{n}^{- 1}) .

(30)

Note that the elements of

x_{(1)}

depend on the elements of y and vice versa, while

x_{(2)}

depends neither on

x_{(1})

nor on y.

5. Asymptotic Bounds on the Perturbation Parameters

In this section, we determine linear (asymptotic) bounds on the perturbation parameter vectors x and y using information only for the norm of the perturbation

Δ A

.

Equations (26) and (29) can be used to determine asymptotic approximations of the vectors

x_{(1)}

and y. The exact solution to these equations satisfies

\begin{matrix} x_{(1)} & = & S_{x f} (- f + vec (Low (Δ_{1})) \\ + S_{x g} (- g - vec ({(Up (δ {N_{U}}_{1} Σ_{n} - Σ_{n} δ N_{V} - Δ_{1}))}^{T}), \end{matrix}

(31)

\begin{matrix} y & = & S_{y f} (- f + vec (Low (Δ_{1})) \\ + S_{y g} (- g - vec ({(Up (δ {N_{U}}_{1} Σ_{n} - Σ_{n} δ N_{V} - Δ_{1}))}^{T}), \end{matrix}

(32)

where taking into account that

S_{1}

and

S_{2}

commute, we have that

\begin{matrix} S_{x f} & = & {(S_{2} - S_{1} S_{2}^{- 1} S_{1})}^{- 1} S_{1} S_{2}^{- 1} = {(S_{2}^{2} - S_{1}^{2})}^{- 1} S_{1} \in R^{q \times q}, \\ S_{x g} & = & {(S_{2} - S_{1} S_{2}^{- 1} S_{1})}^{- 1} = {(S_{2}^{2} - S_{1}^{2})}^{- 1} S_{2} \in R^{q \times q}, \\ S_{y f} & = & S_{x g}, \\ S_{y g} & = & S_{x f} . \end{matrix}

Exploiting these expressions, it is possible to show that

\begin{matrix} S_{x f} & = & diag (\underset{n - 1}{\underset{︸}{\frac{σ_{1}}{σ_{2}^{2} - σ_{1}^{2}}, \frac{σ_{1}}{σ_{3}^{2} - σ_{1}^{2}}, \dots, \frac{σ_{1}}{σ_{n}^{2} - σ_{1}^{2}}}}, \underset{n - 2}{\underset{︸}{\frac{σ_{2}}{σ_{2}^{2} - σ_{1}^{2}}, \dots, \frac{σ_{2}}{σ_{n}^{2} - σ_{2}^{2}}}}, \dots, \underset{1}{\underset{︸}{\frac{σ_{n - 1}}{σ_{n}^{2} - σ_{n - 1}^{2}}}}), \end{matrix}

(33)

\begin{matrix} S_{x g} & = & diag (\underset{n - 1}{\underset{︸}{\frac{σ_{2}}{σ_{2}^{2} - σ_{1}^{2}}, \frac{σ_{3}}{σ_{3}^{2} - σ_{1}^{2}}, \dots, \frac{σ_{n}}{σ_{n}^{2} - σ_{1}^{2}}}}, \underset{n - 2}{\underset{︸}{\frac{σ_{3}}{σ_{3}^{2} - σ_{2}^{2}}, \dots, \frac{σ_{n}}{σ_{n}^{2} - σ_{2}^{2}}}}, \dots, \underset{1}{\underset{︸}{\frac{σ_{n}}{σ_{n}^{2} - σ_{n - 1}^{2}}}}) . \end{matrix}

(34)

Let us consider the conditions for the existence of a solution to the Equations (26) and (29).

Theorem 1.

The Equations (26) and (29) have a unique solution if and only if the singular values of A are distinct.

Proof.

The Equations (26) and (29) have a unique solution for

x_{(1)}

and y if and only if the square symmetric block matrix

[\begin{matrix} - S_{1} & S_{2} \\ S_{2} & - S_{1} \end{matrix}]

is nonsingular, or equivalently, the matrices

S_{1}

,

S_{2}

, and

S_{2}^{2} - S_{1}^{2}

are nonsingular. The matrices

S_{1}

and

S_{2}

are nonsingular since the matrix A has nonzero singular values. (For the solution of linear systems of equations with block matrices, see ([27], Ch. II.)

In turn, a condition for nonsingularity of the matrix

S_{2}^{2} - S_{1}^{2}

can be found taking into account the structure of the matrices

S_{x f}

and

S_{x g}

. Clearly, the denominators of the first

n - 1

diagonal entries of

S_{x f}

and

S_{x g}

will be different from zero if

σ_{1}

is distinct from

σ_{2}, σ_{3}, \dots, σ_{n}

. Similarly, the denominators of the next group of

n - 2

diagonal entries will be different from zero if

σ_{2}

is distinct from

σ_{3}, \dots, σ_{n}

, and so on. Finally,

σ_{n - 1}

should be different from

σ_{n}

. Thus, we conclude that the matrices

S_{x f}

and

S_{x g}

will exist and the Equations (26) and (29) will have a unique solution, if and only if the singular values of A are distinct. □

We note that Theorem 1 is in accordance with the results obtained in [14,28]. Such a result should not come as a surprise since U is the matrix of the transformation of

A A^{T}

to Schur (diagonal) form

U Σ Σ^{T} U^{T}

and V is the matrix of the transformation of

A^{T} A

to diagonal form

V Σ^{T} Σ V^{T}

. On the other hand, the perturbation problem for the Schur form is well posed only when the matrix eigenvalues (the diagonal elements of

Σ Σ^{T}

or

Σ^{T} Σ

) are distinct.

Neglecting the higher-order terms in (31) and (32) and approximating each element of f and g by the perturbation norm

{∥ δ A ∥}_{2}

, we obtain the following result.

Lemma 2.

The linear approximations of the vectors

x_{(1)}

and y satisfy

\begin{matrix} | x_{(1)} | & ⪯ & x_{(1)}^{l i n}, | y | ⪯ y^{l i n} \end{matrix}

(35)

\begin{matrix} x_{(1)}^{l i n} & = & (| S_{x f} | + | S_{x g} |) h, \end{matrix}

(36)

\begin{matrix} y^{l i n} & = & (| S_{y f} | + | S_{y g} |) h, \end{matrix}

(37)

where

S_{y g} = S_{x f}

and

S_{y f} = S_{x g}

are given by (33) and (34), respectively, and

h = {\underset{q}{\underset{︸}{[1, 1, \dots, 1]}}}^{T} \times {∥ δ A ∥}_{2} .

Clearly, if the matrices

| S_{x f} |, | S_{x g} |, | S_{y f} |, | S_{y g} |

have large diagonal entries, then the estimates of the perturbation parameters will be large. Using the expressions for

S_{x f}

and

S_{x g}

, we may show that

\begin{matrix} ∥ S_{x f} ∥_{2} & = & \max_{i, j} \frac{σ_{i}}{| σ_{j}^{2} - σ_{i}^{2} |}, i = 1, 2, \dots, n - 1, j = i + 1, i + 2, \dots, n, \\ ∥ S_{x g} ∥_{2} & = & \max_{i, j} \frac{σ_{j}}{| σ_{j}^{2} - σ_{i}^{2} |} . \end{matrix}

Note that the norms of

S_{x f}

and

S_{x g}

can be considered as condition numbers of the vectors

x_{1}

and y with respect to the changes in

δ A

.

An asymptotic estimate of the vector

x_{(2)}

is obtained, neglecting the higher-order term

Δ_{2}

and approximating its elements according to (30).

Lemma 3.

The linear approximation of the vector

x_{(2)}

satisfies

\begin{matrix} x_{(2)}^{l i n} & = & vec (Z), \\ Z & = & {∥ δ A ∥}_{2} \times [\begin{matrix} 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \\ 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \end{matrix}] . \end{matrix}

(38)

Equation (38) shows that a group of n elements of

{x_{(2)}}^{l i n}

will be large if the singular value associated with the corresponding column of Z is small. The presence of large elements in the vector x leads to large entries in

δ W_{U}

and consequently in the estimate of

δ U

. This observation aligns with the well known fact that the sensitivity of a singular subspace is inversely proportional to the smallest singular value associated with that subspace.

As a result of determining the linear estimates (36)–(38), we obtain an asymptotic approximation of the vector x as

x^{l i n} = Ω_{1}^{T} x_{(1)}^{l i n} + Ω_{2}^{T} x_{(2)}^{l i n} .

(39)

It should be emphasized that the determination of the linear bounds

x^{l i n}

and

y^{l i n}

requires knowing only the norm of the perturbation

δ A

.

Thanks to the diagonal structure of the matrices

S_{1}, S_{2}

, and

Σ

, the solutions to the Equations (26)–(30) are computed efficiently with high accuracy. In fact, the computing of the diagonal elements of the matrices

S_{x f}, S_{x g}, S_{y f}, S_{y g}

requires

10 q

floating point operations (flops), the determining of

x_{(1)}^{l i n}

and

y^{l i n}

according to (36) and (37) requires

4 q

flops, and the obtaining of

x_{(2)}^{l i n}

by (38) needs

2 (m - n) n

flops. Thus, the computing of

x^{l i n}

and

y^{l i n}

requires

14 n (n - 1) / 2 + 2 (m - n) n

flops. Also, the solution of the diagonal systems of equations is performed very accurately in a floating point arithmetic.

Example 1.

Consider the

6 \times 4

matrix

A = [\begin{matrix} 3 & 3 & - 3 & - 6 \\ - 3 & - 1 & 1 & 8 \\ 3 & 1 & 0 & - 9 \\ - 3 & 1 & - 2 & 11.1 \\ 6 & 4 & - 6 & - 11.9 \\ 3 & 1 & 1 & - 10.1 \end{matrix}]

and assume that it is perturbed by

\begin{matrix} δ A & = & 10^{c} \times A_{0}, \\ A_{0} & = & [\begin{matrix} - 5 & 3 & 1 & - 2 \\ 7 & - 4 & - 9 & - 5 \\ - 3 & 8 & 5 & 4 \\ 2 & - 6 & - 3 & - 7 \\ - 5 & 3 & 1 & 9 \\ - 4 & - 8 & - 2 & 6 \end{matrix}], \end{matrix}

where c is a varying parameter. (For convenience, the entries of the matrix

A_{0}

are taken as integers).

The singular value decompositions of the matrices A and

A + δ A

are computed by the function svd of MATLAB^®[29]. The singular values of A are

\begin{matrix} σ_{1} & = & 25.460186918120350, \\ σ_{2} & = & 7.684342946021752, \\ σ_{3} & = & 1.248776186923002, \\ σ_{4} & = & 0.017709242587950 . \end{matrix}

In the given case, the matrices

S_{1}

and

S_{2}

in Equations (27) and (28) are

\begin{matrix} S_{1} & = & diag (25.4601869, 25.4601869, 25.4601869, \\ 7.6843429, 7.6843429, 1.2487762), \\ S_{2} & = & diag (7.6843429, 1.2487762, 0.0177092, \\ 1.2487762, 0.0177092, 0.0177092), \end{matrix}

and the matrices participating in (31) and (32) are

\begin{matrix} S_{x f} & = & diag (- 0.0432135, - 0.0393717, - 0.0392770, \\ - 0.1336647, - 0.1301354, - 0.8009451), \\ S_{x g} & = & diag (- 0.0130426, - 0.0019311, - 0.0000273, \\ - 0.0217217, - 0.0002999, - 0.0113584) \\ S_{y f} & = & S_{x g}, \\ S_{y g} & = & S_{x f} . \end{matrix}

The matrix

[\begin{matrix} - S_{1} & S_{2} \\ S_{2} & - S_{1} \end{matrix}],

which determines the solution for

x_{(1)}

and y, has a condition number with respect to the 2-norm equal to 26.9234179. The exact parameters

x_{k}

and their linear approximations

x_{k}^{l i n}

computed by using (36) and (38) are shown to eight decimal digits for two perturbation sizes in Table 1. The differences between the values of

x_{k}^{l i n}

and

x_{k}

are due to the bounding of the elements of the vectors f and g by the value of

{∥ δ A ∥}_{2}

and taking the terms in (36)–(38) with positive signs. Both approximations are necessary to ensure that

x ⪯ x^{l i n}

for arbitrary small size perturbation.

Similarly, in Table 2, we show for the same perturbations of A the exact perturbation parameters

y_{ℓ}

and their linear approximations obtained from (37).

6. Bounding the Perturbations of the Singular Values

Equation (22) can also be used to determine linear and nonlinear estimates of the perturbations of the singular values. Considering the diagonal elements of this equation (highlighted in green boxes) and taking into account that

diag (δ {N_{U}}_{1}) = 0_{n},

diag (δ N_{V}) = 0_{n}

, we obtain

δ Σ_{n} = diag (δ F_{1}) - diag (δ {D_{U}}_{1} Σ_{n} - Σ_{n} δ D_{V} + Δ_{1})

or

δ σ_{i} = δ f_{i i} - {(δ {D_{U}}_{1} Σ_{n} - Σ_{n} δ D_{V} + Δ_{1})}_{i i}, i = 1, 2, \dots, n,

(40)

where

{(.)}_{i i}

denotes the ith diagonal element of

(.)

. Neglecting the higher-order terms, we determine the componentwise asymptotic estimate

| δ σ_{i} | \leq δ σ_{i}^{l i n}, δ σ_{i}^{l i n} : = | δ f_{i i} |, i = 1, 2, \dots, n .

(41)

Bounding each diagonal element

| δ f_{i i} |

by

{∥ δ A ∥}_{2}

, we find the normwise estimate of

δ σ_{i}

,

| δ σ_{i} | \leq δ {\hat{σ}}_{i}, δ {\hat{σ}}_{i} : = {∥ δ A ∥}_{2},

which is in accordance with Weyl’s theorem (see (3)).

From (40), we also have that

\sqrt{\sum_{i = 1}^{n} δ σ_{i}^{2}} \leq ∥ δ F_{1} ∥_{F} + ∥ δ {D_{U}}_{1} Σ_{n} - Σ_{n} δ D_{V} ∥_{F} + {∥ Δ_{1} ∥}_{F} .

In Table 3, we show the exact perturbations of the singular values of the matrix A of Example 1 along with the normwise bound

δ {\hat{σ}}_{i} = {∥ δ A ∥}_{2}

and the asymptotic estimate

δ σ_{i}^{l i n}

obtained from (41) under the assumption that

δ A

is known. The exact perturbations and their linear bounds are very close.

7. Asymptotic Bounds on the Perturbations of $U_{1}$ and V

Having componentwise estimates for the elements of x and y, it is possible to find easily asymptotic bounds on the entries of the matrices

δ U_{1}

and

δ V

.

Theorem 2.

The asymptotic bound on the matrix

| δ U_{1} |

is given by

| δ U_{1} | ⪯ | U | | U^{T} δ U_{1} | ⪯ δ U_{1}^{l i n} = | U | δ W_{U}^{l i n},

(42)

where

and the parameters

x_{k}^{l i n} \geq 0, k = 1, 2, \dots, p

are determined by (39).

Proof.

The proof follows directly from Lemma 1 and Equation (6). □

The linear approximation

δ U_{1}^{l i n}

gives bounds on the perturbations of the individual elements of the orthogonal transformation matrix U. Note that (42) is strictly valid only for infinitesimally small perturbation

δ A

.

Similarly, we have that the linear approximation of the matrix

δ W_{V}

is given by

From (7), we obtain that

| δ V | ⪯ δ V^{l i n} = | V | δ W_{V}^{l i n} .

(43)

Hence, the matrix

δ V^{l i n}

entries give asymptotic bounds on the perturbations of the entries of V.

Consider the volume of operations necessary to determine the perturbation bounds

δ U^{l i n}

and

δ V^{l i n}

provided that the SVD of the matrix A is already computed. According to (42) and (43), the computation of the bounds

δ U^{l i n}

and

δ V^{l i n}

requires

2 m^{2} n

and

2 n^{3}

flops, respectively. Adding the

14 n (n - 1) / 2 + 2 (m - n) n

flops necessary to determine

x^{l i n}

and

y^{l i n}

, we find that the obtaining of the asymptotic componentwise estimates of

δ U

and

δ V

requires altogether

2 (m^{2} n + n^{3} + m n) + 5 n^{2} - 7 n \approx 2 (m^{2} n + n^{3})

flops. Thus, the perturbation analysis requires

O (m^{3})

operations.

For the matrix A of Example 1, we obtain that the absolute values of the exact changes of the entries of the matrix

U_{1}

for the perturbation

δ A = 10^{- 5} A_{0}

satisfy

| δ U_{1} | = 10^{- 3} \times [\begin{matrix} 0.0020130 & 0.0013238 & 0.0330402 & 0.2407168 \\ 0.0009548 & 0.0070093 & 0.0356216 & 0.9425346 \\ 0.0006562 & 0.0004107 & 0.0576633 & 0.1185295 \\ 0.0000996 & 0.0028839 & 0.0020370 & 0.1517173 \\ 0.0004584 & 0.0024637 & 0.0316821 & 0.3649600 \\ 0.0004974 & 0.0014734 & 0.0630623 & 0.1877213 \end{matrix}]

and their asymptotic componentwise estimates found by using (42) are

δ U_{1}^{l i n} = 10^{- 2} \times [\begin{matrix} 0.0018494 & 0.0052303 & 0.0215558 & 0.9587781 \\ 0.0012398 & 0.0044262 & 0.0221410 & 1.4004629 \\ 0.0014125 & 0.0045112 & 0.0219559 & 0.5375840 \\ 0.0017421 & 0.0041696 & 0.0159680 & 0.5367878 \\ 0.0015062 & 0.0033948 & 0.0117823 & 0.6368282 \\ 0.0016843 & 0.0050287 & 0.0180170 & 0.9570612 \end{matrix}] .

Also, for the same

δ A

, we have that

| δ V | = 10^{- 4} \times [\begin{matrix} 0.0188135 & 0.0293793 & 0.2621967 & 0.0365476 \\ 0.0348050 & 0.0032323 & 0.0065525 & 0.2572914 \\ 0.0153096 & 0.0121733 & 0.0836556 & 0.0994065 \\ 0.0036567 & 0.0106224 & 0.0446413 & 0.0002219 \end{matrix}]

and, according to (43),

δ V^{l i n} = 10^{- 3} \times [\begin{matrix} 0.0110821 & 0.0341292 & 0.1616122 & 0.0371114 \\ 0.0130708 & 0.0300489 & 0.0179235 & 0.1608765 \\ 0.0161344 & 0.0250809 & 0.0803601 & 0.1020482 \\ 0.0056686 & 0.0191882 & 0.0672177 & 0.0164358 \end{matrix}] .

It is seen that the magnitude of the entries of

δ U_{1}^{l i n}

and

δ V^{l i n}

correctly reflect the magnitude of the corresponding entries of

| δ U_{1} |

and

| δ V |

, respectively. Note that the perturbations of the columns of

U_{1}

and V tend to increase with increasing of the column number.

8. Sensitivity of Singular Subspaces

The sensitivity of the left

U_{r} = span (u_{1}, u_{2}, \dots, u_{r}), r \leq min {m - 1, n}

or the right

V_{r} = span (v_{1}, v_{2}, \dots, v_{r}), r \leq n - 1

singular subspace of dimension r is measured by the canonical angles between the corresponding unperturbed and perturbed subspaces ([2], Ch. 4), ([5], Ch. V), [30].

Let the unperturbed left singular subspace corresponding to the first r singular values be denoted by

U_{r}

and its perturbed counterpart as

{\tilde{U}}_{r}

, and let

U_{(r)}

and

{\tilde{U}}_{(r)}

be the orthonormal bases for

U_{r}

and

{\tilde{U}}_{r}

, respectively. Further on, the sensitivity of the singular subspace

U_{r}

will be characterized by the maximum canonical angle between

U_{r}

and

{\tilde{U}}_{r}

, defined as

cos (Φ \max (U_{r}, {\tilde{U}}_{r})) = σ_{min} (U_{(r)}^{T} {\tilde{U}}_{(r)}) .

(44)

The expression (44) has the disadvantage that if

Φ \max

is small, then

cos (Φ \max) \approx 1

and the angle

Φ \max

is not well determined. To avoid this difficulty, instead of

cos (Φ \max (U_{r}, {\tilde{U}}_{r}))

, it is preferable to work with

sin (Φ \max (U_{r}, {\tilde{U}}_{r}))

. Let

U_{(r)}^{⊥}

be the orthogonal complement of

U_{(r)}

,

{U_{(r)}^{⊥}}^{T} U_{(r)} = 0

. Then, it is possible to show that [31]

sin (Φ \max (U_{r}, {\tilde{U}}_{r})) = σ_{\max} ({U_{(r)}^{⊥}}^{T} {\tilde{U}}_{(r)}) .

(45)

Since

{\tilde{U}}_{(r)} = U_{(r)} + δ U_{(r)},

we have that

sin (Φ \max (U_{r}, {\tilde{U}}_{r})) = σ_{\max} ({U_{(r)}^{⊥}}^{T} δ U_{(r)}) .

(46)

Equation (46) shows that the sensitivity of the left singular subspace

U_{r}

is related to the values of the perturbation parameters

x_{k} = u_{i}^{T} δ u_{j}, k = i + (j - 1) m - \frac{j (j + 1)}{2}, i > r,

j = 1, 2, \dots, r

. In particular, for

r = 1

, the sensitivity of the first column of U (the left singular vector, corresponding to

σ_{1}

) is determined by

sin (Φ \max (U_{1}, {\tilde{U}}_{1})) = σ_{\max} (δ {W_{U}}_{2 : m, 1}),

for

r = 2

one has

sin (Φ \max (U_{2}, {\tilde{U}}_{2})) = σ_{\max} (δ {W_{U}}_{3 : m, 1 : 2})

and so on (see Figure 1), where the matrices

{U_{(r)}^{⊥}}^{T} δ U_{(r)} = δ {W_{U}}_{r + 1 : m, 1 : r}

for different values of r are highlighted in boxes.

Similarly, utilizing the matrix

δ W_{V}

, it is possible to find the sine of the maximum angle between the unperturbed

V_{r}

and the perturbed

{\tilde{V}}_{r}

right singular subspace

sin (Θ \max (V_{r}, \tilde{V_{r}})) = σ_{\max} (δ {W_{V}}_{r + 1 : n, 1 : r})

(see Figure 2). Hence, if the perturbation parameters are determined, it is possible to find sensitivity estimates of the nested singular subspaces

\begin{matrix} U_{1} = span (U_{(1)}) & = & span (u_{1}), \\ U_{2} = span (U_{(2)}) & = & span (u_{1}, u_{2}), \\ ⋮ & ⋮ \\ U_{r} = span (U_{(r)}) & = & span (u_{1}, u_{2}, \dots, u_{r}), r = min {m - 1, n} \end{matrix}

and

\begin{matrix} V_{1} = span (V_{(1)}) & = & span (v_{1}), \\ V_{2} = span (V_{(2)}) & = & span (v_{1}, v_{2}), \\ ⋮ & ⋮ \\ V_{n - 1} = span (V_{(n)}) & = & span (v_{1}, v_{2}, \dots, v_{n - 1}) . \end{matrix}

Specifically, as

δ W_{U} = [\begin{matrix} * & * & * & \dots & * \\ x_{1} & * & * & \dots & * \\ x_{2} & x_{m} & * & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n - 1} & x_{m + n - 3} & x_{2 m + n - 6} & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ x_{m - 1} & x_{2 m - 3} & x_{3 m - 6} & \dots & x_{p} \end{matrix}] \in R^{m \times n},

we have that the exact maximum angle between the unperturbed and perturbed left singular subspace of dimension r is given by

Φ \max (U_{r}, {\tilde{U}}_{r}) = arcsin (σ_{\max} (δ {W_{U}}_{r + 1 : m, 1 : r})) .

(47)

Similarly, the maximum angle between the unperturbed and perturbed right singular subspace of dimension r is

Θ \max (V_{r}, {\tilde{V}}_{r}) = arcsin (σ_{\max} (δ {W_{V}}_{r + 1 : m, 1 : r})) .

(48)

Thus, we obtain the following result.

Theorem 3.

The asymptotic estimate of the angle

Φ \max (U_{r}, {\tilde{U}}_{r})

between the unperturbed and perturbed singular subspace of dimension r satisfies

\begin{matrix} | Φ \max (U_{r}, {\tilde{U}}_{r}) | & \leq & {Φ \max}^{l i n} (U_{r}, {\tilde{U}}_{r}), \\ {Φ \max}^{l i n} (U_{r}, {\tilde{U}}_{r}) & = & arcsin (∥ δ {W_{U}}_{r + 1 : m, 1 : r}^{l i n} ∥_{2}), \end{matrix}

(49)

where

δ W_{U}^{l i n} = [\begin{matrix} 0 & * & * & \dots & * \\ x_{1}^{l i n} & 0 & * & \dots & * \\ x_{2}^{l i n} & x_{m}^{l i n} & 0 & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x_{n - 1}^{l i n} & x_{m + n - 3}^{l i n} & x_{2 m + n - 6}^{l i n} & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ x_{m - 1}^{l i n} & x_{2 m - 3}^{l i n} & x_{3 m - 6}^{l i n} & \dots & x_{p}^{l i n} \end{matrix}] \in R^{m \times n}

and the parameters

x_{k}^{l i n}, k = 1, 2, \dots, p

are determined from (39).

In particular, for the sensitivity of the range

R (A)

of A, we obtain that

sin ({Φ \max}^{l i n} (U_{n}, {\tilde{U}}_{n})) = {∥ δ {W_{U}}_{n + 1 : m, 1 : n}^{l i n} ∥}_{2} .

Similarly, for the angles between the unperturbed and perturbed right singular subspaces of dimension r, we obtain the linear estimates

\begin{matrix} | Θ \max (V_{r}, {\tilde{V}}_{r}) | & \leq & {Θ \max}^{l i n} (U_{r}, {\tilde{U}}_{r}), \\ {Θ \max}^{l i n} (V_{r}, {\tilde{V}}_{r}) & = & arcsin (∥ δ {W_{V}}_{r + 1 : n, 1 : r}^{l i n} ∥_{2}), \end{matrix}

(50)

where

δ W_{V}^{l i n} = [\begin{matrix} 0 & * & * & \dots & * \\ y_{1}^{l i n} & 0 & * & \dots & * \\ y_{2}^{l i n} & y_{n}^{l i n} & 0 & \dots & * \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ y_{n - 1}^{l i n} & y_{2 n - 3}^{l i n} & y_{3 n - 6}^{l i n} & \dots & 0 \end{matrix}] \in R^{n \times n}

and

y_{ℓ}, ℓ = 1, 2, \dots, q

are determined from (37).

We note that using separate x and y parameters decouples the SVD perturbation problem and makes it possible to determine the sensitivity estimates of the left and right singular subspaces independently. This is important when the left or right subspace in a pair of singular subspaces is much more sensitive than its counterpart.

Consider as an example the same perturbed matrix A as in Example 1. Computing the matrices

δ W_{U}^{l i n}

and

δ W_{V}^{l i n}

, it is possible to estimate the sensitivity of all four pairs of singular subspaces of dimensions

r = 1, 2, 3

, and 4 corresponding to the chosen ordering of the singular values. In Table 4, we show the actual values of the left and right singular subspaces sensitivity and the computed asymptotic estimates (49) and (50) of this sensitivity. To determine the sensitivity of other singular subspaces, reordering the singular values in the initial decomposition so that the desired subspace appears in the set of nested singular subspaces is possible. Note that asymptotic estimates of the canonical angles between an arbitrary unperturbed singular subspace

U_{q}, q \leq min {m - 1, n}

spanned by specific singular vectors combined in the matrix

U_{q}

and its perturbed counterpart can be determined by computing the singular values of the matrix

{U_{q}}^{⊥}^{T} δ U_{q}^{l i n}

, where

U_{q}^{⊥}

is the orthogonal complement of

U_{q}

, consisting of all singular vectors that do not participate in

U_{q}

, and

δ U_{q}^{l i n}

is the linear estimate obtained by using (42). Similarly, it is possible to obtain asymptotic perturbation bounds for a desired right singular subspace.

9. Global Perturbation Bounds

Since analytical expressions for the global perturbation bounds of the singular value decompositions are unknown up to this moment, we present an iterative procedure for finding estimates of these bounds based on the asymptotic analysis presented above. This procedure is similar to the corresponding iterative schemes proposed in [26] but is more complicated since determining bounds on the parameter vectors x and y must be performed simultaneously because the equations for these parameters are coupled.

9.1. Perturbation Bounds of the Entries of $U_{2}$

The main difficulty in determining global bounds of x and y is to find an appropriate approximation of the high-order term

Δ_{2}

in (23). As is seen from (25), the determining of such an estimate requires knowing the perturbation

δ U_{2}

, which is not well determined since it contains the columns of the matrix

δ U_{2} = {\tilde{U}}_{2} - U_{2}

. This perturbation satisfies the equations

\begin{matrix} {(U_{1} + δ U_{1})}^{T} (U_{2} + δ U_{2}) & = & 0, \end{matrix}

(51)

\begin{matrix} {(U_{2} + δ U_{2})}^{T} (U_{2} + δ U_{2}) & = & I_{m - n} \end{matrix}

(52)

which follow from the orthogonality of the matrix

\tilde{U} = [U_{1} + δ U_{1}, U_{2} + δ U_{2}] .

An estimate of

δ U_{2}

can be found based on a suitable approximation of

X = U^{T} δ U_{2} : = [\begin{matrix} X_{1} \\ X_{2} \end{matrix}] .

As shown in [26], a first-order approximation of the matrix X can be determined using the estimates

\begin{matrix} X_{1}^{a p p r} & = & - {(I_{n} + δ W_{1}^{T})}^{- 1} δ W_{2}^{T} \in R^{n \times (m - n)}, \end{matrix}

(53)

\begin{matrix} X_{2}^{a p p r} & = & - {X_{1}^{a p p r}}^{T} X_{1}^{a p p r} / 2 \in R^{(m - n) \times (m - n)}, \end{matrix}

(54)

where

δ W_{1} = U_{1}^{T} δ U_{1}

,

δ W_{2} = U_{2}^{T} δ U_{1}

and for sufficiently small perturbation

δ U_{1}

, the matrix

I_{n} + δ W_{1}^{T}

is nonsingular. (Note that

δ W_{U} = {[δ W_{1}^{T} δ W_{2}^{T}]}^{T}

is already estimated.) Thus, we have that

δ U_{2}^{a p p r} = U X^{a p p r} .

(55)

9.2. Iterative Procedure for Finding Global Bounds of x and y

Global componentwise perturbation bounds of the matrices U and V can be found using nonlinear estimates of the matrices

δ W_{U}

and

δ W_{V}

, determined by (15) and (16), respectively. Such estimates are found correcting the linear estimates of the perturbation parameters

x_{k} = u_{i}^{T} δ u_{j}

and

y_{ℓ} = v_{i}^{T} δ v_{j}

on each iteration step. This approach is used in [9,32] (Ch. 4), and [33] in the normwise perturbation analysis of invariant subspaces and is related to the solution of the nonlinear equation

T x = g - ϕ (x),

where

T

is a linear operator corresponding to the asymptotic perturbation estimate and

ϕ (x)

reflects the higher order terms corresponding to the nonlinear correction of the estimate. In finding the normwise global perturbation bounds, the above operator equation can be solved analytically using the contraction mapping theorem. Unfortunately, in the determining of componentwise estimates it is necessary to solve a system of complicated nonlinear equations and in such case the determining of analytical nonlinear bounds is difficult. That is why, in finding global bounds, we shall implement an iterative technique, similar to the one presented in [26]. Like the asymptotic estimates, the global bounds can be found only knowing the norm of

δ A

. This is performed approximating in the best possible way the higher-order terms (15) and (16), using on each step the approximations of the parameter vectors x and y. We note that the iteration convergence is established only experimentally, and the derivation of a convergence proof is a matter for further investigation. Thus, the proposed iterative procedure should be considered only as an illustrative one.

Consider the case of estimating the matrix

δ W_{U}

. It is convenient to substitute the terms containing the perturbations

δ u_{j}

in (15) with the quantities

δ {w_{u}}_{j} = U^{T} δ u_{j}, j = 1, 2, \dots, n,

which have the same magnitude as

δ u_{j}

. Since

δ u_{i}^{T} δ u_{j} = δ u_{i}^{T} U U^{T} δ u_{j} = δ {w_{u}}_{i}^{T} δ {w_{u}}_{j},

the absolute value of the matrix

δ W_{U} \in R^{m \times n}

(15) can be bounded as

\begin{matrix} | δ W_{U}^{n o n l} | & = & | U^{T} δ U_{1} | = [| δ {w_{u}}_{1} |, | δ {w_{u}}_{2} |, \dots, | δ {w_{u}}_{n} |] \\ ⪯ & | δ Q_{U} | + | δ D_{U} | + | δ N_{U} |, \end{matrix}

(56)

where

The diagonal entries

δ {d_{u}}_{j j}, j = 1, 2, \dots, n

can be found from the entries of

δ Q_{U}

by using the approximation (14). Since the unknown column estimates

| δ w_{j} |

participate in both sides of (56), it is possible to obtain them as follows. The first column of

δ W_{U}

is determined from

| δ {w_{u}}_{1} | = | δ {q_{u}}_{1} | + | δ {d_{u}}_{1} |,

where

| δ {q_{u}}_{1} |

,

| δ d_{1} |

are the first columns of

| δ Q_{U} |

,

| δ D_{U} |

, respectively. Then, the next column estimates

| δ w_{j} |, j = 2, 3, \dots n

can be determined recursively from

| δ {w_{u}}_{j} | = | δ {q_{u}}_{j} | + | δ {d_{u}}_{j} | + [\begin{matrix} | δ {w_{u}}_{1}^{T} | | δ {w_{u}}_{j} | \\ | δ {w_{u}}_{2}^{T} | | δ {w_{u}}_{j} | \\ ⋮ \\ | δ {w_{u}}_{j - 1}^{T} | | δ {w_{u}}_{j} | \\ 0 \\ ⋮ \\ 0 \end{matrix}],

which is equivalent to solving the linear system

| {S_{U}}_{j} | | δ {w_{u}}_{j} | = | δ {q_{u}}_{j} | + | δ {d_{u}}_{j} |,

where

| {S_{U}}_{j} | = [\begin{matrix} e_{1}^{T} - | δ {w_{u}}_{1}^{T} | \\ e_{2}^{T} - | δ {w_{u}}_{2}^{T} | \\ ⋮ \\ e_{j - 1}^{T} - | δ {w_{u}}_{j - 1}^{T} | \\ e_{j}^{T} \\ ⋮ \\ e_{n}^{T} \end{matrix}] \in R^{m \times m}

and

e_{j}

is the jth column of

I_{m}

. The matrix

| {S_{U}}_{j} |

is upper triangular with unit diagonal and if

| δ {w_{u}}_{i} |, i = 1, 2, \dots, j - 1

have small norms, then the matrix

| {S_{U}}_{j} |

is diagonally dominant. Hence, it is very well conditioned with a condition number close to 1.

As a result we obtain that

| δ {w_{u}}_{j} | ⪯ | {S_{U}}_{j} |^{- 1} (| δ {q_{u}}_{j} | + | δ {d_{u}}_{j} |)

(57)

which produces the jth column of

| δ W_{U}^{n o n l} |

.

A similar recursive procedure can be used to determine the quantities

| δ {w_{v}}_{j} | = | v_{i}^{T} δ v_{j} |

. In this case for each j it is necessary to solve the nth order linear system

| {S_{V}}_{j} | | δ {w_{v}}_{j} | = | δ {q_{v}}_{j} | + | δ {d_{v}}_{j} | .

The estimates of

| δ {w_{u}}_{j} |, j = 1, 2, \dots, m

and

| δ {w_{v}}_{j} |, j = 1, 2, \dots, n

thus obtained, are used to bound the absolute values of the nonlinear elements

Δ_{1}

and

Δ_{2}

given in (24) and (25), respectively. Utilizing the approximation of

U^{T} δ U_{2}

, it is possible to find an approximation of the matrix

U^{T} δ U

as

Z = [δ W_{U}^{n o n l}, δ U_{2}^{a p p r}] : = [Z_{1}, Z_{2}], Z_{1} \in R^{m \times n}, Z_{2} \in R^{m \times (m - n)}

where

δ U_{2}^{a p p r} = U X^{a p p r}

,

X^{a p p r} = [\begin{matrix} X_{1}^{a p p r} \\ X_{2}^{a p p r} \end{matrix}]

and

X_{1}^{a p p r}, X_{2}^{a p p r}

are given by (53), (54). Then, the elements of

Δ_{1}

,

Δ_{2}

are bounded according to (24) and (25) as

\begin{matrix} | {Δ_{1}}_{i j}^{n o n l} | & \leq & σ_{j} | {Z_{1}}_{i} |^{T} | {Z_{1}}_{j} | + | {Z_{1}}_{i} |^{T} Σ | Y_{j} | + \\ (∥ {Z_{1}}_{i} ∥_{2} + ∥ Y_{j} ∥_{2} + ∥ {Z_{1}}_{i} ∥_{2} ∥ Y_{j} ∥_{2} {) ∥ δ A ∥}_{2}, \\ i = 1, 2, \dots, n, j = 1, 2, \dots, n, \end{matrix}

(58)

\begin{matrix} | {Δ_{2}}_{i j}^{n o n l} | & \leq & σ_{j} | {Z_{2}}_{i} |^{T} | {Z_{1}}_{j} | + | {Z_{2}}_{i} |^{T} Σ | Y_{j} | + \\ (∥ {Z_{2}}_{i} ∥_{2} + ∥ Y_{j} ∥_{2} + ∥ {Z_{2}}_{i} ∥_{2} ∥ Y_{j} ∥_{2} {) ∥ δ A ∥}_{2} . \\ i = 1, 2, \dots, m - n, j = 1, 2, \dots, n . \end{matrix}

(59)

Utilizing (31) and (32), the nonlinear corrections of the vectors

x_{(1)}

and y can be determined from

\begin{matrix} δ x_{(1)} & = & | S_{x f} | vec (Low (| Δ_{1} |) + | S_{x g} | vec ((Up (| δ {N_{U}}_{1} | Σ_{n} \\ + Σ_{n} | δ N_{V} | + | Δ_{1}^{n o n l} {|))}^{T}), \end{matrix}

(60)

\begin{matrix} δ y & = & | S_{y f} | vec (Low (| Δ_{1} |) + | S_{y g} | vec ((Up (| δ {N_{U}}_{1} | Σ_{n} \\ + Σ_{n} | δ N_{V} | + | Δ_{1} |^{n o n l} {))}^{T}), \end{matrix}

(61)

where

| δ {N_{U}}_{1} |

is estimated by using the corresponding expression (56) and

| δ N_{V} |

—is estimated by using a similar expression.

The nonlinear correction of

x_{(2)}

is found from

\begin{matrix} δ x_{(2)} & = & vec (Z), \\ Z & = & ∥ Δ_{2}^{n o n l} ∥_{2} \times [\begin{matrix} 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \\ 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 1 / σ_{1} & 1 / σ_{2} & \dots & 1 / σ_{n} \end{matrix}] . \end{matrix}

(62)

and the total correction vector is determined from

δ x = Ω_{1}^{T} δ x_{(1)} + Ω_{2}^{T} δ x_{(2)} .

(63)

Now, the nonlinear estimates of the vectors x and y are found from

\begin{matrix} x^{n o n l} & = & x^{l i n} + δ x, \end{matrix}

(64)

\begin{matrix} y^{n o n l} & = & y^{l i n} + δ y . \end{matrix}

(65)

In this way, we obtain an iterative scheme for finding simultaneously nonlinear estimates of the coupled perturbation parameter vectors x and y involving the Equations (56)–(59) and (60)–(65). In the numerical experiments presented below, the initial conditions are chosen as

x_{0}^{n o n l} = e p s {[1, 1, \dots, 1]}^{T}

and

y_{0}^{n o n l} = e p s {[1, 1, \dots, 1]}^{T}

, where

e p s

is the MATLAB^®function eps,

e p s = 2^{- 52}

. The stopping criteria for x- and y-iterations are taken as

\begin{matrix} e r r_{x} & = & {∥ x_{s}^{n o n l} - x_{s - 1}^{n o n l} ∥}_{2} / {∥ x_{s - 1}^{n o n l} ∥}_{2} < t o l, \\ e r r_{y} & = & {∥ y_{s}^{n o n l} - y_{s - 1}^{n o n l} ∥}_{2} / {∥ y_{s - 1}^{n o n l} ∥}_{2} < t o l, \end{matrix}

where

t o l = 10 e p s

. The scheme converges for perturbations

δ A

of restricted size. It is possible that y converges while x does not converge.

The nonlinear estimate of the higher term

Δ_{1}

can be used to obtain nonlinear corrections of the singular value perturbations. Based on (40), a nonlinear correction of each singular value can be determined as

δ σ_{i}^{c o r r} = (| δ {D_{U}}_{1} | Σ_{n} + Σ_{n} | δ D_{V} | + | Δ_{1} {|)}_{i i},

(66)

so that the corresponding singular value perturbation is estimated as

δ σ_{i}^{n o n l} = σ_{i}^{l i n} + δ σ_{i}^{c o r r} .

Note that

σ_{i}^{l i n} = | δ f_{i i} |

is known only when the entries of the perturbation

δ A

are known, and usually, this is not fulfilled in practice. Nevertheless, the nonlinear correction (66) can be useful in estimating the sensitivity of a given singular value.

In Table 5, we present the number of iterations necessary to find the global bound

x^{n o n l}

for the problem considered in Example 1 with perturbations

δ A = 10^{c} \times A_{0}, c = - 10, - 9, \dots, - 3

. In the last two columns of the table, we give the norm of the exact higher-order term

Δ

and its approximation

∥ Δ^{n o n l} ∥_{2}

computed according to (58) and (59) (the approximation is given for the last iteration). In particular, for the perturbation

δ A = 10^{- 5} A_{0}

, the exact higher-order term

Δ

, found using (24) and (25), is

\begin{matrix} | Δ | & = & 10^{- 7} \times [\begin{matrix} 0.0045563 & 0.0006839 & 0.0146195 & 0.0257942 \\ 0.0012737 & 0.0008692 & 0.0080543 & 0.0003708 \\ 0.0000993 & 0.0015257 & 0.0086067 & 0.0085598 \\ 0.0019107 & 0.0007767 & 0.0258012 & 0.0170781 \\ 0.0032494 & 0.0006701 & 0.0423789 & 0.2046100 \\ 0.0064032 & 0.0004567 & 0.0384765 & 0.0071094 \end{matrix}], \\ {∥ Δ ∥}_{2} & = & 2.12746 \times 10^{- 8} . \end{matrix}

Implementing the described iterative procedure, after 10 iterations, we obtain the nonlinear bound

\begin{matrix} Δ^{n o n l} & = & 10^{- 5} \times [\begin{matrix} 0.0019716 & 0.0021343 & 0.0049111 & 0.0047911 \\ 0.0041273 & 0.0053422 & 0.0069174 & 0.0069309 \\ 0.0189857 & 0.0186412 & 0.0222135 & 0.0179503 \\ 0.8671799 & 0.8680880 & 0.8664831 & 0.8607933 \\ 0.5146400 & 0.5162827 & 0.5207565 & 0.2618103 \\ 0.5146400 & 0.5162827 & 0.5207565 & 0.2618103 \end{matrix}], \\ ∥ Δ^{n o n l} ∥_{2} & = & 2.16338 \times 10^{- 5} \end{matrix}

computed according to (58) and (59) on the base of the nonlinear bound

x^{n o n l}

.

The global bounds

∥ x^{n o n l} ∥_{2}

and

∥ y^{n o n l} ∥_{2}

, found for different perturbations along with the values of

{∥ x ∥}_{2}

and

{∥ y ∥}_{2}

, are shown in Table 6. The results confirm that the global estimates of x and y are close to the corresponding asymptotic estimates.

In Figure 3 and Figure 4, we show the convergence of the relative errors

e r r_{x} = ∥ x_{s}^{n o n l} - x_{s - 1}^{n o n l} ∥ / ∥ x_{s - 1}^{n o n l} ∥

and

e r r_{y} = ∥ y_{s}^{n o n l} - y_{s - 1}^{n o n l} ∥ / ∥ y_{s - 1}^{n o n l} ∥,

respectively, at step s of the iterative process for different perturbations

δ A = 10^{c} \times A_{0}

. (The value of the relative error on each step is shown by a circle.) As it is seen from the figures, for the given example the convergence of y is close to the convergence of x. For all values of c, except

c = - 3

, the relative error begins with values

e r r_{x}, e r r_{y} > > 1

and ends with values

e r r_{x}, e r r_{y} < 10^{- 13}

. With the increasing in the perturbation size, the convergence becomes slower and for

c = - 3

(∥ δ A ∥_{2} = 2.08198 \times 10^{- 2})

, the iteration diverges. This demonstrates the restricted usefulness of the nonlinear estimates, which is valid only for limited perturbation magnitudes.

In Table 7, we give normwise perturbation bounds of the singular values along with the actual singular value perturbations and their global bounds found for two perturbations of A under the assumption that the linear bounds of all singular values are known. As can be seen from the table, the nonlinear estimates of the singular values are very tight.

9.3. Global Perturbation Bounds of $δ U_{1}$ and $δ V$

Having nonlinear bounds of x, y,

| δ W_{U} |

, and

| δ W_{V} |

, we may find nonlinear bounds on the perturbations of the entries of

U_{1}

and V according to the relationships

\begin{matrix} δ U_{1}^{n o n l} & = & | U | δ W_{U}^{n o n l}, \end{matrix}

(67)

\begin{matrix} δ V^{n o n l} & = & | V | δ W_{V}^{n o n l} . \end{matrix}

(68)

For the perturbations of the orthogonal matrices of Example 1 in the case of

δ A = 10^{- 5} A_{0}

, we obtain the nonlinear componentwise bounds

δ U_{1}^{n o n l} = 10^{- 2} \times [\begin{matrix} 0.0018797 & 0.0053307 & 0.0221710 & 0.9714224 \\ 0.0012678 & 0.0045195 & 0.0227162 & 1.4182521 \\ 0.0014492 & 0.0046330 & 0.0227031 & 0.5447168 \\ 0.0017643 & 0.0042429 & 0.0164150 & 0.5441097 \\ 0.0015182 & 0.0034345 & 0.0120243 & 0.6452928 \\ 0.0017076 & 0.0051060 & 0.0184868 & 0.9697071 \end{matrix}]

and

δ V^{n o n l} = 10^{- 3} \times [\begin{matrix} 0.0110869 & 0.0341506 & 0.1618907 & 0.0371700 \\ 0.0130756 & 0.0300673 & 0.0179387 & 0.1611512 \\ 0.0161396 & 0.0250967 & 0.0804756 & 0.1022096 \\ 0.0056705 & 0.0191966 & 0.0673203 & 0.0164507 \end{matrix}] .

These bounds are close to the obtained respective linear estimates

δ U_{1}^{l i n}

and

δ V^{l i n}

in Section 7.

Based on (47) and (48), global estimates of the maximum angles between the unperturbed and perturbed singular subspaces of dimension r can be obtained using the nonlinear bounds

δ W_{U}^{n o n l}

and

δ W_{V}^{n o n l}

of the matrices

δ W_{U}

and

δ W_{V}

, respectively. For the pair of left and right singular subspaces of dimension r, we obtain that

\begin{matrix} {Φ \max}^{n o n l} (U_{r}, {\tilde{U}}_{r}) & = & arcsin (∥ {δ W_{U}}_{r + 1 : m, 1 : r}^{n o n l} ∥_{2}), \\ r \leq min {m - 1, n}, \end{matrix}

(69)

\begin{matrix} {Θ \max}^{n o n l} (V_{r}, {\tilde{V}}_{r}) & = & arcsin (∥ {δ W_{V}}_{r + 1 : n, 1 : r}^{n o n l} ∥_{2}), \\ r \leq n - 1 . \end{matrix}

(70)

In Table 8, we give the exact angles between the unperturbed and perturbed left and right singular subspaces of different dimensions and their nonlinear bounds computed using (69) and (70) for the matrix A from Example 1 and two perturbations

δ A = 10^{c} \times A_{0}, c = - 10, - 5

. The comparison with the corresponding linear bounds in Table 4 shows that the two types of bounds produce close results. As in estimating the other elements of the singular value decomposition, the global perturbation bounds are slightly larger than the corresponding asymptotic estimates but give guaranteed bounds on the changes of the respective elements, although for a limited size of

∥ δ A ∥

.

10. Numerical Experiments

In this section, we present the results of some numerical experiments illustrating the properties of the asymptotic and global estimates obtained in the paper. The computations are performed with MATLAB^®Version 9.9 (R2020b) [29] using IEEE double-precision arithmetic and are verified by using GNU Octave, v. 5.2.0. The M-files implementing the linear and nonlinear SVD perturbation estimates along the example files are available from the authors.

Example 2.

This example illustrates the ill-conditioning of the singular subspaces in the case of close singular values of the given matrix.

Consider a

10 \times 3

matrix A obtained as

A = U_{0} Σ_{0} V_{0}^{T},

where

\begin{matrix} U_{0} & = & I - 2 e e^{T} / 10, e = [1, 1, \dots, 1], \\ V_{0} & = & I - 2 h h^{T} / 3, h = [1, - 1, 1] \end{matrix}

are orthogonal ad symmetric matrices (elementary reflections [Ch. 4] [2]) and

diag (Σ_{0}) = (2, 1 + τ, 1),

where the parameter τ varies between

10^{- 8}

and

10^{- 1}

.

In Figure 5, we present the actual values of

∥ δ U_{1} ∥_{2}, {∥ x ∥}_{2}

and the angle

| Φ \max_{2} |

along with the asymptotic normwise estimate

∥ δ U_{1}^{l i n} ∥_{2}

as functions of the difference

τ = σ_{2} - σ_{3} = 10^{- 8}, \dots, 10^{- 1}

. Similarly, in Figure 6, we show the values of

{∥ δ V ∥}_{2}, {∥ y ∥}_{2}

and the angle

| Θ \max_{2} |

together with the asymptotic estimate

∥ δ V^{l i n} ∥_{2}

as functions of τ. Note that

Φ \max_{2}

is the larger angle between the subspaces

U_{(2)} = span (u_{1}, u_{2})

and

{\tilde{U}}_{(2)} = span ({\tilde{u}}_{1}, {\tilde{u}}_{2})

, while

Θ \max_{2} = Θ \max (V_{2}, {\tilde{V}}_{2})

.

The results shown in Figure 5 and Figure 6 confirm that the sensitivity of SVD increases with the decreasing distance between the singular values. According to (33), (34), (36) and (37), with the decreasing of τ the conditioning of the singular subspaces worsens and the norms of the perturbations

δ U_{1}, δ V

increase. Hence, a potential source of ill-conditioning of the singular subspaces is the closeness between the singular values of the matrix. Another cause of ill-conditioning may be the presence of small singular values which, according to (38), leads to large elements of the vector

x_{2}

and large norms of

δ U_{1}, δ V

. The separation of the parameter vector x in two parts

x_{1}

and

x_{2}

reveals the independent importance of these two causes.

Example 3.

In this example, we compare the perturbation bounds of the singular subspaces derived in this paper with some known bounds from the literature.

For the matrix A and the perturbation E given in the previous example, compare first the sensitivity of the singular vector

v_{3}

associated with the minimum singular value of A with the respective estimate of

sin Θ_{3} (v_{3}, \tilde{v_{3}})

presented in [Example 2, p. 267] [5].

In Figure 7, we show the exact value of

| sin Θ_{3} |

, the estimate given in [5], and the linear estimate

sin Θ^{l i n}

derived in this paper as functions of the parameter τ. (The estimate from [5] is valid for

{τ > ∥ δ A ∥}_{2}

.) Both estimates are very close for all values of τ.

Compare now the sensitivity of the left

U_{2}

and right

V_{2}

singular subspaces for the same matrix, with the bounds derived in [13] and the nonlinear estimates obtained in this paper. Since the estimates in [13] require knowing the norms of parts of the exact perturbation

δ A

, these norms are substituted by

{∥ δ A ∥}_{2}

for a fair comparison.

In Figure 8, we show the exact value of

{∥ tan Φ ∥}_{F} = \sqrt{tan Φ_{1}^{2} + tan Φ_{2}^{2}}

, the respective estimate from [13], and the estimate

∥ tan Φ^{n o n l} ∥_{F} = \sqrt{tan {Φ_{1}}^{n o n l}^{2} + tan {Φ_{2}}^{n o n l}^{2}}

, where

Φ_{1}

and

Φ_{2}

are the canonical angles between

U_{2}

and

{\tilde{U}}_{2}

. The corresponding comparison of the exact value of

{∥ tan Θ ∥}_{F} = | tan Θ_{1} |

with the estimate from [13] and the estimate

∥ tan Θ^{n o n l} ∥_{F} = tan Θ_{1}^{n o n l}

is given in Figure 9. It is seen from the figures that for values of τ between

9 \times 10^{- 9}

and

10^{- 6}

the estimates from [13] produce slightly better results, but for values of τ less than

9 \times 10^{- 9}

, these estimates exceed significantly the nonlinear estimates obtained in this paper. (Note that the estimates derived in [13] are valid for

τ \geq 6.498 \times 10^{- 9} .)

The comparison of the estimates shows that in case of ill conditioned problems (small values of τ) the bound presented in this paper is less conservative than the estimate given in [13].

Example 4.

This example illustrates the properties of the linear and nonlinear perturbation bounds obtained in the paper.

Consider a

100 \times 80

matrix, taken as

A = U_{0} [\begin{matrix} Σ_{0} \\ 0 \end{matrix}] V_{0}^{T},

where

Σ_{0} = diag (1, 1, \dots, 1)

, the matrices

U_{0}

and

V_{0}

are constructed as proposed in [34],

\begin{matrix} U_{0} & = & M_{2} S_{U} M_{1}, \\ M_{1} & = & I_{m} - 2 e e^{T} / m, M_{2} = I_{m} - 2 f f^{T} / m, \\ e & = & {[1, 1, 1, \dots, 1]}^{T}, f = {[1, - 1, 1, \dots, {(- 1)}^{m - 1}]}^{T}, \\ S_{U} & = & diag (1, σ, σ^{2}, \dots, σ^{m - 1}), \\ V_{0} & = & N_{2} S_{V} N_{1}, \\ N_{1} & = & I_{n} - 2 g g^{T} / n, N_{2} = I_{n} - 2 h h^{T} / n, \\ g & = & {[1, 1, 1, \dots, 1]}^{T}, h = {[1, - 1, 1, \dots, {(- 1)}^{n - 1}]}^{T}, \\ S_{V} & = & diag (1, τ, τ^{2}, \dots, τ^{n - 1}), \end{matrix}

and the matrices

M_{1}, M_{2}, N_{1}, N_{2}

are Householder reflections. The condition numbers of

U_{0}

and

V_{0}

with respect to the inversion are controlled by the variables σ and τ and are equal to

σ^{m - 1}

and

τ^{n - 1}

, respectively. In the given case,

σ = 1.05

,

τ = 1.1

and

cond (U_{0}) = 125.2, cond (V_{0}) = 186.2

. The minimum singular value of the matrix A is

σ_{min} (A) = 0.103068025192609

. The perturbation of A is taken as

δ A = 10^{c} \times A_{0}

, where c is a negative number and

A_{0}

is a matrix with random entries generated by the MATLAB^®functionrand.

In Figure 10, Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15, we show several results related to the perturbations of the singular value decomposition of A for 30 values of c between

- 12

and

- 5

. As particular examples, in Figure 10 we display the perturbations of the entry

U_{60, 20}

, which is an element of the matrix

U_{1}

, and in Figure 11, we display the perturbations of the entry

V_{60, 20}

, both as functions of

{∥ δ A ∥}_{2}

. The componentwise linear bounds correctly reflect the behavior of the actual perturbations and are valid for wide changes in the perturbation size. Note that this holds for all elements of U and V. The global (nonlinear) bounds practically coincide with the linear bounds but do not exist for perturbations whose size is larger than

1.5 \times 10^{- 4}

.

In Figure 12 and Figure 13, we show the angles between the perturbed and unperturbed left

U_{50} = span (u_{1}, u_{2}, \dots, u_{50})

and right

V_{50} = span (v_{1}, v_{2}, \dots, v_{50})

singular subspaces of dimension 50. Again, the linear bounds on the angles

| Φ \max (U_{50}, {\tilde{U}}_{50}) |

and

| Θ \max (V_{50}, {\tilde{V}}_{50}) |

are valid for perturbation magnitudes from

{∥ δ A ∥}_{2} = 10^{- 10}

to

{∥ δ A ∥}_{2} = 10^{- 4}

and this also holds for singular subspaces of other dimensions. Note that for sufficiently large

{∥ δ A ∥}_{2}

, the linear estimate also become non-valid.

In Figure 14 and Figure 15, we show the perturbations of the singular values and their nonlinear bounds for two different perturbations with

{∥ δ A ∥}_{2} = 2.7282784 \times 10^{- 10}

and

{∥ δ A ∥}_{2} = 4.3932430 \times 10^{- 4}

. While in the first case, the nonlinear bound

δ σ_{i}^{n o n l}

is close to the actual change

δ σ_{i}

of the singular values, in the second case, the bound becomes significantly greater than the actual change due to the overestimating of the higher-order term

Δ_{1}

.

From the results shown in Figure 10, Figure 11, Figure 12 and Figure 13, it follows that for the

100 \times 80

matrix in the example under consideration, the overestimates of the perturbations of U and V are approximately 200 times, while the overestimates of

Φ \max

and

Θ \max

are approximately 100 and 170 times, respectively. In Figure 16 and Figure 17, we show the ratios

δ U_{i j}^{l i n} / | δ U_{i j} |,

i = 1, 2, \dots, m;

j = 1, 2, \dots, n

and

δ V_{i j}^{l i n} / | δ V_{i j} |, i = 1, 2, \dots, n; j = 1, 2, \dots, n

, respectively, along with the corresponding mean values (averages)

E (δ U_{i j}^{l i n} / | δ U_{i j} |)

and

E (δ V_{i j}^{l i n} / | δ V_{i j} |)

of the ratio for

δ A = 10^{- 8} \times A_{0} = 4.4679 \times 10^{- 7}

. The overestimates are mainly due to overestimating the perturbation parameters

x_{(1)}, y

and

x_{(2)}

in the Equations (36)–(38). In these equations, each entry

δ A_{i j}

is replaced by the quantity

{∥ δ A ∥}_{2}

, leading to an overestimate approximately proportional to m in case of random perturbations. To illustrate this point, in Figure 18 and Figure 19 we show the entries of the matrices

| δ U_{1} |

and

| δ V |

, respectively, along with the corresponding componentwise estimates

δ U^{l i n}

and

δ V^{l i n}

for

δ A = 10^{- 8} \times A_{0} = 4.4679 \times 10^{- 7}

, where the asymptotic estimates are computed for the exact (non approximated) values of the perturbation parameters

x_{k} = u_{i} δ u_{j}

and

y_{ℓ} = v_{i} δ v_{j}

, obtained by using the exact

δ A

. Obviously, the estimates are very close to the exact values, which confirms their good quality. The small differences between the actual quantities and their estimates are caused by taking the absolute values of

U, V

and

δ W_{U}, δ W_{V}

in finding the asymptotic bounds

δ U^{l i n} = | U | δ W_{U}^{l i n}

and

δ V^{l i n} = | V | δ W_{V}^{l i n}

. We note that the overestimates are inevitable if we want to obtain guaranteed asymptotic perturbation bounds of the SVD elements, and they can be significantly reduced only if we use probability perturbation bounds [28].

Example 5.

This example visualizes the componentwise estimates of the orthogonal matrices and the singular values for a

200 \times 150

matrix.

Consider a matrix A with

m = 200, n = 150

, constructed as in the previous example for

σ = 1.05

and

τ = 1.0

. The perturbation of A is taken as

δ A = 10^{- 9} \times A_{0}

, where

A_{0}

is a matrix with random entries.

In Figure 20, Figure 21 and Figure 22, we show the entries of the matrix

| δ U_{1} |

, the absolute values of the exact changes of all 150 singular values, and the entries of the matrix

| δ V |

, respectively, along with the corresponding componentwise estimates

δ U^{l i n}

,

δ σ_{i}^{n o n l}

, and

δ V^{l i n}

. The computation of the linear bounds requires the solution of a system of

2 q = 22350

linear equations for

x_{(1)}

and y and

(m - n) m = 7500

equations for

x_{(2)}

. The nonlinear bounds of

δ U_{1}

and

δ V

are found only for 7 iterations and are visually indistinguishable from the corresponding linear bounds.

The perturbation bounds derived were also tested with higher-order examples, including a

500 \times 300

example.

The examples presented in this section confirm that the new componentwise perturbation bounds of the singular value decomposition can be used efficiently in the analysis of high-order problems and may compare favorably with the known bounds for some particular cases, for instance in the sensitivity analysis of the singular subspaces. The componentwise bounds of the perturbations in the orthogonal matrices U and V in the last example clearly show that for some specific entries the perturbation magnitudes differ

10^{5}

times and the implementation of normwise perturbation bounds for such examples is not informative. It should also be emphasized that the asymptotic estimates, although not global, ensure valid perturbation bounds for sufficiently large perturbations of the given matrix.

11. Conclusions

The paper presents new results related to the perturbation analysis of the singular value decomposition of a real rectangular matrix of full rank. New asymptotic componentwise perturbation bounds are derived for the orthogonal matrices participating in the decomposition, and an alternative method for computing the sensitivity of the singular subspaces is proposed. The possibility to find non-local bounds is illustrated by using a simple iterative procedure.

A potential disadvantage of the proposed perturbation bounds is their conservatism, i.e., the large difference between the bounds and the corresponding perturbations, especially for large values of m and n. This is due to the necessity to replace the entries of the actual perturbation in the derived bounds by its 2-norm, which leads to pessimistic estimates. This conservatism can be removed using probability perturbation bounds, a matter of further research.

The singular value decomposition perturbation analysis presented in this paper has some peculiarities, making it a challenging problem. On the one hand, the SVD analysis is simpler than other problems, like the perturbation analysis of the orthogonal decomposition to triangular form (the QR decomposition). This is due to the diagonal form of the decomposed matrix, which, among the others, allows the equations for the perturbation parameters to be solved easily, avoiding using the Kronecker product. On the other hand, the presence of two matrices in the decomposition requires the introduction of two different parameter vectors, which are mutually dependent due to the relationship between the perturbations of the two orthogonal matrices. This makes it necessary to solve a coupled system of equations about the parameter vectors, complicating the analysis.

The analysis performed in the paper reveals two reasons for the ill-conditioning of the singular subspaces of a matrix. The first cause is the closeness of some singular values of A, which leads to large elements of the vector

| x_{(1)} |

and consequently to large entries of the perturbation

| δ U_{1} |

. The second reason is the presence of small singular values of A, which is reflected by large elements of the vector

| x_{(2)} |

and also leads to large values of the respective entries of

| δ U_{1} |

. Significantly, these two reasons are independent of each other.

Author Contributions

Conceptualization, P.P.; methodology, V.A. and P.P.; software, P.P.; validation, V.A.; writing—original draft preparation, P.P.; writing—review and editing, V.A. and P.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated during the current study are available from the authors upon reasonable request. The data are not publicly available due to technical reasons.

Acknowledgments

The authors are grateful to the reviewers for carefully reading the manuscript and the remarks and suggestions that improved the presentation.

Conflicts of Interest

The authors declare no conflicts of interest.

Notation

$R$	the set of real numbers;
$R^{m \times n}$	the space of real $m \times n$ matrices ( $R^{n} = R^{n \times 1}$ );
$R (A)$	the range of A;
$span (u_{1}, u_{2}, \dots, u_{n})$ ,	the subspace spanned by the vectors $u_{1}, u_{2}, \dots, u_{n}$ ;
$X^{⊥}$	the orthogonal complement of the subspace $X$ ;
$\| A \|$	the matrix of absolute values of the elements of A;
$A^{T}$	the transposed of A;
$A^{- 1}$	the inverse of A;
$a_{j}$	the jth column of A;
$A_{i, 1 : n}$	the ith row of $m \times n$ matrix A;
$A_{i_{1} : i_{2}, j_{1} : j_{2}}$	the part of matrix A from row $i_{1}$ to $i_{2}$ and from column $j_{1}$ to $j_{2}$ ;
$δ A$	perturbation of A;
$O (∥ δ A ∥^{2})$	a quantity of second-order of magnitude with respect to $∥ δ A ∥$ ;
$0_{m \times n}$	the zero $m \times n$ matrix;
$I_{n}$	the unit $n \times n$ matrix;
$e_{j}$	the jth column of $I_{n}$ ;
$σ_{i} (A)$	the ith singular value of A;
$σ_{min} (A), σ_{\max} (A)$	the minimum and maximum singular values of A, respectively;
⪯	relation of partial order. If $a, b \in R^{n}$ , then $a ⪯ b$
	means $a_{i} \leq b_{i}, i = 1, 2, \dots, n$ ;
$Low (A)$	the strictly lower triangular part of $A \in R^{n \times n}$ ;
$Up (A)$	the strictly upper triangular part of $A \in R^{n \times n}$ ;
$vec (A)$	the vec mapping of $A \in R^{m \times n}$ . If A is partitioned column-wise as
	$A = [a_{1}, a_{2}, \dots a_{n}]$ , then $vec (A) = {[a_{1}^{T}, a_{2}^{T}, \dots, a_{n}^{T}]}^{T}$ ,
$∥ x ∥$	the Euclidean norm of $x \in R^{n}$ ;
${∥ A ∥}_{2}$	the spectral norm of A;
$Θ_{\max} (X, Y)$	the maximum angle between subspaces $X$ and $Y$ .

References

Horn, R.A.; Charles, R. Matrix Analysis, 2nd ed.; Cambridge University Press: Cambridge, UK, 2013; ISBN 978-0-521-83940-2. [Google Scholar]
Stewart, G.W. Matrix Algorithms: Volume 1: Basic Decompositions; SIAM: Philadelphia, PA, USA, 1998; ISBN 0-89871-414-1. [Google Scholar]
Golub, G.H.; Van Loan, C.F. Matrix Computations, 4th ed.; The Johns Hopkins University Press: Baltimore, MD, USA, 2013; ISBN 978-1-4214-0794-4. [Google Scholar]
Stewart, G.W. On the early history of the singular value decomposition. SIAM Rev. 1993, 35, 551–566. [Google Scholar] [CrossRef]
Stewart, G.; Sun, J.-. G Matrix Perturbation Theory; Academic Press: New York, NY, USA, 1990; ISBN 978-0126702309. [Google Scholar]
Wilkinson, J.H. The Algebraic Eigenvalue Problem; Clarendon Press: Oxford, UK, 1965; ISBN 0-198-53418-3. [Google Scholar]
Gohberg, I.; Koltracht, I. Mixed, componentwise, and structured condition numbers. SIAM J. Matrix Anal. Appl. 1993, 14, 688–704. [Google Scholar] [CrossRef]
Wedin, P.-A. Perturbation bounds in connection with singular value decomposition. Bit Numer. Math. 1972, 12, 99–111. [Google Scholar] [CrossRef]
Stewart, G. Error and perturbation bounds for subspaces associated with certain eigenvalue problems. SIAM Rev. 1973, 15, 727–764. [Google Scholar] [CrossRef]
Sun, J.g. Perturbation expansions for invariant subspaces. Lin. Alg. Appl. 1991, 153, 85–97. [Google Scholar] [CrossRef]
Vaccaro, R. A second-order perturbation expansions for the SVD. SIAM J. Matrix Anal. Appl. 1994, 15, 661–671. [Google Scholar] [CrossRef]
Stewart, G. Perturbation Theory for the Singular Value Decomposition; Technical Report CS-TR 2539; University of Maryland: College Park, MD, USA, 1990. [Google Scholar]
Sun, J.g. Perturbation analysis of singular subspaces and deflating subspaces. Numer. Math. 1996, 73, 235–263. [Google Scholar] [CrossRef]
Barlow, J.; Demmel, J. Computing accurate eigensystems of scaled diagonally dominant matrices. SIAM J. Numer. Anal. 1990, 27, 762–791. [Google Scholar] [CrossRef]
Barlow, J.; Slapničar, I. Optimal perturbation bounds for the Hermitian eigenvalue problem. Lin. Alg. Appl. 2000, 309, 19–43. [Google Scholar] [CrossRef]
Demmel, J.; Gu, M.; Eisenstat, S.; Slapničar, I.; Veselić, K.; Drmač, Z. Computing the singular value decomposition with high relative accuracy. Lin. Alg. Appl. 1999, 299, 21–80. [Google Scholar] [CrossRef]
Dopico, F.M.; Moro, J. A note on multiplicative backward errors of accurate SVD algorithms. SIAM J. Matrix Anal. Appl. 2004, 25, 1021–1031. [Google Scholar] [CrossRef]
Drmač, Z.; Veselić, K. New fast and accurate Jacobi SVD algorithm: I. SIAM J. Matrix Anal. Appl. 2008, 29, 1322–1342. [Google Scholar] [CrossRef]
Drmač, Z.; Veselić, K. New fast and accurate Jacobi SVD algorithm: II. SIAM J. Matrix Anal. Appl. 2008, 29, 1343–1362. [Google Scholar] [CrossRef]
Anderson, E.; Bai, Z.; Bischof, C.; Blackford, S.; Demmel, J.; Dongarra, J.; Sorensen, D. LAPACK Users’ Guide, 3rd ed.; SIAM: Philadelphia, PA, USA, 1999. [Google Scholar]
Nakatsukasa, Y. Algorithms and Perturbation Theory for Matrix Eigenvalue Problems and the Singular Value Decomposition. Ph.D. Thesis, University of California, Davis, CA, USA, 2011. [Google Scholar]
Li, R. Matrix perturbation theory. In Handbook of Linear Algebra, 2nd ed.; Hogben, L., Ed.; CRC Press: Boca Raton, FL, USA, 2014. [Google Scholar]
Drmač, Z. Computing eigenvalues and singular values to high relative accuracy. In Handbook of Linear Algebra, 2nd ed.; Hogben, L., Ed.; CRC Press: Boca Raton, FL, USA, 2014; pp. (59–1)–(59–21). [Google Scholar]
Harcha, H.; Chakrone, O.; Tsouli, N. On the nonlinear eigenvalue problems involving the fractional p-Laplacian operator with singular weight. J. Nonlinear Funct. Anal. 2022, 2022, 40. [Google Scholar] [CrossRef]
Tu, Z.; Guo, L.; Pan, H.; Lu, J.; Xu, C.; Zou, Y. Multitemporal image cloud removal using group sparsity and nonconvex low-rank approximation. J. Nonlinear Var. Anal. 2023, 7, 527–548. [Google Scholar] [CrossRef]
Petkov, P. Componentwise perturbation analysis of the QR decomposition of a matrix. Mathematics 2022, 10, 4687. [Google Scholar] [CrossRef]
Gantmacher, F. Theory of Matrices; AMS Chelsea Publishing: New York, NY, USA, 1959; Volume 1, Reprinted by the AMS: Providence, RI, USA, 2000; ISBN 0-8218-1376-5. [Google Scholar]
Stewart, G. Stochastic perturbation theory. SIAM Rev. 1990, 32, 579–610. [Google Scholar] [CrossRef]
The MathWorks, Inc. MATLAB, version 9.9.0.1538559 (R2020b); The MathWorks, Inc.: Natick, MA, USA, 2020; Available online: http://www.mathworks.com (accessed on 5 February 2024).
Drmač, Z. On principal angles between subspaces of Euclidean space. SIAM J. Matrix Anal. Appl. 2000, 22, 173–194. [Google Scholar] [CrossRef]
Björck, A.; Golub, G. Numerical methods for computing angles between linear subspaces. Math. Comp. 1973, 27, 579–594. [Google Scholar] [CrossRef]
Stewart, G. Matrix Algorithms: Volume II: Eigensystems; SIAM: Philadelphia, PA, USA, 2001; ISBN 0-89871-503-2. [Google Scholar]
Konstantinov, M.M.; Petkov, P.H. Perturbation Methods in Matrix Analysis and Control; NOVA Science Publishers, Inc.: New York, NY, USA, 2020; ISBN 978-1-53617-470-0. [Google Scholar]
Bavely, C.; Stewart, G. An algorithm for computing reducing subspaces by block diagonalization. SIAM J. Numer. Anal. 1979, 16, 359–367. [Google Scholar] [CrossRef]

Figure 1. Sensitivity estimations of the left singular subspaces.

Figure 2. Sensitivity estimations of the right singular subspaces.

Figure 3. Relative errors in the iterative determining of the global bounds of x.

Figure 4. Relative errors in the iterative determining of the global bounds of y.

Figure 5. Norms of

δ U_{1}, x, δ U_{1}^{l i n}

and

Φ \max_{2}

as functions of the difference

τ = σ_{2} - σ_{3}

.

Figure 5. Norms of

δ U_{1}, x, δ U_{1}^{l i n}

and

Φ \max_{2}

as functions of the difference

τ = σ_{2} - σ_{3}

.

Figure 6. Norms of

δ V, y, δ V_{l i n}

and

Θ \max_{2}

as functions of the difference

τ = σ_{2} - σ_{3}

.

Figure 6. Norms of

δ V, y, δ V_{l i n}

and

Θ \max_{2}

as functions of the difference

τ = σ_{2} - σ_{3}

.

Figure 7. Exact value of

| sin Θ_{3} |

and its perturbation bounds as functions of

τ

.

Figure 7. Exact value of

| sin Θ_{3} |

and its perturbation bounds as functions of

τ

.

Figure 8. Exact value of

{∥ tan Φ ∥}_{F}

and its perturbation bounds as functions of

τ

.

Figure 8. Exact value of

{∥ tan Φ ∥}_{F}

and its perturbation bounds as functions of

τ

.

Figure 9. Exact value of

{∥ tan Θ ∥}_{F}

and its perturbation bounds as functions of

τ

.

Figure 9. Exact value of

{∥ tan Θ ∥}_{F}

and its perturbation bounds as functions of

τ

.