**2. Preliminaries**

For a symmetric positive definite matrix *W* ∈ <sup>R</sup>*N*×*N*, the *W*-inner product is defined as following

$$
\langle \mathbf{x}, \mathbf{y} \rangle\_{\mathcal{W}} := \mathbf{y}^T \mathcal{W} \mathbf{x}, \quad \forall \mathbf{x}, \mathbf{y} \in \mathbb{R}^N.
$$

If *<sup>x</sup>*, *yW* = 0, then we denote it by *x* ⊥*W y*, and call it with *x* and *y* are *W*-orthogonal. The projector Π*W* is called the *W*-orthogonal projector onto Y if for any *y* ∈ <sup>R</sup>*N*,

$$
\Pi\_{\mathcal{W}} y \in \mathcal{Y}, \quad (I - \Pi\_{\mathcal{W}}) y \perp\_{\mathcal{W}} \mathcal{Y}.
$$

For two subspaces X , Y ⊆ <sup>R</sup>*N*, and suppose *k* = *dim*(X ) ≤ *dim*(Y) = -, if *X* ∈ R*N*×*<sup>k</sup>* and *Y* ∈ R*N*×- are *W*-orthonormal basis of X and Y, respectively, i.e.,

$$X^T W X = I\_{k'} \ \mathcal{X} = \mathcal{R}(X) \quad \text{and} \quad Y^T W Y = I\_{\ell'} \ \mathcal{Y} = \mathcal{R}(Y),$$

and *νj* for *j* = 1, ··· , *k* with *ν*1 ≤ ··· ≤ *νk* are the singular values of *<sup>Y</sup>TWX*, then the *W*-canonical angles *θ*(*j*) *W*(X , Y) from X to Y are defined by

$$0 \le \theta\_W^{(j)}(\mathcal{X}, \mathcal{Y}) = \arccos \nu\_j \le \pi/2, \quad \text{for } j = 1, \dots, k.$$

If *k* = -, these angles can be said between X and Y. Obviously, *θ*(1) *W*(<sup>X</sup>,Y) ≥···≥ *θ*(*k*) *W*(<sup>X</sup>,Y). Set

$$
\Theta\_W(\mathcal{X}, \mathcal{Y}) = \operatorname{diag}(\theta\_W^{(1)}(\mathcal{X}, \mathcal{Y}), \dots, \theta\_W^{(k)}(\mathcal{X}, \mathcal{Y})).
$$

Especially, if *k* = 1, *X* is a vector, there is only one *W*-canonical angle from X to Y. In the following, we may use a matrix in one or both arguments of <sup>Θ</sup>*W*(·, ·), i.e., <sup>Θ</sup>*W*(*<sup>X</sup>*,*<sup>Y</sup>*) with the understanding that it means the subspace spanned by the columns of the matrix argument.

The following two lemmas are important to our later analysis, and for proofs and more details, the reader is referred to [12,16].

**Lemma 1.** ([12] Lemma 3.2). *Let* X *and* Y *be two subspaces in* R*<sup>N</sup> with equal dimensional dim*(X ) = *dim*(Y) = *k. Suppose θ*(1) *W* (X , Y) < *<sup>π</sup>*/2*. Then, for any set y*1, *y*2, ··· , *yk*1 *of the basis vectors in* Y *where* 1 ≤ *k*1 ≤ *k, there is a set x*1, *x*2, ··· , *xk*1 *of linearly independent vectors in* X *such that* Π*W xj* = *yj for* 1 ≤ *j* ≤ *k*1*, where* Π*W is the W-orthogonal projector onto* Y*.*

**Lemma 2.** ([16] Proposition 3.1). *The matrix MK has N position eigenvalues λ*21 ≥ *λ*22 ≥ ··· ≥ *<sup>λ</sup>*2*N with λj* > 0*. The corresponding right eigenvectors ξ*1, ··· , *ξN can be chosen K-orthonormal, and the corresponding left eigenvectors η*1, ··· , *ηN can be chosen M-orthonormal. In particular, for given* {*ξj*}*, one can choose ηj* = *λ*−<sup>1</sup> *j <sup>K</sup>ξj, and for given* {*ηj*}*, ξj* = *λ*−<sup>1</sup> *j <sup>M</sup>ηj, for j* = 1, ··· , *N.*

#### **3. Weighted Block Golub-Kahan-Lanczos Algorithm**

#### *3.1. Weighted Block Golub-Kahan-Lanczos Algorithm*

In this section, we plan to introduce the weighted block Golub-Kahan-Lanczos algorithm (**wbGKL**) for LREP, which is a block version of the weighted Golub-Kahan-Lanczos algorithm [16]. Algorithm 1 gives the process of recursively generating the *M*-orthonormal matrix X*<sup>n</sup>*, the *K*-orthonormal matrix Y*<sup>n</sup>*, and the block bidiagonal matrix B*<sup>n</sup>*. Giving *Y*1 ∈ R*n*×*nb* with *YT*1 *KY*1 = *Inb* , denoting *ETn* = [<sup>0</sup>*nb*<sup>×</sup>(*<sup>n</sup>*−<sup>1</sup>)*nb* , *Inb* ], and

$$\mathcal{X}\_{\mathfrak{n}} = [X\_1, \dots, X\_{\mathfrak{n}}]\_{\mathfrak{r}} \quad \mathcal{Y}\_{\mathfrak{n}} = [Y\_1, \dots, Y\_{\mathfrak{n}}]\_{\mathfrak{r}} \quad \mathcal{B}\_{\mathfrak{n}} = \begin{bmatrix} A\_1 & B\_1 & & \\ & A\_2 & \ddots & \\ & & \ddots & B\_{\mathfrak{n}-1} \\ & & & A\_{\mathfrak{n}} \end{bmatrix}\_{\mathfrak{r}} \check{\mathsf{q}}$$

then we have the relation from Algorithm 1:

$$K\mathcal{Y}\_n = \mathcal{X}\_n \mathcal{B}\_n, \quad M\mathcal{X}\_n = \mathcal{Y}\_n \mathcal{B}\_n^T + \mathcal{Y}\_{n+1} B\_n^T E\_n^T,\tag{1}$$

0

0

and

$$
\mathcal{X}\_n^T \mathcal{M} \mathcal{X}\_n = I\_{nn\_\mathbb{R}} = \mathcal{Y}\_n^T K \mathcal{Y}\_n.
$$

**Remark 1.** *In Algorithm 1, we only consider the case that rank*(*X j*) = *rank*(*Y <sup>j</sup>*+<sup>1</sup>) = *nb, no further treatment is provided for the cases rank*(*X* 0 *j*) < *nb or rank*(*Y* 0 *<sup>j</sup>*+<sup>1</sup>) < *nb. Because K and M are both symmetric positive definite, thus the two W in* **Step 2** *are both reversible.*

**Remark 2.** *With j increasing in* **Step 2***, the M-orthogonality of* X*j and the K-orthogonality of* Y*j will slowly lose. Thus, in practice, we can add a re-orthogonalization process in each iteration to eliminate the defect. The same strategy is executed in the following algorithms.*

#### **Algorithm 1: wbGKL**

**1.** Choose *Y*1 satisfying *YT*1 *KY*1 = *Inb* , and set *W* = *Inb* , *B*0 = *Inb* , *X*0 = <sup>0</sup>*n*×*nb* . Compute*F* = *KY*1. **2.** For *j* = 1, 2, ··· , *n X* 0 *j* = *FW* − *Xj*−<sup>1</sup>*Bj*−<sup>1</sup> *F* = *MX* 0 *j* Do Cholesky decomposition *X* <sup>0</sup>*T j F* = *WTW Aj* = *W*, *W* = *inv*(*W*), *Xj* = *X* 0 *j<sup>W</sup>* %*W* = *inv*(*W*) means *W* = *W*−<sup>1</sup> *Y* 0 *j*+1 = *FW* − *YjATj F* = *KY* 0 *j*+1 Do Cholesky decomposition *Y* <sup>0</sup>*T <sup>j</sup>*+1*<sup>F</sup>* = *WTW Bj* = *<sup>W</sup>T*, *W* = *inv*(*W*), *Yj*+<sup>1</sup> = *<sup>Y</sup>*<sup>0</sup>*j*+1*<sup>W</sup>* End

From (1), we have 

$$
\begin{bmatrix} 0 & M \\ K & 0 \end{bmatrix} \begin{bmatrix} \mathcal{Y}\_n & 0 \\ 0 & \mathcal{X}\_n \end{bmatrix} = \begin{bmatrix} \mathcal{Y}\_n & 0 \\ 0 & \mathcal{X}\_n \end{bmatrix} \begin{bmatrix} 0 & \mathcal{B}\_n^T \\ \mathcal{B}\_n & 0 \end{bmatrix} + \begin{bmatrix} \mathcal{Y}\_{n+1} \\ 0 \end{bmatrix} B\_n^T E\_{2n}^T
$$

with *<sup>E</sup>T*2*n* = [<sup>0</sup>*nb*<sup>×</sup>(<sup>2</sup>*n*−<sup>1</sup>)*nb* , *Inb* ]. Then, the approximate eigenpairs of **H** can be obtained by solving a small eigenvalue problem of 0 B*Tn* B*n*0 . Suppose B*n* has an singular value decomposition

$$\mathcal{B}\_n = \Phi \Sigma\_n \Psi^T,\tag{2}$$

where Φ = [*φ*1, *φ*2, ··· , *φnnb* ], Ψ = [*ψ*1, *ψ*2, ··· , *ψnnb* ], Σ*n* = [*<sup>σ</sup>*1, *σ*2, ··· , *<sup>σ</sup>nnb* ] with *σ*1 ≥···≥ *<sup>σ</sup>nnb* > 0. Thus, we can take <sup>±</sup>*<sup>σ</sup>j*(<sup>1</sup> ≤ *j* ≤ *nnb*) as the Ritz values of **H** and

$$\bar{z}\_{\dot{j}} = \frac{1}{\sqrt{2}} \begin{bmatrix} \mathcal{Y}\_n \psi\_{\dot{j}} \\ \pm \mathcal{X}\_n \phi\_{\dot{j}} \end{bmatrix}, \quad 1 \le \dot{j} \le m\_{b\prime}$$

as the corresponding **K**-orthonormal Ritz vectors, where **K** = *K* 0 0 *M* .

#### *3.2. Convergence Analysis*

In this section, we first consider the convergence analysis when using the first few *σj* as approximations to the first few *λj*. Then, the similar theories are presented if using the last few *σj* as approximations to the last few *λj*. Since a block Lanczos method with a suitable block size which is not smaller than the size of an eigenvalue cluster can compute all eigenvalues in the cluster. Now, we are considering the *i*-th to (*i* + *nb* − <sup>1</sup>)-st eigenpairs of LREP, in which the *k*-th to --th eigenvalues form a cluster as in the following figure with 1 ≤ *i* ≤ *k* ≤ - ≤ *i* + *nb* − 1 ≤ *nnb* and *k* ≤ *n*.

Here, the squares of the eigenvalues for LREP are listed. Hence, motivated by [12,17], we analyze the convergence of the cluster eigenvalues and their corresponding eigenspace, and give the error bounds of the approximate eigenpairs belonging to eigenvalue cluster together, instead of separately for each individual eigenpair.

We first give some notations and equations, which are critical in our main theorem. Note that from (1), we ge<sup>t</sup>

$$MK\mathcal{Y}\_n = \mathcal{Y}\_n B\_n^T \mathcal{B}\_n + \mathcal{Y}\_{n+1} B\_n^T A\_n E\_n^T. \tag{3}$$

Since (2) is the singular value decomposition of B*<sup>n</sup>*, thus the eigenvalues of B*Tn* B*n* are *σ*2*j* with the associated eigenvectors *ψj* for 1 ≤ *j* ≤ *nnb*.

From Lemma 2, if we let Ξ = [*ξ*1, ··· , *ξN*], and Γ = [*η*1, ··· , *ηN*], then Γ = *<sup>K</sup>*ΞΛ−1, and

$$\mathsf{MK}\Xi = \Xi\Lambda^2.\tag{4}$$

Write Ξ and Λ<sup>2</sup> as

$$
\Delta = \begin{bmatrix}
\begin{matrix}
\Sigma\_1 & \Sigma\_2 & \Sigma\_3
\end{matrix}
\end{bmatrix}, \qquad \Lambda^2 = \begin{bmatrix}
\Lambda\_1^2 & & & & \Lambda\_0 & \Lambda\_1 - \imath\_0 - \imath + 1 \\
\Sigma\_1 & \Sigma\_2 & & & \imath\_0 & \imath\_1 \\
& & & & \imath\_0 & \imath\_1 & \imath\_0 \\
& & & & & \imath\_1 & \imath\_0
\end{bmatrix}.
$$

Let Ξ ˇ 2 = <sup>Ξ</sup>(:,*k*:-) and Λ ˇ 2 2 = *diag*(*λ*2*k*, ··· , *λ*2- ). Denote *Cj* the first kind Chebyshev polynomial with *j*-th degree, and 0 ≤ *j* ≤ *n*.

In the following, we assume

$$
\theta\_K^{(1)}(Y\_1, \Xi\_2) < \pi/2,\tag{5}
$$

i.e., *rank*(*YT*1 *<sup>K</sup>*Ξ2) = *nb*, then from Lemma 1, we have ∃ *Z* ∈ R*N*×(-−*k*+<sup>1</sup>) with R(*Z*) ⊆ R(*<sup>Y</sup>*1), s.t.,

$$
\Xi\_2 \Xi\_2^T K \mathcal{Z} = \mathfrak{S}\_2. \tag{6}
$$

,

**Theorem 1.** *Suppose θ*(1) *K* (*<sup>Y</sup>*1, <sup>Ξ</sup>2) < *<sup>π</sup>*/2*, and Z satisfy (6), then we have*

$$\|\|\text{diag}(\lambda\_k^2 - \sigma\_{k\prime}^2, \cdot, \cdot, \lambda\_\ell^2 - \sigma\_\ell^2)\|\|\_F \le (\lambda\_k^2 - \lambda\_N^2) \frac{\pi\_{i,k,\ell}^2}{\overline{C}\_{n-k}^2 (1 + 2\gamma\_{i,\ell})} \|\tan^2 \Theta\_K(\varXi\_2, Z)\|\|\_F \tag{7}$$

*with*

$$\gamma\_{i,\ell} = \frac{\lambda\_{\ell}^2 - \lambda\_{i+n\_b}^2}{\lambda\_{i+n\_b}^2 - \lambda\_N^2}, \quad \pi\_{i,k,\ell} = \frac{\max\_{i+n\_b \le j \le N} \prod\_{m=1}^{k-1} |\sigma\_m^2 - \lambda\_j^2|}{\min\_{k \le t \le \ell} \prod\_{m=1}^{k-1} |\sigma\_m^2 - \lambda\_t^2|}.$$

*and*

$$\|\|\sin\Theta\_{\mathbf{K}}(\check{\Xi}\_{2\prime}\mathcal{Y}\_{\mathbf{n}}\Psi\_{(:,\mathbf{k};\ell)})\|\|\_{F} \leq \frac{\pi\_{i,\mathbf{k}}\sqrt{1+c^{2}\|\|A\_{\mathbf{n}}^{\mathrm{T}}\mathcal{B}\_{\mathbf{n}}\|\_{2}^{2}/\delta^{2}}}{\mathcal{C}\_{\mathbf{n}-i}(1+2\gamma\_{i,\ell})}\|\tan\Theta\_{\mathbf{K}}(\check{\Xi}\_{2\prime}Z)\|\|\_{F} \tag{8}$$

*with constant c lies between* 1 *and <sup>π</sup>*/2*, and c* = 1 *if k* = -*, and*

$$\delta = \min\_{\substack{k \le \boldsymbol{\ell} \le \boldsymbol{\ell} \\ p < k \text{ or } p > \boldsymbol{\ell}}} |\lambda\_j^2 - \sigma\_p^2|, \quad \pi\_{i,k} = \prod\_{j=1}^{i-1} \frac{\lambda\_j^2 - \lambda\_N^2}{\lambda\_j^2 - \lambda\_k^2}.$$

*Particularly if <sup>σ</sup>*2*k*−<sup>1</sup> ≥ *λ*2*k, then*

$$
\pi\_{i\boldsymbol{k},\ell} = \prod\_{m=1}^{k-1} \frac{|\sigma\_m^2 - \lambda\_N^2|}{|\sigma\_m^2 - \lambda\_k^2|}.
$$

*Mathematics* **2019**, *7*, 53

**Proof.** Multiplying *L<sup>T</sup>* from left, (4) can be rewritten as *LTML*(*LT*Ξ)=(*LT*Ξ)Λ2, so, (*λ*2*j* , *<sup>L</sup><sup>T</sup>ξj*) is the eigenpair of *<sup>L</sup>TML*, for *j* = 1, ··· , *N*, and *<sup>L</sup><sup>T</sup>ξ*1, ··· , *<sup>L</sup><sup>T</sup>ξN* are orthonormal. Do the same process to (3), we have

$$L^T M L \mathcal{V}\_n = \mathcal{V}\_n B\_n^T B\_n + V\_{n+1} B\_n^T A\_n E\_n^T \tag{9}$$

where V*n* = *<sup>L</sup><sup>T</sup>*Y*<sup>n</sup>*, *Vn*+<sup>1</sup> = *<sup>L</sup>TYn*+1, and V*Tn* V*n* = *Innb* , which can be seen as the relation generalize by using standard Lanczos process to *LTML*. Thus, *σ*21 , ··· , *<sup>σ</sup>*2*nnb* are the Ritz values of *<sup>L</sup>TML*, with the corresponding orthonormal Ritz vectors V*nψ*1, ··· , V*nψnnb* .

Premultiplying *L<sup>T</sup>* to Equation (6), we have *<sup>L</sup>T*Ξ2Ξ*T*2 *L*(*LTZ*) = *LT*Ξ<sup>ˇ</sup> 2. Consequently, the conditions of the block Lanczos convergence Theorem 4.1 and Theorem 5.1 in [17] are satisfied. Thus, using the results Theorem 5.1 in [17], one has

$$\|\|\text{diag}(\lambda\_k^2 - \sigma\_{k'}^2 \cdot \cdots \cdot, \lambda\_\ell^2 - \sigma\_\ell^2)\|\|\_F \le (\lambda\_k^2 - \lambda\_N^2) \frac{\pi\_{i,k,\ell}^2}{\mathcal{C}\_{n-k}^2(1 + 2\gamma\_{i,\ell})} \|\tan^2 \Theta(L^T \mathfrak{S}\_2, L^T \mathbf{Z})\|\_F.$$

Then the bound (7) can be easily go<sup>t</sup> by using ([21] Theorem 4.2)

$$
\Theta(L^T \overleftarrow{\Xi}\_2, L^T Z) = \Theta\_K(\overleftarrow{\Xi}\_2, Z). \tag{10}
$$

Let Π*n* = <sup>V</sup>*n*<sup>V</sup>*Tn* , then Π*n* is the orthogonal projection onto <sup>K</sup>*n*(*LTML*, *<sup>L</sup>TZ*), thus from (9), we have

$$\begin{aligned} \|\Pi\_n L^T M L (I - \Pi\_n)\|\_2 &= \|\mathcal{V}\_n \mathcal{V}\_n^T L^T M L (I - \mathcal{V}\_n \mathcal{V}\_n^T)\|\_2 \\ &= \|\mathcal{V}\_n (\mathcal{B}\_n^T \mathcal{B}\_n + E\_n A\_n^T B\_n \mathcal{V}\_{n+1}^T) - \mathcal{V}\_n \mathcal{B}\_n^T \mathcal{B}\_n \mathcal{V}\_n^T\|\_2 \\ &= \|\mathcal{V}\_n A\_n^T B\_n \mathcal{V}\_{n+1}^T\|\_2 \\ &= \|A\_n^T B\_n\|\_2. \end{aligned}$$

Consequently, applying the results of Theorem 4.1 in [17], we ge<sup>t</sup>

$$\begin{split} \| \| \sin \Theta (\boldsymbol{L}^{T} \mathfrak{S}\_{2}, \mathcal{V}\_{\boldsymbol{n}} \boldsymbol{\Psi}\_{(\boldsymbol{j}, \boldsymbol{k}, \boldsymbol{\ell})}) \|\_{F} &\leq \frac{\pi\_{i, \boldsymbol{k}} \sqrt{1 + \| \boldsymbol{\Pi}\_{\boldsymbol{n}} \boldsymbol{L}^{T} \boldsymbol{M} \boldsymbol{L} (\boldsymbol{I} - \boldsymbol{\Pi}\_{\boldsymbol{n}}) \|\_{2}^{2} / \delta^{2}}}{\mathsf{C}\_{n-i} (1 + 2\gamma\_{i, \boldsymbol{\ell}})} \| \tan \Theta (\boldsymbol{L}^{T} \mathfrak{S}\_{2}, \boldsymbol{L}^{T} \boldsymbol{Z}) \|\_{F} \\ &= \frac{\pi\_{i, \boldsymbol{k}} \sqrt{1 + \| \boldsymbol{A}\_{\boldsymbol{n}}^{T} \boldsymbol{B}\_{\boldsymbol{n}} \|\_{2}^{2} / \delta^{2}}}{\mathsf{C}\_{n-i} (1 + 2\gamma\_{i, \boldsymbol{\ell}})} \| \tan \Theta (\boldsymbol{L}^{T} \mathfrak{S}\_{2}, \boldsymbol{L}^{T} \boldsymbol{Z}) \|\_{F} . \end{split}$$

Then the bound (8) can be derived by using Θ(*LT*Ξ<sup>ˇ</sup> 2, <sup>V</sup>*n*Ψ(:,*k*:-)) = <sup>Θ</sup>*K*(Ξ<sup>ˇ</sup> 2, Y*n*Ψ(:,*k*:-)) and (10).

Theorem 1 is used to bound the errors of the approximate eigenvalues to an eigenvalue cluster including the multiple eigenvalues. It can be also applied to the single eigenvalue case, the following corollary is derived by setting *k* = - = *i*, except the left equality of (10), which needs to be proved.

**Corollary 1.** *Suppose θ*(1) *K* (*<sup>Y</sup>*1, <sup>Ξ</sup>2) < *<sup>π</sup>*/2*, then for* 1 ≤ *i* ≤ *nnb, there exits a vector y* ∈ R(*<sup>Y</sup>*1)*, s.t.,* <sup>Ξ</sup>2Ξ*T*2 *y* = *ξi, and*

$$
\lambda\_i^2 - \sigma\_i^2 \le (\lambda\_i^2 - \lambda\_N^2) \frac{\pi\_{i,j}^2}{\mathbb{C}\_{n-i}^2 (1 + 2\gamma\_i)} \tan^2 \theta\_K(\xi\_i, y),
$$

*with*

$$\gamma\_i = \frac{\lambda\_i^2 - \lambda\_{i+n\_b}^2}{\lambda\_{i+n\_b}^2 - \lambda\_N^2}, \quad \pi\_{i,j} = \max\_{i+n\_b \le j \le N} \prod\_{m=1}^{i-1} \frac{|\sigma\_m^2 - \lambda\_j^2|}{|\sigma\_m^2 - \lambda\_i^2|}.$$

*and*

$$\begin{aligned} & \left( (1 - \frac{\sigma\_i^2}{\lambda\_i^2}) + \frac{\sigma\_i^2}{\lambda\_i^2} \sin^2 \theta\_M (\eta\_{i\prime} \mathcal{X}\_n \phi\_i) \right)^{1/2} \\ &= \sin \theta\_K (\xi\_{i\prime}^\circ \mathcal{Y}\_n \psi\_i) \le \frac{\pi\_i \sqrt{1 + ||A\_n^T B\_n||\_2^2 / \delta^2}}{\mathcal{C}\_{n-i} (1 + 2\gamma\_i)} \tan \theta\_K (\xi\_{i\prime} \mathcal{y}\_i) \end{aligned} \tag{11}$$

.

*with*

$$\delta = \min\_{i \neq j} |\lambda\_j^2 - \sigma\_i^2| \prime \quad \pi\_i = \prod\_{j=1}^{i-1} \frac{\lambda\_j^2 - \lambda\_N^2}{\lambda\_j^2 - \lambda\_i^2}.$$

**Proof.** We only proof the left equality of (11). From (4) and Lemma 2, we have Ξ = *MK*ΞΛ−<sup>2</sup> = *M*ΓΛ−1. If we let *Z*1 = (Y*nψi*)*TKξ<sup>i</sup>*, and *Z*2 = (<sup>X</sup>*nφi*)*TMηi*, then we can ge<sup>t</sup> *Z*1 = *σi λi Z*2 by using *K*Y*n*Ψ = X*nBn*Ψ = X*n*ΦΣ*<sup>n</sup>*. Thus

$$\begin{split} \sin^2 \theta\_K(\mathbb{\zeta}\_i, \mathbb{\mathcal{Y}}\_n \psi\_i) &= 1 - \cos^2 \theta\_K(\mathbb{\zeta}\_i, \mathbb{\mathcal{Y}}\_n \psi\_i) \\ &= 1 - Z\_1^T Z\_1 \\ &= 1 - \frac{\sigma\_i^2}{\lambda\_i^2} Z\_2^T Z\_2 \\ &= 1 - \frac{\sigma\_i^2}{\lambda\_i^2} \cos^2 \theta\_M(\eta\_{i\prime}, \mathbb{\mathcal{X}}\_n \phi\_i) \\ &= 1 - \frac{\sigma\_i^2}{\lambda\_i^2} + \frac{\sigma\_i^2}{\lambda\_i^2} \sin^2 \theta\_M(\eta\_{i\prime}, \mathbb{\mathcal{X}}\_n \phi\_i). \end{split}$$

Then,

$$\sin \theta\_K(\mathcal{J}\_{i\prime}, \mathcal{Y}\_n \psi\_i) = \left(1 - \frac{\sigma\_i^2}{\lambda\_i^2} + \frac{\sigma\_i^2}{\lambda\_i^2} \sin^2 \theta\_M(\eta\_{i\prime}, \mathcal{X}\_n \phi\_i)\right)^{1/2}.$$

Next, we are going to consider the last few *σj* to approximate as the last few *<sup>λ</sup>N*−*nnb*+*j*, *j* = *k*, ··· , -, and *<sup>λ</sup>N*−*nnb*+*<sup>k</sup>* to *<sup>λ</sup>N*−*nnb*+- form a cluster in *λ*ˆ*i* to *<sup>λ</sup>*<sup>ˆ</sup>*i*+*nb*−1, which is described in the following figure, where *N* + 1 − *nnb* ≤ ˆ *i* ≤ ˆ *k* ≤ ˆ - ≤ ˆ *i* + *nb* − 1 ≤ *N*, *nnb* − - + 1 ≤ *n*, ˆ *k N* − *nnb* + *k*, and ˆ - *N* − *nnb* + -.

Similar to the above discussion for the first few eigenvalues, we can also obtain the error bounds of the approximate last few eigenpairs belongs to eigenvalue cluster together. We use the same notion, except Λ ˆ 2 2 = *diag*(*λ*2ˆ*k*, ··· , *λ*2ˆ- ) and Ξˆ 2 = <sup>Ξ</sup>(:,<sup>ˆ</sup>*k*:<sup>ˆ</sup>-). Assuming *θ*(1) *K* (*<sup>Y</sup>*1, <sup>Ξ</sup>2) < *π*/2, then from Lemma 1, there ∃ *Z* ˆ ∈ R*N*×(-−*k*+<sup>1</sup>) with R(*Z*<sup>ˆ</sup>) ⊆ R(*<sup>Y</sup>*1), s.t.,

$$
\Sigma\_2 \Xi\_2^T K \hat{\mathcal{Z}} = \hat{\Xi}\_2. \tag{12}
$$

.

**Theorem 2.** *Suppose θ*(1) *K* (*<sup>Y</sup>*1, <sup>Ξ</sup>2) < *π*/2 *and Z satisfy (* ˆ *12), then we have*

$$\begin{split} & \| \operatorname{diag} (\boldsymbol{\sigma}\_{k}^{2} - \lambda\_{k'}^{2} \cdots \boldsymbol{\sigma}\_{\ell}^{2} - \lambda\_{\ell}^{2}) \|\_{F} \\ & \leq (\lambda\_{1}^{2} - \lambda\_{\ell}^{2}) \frac{\mathsf{H}\_{i,k,\ell}^{2}}{\mathsf{C}\_{n-N+\hat{\ell}-1}^{2} (1 + 2\hat{\gamma}\_{i,\hat{k}})} \| \operatorname{tan}^{2} \Theta\_{K} (\boldsymbol{\Xi}\_{2}, \mathcal{Z}) \|\_{F} \end{split} \tag{13}$$

,

.

*with*

$$
\hat{\gamma}\_{\hat{\imath},\hat{\ell}} = \frac{\lambda\_{\hat{\imath}-1}^2 - \lambda\_{\hat{k}}^2}{\lambda\_1^2 - \lambda\_{\hat{\imath}-1}^2}, \quad \hat{\pi}\_{\hat{\imath},\hat{k},\hat{\ell}} = \frac{\max\_{1 \le j \le i-1} \prod\_{m=\ell+1}^{m\_b} |\sigma\_m^2 - \lambda\_{\hat{j}}^2|}{\min\_{\substack{m\_b\\ \hat{k} \le t \le \hat{\ell}}} \prod\_{m=\ell+1}^{m\_b} |\sigma\_m^2 - \lambda\_t^2|}
$$

*and*

$$\|\sin\Theta\_{\mathsf{K}}(\triangle\_{\mathsf{2}},\mathsf{\mathcal{Y}}\_{\mathsf{n}}\,\mathsf{\mathcal{Y}}\_{\mathsf{i}\,(:,\mathsf{k}:\,\ell)})\|\_{\mathsf{F}} \leq \frac{\mathsf{\mathcal{H}}\_{\mathsf{i},\mathsf{k}}^{\*}\sqrt{1+\mathcal{E}^{2}\|\boldsymbol{A}\_{\mathsf{n}}^{T}\boldsymbol{B}\_{\mathsf{n}}\|\_{2}^{2}/\widehat{\delta}^{2}}}{\mathsf{C}\_{\mathsf{n}+\mathsf{i}+\mathsf{n}\_{\mathsf{b}}-N-2}(1+2\widehat{\gamma}\_{\mathsf{i},\mathsf{k}})}\|\tan\Theta\_{\mathsf{K}}(\triangle\_{\mathsf{2}},\mathsf{\mathcal{Z}})\|\_{\mathsf{F}}\tag{14}$$

*with constant c lies between* ˆ 1 *and <sup>π</sup>*/2*, and c*ˆ = 1 *if k* = -*, and*

$$\mathcal{S} = \min\_{\substack{k \le \boldsymbol{\ell} \le \boldsymbol{\ell} \\ p < k \text{ or } p > \boldsymbol{\ell}}} |\lambda\_j^2 - \sigma\_p^2| , \quad \mathfrak{H}\_{i\_\nu^\* \boldsymbol{\ell}} = \prod\_{j=i\_\nu^\* + m\_b}^N \frac{\lambda\_1^2 - \lambda\_j^2}{\lambda\_{\boldsymbol{\ell}}^2 - \lambda\_j^2}.$$

**Remark 3.** *Similar to Corollary 1, Theorem 2 can also be applied to the single eigenvalue case, here we omit the detail.*

**Remark 4.** *In Theorem 1 and 2, we use the Frobenius norm to estimate the accuracy of eigenpairs approximations, in fact, any unitary invariant norm can be used to measure.*

**Remark 5.** *Compared with the single-vector type of the weighted Golub-Kahan-Lanczos method in [16], our convergence results show the superiority of the block version. For instance, in Corollary 1, the convergence rate of the approximate eigenvalues σj is proportional to C*−<sup>2</sup> *<sup>n</sup>*−*<sup>i</sup>*(<sup>1</sup> + <sup>2</sup>*γi*) *with γi* = *λ*2*i* − *<sup>λ</sup>*2*i*+*nb <sup>λ</sup>*2*i*+*nb* − *<sup>λ</sup>*2*N , which is obviously better than C*−<sup>2</sup> *<sup>n</sup>*−*<sup>i</sup>*(<sup>1</sup> + <sup>2</sup>*γ*˜*i*) *with γ*˜*i* = *λ*2*i* − *<sup>λ</sup>*2*i*+<sup>1</sup> *<sup>λ</sup>*2*i*+<sup>1</sup> − *<sup>λ</sup>*2*N in ([16] Theorem 3.4). While the additional cost caused from the block version can be paid by the improvements generated by γi, especially when the desired eigenvalues lie in a well-separated cluster [12].*

#### **4. Thick Restart**

As the number of iterations increases, Algorithm 1 may encounter the dilemma that the amount of calculation and storage increases sharply and the numerical stability gradually weakens. In this section, we will apply the thick restart strategy [20] to improve the algorithm. After running *n* iterations, Algorithm 1 derives the following relations for LREP:

$$\begin{cases} \, \mathcal{K}\mathcal{Y}\_{n} = \mathcal{X}\_{n}\mathcal{B}\_{n}, \\ \mathcal{M}\mathcal{X}\_{n} = \mathcal{Y}\_{n}\mathcal{B}\_{n}^{T} + \mathcal{Y}\_{n+1}\mathcal{B}\_{n}^{T}E\_{n}^{T}, \end{cases} \tag{15}$$

with X *Tn M*X*n* = *Innb*= Y*Tn K*Y*<sup>n</sup>*.

Recall the SVD (2), let Φ*k* and Ψ*k* be the first *knb* columns of Φ and Ψ, respectively, i.e.,

$$\Phi\_k = [\phi\_1, \phi\_2, \dots, \phi\_{kn\_b}]\_\prime \quad \Psi\_k = [\psi\_1, \psi\_2, \dots, \psi\_{kn\_b}]\_\prime$$

Thus it follows that

$$\mathcal{B}\_n \Psi\_k = \Phi\_k \Sigma\_k \quad \text{and} \quad \mathcal{B}\_n^T \Phi\_k = \Psi\_k \Sigma\_k \tag{16}$$

where Σ*k* = *diag*(*<sup>σ</sup>*1, ··· , *<sup>σ</sup>knb*).

By using the approximate eigenvectors of **H** for thick restart, we post-multiply Ψ*k* and Φ*k* to the Equation (15), respectively, and ge<sup>t</sup>

$$\begin{cases} \mathcal{K}\mathcal{Y}\_{\boldsymbol{n}}\boldsymbol{\Psi}\_{\boldsymbol{k}} = \mathcal{X}\_{\boldsymbol{n}}\mathcal{B}\_{\boldsymbol{n}}\boldsymbol{\Psi}\_{\boldsymbol{k}},\\ \mathcal{M}\mathcal{X}\_{\boldsymbol{n}}\boldsymbol{\Phi}\_{\boldsymbol{k}} = \mathcal{Y}\_{\boldsymbol{n}}\mathcal{B}\_{\boldsymbol{n}}^{T}\boldsymbol{\Phi}\_{\boldsymbol{k}} + \mathcal{Y}\_{\boldsymbol{n}+1}\boldsymbol{B}\_{\boldsymbol{n}}^{T}\boldsymbol{E}\_{\boldsymbol{n}}^{T}\boldsymbol{\Phi}\_{\boldsymbol{k}\prime} \end{cases} \tag{17}$$

From (16), and let Y 1 *k* = Y*n*Ψ*k*, X 1 *k* = X*n*Φ*k*, B 1 *k* = Σ*k*, *Y* 1 *k*+1 = *Yn*+1, *U<sup>T</sup>* = *ETn* Φ*k*, *B*1*k* = *Bn*, then (17) can be rewritten as 

$$\begin{cases} \mathcal{K}\hat{\mathcal{Y}}\_{k} = \hat{\mathcal{X}}\_{k}\hat{\mathcal{B}}\_{k\prime} \\ M\hat{\mathcal{X}}\_{k} = \hat{\mathcal{Y}}\_{k}\hat{\mathcal{B}}\_{k}^{T} + \hat{\mathcal{Y}}\_{k+1}\hat{\mathcal{B}}\_{k}^{T}\mathcal{U}^{T} \end{cases} \tag{18}$$

and X <sup>1</sup>*T k M*X 1 *k* = *Iknb* = Y <sup>1</sup>*T k K*Y 1 *k*.

Next, *X* 1 *k*+1 and *Y* 1 *k*+2 will be generalized. Firstly, we compute

$$\begin{aligned} \check{X}\_{k+1} &= K \hat{Y}\_{k+1} - \hat{\mathcal{X}}\_k \hat{\mathcal{X}}\_k^T M K \hat{Y}\_{k+1} \\ &= K \hat{Y}\_{k+1} - \hat{\mathcal{X}}\_k L \hat{B}\_k. \end{aligned}$$

From the second equation in (18), we know *X* <sup>0</sup>*T <sup>k</sup>*+1*M*<sup>X</sup> 1 *k* = 0. Do Cholesky decomposition *X* <sup>0</sup>*T <sup>k</sup>*+1*MX* 0 *k*+1 = *<sup>W</sup>TW*, and set *<sup>A</sup>*<sup>1</sup>*k*+<sup>1</sup> = *W*, *W* = *inv*(*W*). Compute *<sup>X</sup>*<sup>1</sup>*k*+<sup>1</sup> = *<sup>X</sup>*<sup>0</sup>*k*+1*W*, and let

$$
\hat{\mathcal{R}}\_{k+1} = [\hat{\mathcal{R}}\_k \hat{\mathcal{R}}\_{k+1}]\_\prime \quad \hat{\mathcal{B}}\_{k+1} = \begin{bmatrix} \hat{\mathcal{B}}\_k & \mathcal{U}\hat{\mathcal{B}}\_k \\ 0 & \hat{\mathcal{A}}\_{k+1} \end{bmatrix}\_\prime \prime
$$

we have

$$
\mathcal{K}\hat{\mathcal{O}}\_{k+1} = \hat{\mathcal{R}}\_{k+1}\hat{\mathcal{B}}\_{k+1} \quad \text{with} \quad \hat{\mathcal{R}}\_{k+1}^T M \hat{\mathcal{R}}\_{k+1} = I\_{(k+1)n\_b}. \tag{19}
$$

Secondly, from the above equation, we can compute

$$\begin{split} \widehat{Y}\_{k+2} &= M\widehat{X}\_{k+1} - \widehat{\mathcal{Y}}\_{k}\widehat{\mathcal{Y}}\_{k}^{T}KM\widehat{X}\_{k+1} - \widehat{Y}\_{k+1}\widehat{Y}\_{k+1}^{T}KM\widehat{X}\_{k+1} \\ &= M\widehat{X}\_{k+1} - \widehat{Y}\_{k+1}\widehat{A}\_{k+1}^{T}. \end{split}$$

Again using (19), it is easily go<sup>t</sup> that *Y* <sup>0</sup>*T <sup>k</sup>*+2*K*Y 1 *k*+1 = 0. Similarly, do Cholesky decomposition *Y* <sup>0</sup>*T <sup>k</sup>*+2*KY* 0 *k*+2 = *<sup>W</sup>TW*, and let *<sup>B</sup>*<sup>1</sup>*k*+<sup>1</sup> = *<sup>W</sup>T*, *W* = *inv*(*W*). Compute *<sup>Y</sup>*<sup>1</sup>*k*+<sup>2</sup> = *<sup>Y</sup>*<sup>0</sup>*k*+2*W*, and let <sup>Y</sup><sup>1</sup>*k*+<sup>1</sup> = [Y 1 *<sup>k</sup>*,*Y* 1 *<sup>k</sup>*+<sup>1</sup>], we ge<sup>t</sup>

$$
\mathcal{M}\hat{\mathcal{R}}\_{k+1} = \hat{\mathcal{Y}}\_{k+1}\mathcal{B}\_{k+1}^T + \hat{\mathcal{Y}}\_{k+2}\hat{\mathcal{B}}\_{k+1}^T E\_{k+1}^T \quad \text{with} \quad \hat{\mathcal{Y}}\_{k+1}^T M \hat{\mathcal{Y}}\_{k+1} = I\_{(k+1)n\_b}.
$$

Continue the same procedure for *X* 1 *k*+2, ··· , *X* 1 *n* and *Y* 1 *k*+3, ··· ,*Y* 1 *n*+1, we can obtain the new *M*-orthonormal matrix X 1 *n* ∈ R*N*×*nnb* , the new *K*-orthonormal matrix Y1*n* ∈ R*N*×*nnb* , and the new matrix B 1 *n* ∈ R*nnb*×*nnb* , and relations

$$\begin{cases} \mathcal{K}\hat{\mathcal{Y}}\_{n} = \hat{\mathcal{X}}\_{n}\hat{\mathcal{B}}\_{n\prime} \\ \mathcal{M}\hat{\mathcal{X}}\_{n} = \hat{\mathcal{Y}}\_{n}\hat{\mathcal{S}}\_{n}^{T} + \hat{\mathcal{Y}}\_{n+1}\hat{\mathcal{B}}\_{n}^{T}E\_{n\prime}^{T} \end{cases} \tag{20}$$

with X <sup>1</sup>*T n M*X 1 *n* = *Innb* = Y <sup>1</sup>*T n K*Y 1 *n*, and

$$
\hat{\mathcal{B}}\_n = \begin{bmatrix}
\hat{\mathcal{B}}\_k & \mathcal{U}\hat{\mathcal{B}}\_k \\
& \hat{A}\_{k+1} & \hat{\mathcal{B}}\_{k+1} \\
& & \ddots & \hat{B}\_{n-1} \\
& & & \hat{A}\_n
\end{bmatrix}.
$$

Note that B 1 *n* is no longer a block bidiagonal matrix. Algorithm 2 is our thick-restart weighted block Golub-Kahan-Lanczos algorithm for LREP.

**Remark 6.** *Actually, from the construction of* B 1 *n, we can know the procedure for getting X* 1 *k*+2, ··· , *X* 1 *n and Y* 1 *k*+3, ··· ,*Y* 1 *n*+1 *is the same as applying Algorithm 1 to Y* 1 *k*+2 *for n* − *k* − 1 *iterations, thus we use Algorithm 1 directly in restarting* **Step 2** *of the following Algorithm 2.*

#### **Algorithm 2: wbGKL-TR**

**1.** Given an initial guess *Y*1 satisfying *YT*1 *KY*1 = *Inb* , a tolerance *tol*, an integer *k* that the *k* blocks approximate eigenvectors we want to add to the solving subspace, an integer *n* the block dimension of solving subspace, as well as *w*- the desired number of eigenpairs; **2.** Apply Algorithm 1 from the current point to generate the rest of X*<sup>n</sup>*, Y*n*+1, and B*<sup>n</sup>*. If it is the first cycle, the current point is *Y*1, else *Yk*+2; **3.** Compute an SVD of B*n* as in (2), select *w*-(*<sup>w</sup>*- ≤ *nnb*) wanted singular values *<sup>σ</sup>j*, and their associated left singular vectors *φj* and right singular vectors *ψj*. Form the approximate eigenpairs for **H**, if the stopping criterion is satisfied, then stop, else continue; **4.** Generate new X 1 *k*+1, Y 1 *k*+2 and B 1 *k*+1: Compute Y 1 *k* = Y*n*Ψ*k*, X 1 *k* = X*n*Φ*k*, B 1 *k* = Σ*k*, *Y* 1 *k*+1 = *Yn*+1, *U<sup>T</sup>* = *ETn* Φ*k*, *B*1*k* = *Bn*; Compute *X* 0 *k*+1 = *KY* 1 *k*+1 − X 1 *kUB* 1 *k*, do Cholesky decomposition *X* <sup>0</sup>*T <sup>k</sup>*+1*MX* 0 *k*+1 = *<sup>W</sup>TW*, set *A* 1 *k*+1 = *W*, *W* = *inv*(*W*), *X* 1 *k*+1 = *X* 0 *<sup>k</sup>*+1*W*; Compute *Y* 0 *k*+2 = *MX* 1 *k*+1 − *Y* 1 *<sup>k</sup>*+1*A* <sup>1</sup>*T k*+1, do Cholesky decomposition *Y* <sup>0</sup>*T <sup>k</sup>*+2*KY* 0 *k*+2 = *<sup>W</sup>TW*, set *B* 1 *k*+1 = *<sup>W</sup>T*, *W* = *inv*(*W*), *<sup>Y</sup>*<sup>1</sup>*k*+<sup>2</sup> = *<sup>Y</sup>*<sup>0</sup>*k*+2*W*; Let X*k*+<sup>1</sup> = X 1 *k*+1 = [X 1 *k*, *X* 1 *<sup>k</sup>*+<sup>1</sup>], B*k*+<sup>1</sup> = B 1 *k*+1 = ( B1*k UB*<sup>1</sup>*k* 0 *<sup>A</sup>*<sup>1</sup>*k*+<sup>1</sup> ), Y*<sup>k</sup>*+<sup>2</sup> = <sup>Y</sup><sup>1</sup>*k*+<sup>2</sup> = [Y<sup>1</sup>*k*,*Y*<sup>1</sup>*k*+1,*Y*<sup>1</sup>*k*+<sup>2</sup>], and go to **Step 2**.

**Remark 7.** *In* **Step 3***, we compute the harmonic Ritz pairs after n iterations. In practice, we do the computation for each iterations j* = 1, ··· , *n. When restarting, the information chosen to add to the solving subspaces are the wanted w*- *singular values of* B*n with their corresponding left and right singular vectors. Actually, we use MATLAB command "sort" to choose the w*- *smallest ones or the w*- *largest ones, and which singular values to choose depends on the desired eigenvalues of* **H***.*

In the end of this section, we list the computational costs in a generic cycle of four algorithms, which are weighted block Golub-Kahan-Lanczos algorithm, thick-restart weighted block Golub-Kahan-Lanczos algorithm, block Lanczos algorithm [12], and thick-restart block Lanczos algorithm [12], and denoted by **wbGKL**, **wbGKL-TR**, **BLan**, and **BLan-TR**, respectively. The detail pseudocodes of **BLan** and **BLan-TR** are be found in [12].

The comparisons are presented in Tables 1 and 2. Here, we denote "block vector" a *N* × *nb* rectangular matrix, denote "mvb" the product number of a *N* × *N* matrix and a block vector. "dpb" denotes the dot product number of two block vectors *X* and *Y*, i.e., *XTY*. "saxpyb" denotes the number of adding two block vectors or multiplying a block vector to a *nb* × *nb* small matrix. "Ep(2*n* × 2*n*)(with sorting)" means the number of 2*n* × 2*n* size eigenvalue problem with sorting eigenvalues and their corresponding eigenvectors in one cycle. Similarly, "Sp(*n* × *n*)" denotes the number of *n* × *n* size singular value decomposition in one cycle. Because **wbGKL** and **BLan** are non-restart algorithms, we just count the first *n* Lanczos iterations.


**Table 1.** Main computational costs per cycle **wbGKL** and **wbGKL-TR.**

**Table 2.** Main computational costs per cycle **BLan** and **BLan-TR.**


#### **5. Numerical Examples**

In this section, two numerical experiments are carried out by using MATLAB 8.4 (R2014b) on a laptop with an Intel Core i5-6200U CPU 2.3 GHz memory 8 GB under the Windows 10 operating system.

**Example 1.** *In this example, we check the bounds established in Theorem 1 and 2. For simplicity, we take N* = 100*, the number of weighted block Golub-Kahan-Lanczos steps n* = 20*, K* = *M as diagonal matrix diag*(*<sup>λ</sup>*1, *λ*2, ··· , *<sup>λ</sup>N*)*, where*

$$\begin{aligned} \lambda\_1 &= 11 + \rho\_\prime \ \lambda\_2 = 11, \ \lambda\_3 = 11 - \rho\_\prime\\ \lambda\_{N-2} &= 1 + \rho\_\prime \ \lambda\_{N-1} = 1, \ \lambda\_N = 1 - \rho\_\prime\\ \lambda\_j &= 5 + \frac{5(N - j + 1)}{N - 3}, \ j = 4, \dots, N - 3, \end{aligned}$$

*and i* = *k* = 1*,* - = 3*,* ˆ *i* = ˆ *k* = *N* − 2*,* ˆ - = *N, nb* = 3*. There are three positive eigenvalue clusters:* {*<sup>λ</sup>*1, *λ*2, *<sup>λ</sup>*3}*,* {*<sup>λ</sup>*4, ··· , *<sup>λ</sup>N*−<sup>3</sup>}*, or* {*<sup>λ</sup>N*−2, *λN*−1, *<sup>λ</sup>N*}*. Obviously,* Ξ = Γ = *K*− 12 *.*

We seek two groups of the approximate eigenpairs, the first is related to the first cluster, the second is related to the last cluster, i.e., {*<sup>σ</sup>*1, *σ*2, *<sup>σ</sup>*3} approximate {*<sup>λ</sup>*1, *λ*2, *<sup>λ</sup>*3}, and {*<sup>σ</sup>nnb*−2, *<sup>σ</sup>nnb*−1, *<sup>σ</sup>nnb*} approximate {*<sup>λ</sup>N*−2, *λN*−1, *<sup>λ</sup>N*}. In order to see the affect that generated from *ρ* to the upper bounds of the approximate eigenpairs errors in weighted block Golub-Kahan-Lanczos method for LREP, we change the parameter *ρ* > 0 to overmaster the tightness among eigenvalues within {*<sup>λ</sup>*1, *λ*2, *<sup>λ</sup>*3} and {*<sup>λ</sup>N*−2, *λN*−1, *<sup>λ</sup>N*}. First, we choose the same matrix *Y*0 as in [12,17], i.e.,

$$\mathbf{Y}\_0 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \frac{1}{N} & \sin 1 & \cos 1 \\ \vdots & \vdots & \vdots \\ \frac{N-n\_b}{N} & \sin (N-n\_b) & \cos (N-n\_b) \end{bmatrix}$$

.

Obviously, *rank*(*<sup>Y</sup>*0) = *nb* and *rank*(*YT*0 *<sup>K</sup>*Ξ(:,1:3)) = *nb*. Since *K* symmetric positive definite, thus do Cholesky decomposition *YT*0 *KY*0 = *<sup>W</sup>TW*, let *Y*1 = *<sup>Y</sup>*0*W*−1, hence, *Y*1 satisfies (5), i.e., *YT*1 *<sup>K</sup>*Ξ(:,1:3) is singular. We take *Z* = *<sup>Y</sup>*1(Ξ*<sup>T</sup>*(:,1:3)*KY*1)−1, then *Z* satisfies (6). We execute the weighted block Golub-Kahan-Lanczos method with full re-orthogonalization for LREP in MATLAB, and check the bounds in (7), (8), (13), and (14). Since the approximate eigenvalues are {*<sup>σ</sup>*1, *σ*2, *<sup>σ</sup>*3} and {*<sup>σ</sup>nnb*−2, *<sup>σ</sup>nnb*−1, *<sup>σ</sup>nnb*}, thus *<sup>π</sup>i*,*k*,- = *<sup>π</sup>i*,*<sup>k</sup>* = *<sup>π</sup>*ˆˆ*i*,<sup>ˆ</sup>*k*,<sup>ˆ</sup>- = *π* ˆ ˆ *i*, ˆ - = 1, *c* = *c*ˆ = 1, and we measure the following two groups of errors:

$$\begin{split} \varepsilon\_{11} &= \|| \operatorname{diag} (\lambda\_1^2 - \sigma\_1^2, \lambda\_2^2 - \sigma\_2^2, \lambda\_3^2 - \sigma\_3^2) \|\_{F\_{\prime}} \\ \varepsilon\_{21} &= \frac{\lambda\_1^2 - \lambda\_N^2}{\mathsf{C}\_{n-1}^2 (1 + 2\gamma\_{1,3})} \|| \operatorname{tan}^2 \Theta\_K (\Xi\_{(:,1:3)}, Z) \|\_{F\_{\prime}} \\ \varepsilon\_{31} &= \|| \operatorname{sin} \Theta\_K (\Xi\_{(:,1:3)}, \mathsf{J}\_n \Psi\_{(:,1:3)}) \|\_{F\_{\prime}} \\ \varepsilon\_{41} &= \frac{\sqrt{1 + ||A\_n^T B\_n||\_2^2 / \delta^2}}{\mathsf{C}\_{n-1} (1 + 2\gamma\_{1,3})} \|| \operatorname{tan} \Theta\_K (\Xi\_{(:,1:3)}, Z) \|\_{F\_{\prime}} \end{split}$$

and

$$\begin{split} \varepsilon\_{12} &= \|| \text{diag} (\sigma\_{N-2}^2 - \lambda\_{N-2}^2 \sigma\_{N-1}^2 - \lambda\_{N-1}^2 \sigma\_N^2 - \lambda\_N^2) \|\_{F'} \\ \varepsilon\_{22} &= \frac{\lambda\_1^2 - \lambda\_N^2}{\mathcal{C}\_{n-1}^2 (1 + 2\hat{\gamma}\_{N-2,N})} \|| \tan^2 \Theta\_K (\Xi\_{(:,N-2:N)}, Z) \||\_{F'} \\ \varepsilon\_{32} &= \|| \sin \Theta\_K (\Xi\_{(:,N-2:N)}, \mathcal{Y}\_n \Psi\_{(:,nn\_b-2:nn\_b)}) \||\_{F'} \\ \varepsilon\_{42} &= \frac{\sqrt{1 + \| \| A\_n^T B\_n \|\_2^2 / \hat{\delta}^2}}{\mathcal{C}\_{n-1} (1 + 2\hat{\gamma}\_{N-2,N})} || \tan \Theta\_K (\Xi\_{(:,N-2:N)}, Z) \||\_{F} . \end{split}$$

Actually, *ε*21 and *ε*41 are upper bounds of *ε*11 and *ε*31, and *ε*22 and *ε*42 are upper bounds of *ε*12 and *ε*32. Tables 3 and 4 report the results of *<sup>ε</sup>ij*, *i* = 1, ··· , 4, *j* = 1, 2 with the parameter *ρ* goes to 0. From the two tables, we can see that the bounds for the eigenvalues lie in a cluster and their corresponding eigenvectors are sharp, and they are not sensitive to *ρ* when *ρ* goes to 0.

**Table 3.** *ε*11, *ε*31 together with their upper bounds *ε*21, *ε*41 of Example 1.


**Table 4.** *ε*12, *ε*32 together with their upper bounds *ε*22, *ε*42 of Example 1.


**Example 2.** *In this example, we are going to test the effectiveness of our weighted block Golub-Kahan-Lanczos algorithms. Four algorithms are tested, i.e.,* **wbGKL***,* **wbGKL-TR***,* **BLan***, and* **BLan-TR***. We choose 3 test problems used in [12,13], which are listed in Table 5. All the matrices K and M in the problems are symmetric positive definite. Specifically, Test 1 and Test 2, which are derived by the turboTDDFT command in QUANTUM*

*ESPRESSO [22], are from the linear response research for Na2 and silane (SiH4) compound, respectively. The matrices K and M in Test 3 are from the University of Florida Sparse Matrix Collection [23], where the order of K is N* = 9604*, and M is the leading N* × *N principal submatrix of f inan*512*.*

**Table 5.** The matrices *K* and *M* in Test 1–3.


We aim to compute the smallest 5 positive eigenvalues and the largest 5 eigenvalues, i.e., *λi* for *i* = 1, ··· , 5, *N* − 4, ··· , *N*, together with their associated eigenvectors. The initial guess is chosen as *V*0 = *eye*(*<sup>N</sup>*, *nb*) with block size *nb* = 3, where *eye* is the MATLAB command. The same as in Example 1, since *K* is symmetric positive definite, thus do Cholesky decomposition *YT*0 *KY*0 = *<sup>W</sup>TW*, let *Y*1 = *<sup>Y</sup>*0*W*−1, hence, *Y*1 satisfies *YT*1 *KY*1 = *Inb* . In **wbGKL-TR** and **BLan-TR**, we select *n* = 30, *k* = 20, i.e., the restart will occur once the dimension of the solving subspace is larger than 90, and the information of 60 Ritz vectors are kept. For **wbGKL** and **BLan**, because there is no restart, then we compute the approximate eigenpairs when the Lanczos iterations equals to 30 + 10 × (*j* − <sup>1</sup>), *j* = 1, 2, ··· , hence, the Lanczos iterations are as the same amount as in **wbGKL-TR** and **BLan-TR**. The following relative eigenvalue error and relative residual 1-norm for each 10 approximate eigenpairs are calculated:

$$\begin{aligned} \mathbf{c}(\sigma\_{j}) &:= \begin{cases} \frac{|\lambda\_{j} - \sigma\_{j}|}{\lambda\_{j}}, & j = 1, \cdots, 5, \\\frac{|\lambda\_{n+j-k} - \sigma\_{j}|}{\lambda\_{n+j-k}}, & j = nm\_{b} - 4, \cdots, nm\_{b}, \end{cases} \\\ r(\sigma\_{j}) &:= \frac{||\mathbf{H}\tilde{z}\_{j} - \sigma\_{j}\tilde{z}\_{j}||\_{1}}{(||\mathbf{H}||\_{1} + \sigma\_{j})||\tilde{z}\_{j}||\_{1}}, & j = 1, \cdots, 5, nm\_{b} - 4, \cdots, nm\_{b}. \end{aligned}$$

where the "exact" eigenvalues *λj* are calculated by the MATLAB code *eig*. The calculated approximate eigenpair (*<sup>σ</sup>j*, *z*˜*j*) is regarded as converged if *<sup>r</sup>*(*<sup>σ</sup>j*) ≤ *tol* = 10−8.

Tables 6 and 7 give the number of the Lanczos iterations (denote by *iter*) and the CPU time in seconds (denote by *CPU*) for the four algorithms, and Table 6 is for the smallest 5 positive eigenvalues, Table 7 is for the largest 5 eigenvalues. From Table 6, one can see that, no matter the smallest or the largest eigenvalues, the iteration number of the four algorithms are competitive, but **wbGKL** and **wbGKL-TR** cost significant less time than **BLan** and **BLan-TR**, especially, **wbGKL-TR** consumes the least amount of time. Because **BLan** and **BLan-TR** need to compute the eigenvalues of 0 *Tn Dn* 0 , which is a nonsymmetric matrix, thus the two algorithms slower than **wbGKL** and **wbGKL-TR**. Due to the saving during the orthogonalization procedure and solving a much smaller B*<sup>n</sup>*, **wbGKL-TR** is the faster algorithm.

**Table 6.** Compute 5 smallest positive eigenvalues for Test 1–3.



**Table 7.** Compute 5 largest eigenvalues for Test 1–3.

The accuracy of the last two approximate eigenpairs in Test 1 are shown in Figure 1. From the figure, we can see that, for the last two eigenpairs, **wbGKL** and **BLan** require almost the same iterations to obtain the same accuracy, and the case of **wbGKL-TR** and **BLan-TR** also need almost the same iterations, which are one or two more restarts than **wbGKL** and **BLan**. On one hand, without solving a nonsymmetric eigenproblem, **wbGKL** and **wbGKL-TR** can save much more time than **BLan** and **BLan-TR**. On the other hand, since the dimension of the solving subspace for **wbGKL-TR** is bounded by *nnb*, the savings in the process of orthogonalization and a much smaller singular value decomposition problem is sufficient to cover the additional restart steps.

**Figure 1.** Errors and residuals of the 2 smallest positive eigenvalues for Test 1 in Example 2.
