**Proof.**

*UB1.*

The matrix *A*−*<sup>m</sup>* is symmetric because *A* is symmetric, and it holds that

$$|\langle \mathbf{x}^T A^{-m} b \rangle| = |(\mathbf{x}, A^{-m} b)| = |(A^{-m} \mathbf{x}, b)| \le ||A^{-m} \mathbf{x}|| \cdot ||b||\_r$$

by the Cauchy–Schwarz inequality. Moreover, we have

$$\|A^{-m}\mathbf{x}\| = \sqrt{(A^{-m}\mathbf{x}, A^{-m}\mathbf{x})} = \sqrt{(\mathbf{x}, A^{-2m}\mathbf{x})}.\tag{3}$$

Using the Kantorovich inequality for the matrix *Am* and considering that *<sup>λ</sup>*min(*A*2*m*) = *<sup>λ</sup>*2*m*min, *<sup>λ</sup>*max(*A*2*m*) = *<sup>λ</sup>*2*m*max, we have

$$\begin{array}{l} \frac{(\mathbf{x}^T \mathbf{x})^2}{(\mathbf{x}^T A^{2m} \mathbf{x})(\mathbf{x}^T (A^{2m})^{-1} \mathbf{x})} \ge \frac{4\lambda\_{\text{min}} (A^{2m})\lambda\_{\text{max}} (A^{2m})}{(\lambda\_{\text{min}} (A^{2m}) + \lambda\_{\text{max}} (A^{2m}))^2} \\ \Rightarrow \quad \frac{||\mathbf{x}||^4}{(\mathbf{x}^T A^{2m} \mathbf{x})(\mathbf{x}^T A^{-2m} \mathbf{x})} \ge \frac{4\lambda\_{\text{min}}^{2m} \lambda\_{\text{max}}^{2m}}{(\lambda\_{\text{min}}^{2m} + \lambda\_{\text{max}}^{2m})^2} \\ \Rightarrow \quad \mathbf{x}^T A^{-2m} \mathbf{x} \le \frac{||\mathbf{x}||^4}{(\mathbf{x}\_r A^{2m} \mathbf{x})} \left(\frac{\lambda\_{\text{min}}^{2m} + \lambda\_{\text{max}}^{2m}}{4\lambda\_{\text{min}}^{2m} \lambda\_{\text{max}}^{2m}}\right) \\ \Rightarrow \quad \mathbf{x}^T A^{-2m} \mathbf{x} \le \frac{||\mathbf{x}||^4}{4||A^{4m} \mathbf{x}||^2} \left(\frac{\lambda\_{\text{min}}^m}{\lambda\_{\text{max}}^m} + \frac{\lambda\_{\text{max}}^m}{\lambda\_{\text{min}}^m}\right)^2 = \frac{||\mathbf{x}||^4}{4||A^{4m} \mathbf{x}||^2} \left(\frac{1}{\mathbf{x}^T} + \mathbf{x}^m\right)^2, \end{array}$$

where *κ* = *λ*max *λ*min is the condition number of *A*. Therefore, the norm *<sup>A</sup>*−*mx* given by (3) can be bounded by

$$||A^{-m}\mathbf{x}|| \le \frac{||\mathbf{x}||^2}{2||A^m\mathbf{x}||} \left(\frac{1}{\kappa^m} + \kappa^m\right). \tag{4}$$

Hence, we have

$$|\mathbf{x}^T \mathbf{A}^{-m} \mathbf{b}| \le ||\mathbf{A}^{-m} \mathbf{x}|| \cdot ||\mathbf{b}|| = \frac{||\mathbf{x}||^2}{2||\mathbf{A}^m \mathbf{x}||} \left(\frac{1}{\mathbf{x}^m} + \mathbf{x}^m\right) ||\mathbf{b}||.$$

*UB2.*

> Due to the Cauchy–Schwarz inequality, it holds that

$$|\mathbf{x}^T A^{-m} b| = |(\mathbf{x}\_\prime A^{-m} b)| \le ||\mathbf{x}|| \cdot ||A^{-m} b||.$$

Following a similar approach as above based on the Kantorovich inequality, we obtain

$$||A^{-m}b|| \le \frac{||b||^2}{2||A^m b||} \left(\frac{1}{\kappa^m} + \kappa^m\right).$$

So,

$$|\mathbf{x}^T \mathbf{A}^{-m} b| \le \frac{||\mathbf{x}|| \cdot ||b||^2}{2||\mathbf{A}^m b||} \left(\frac{1}{\kappa^m} + \kappa^m\right).$$

*UB3.*

> It holds that

$$\begin{array}{rcl} |\mathbf{x}^T \boldsymbol{A}^{-m} \boldsymbol{b}| &=& |(\boldsymbol{A}^{-\frac{m}{2}} \mathbf{x}, \boldsymbol{A}^{-\frac{m}{2}} \boldsymbol{b})| \leq ||\boldsymbol{A}^{-\frac{m}{2}} \mathbf{x}|| \cdot ||\boldsymbol{A}^{-\frac{m}{2}} \boldsymbol{b}||\\ &=& \sqrt{(\boldsymbol{A}^{-\frac{m}{2}} \mathbf{x}, \boldsymbol{A}^{-\frac{m}{2}} \mathbf{x})} \cdot \sqrt{(\boldsymbol{A}^{-\frac{m}{2}} \boldsymbol{b}, \boldsymbol{A}^{-\frac{m}{2}} \boldsymbol{b})} = \sqrt{(\mathbf{x}, \boldsymbol{A}^{-m} \mathbf{x})} \cdot \sqrt{(\boldsymbol{b}, \boldsymbol{A}^{-m} \mathbf{b})}. \end{array}$$

Applying the Kantorovich inequality to the matrix *Am* in a similar way as above, we can immediately obtain the inequality

$$\|\mathbf{x}^T \mathbf{A}^{-m} \mathbf{x}\| \le \frac{||\mathbf{x}||^4}{4\mathbf{x}^T \mathbf{A}^m \mathbf{x}} \left(\frac{1}{\mathbf{x}^T} + \kappa^{\frac{\mathbf{u}}{2}}\right)^2.$$

So, we have

$$\begin{split} |\boldsymbol{\chi}^{T}\boldsymbol{A}^{-m}\boldsymbol{b}| &\leq \quad \sqrt{\frac{||\boldsymbol{\chi}||^{4}}{4\boldsymbol{\chi}^{T}\boldsymbol{A}^{m}\boldsymbol{\chi}}\left(\frac{1}{\boldsymbol{\kappa}^{\frac{\boldsymbol{w}}{2}}}+\boldsymbol{\kappa}^{\frac{\boldsymbol{w}}{2}}\right)^{2}\frac{||\boldsymbol{b}||^{4}}{4\boldsymbol{b}^{T}\boldsymbol{A}^{m}\boldsymbol{b}}\left(\frac{1}{\boldsymbol{\kappa}^{\frac{\boldsymbol{w}}{2}}}+\boldsymbol{\kappa}^{\frac{\boldsymbol{w}}{2}}\right)^{2}} \\ &= \quad \frac{||\boldsymbol{\chi}||^{2}||\boldsymbol{b}||^{2}}{4\sqrt{\boldsymbol{\kappa}^{T}\boldsymbol{A}^{m}\boldsymbol{\kappa}}\cdot\sqrt{\boldsymbol{b}^{T}\boldsymbol{A}^{m}\boldsymbol{b}}\left(\frac{1}{\boldsymbol{\kappa}^{\frac{\boldsymbol{w}}{2}}}+\boldsymbol{\kappa}^{\frac{\boldsymbol{w}}{2}}\right)^{2}}. \end{split}$$

*UB4.*

Applying the Cauchy–Schwarz inequality, we obtain

$$\|\mathbf{x}^T A^{-m} b\| = |(\mathbf{x}, A^{-m} b)| \le \|\mathbf{x}\| \cdot \|A^{-m} b\| \le \|\mathbf{x}\| \frac{\|b\|}{\lambda\_{\min}(A^m)} = \|\mathbf{x}\| \frac{\|b\|}{\lambda\_{\min}^m}.$$

.

*UB5.*

Since *A* is positive definite, as is *Aq* for any integer *q*, the angle between vectors *v* and *Aqv* does not exceed *π*/2 for any *v*, i.e., -(*v*; *Aqv*) ≤ *π*2.

Taking *v* = *A*−*mx* and *q* = *p* + *m*, we obtain

$$\mathcal{L}(A^{-m}\mathfrak{x}; A^{p+m}A^{-m}\mathfrak{x}) \le \frac{\pi}{2} \quad \Rightarrow \quad \mathcal{L}(A^{-m}\mathfrak{x}; A^{p}\mathfrak{x}) \le \frac{\pi}{2}.$$

The assumption (*<sup>x</sup>*,*Ap x*) (*Amx*,*Ap x*) < *α* implies that

$$\begin{aligned} \left(\mathfrak{x}, A^p \mathfrak{x}\right) - \mathfrak{a}\left(A^m \mathfrak{x}, A^p \mathfrak{x}\right) &< 0 \Rightarrow \left(\mathfrak{x} - \mathfrak{a}A^m \mathfrak{x}, A^p \mathfrak{x}\right) &< 0\\ \Rightarrow \left(-b, A^p \mathfrak{x}\right) &< 0 \Rightarrow \mathcal{L}\left(A^p \mathfrak{x}; -b\right) \in \left(\frac{\pi}{2}, \pi\right]. \end{aligned}$$

Hence, we obtain

$$\mathcal{L}(A^{-m}\mathbf{x}; -b) \ge \underbrace{\mathcal{L}(A^p\mathbf{x}; -b)}\_{\in \left(\underline{\Psi}, \pi\right]} - \underbrace{\mathcal{L}(A^{-m}\mathbf{x}; A^p\mathbf{x})}\_{\in \left[0, \frac{\pi}{2}\right]} \ge \mathcal{L}(A^p\mathbf{x}; -b) - \frac{\pi}{2} > 0.$$

At the same time, the assumption *αx*<sup>2</sup> ≤ (*<sup>x</sup>*, *<sup>A</sup>*−*mx*) implies

$$\forall (\mathbf{x}, a\mathbf{x}) \le (\mathbf{x}, A^{-m}\mathbf{x}) \Rightarrow (A^{-m}\mathbf{x}, aA^m\mathbf{x}) \le (A^{-m}\mathbf{x}, \mathbf{x}) \Rightarrow (A^{-m}\mathbf{x}, \underbrace{\mathbf{x} - aA^m\mathbf{x}}\_{-b}) \ge 0\varphi$$

so, -(*<sup>A</sup>*−*mx*; −*b*) ≤ *π*2 . To summarize,

$$\frac{\pi}{2} \ge \angle(A^{-m}x; -b) \ge \underbrace{\angle(A^p x; -b)}\_{\in \left(\frac{\pi}{2}, \pi\right]} - \frac{\pi}{2} > 0 \,\,\omega$$

Consequently,

$$0 \le \cos \mathcal{L}(A^{-m}x; -b) \le \cos \left( \mathcal{L}(A^p x; -b) - \frac{\pi}{2} \right) = \sin \mathcal{L}(A^p x; -b).$$

So, we have

$$|\left(A^{-m}x, -b\right)| = \left|\left|A^{-m}x\right|\cdot\left|\right| - b\right| \cdot \left|\cos\angle\left(A^{-m}x; -b\right)\right| \leq \left|\left|A^{-m}x\right|\cdot\left|b\right| \cdot \left|\sin\angle\left(A^{p}x; -b\right)\right|.\tag{5}$$

The norm *<sup>A</sup>*−*mx* can be bounded using the Kantorovich inequality, as shown in Relation (4). Regarding the factor |sin -(*Apx*, <sup>−</sup>*b*)|, we have

$$\begin{split} |\sin(\angle(A^p \mathbf{x}; -b))| &= \sqrt{1 - \cos^2 \angle(A^p \mathbf{x}; -b)} = \sqrt{1 - \frac{(A^p \mathbf{x}, -b)^2}{||A^p \mathbf{x}||^2 ||b||^2}} \\ &= \sqrt{1 - \frac{(A^p \mathbf{x}, b)^2}{||A^p \mathbf{x}||^2 ||b||^2}} = \frac{\sqrt{||A^p \mathbf{x}||^2 ||b||^2 - (A^p \mathbf{x}, b)^2}}{||A^p \mathbf{x}|| \cdot ||b||} \end{split}$$

Therefore, the relation (5) can be reformulated as

$$|(A^{-m}x,b)| \leq \frac{||x||^2}{2||A^mx|| \cdot ||A^px||} \left(\frac{1}{\kappa^m} + \kappa^m\right) \sqrt{||A^px||^2 ||b||^2 - (A^px, b)^2}.$$

#### **3. Estimate of** *x<sup>T</sup> A−mx* **by the Projection Method**

Our goal is to find a number *α* such that *xTA*−*mx* ≈ *αx*<sup>2</sup> (cf. (1)). To that end, let us take a fixed *k* ∈ N0 = N ∪ {0} and consider the following decomposition of *x*,

$$\mathbf{x} = \mathfrak{a}A^m\mathbf{x} - b\_r$$

where *b* ⊥ *Akx*. (That is, *αAmx* is a projection of *x* onto *Amx* along the orthogonal complement of *Akx*.) Then, we have

$$(\mathfrak{x}, A^k \mathfrak{x}) = (a A^m \mathfrak{x}, A^k \mathfrak{x}) - (b, A^k \mathfrak{x}).$$

Using the assumption *b* ⊥ *<sup>A</sup>kx*, we obtain

$$(\mathbf{x}, A^k \mathbf{x}) = \mathfrak{a}(A^m \mathbf{x}, A^k \mathbf{x})\_{\prime\prime}$$

and so

$$\mathfrak{a} = \frac{(\mathfrak{x}\_{\prime} A^{k} \mathfrak{x})}{(\mathfrak{x}\_{\prime} A^{m+k} \mathfrak{x})}$$

Hence, we obtain a family of estimates for *xTA*−*mx* as follows:

$$(\mathbf{x}, A^{-m}\mathbf{x}) \approx \frac{(\mathbf{x}, A^k\mathbf{x})}{(\mathbf{x}, A^{m+k}\mathbf{x})} ||\mathbf{x}||^2 \qquad (k \in \mathbb{N}\_0). \tag{6}$$

.

We denote these estimates by *estproj*(*k*), *k* ∈ N0. The computational implementation requires 1*m* + *k* 22 matrix-vector products (mvps).

Let us now explore the error corresponding to the above choice of *α*. We have

$$(\mathfrak{x}, A^{-m}\mathfrak{x}) = (aA^m\mathfrak{x}, A^{-m}\mathfrak{x}) - (b, A^{-m}\mathfrak{x});$$

therefore, 
$$(\mathfrak{x}, A^{-m}\mathfrak{x}) = \mathfrak{a} \|\mathfrak{x}\|^2 - (\mathfrak{x}, A^{-m}\mathfrak{b}).$$

Since *αx*<sup>2</sup> is the estimate (see (1)), the error term is provided as (*<sup>x</sup>*, *<sup>A</sup>*−*mb*). Bounds on its absolute value can be found using Proposition 1 with

$$b = \mathfrak{a}A^{\mathfrak{m}}\mathfrak{x} - \mathfrak{x} = \frac{(\mathfrak{x}, A^k \mathfrak{x})}{(\mathfrak{x}, A^{\mathfrak{m} + k} \mathfrak{x})} A^{\mathfrak{m}}\mathfrak{x} - \mathfrak{x}.$$

**Remark 1.** *Let us comment on the choice of the parameter k.*


*In general, for any choice of k, the error of the estimate can be assessed using Proposition 1.*

#### **4. Estimate of** *x<sup>T</sup> A−mx* **Using the Minimization Method**

The estimates that we present in this section stem from the upper bounds UB2 and UB3 for the absolute error |(*<sup>x</sup>*, *<sup>A</sup>*−*mb*)|, which are derived in Proposition 1. Our goal is to reduce the absolute error by finding the value *α* that minimizes these bounds.

Plugging *b* = *αAmx* − *x* in the explicit formulas for UB2 and UB3, we can easily check that the two upper bounds in question attain their minimal values if and only if *α* minimizes the function

$$f(\boldsymbol{\alpha}) = \frac{\boldsymbol{\alpha}^2 \|\boldsymbol{A}^m \boldsymbol{\alpha}\|^2 - 2\boldsymbol{\alpha}(\mathbf{x}, \boldsymbol{A}^m \boldsymbol{\alpha}) + \|\boldsymbol{\alpha}\|^2}{\sqrt{\boldsymbol{\alpha}^2(\mathbf{x}, \boldsymbol{A}^{3m+k} \boldsymbol{\alpha}) - 2\boldsymbol{\alpha}(\mathbf{x}, \boldsymbol{A}^{2m+k} \boldsymbol{\alpha}) + (\mathbf{x}, \boldsymbol{A}^{m+k} \boldsymbol{\alpha})}},$$

where *k* = *m* corresponds to UB2 and *k* = 0 corresponds to UB3. By differentiating this expression with respect to *α*, we find that the upper bounds UB2 and UB3 are minimized at *α*ˆ, being the root of the equation

$$\begin{aligned} \|A^m \mathbf{x}\|^2 (\mathbf{x}, A^{3m+k} \mathbf{x}) a^3 - 3 \|A^m \mathbf{x}\|^2 (\mathbf{x}, A^{2m+k} \mathbf{x}) a^2 + \\ + \left( 2 \|A^m \mathbf{x}\|^2 (\mathbf{x}, A^{m+k} \mathbf{x}) + 2 (\mathbf{x}, A^m \mathbf{x}) (\mathbf{x}, A^{2m+k} \mathbf{x}) - \|\mathbf{x}\|^2 (\mathbf{x}, A^{3m+k} \mathbf{x}) \right) a + \\ + \|\mathbf{x}\|^2 (\mathbf{x}, A^{2m+k} \mathbf{x}) - 2 (\mathbf{x}, A^m \mathbf{x}) (\mathbf{x}, A^{m+k} \mathbf{x}) &= 0, \end{aligned}$$

where, as before, the values *k* = *m* and *k* = 0 correspond to UB2 and UB3, respectively. With this value *α*ˆ, we obtain the estimation of *xTA*−*mx* as

$$est\_{min} = \mathbb{A} \|\mathbf{x}\|^2.$$

For the sake of brevity, we adopt the notation *estmin*1 for *k* = 0 and *estmin*2 for *k* = *m*. The computational implementation requires 13*m* + *k* 2 2 mvps.

#### **5. The Heuristic Approach**

Let us consider the quantity

$$R\_m(\mathbf{x}) = \frac{||\mathbf{x}||^2 ||A^m \mathbf{x}||^2}{(\mathbf{x}, A^m \mathbf{x})^2}. \tag{7}$$

We refer to *Rm*(*x*) as the generalized index of proximity.

**Lemma 1.** *Assume that A* ∈ R*n*×*n is a symmetric matrix. For any nonzero vector x* ∈ R*n, the value Rm*(*x*) *satisfies Rm*(*x*) ≥ 1*. The equality Rm*(*x*) = 1 *holds true if and only if x is an eigenvector of A.*

**Proof.** By the Cauchy–Schwarz inequality, we have (*<sup>x</sup>*, *Amx*)<sup>2</sup> ≤ *x*<sup>2</sup>*Amx*2; hence, *Rm*(*x*) ≥ 1. The equality *Rm*(*x*) = 1 is equivalent to the equality in the Cauchy–Schwarz inequality, which occurs if and only if the vector *Amx* is a scalar multiple of the vector *x*, in other words, when *Amx* = *αx* for a certain *α* ∈ R. This is further equivalent to *Ax* = *λx* (with *λ* satisfying *λm* = *α*) given the assumption that *A* is symmetric.

As a result of Lemma 1, the equality

$$R\_m(A^{-m/2}\mathfrak{x})^{n\_1}R\_m(A^{m/2}\mathfrak{x})^{n\_2} = R\_m(\mathfrak{x})^{n\_1+n\_2}.$$

where *n*1, *n*2 ∈ Z, is identically true for any eigenvector of *A* (i.e., for any vector satisfying *Rm*(*x*) = 1), and becomes approximately true for vectors *x* with the property *Rm*(*x*) ≈ 1. Therefore, if *Rm*(*x*) ≈ 1, we have

*<sup>A</sup>*−*<sup>m</sup>*/2*x*<sup>2</sup>*n*<sup>1</sup> *Am <sup>A</sup>*−*<sup>m</sup>*/2*x*<sup>2</sup>*n*<sup>1</sup> (*<sup>A</sup>*−*<sup>m</sup>*/2*x*, *Am <sup>A</sup>*−*<sup>m</sup>*/2*x*)<sup>2</sup>*n*<sup>1</sup> *Am*/2*x*<sup>2</sup>*n*<sup>2</sup> *Am Am*/2*x*<sup>2</sup>*n*<sup>2</sup> (*Am*/2*x*, *Am Am*/2*x*)<sup>2</sup>*n*<sup>2</sup> ≈ *x*<sup>2</sup>(*<sup>n</sup>*1+*n*2)*Amx*<sup>2</sup>(*<sup>n</sup>*1+*n*2) (*<sup>x</sup>*, *Amx*)<sup>2</sup>(*<sup>n</sup>*1+*n*2) ⇒ (*<sup>x</sup>*, *<sup>A</sup>*−*mx*)*<sup>n</sup>*<sup>1</sup> *Am*/2*x*<sup>2</sup>*n*<sup>1</sup> *x*<sup>4</sup>*n*<sup>1</sup> *Am*/2*x*<sup>2</sup>*n*<sup>2</sup> *A*3*m*/2*x*<sup>2</sup>*n*<sup>2</sup> *Amx*<sup>4</sup>*n*2 ≈ *x*<sup>2</sup>(*<sup>n</sup>*1+*n*2)*Amx*<sup>2</sup>(*<sup>n</sup>*1+*n*2) (*<sup>x</sup>*, *Amx*)<sup>2</sup>(*<sup>n</sup>*1+*n*2) ⇒ (*<sup>x</sup>*, *<sup>A</sup>*−*mx*)*<sup>n</sup>*<sup>1</sup> ≈ *x*<sup>6</sup>*<sup>n</sup>*1+2*n*<sup>2</sup> *Amx*<sup>2</sup>*n*1+6*<sup>n</sup>*2 (*<sup>x</sup>*, *Amx*)<sup>3</sup>(*<sup>n</sup>*1+*n*2)(*<sup>x</sup>*, *<sup>A</sup>*3*mx*)*<sup>n</sup>*2 ⇒ (*<sup>x</sup>*, *<sup>A</sup>*−*mx*) ≈ *n*1+ *x*<sup>6</sup>*<sup>n</sup>*1+2*n*<sup>2</sup> *Amx*<sup>2</sup>*n*1+6*<sup>n</sup>*2 (*<sup>x</sup>*, *Amx*)<sup>3</sup>(*<sup>n</sup>*1+*n*2)(*<sup>x</sup>*, *<sup>A</sup>*3*mx*)*<sup>n</sup>*2 .

We refer to this estimate as *esth*. If, in particular, *n*1 = 1 and *n*2 = 0, we denote the estimate by *esth*1, and if *n*1 = *n*2 = 1, the corresponding estimate is denoted by *esth*2. The computational implementation requires 13*m*2 2 mvps.

#### **6. A Comparison with Other Methods**

In this section, we briefly describe two methods that were proposed in the literature for estimating quadratic forms of the type *x<sup>T</sup> f*(*A*)*<sup>x</sup>*, where *A* ∈ R*<sup>n</sup>*×*n*, *x* ∈ R*<sup>n</sup>*, and *f* is a smooth function defined on the spectrum of *A*. The first method is an extrapolation procedure developed in [8] and the second one is based on Gaussian quadrature [5] (Chapter 7) and [6].

#### *6.1. The Extrapolation Method*

We adjust the family of estimates for *x<sup>T</sup> f*(*A*)*x* given in [8] (Proposition 2) by setting *f*(*t*) = *t*−*m*, *m* ∈ N. Hence, we directly obtain the estimating formula given in the following lemma.

**Lemma 2.** *Let A* ∈ R*n*×*n be a symmetric matrix. An extrapolation estimate for the quadratic form xTA*−*mx, m* ∈ N*, is given by*

$$\mathfrak{e}\_{\nu} = \rho^{-m\nu} \frac{\|\mathbf{x}\|^{2(m+1)}}{(\mathbf{x}, A\mathbf{x})^{m}}, \quad \rho = \frac{\|\mathbf{x}\|^{2} \|A\mathbf{x}\|^{2}}{(\mathbf{x}, A\mathbf{x})^{2}}, \quad \nu \in \mathbb{R}. \tag{8}$$

We refer to this estimation as *estextrap*(*ν*). The computational implementation requires just one mvp.

**Remark 2.** *In the special case of m* = 1*, some of the proposed estimates are identified to the corresponding extrapolation estimates for specific choices of the family parameter ν. We have*


Notably, the extrapolation procedure proposes estimates for the quadratic form *xTA*−*mx* and not bounds. The choice of the family parameter *ν* is arbitrary and no bounds for the absolute error of the estimates are provided.

#### *6.2. Gaussian Techniques*

We consider the spectral factorization of *A*, which allows us to express the matrix *A* as *A* = ∑*nk*=<sup>1</sup> *<sup>λ</sup>kvkvTk* , where *λk* ∈ R are the eigenvalues of *A* with corresponding eigenvectors *vk*. Therefore, the quadratic form *xTA*−*mx* can be written as

$$\mathbf{x}^T A^{-m} \mathbf{x} = \sum\_{k=1}^n \lambda\_k^{-m} (\mathbf{x}, \upsilon\_k)^2. \tag{9}$$

The Summation (9) can be considered a Riemann–Stieltjes integral of the form

$$\int\_{\lambda\_{\min}}^{\lambda\_{\max}} \lambda^{-m} d\mu(\lambda)\_{\prime}$$

where the measure *μ*(*λ*) is a piecewise constant function defined by

$$\mu(\lambda) = \begin{cases} 0, & \text{if } \lambda < \lambda\_{\text{min}}, \\ \sum\_{i=1}^{j} (\mathbf{x}\_{i} \upsilon\_{i})^{2}, & \text{if } \lambda\_{j} \le \lambda < \lambda\_{j+1}, \\ \sum\_{i=1}^{p} (\mathbf{x}\_{i} \upsilon\_{i})^{2}, & \text{if } \lambda\_{\text{max}} \le \lambda. \end{cases}$$

This Riemann–Stieltjes integral can be approximated using Gauss quadrature rules [5,6].Hence, it is necessary to produce a sequence of orthogonal polynomials, which can be achieved by the Lanczos algorithm. The operation count for this procedure is dominated by the application of the Lanczos algorithm, which requires a cost of *kn*<sup>2</sup> matrix-vector products, where *k* is the number of Lanczos iterations. As the number of the iterations increases, the estimates increase in accuracy but the complexity and the execution time increase as well.

We refer to this estimation as to *estGauss*.

#### **7. Application in Estimating** *x<sup>T</sup>* **(***AA<sup>T</sup>* **+** *λIn***)***−mx*

In several applications, the appearance matrix has the form *B* = *AA<sup>T</sup>* + *λIn*, *λ* > 0, which is a symmetric positive definite matrix. For instance, this type of matrix appears in specifying the regularization parameter in Tikhonov regularization. In this case, the estimation of the quadratic forms of the type *xTB*−*mx* is required. The estimates derived in the previous sections involve positive powers of *B*, i.e., *<sup>B</sup>k*, *k* ∈ N. However, since the direct computation of the matrix powers *B<sup>k</sup>* is not stable for every *λ*, our next goal was to develop an alternative approach to its evaluation. As we show below, the computation of *B<sup>k</sup>* can be obviated.

Since the matrices *AA<sup>T</sup>* and *In* commute, the binomial theorem applies,

$$B^{\mathfrak{m}} = (AA^T + \lambda I\_{\mathfrak{n}})^{\mathfrak{m}} = \sum\_{j=0}^{m} \binom{m}{j} \lambda^j (AA^T)^{m-j}, \ m \in \mathbb{N},$$

and hence

$$B^m \mathfrak{x} = \sum\_{j=0}^m \binom{m}{j} \lambda^j (AA^T)^{m-j} \mathfrak{x}, \ m \in \mathbb{N}.$$

The above representation of the vector *Bmx* effectively allows us to avoid the computation of the powers of the matrix *B* = *AA<sup>T</sup>* + *λIn* that appear in the estimates of the quadratic form *<sup>x</sup>TB*−*mx*. The expressions of type (*AA<sup>T</sup>*)*<sup>m</sup>*−*<sup>j</sup>* can be evaluated successively as follows:

$$A^T \mathbf{x}, \quad A A^T \mathbf{x}, \quad A^T A A^T \mathbf{x}, \quad A A^T A A^T \mathbf{x}, \quad \dots$$

#### **8. Numerical Examples**

Here, we present several numerical examples that illustrate the performance of the derived estimates. All computations were performed using MATLAB (R2018a). Throughout the numerical examples, we denote by *ei* the *i*th column of the identity matrix of appropriate order and 1*n* as the *n*th vector with all elements equal to one.

#### **Example 1.** *Upper bounds for the absolute error.*

In this example, we consider the symmetric positive define matrix *A* = *BTB* ∈ R1000×1000, where *B* is the Parter matrix selected from the MATLAB gallery. The condition number of the matrix *A* is *κ* = 17.8983. We choose the vector *x* ∈ R<sup>1000</sup> as the 100th column of the identity matrix, i.e., *x* = *e*100. We estimate the quadratic form *xTA*−2*x* whose exact value is 0.0127. In Table 1, we present the generated estimates following the proposed approach and the upper bounds for the corresponding absolute error, which are given in Proposition 1.


**Table 1.** Estimating *x<sup>T</sup> A*−2*x* = 0.0127, where *A* = *<sup>B</sup>TB*, *B* = *Parter*, *x* = *e*100.

**Example 2.** *Estimation of quadratic forms.*

We consider the Kac–Murdock–Szegö (KMS) matrix *A* ∈ R1000×1000, which is symmetric positive-definite and Toeplitz. The elements *Aij* of this matrix are *Aij* = *r*|*<sup>i</sup>*−*j*| , *i*, *j* = 1, 2, ... , 1000, 0 < *r* < 1. We tested this matrix for *r* = 0.2 and the condition number of *A* is *κ* = 2.25. We estimated both the quadratic forms *xTA*−2*x* = 1.2072 and *xTA*−3*x* = 296.8727. The chosen vectors were *x* = *e*1000 + 1/4*<sup>e</sup>*120 ∈ R<sup>1000</sup> and *x* = 1*<sup>n</sup>*. The results are provided in Tables 2 and 3. As we shown, the derived estimates are satisfactory in both cases.

**Table 2.** Estimating *x<sup>T</sup> A*−2*x* = 1.2072, where *A* = *KMS*, *x* = *e*1000 + 1/4*<sup>e</sup>*120.



**Table 3.** Estimating *x<sup>T</sup> A*−3*x* = 296.8727, where *A* = *KMS*, *x* = 1*<sup>n</sup>*.

**Example 3.** *Estimation of the whole diagonal of the covariance matrices.*

In this example, we consider thecovariance matrices of order *n*, whose elements *Aij* are given by

$$A\_{ij} = \begin{cases} 1 + i^{\mathfrak{a}}\prime \ n = j\\ \frac{1}{|i - j|^{\mathfrak{F}'}} \not\equiv j \end{cases} \quad \prime \quad i = 1, 2, \dots, n\_{\nu}$$

where *α*, *β* ∈ R and *β* ≥ 1 [9]. We estimated the whole diagonal of the inverse of covariance matrices through the derived estimates presented in this work. Moreover, we used the two approaches presented in Section 6, which were used in previous studies. We applied the Gauss quadrature using *k* = 3 Lanczos iterations. We chose the pair of values for the parameters (*<sup>α</sup>*, *β*)=(3, <sup>1</sup>). We validated the quality of the generated estimates by computing the mean relative error (MRE) given by

$$MRE = \frac{1}{n} \sum\_{i=1}^{n} \frac{|A\_{ii}^{-1} - \text{est}(i)|}{|A\_{ii}^{-1}|},$$

where *est*(*i*) is the corresponding estimate for the diagonal element *A*−<sup>1</sup> *ii* . The results are recorded in Table 4. Specifically, we analyzed the performance of the proposed estimates in terms of the MRE and the execution time (in seconds).

**Table 4.** Mean relative errors and execution times for estimating the diagonal of the covariance matrices of order *n* with (*<sup>α</sup>*, *β*)=(3, <sup>1</sup>).


**Example 4.** *Network analysis.*

In this example, we tested the behavior of the proposed estimates in network analysis. Specifically, we estimated the whole diagonal of the resolvent matrix (*In* − *aA*)−1, where *A* is the adjacency matrix of the network. We chose the parameter *a* = 0.85/*λmax*. We considered three adjacency matrices of order *n* = 4000, which were selected by the CONTEST toolbox [10]. In Table 5, we provide the mean relative error for estimating the whole diagonal of the resolvent matrix. We also provide the execution time in seconds in the brackets in this table.

**Table 5.** Mean relative errors and execution times (seconds) for estimating the diagonal of the resolvent matrix.


**Example 5.** *Solution of ill-posed problems via the GCV method.*

Let us consider the least-squares problem of the form min*x*∈R*<sup>d</sup> Ax* − *b*2, where *A* ∈ R*n*×*<sup>d</sup>* and *b* ∈ R*<sup>n</sup>*. In ill-posed problems, the solution of the above minimization problem is not satisfactory and it is necessary to replace this problem with another one that is a penalized least-squares problem of the form

$$\min\_{\mathbf{x}\in\mathbb{R}^d} \{ \|Ax - b\|^2 + \lambda \|\|\mathbf{x}\|\|^2 \}\,\_{\mathsf{V}}\tag{10}$$

where *λ* > 0 is the regularization parameter. This is the popular Tikhonov regularization. The solution of (10) is *xλ* = (*ATA* + *<sup>λ</sup>Id*)−1*ATb*. A major issue is the specification of the regularization parameter *λ*. This can be achieved by minimizing the GCV function. Following the expression of the GCV function *V*(*λ*) in terms of quadratic forms presented in [11], we write

$$V(\lambda) = \frac{b^T B^{-2} b}{(Tr(B^{-1}))^2}.$$

where *B* = *AA<sup>T</sup>* + *λIn* ∈ R*<sup>n</sup>*×*n*.

In this example, we considered three test problems of order *n*, which were selected from the Regularization Tools package [12]. In particular, we considered the Shaw, Tomo, and Baart problems. Each of these test problems generates a matrix *A* and a solution *x*. We computed the error-free vector *b* such that *b* = *Ax*. The perturbed data vector *bper* ∈ R*p* was computed by the formula *bper* = *b* + *e b σ*√*n*, where *σ* is a given noise level and *e* ∈ R*n* is a Gaussian noise with mean zero and variance one. We estimated the GCV function using the estimate *esth*1 without computing the matrix *B*, but we used the relations for *Bx* given in Section 7. We found the minimum of the corresponding estimation over a grid of values for *λ* and we computed the solution *xλ*. Concerning the grid of *λ*, we considered 100 equally spaced values in log-scale in the interval [<sup>10</sup>−12, <sup>10</sup>].

In Figures 1–3, we plot the exact solution *x* of the problem and the estimated solution *xλ* generated by Tikhonov regularization via the GCV function. Specifically, for each test problem, we depict two graphs. The left-hand-side graph corresponds to the determination of the regularization parameter via the estimated GCV using *esth*1, and the right-handside graph concerns the exact computation of the GCV function. In Table 6, we list the

characteristics of Figures 1–3. In particular, we provide the order *n*, the noise level *σ*, and the error norm of the derived solution *xλ* of each test problem.


**Table 6.** Characteristics of Figures 1–3.

**Figure 1.** Solution of the Shaw test problem via an estimation of GCV (**left**) and the exact GCV (**right**).

**Figure 2.** Solution of the Tomo test problem via an estimation of GCV (**left**) and the exact GCV (**right**).

**Figure 3.** Solution of the Baart test problem via an estimation of GCV (**left**) and the exact GCV (**right**).
