1. Introduction
Let
be a given symmetric positive definite matrix and
. We are interested in estimating the quadratic forms of the type
,
. Our main goal was to find an efficient and cheap approximate evaluation of the desired quadratic form without the direct computation of the matrix
. As such, we revisited the approach for estimating the quadratic form
, developed in [
1], and extended it to the case of an arbitrary negative power of
A.
The computation of quadratic forms is a mathematical problem with many applications. Indicatively, we refer to some usual applications.
Statistics: The inverse of the covariance matrix, which is referred to as a precision matrix, usually appears in statistics. The covariance matrix reveals marginal correlations between the variables, whereas the precision matrix represents the conditional correlations between two data variables of the other variables [
2]. The diagonal of the inverse of covariance matrices provides information about the quality of data in uncertainty quantification [
3].
Network analysis: The determination of the importance of the nodes of a graph is a major issue in network analysis. Information for these details can be extracted by the evaluation of the diagonal elements of the matrix
, where
A is the adjacency matrix of the network,
, and
is the spectral radius of
A. This matrix is referred to as a resolvent matrix, see, for example, [
4] and the references therein.
Numerical analysis: Quadratic forms arise naturally in the context of the computation of the regularization parameter in Tikhonov regularization for solving ill-posed problems. In this case, the matrix has the form
,
. In the literature, many methods have been proposed for the selection of the regularization parameter
, such as the discrepancy principle, cross-validation, generalized cross-validation (GCV), L-curve, and so forth; see, for an example, [
5] (Chapter 15) and references therein. These methods involve quadratic forms of type
, with
.
In practice, exact computation of a quadratic form is often replaced using an estimate that is faster to evaluate. Regarding its numerous applications, the estimation of quadratic forms is an important practical problem that has been frequently studied in the literature. Let us indicatively refer to some well-known methods. A widely used method is based on Gaussian quadrature [
5] (Chapter 7) and [
6]. Moreover, extrapolation procedures have been proposed. Specifically, in [
7], families of estimates for the bilinear form
for any invertible matrix, and in [
8], families of estimates for the bilinear form
for a Hermitian matrix were developed.
In the present work, we consider alternative approaches to this problem. To begin, notice that the value of the quadratic form is proportional to the second power of the norm of x. Therefore, the task of estimating consists of two steps:
Assessing the absolute error of the above estimate, i.e., determining a bound for the quantity
In
Section 2, we present the upper bounds for the absolute error (
2) for any given
.
Section 3 is devoted to estimates of the value
in (
1) using a projection method. In
Section 4, we use bounds from
Section 2 as a stepping stone for estimating
using the minimization method. A heuristic approach is outlined in
Section 5. In
Section 6, we briefly describe two methods that were used in previous studies, namely, an extrapolation approach and another one based on Gaussian quadrature.
Section 7 is focused on adapting the proposed estimates to the case of the matrix of form
. Numerical examples that illustrate the performance of the derived estimates are found in
Section 8. We end this work with several concluding remarks in
Section 9.
3. Estimate of by the Projection Method
Our goal is to find a number
such that
(cf. (
1)). To that end, let us take a fixed
and consider the following decomposition of
x,
where
. (That is,
is a projection of
x onto
along the orthogonal complement of
.) Then, we have
Using the assumption
, we obtain
and so
Hence, we obtain a family of estimates for
as follows:
We denote these estimates by , . The computational implementation requires matrix-vector products (mvps).
Let us now explore the error corresponding to the above choice of
. We have
therefore,
Since
is the estimate (see (
1)), the error term is provided as
. Bounds on its absolute value can be found using Proposition 1 with
Remark 1. Let us comment on the choice of the parameter k.
Observe that upper bounds UB1 and UB4 from Proposition 1 are minimal for . In this case, we have ; thus, b has the smallest possible norm. Therefore, from the point of view of minimizing the upper bound on the error (more precisely, minimizing upper bounds UB1 and UB4), a convenient choice is .
However, if the goal is fast estimation, we can take for even m and for odd m, as these two choices provide and , respectively, which are both easy to evaluate.
In general, for any choice of k, the error of the estimate can be assessed using Proposition 1.
4. Estimate of Using the Minimization Method
The estimates that we present in this section stem from the upper bounds UB2 and UB3 for the absolute error , which are derived in Proposition 1. Our goal is to reduce the absolute error by finding the value that minimizes these bounds.
Plugging
in the explicit formulas for UB2 and UB3, we can easily check that the two upper bounds in question attain their minimal values if and only if
minimizes the function
where
corresponds to UB2 and
corresponds to UB3. By differentiating this expression with respect to
, we find that the upper bounds UB2 and UB3 are minimized at
, being the root of the equation
where, as before, the values
and
correspond to UB2 and UB3, respectively. With this value
, we obtain the estimation of
as
For the sake of brevity, we adopt the notation for and for . The computational implementation requires mvps.
5. The Heuristic Approach
Let us consider the quantity
We refer to as the generalized index of proximity.
Lemma 1. Assume that is a symmetric matrix. For any nonzero vector , the value satisfies . The equality holds true if and only if x is an eigenvector of A.
Proof. By the Cauchy–Schwarz inequality, we have ; hence, . The equality is equivalent to the equality in the Cauchy–Schwarz inequality, which occurs if and only if the vector is a scalar multiple of the vector x, in other words, when for a certain . This is further equivalent to (with satisfying ) given the assumption that A is symmetric. □
As a result of Lemma 1, the equality
where
,
, is identically true for any eigenvector of
A (i.e., for any vector satisfying
), and becomes approximately true for vectors
x with the property
.
Therefore, if
, we have
We refer to this estimate as . If, in particular, and , we denote the estimate by , and if , the corresponding estimate is denoted by . The computational implementation requires mvps.
7. Application in Estimating
In several applications, the appearance matrix has the form , , which is a symmetric positive definite matrix. For instance, this type of matrix appears in specifying the regularization parameter in Tikhonov regularization. In this case, the estimation of the quadratic forms of the type is required. The estimates derived in the previous sections involve positive powers of B, i.e., , . However, since the direct computation of the matrix powers is not stable for every , our next goal was to develop an alternative approach to its evaluation. As we show below, the computation of can be obviated.
Since the matrices
and
commute, the binomial theorem applies,
and hence
The above representation of the vector
effectively allows us to avoid the computation of the powers of the matrix
that appear in the estimates of the quadratic form
. The expressions of type
can be evaluated successively as follows:
8. Numerical Examples
Here, we present several numerical examples that illustrate the performance of the derived estimates. All computations were performed using MATLAB (R2018a). Throughout the numerical examples, we denote by the ith column of the identity matrix of appropriate order and as the nth vector with all elements equal to one.
Example 1. Upper bounds for the absolute error.
In this example, we consider the symmetric positive define matrix
, where
B is the Parter matrix selected from the MATLAB gallery. The condition number of the matrix
A is
. We choose the vector
as the 100th column of the identity matrix, i.e.,
. We estimate the quadratic form
whose exact value is
. In
Table 1, we present the generated estimates following the proposed approach and the upper bounds for the corresponding absolute error, which are given in Proposition 1.
Example 2. Estimation of quadratic forms.
We consider the Kac–Murdock–Szegö (KMS) matrix
, which is symmetric positive-definite and Toeplitz. The elements
of this matrix are
. We tested this matrix for
and the condition number of
A is
. We estimated both the quadratic forms
and
. The chosen vectors were
and
. The results are provided in
Table 2 and
Table 3. As we shown, the derived estimates are satisfactory in both cases.
Example 3. Estimation of the whole diagonal of the covariance matrices.
In this example, we consider thecovariance matrices of order
n, whose elements
are given by
where
and
[
9]. We estimated the whole diagonal of the inverse of covariance matrices through the derived estimates presented in this work. Moreover, we used the two approaches presented in
Section 6, which were used in previous studies. We applied the Gauss quadrature using
Lanczos iterations. We chose the pair of values for the parameters
. We validated the quality of the generated estimates by computing the mean relative error (MRE) given by
where
is the corresponding estimate for the diagonal element
. The results are recorded in
Table 4. Specifically, we analyzed the performance of the proposed estimates in terms of the MRE and the execution time (in seconds).
Example 4. Network analysis.
In this example, we tested the behavior of the proposed estimates in network analysis. Specifically, we estimated the whole diagonal of the resolvent matrix
, where
A is the adjacency matrix of the network. We chose the parameter
. We considered three adjacency matrices of order
, which were selected by the CONTEST toolbox [
10]. In
Table 5, we provide the mean relative error for estimating the whole diagonal of the resolvent matrix. We also provide the execution time in seconds in the brackets in this table.
Example 5. Solution of ill-posed problems via the GCV method.
Let us consider the least-squares problem of the form
, where
and
. In ill-posed problems, the solution of the above minimization problem is not satisfactory and it is necessary to replace this problem with another one that is a penalized least-squares problem of the form
where
is the regularization parameter. This is the popular Tikhonov regularization. The solution of (
10) is
. A major issue is the specification of the regularization parameter
. This can be achieved by minimizing the GCV function. Following the expression of the GCV function
in terms of quadratic forms presented in [
11], we write
where
.
In this example, we considered three test problems of order
n, which were selected from the Regularization Tools package [
12]. In particular, we considered the Shaw, Tomo, and Baart problems. Each of these test problems generates a matrix
A and a solution
x. We computed the error-free vector
b such that
. The perturbed data vector
was computed by the formula
, where
is a given noise level and
is a Gaussian noise with mean zero and variance one. We estimated the GCV function using the estimate
without computing the matrix
B, but we used the relations for
given in
Section 7. We found the minimum of the corresponding estimation over a grid of values for
and we computed the solution
. Concerning the grid of
, we considered 100 equally spaced values in log-scale in the interval
.
In
Figure 1,
Figure 2 and
Figure 3, we plot the exact solution
x of the problem and the estimated solution
generated by Tikhonov regularization via the GCV function. Specifically, for each test problem, we depict two graphs. The left-hand-side graph corresponds to the determination of the regularization parameter via the estimated GCV using
, and the right-hand-side graph concerns the exact computation of the GCV function. In
Table 6, we list the characteristics of
Figure 1,
Figure 2 and
Figure 3. In particular, we provide the order
n, the noise level
, and the error norm of the derived solution
of each test problem.