A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems

Xie, Weixiang; Song, Jie

doi:10.3390/math12192991

Open AccessArticle

A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems

by

Weixiang Xie

and

Jie Song

^*

School of Mathematics and Statistics, Shaoguan University, Shaoguan 512000, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(19), 2991; https://doi.org/10.3390/math12192991

Submission received: 2 September 2024 / Revised: 20 September 2024 / Accepted: 24 September 2024 / Published: 25 September 2024

(This article belongs to the Special Issue Applied Mathematics in Data Science and High-Performance Computing)

Download

Browse Figures

Versions Notes

Abstract

Multiple-input multiple-output (MIMO) linear time-invariant (LTI) systems exhibit enormous computational costs for high-dimensional problems. To address this problem, we propose a novel approach for reducing the dimensionality of MIMO systems. The method leverages the Takenaka–Malmquist basis and incorporates the strictly complex-valued relevant vector machine (SCRVM). We refer to this method as covariance-free maximum likelihood (CoFML). The proposed method avoids the explicit computation of the covariance matrix. CoFML solves multiple linear systems to obtain the required posterior statistics for covariance. This is achieved by exploiting the preconditioning matrix and the matrix diagonal element estimation rule. We provide theoretical justification for this approximation and show why our method scales well in high-dimensional settings. By employing the CoFML algorithm, we approximate MIMO systems in parallel, resulting in significant computational time savings. The effectiveness of this method is demonstrated through three well-known examples.

Keywords:

model order reduction; rational approximation; Takenanka–Malmquist; sparse Bayesian learning

MSC:

93A30; 41A20; 93C05

1. Introduction

A reduction in dimensionality is crucial in multiple-input multiple-output (MIMO) system models due to the high computational costs associated with high-dimensional problems. A MIMO linear time-invariant (LTI) system with l outputs and k inputs can be represented by a matrix of transfer functions in the following form [1,2,3,4,5,6]:

\begin{matrix} F (s) = \frac{B_{m - 1} s^{m - 1} + \dots + B_{0}}{d_{m} s^{m} + d_{1} s^{m - 1} + \dots + 1} . \end{matrix}

(1)

Here,

B_{i} \in R^{l \times k}

, m is the system order, and

d_{i} \in R

,

i = 0, 1, 2, \dots, m

.

The n-th approximate model is given by the following [1,2,3,4,5,6]:

\begin{matrix} {\hat{F}}_{n} (s) = \frac{{\hat{B}}_{1} s^{n - 1} + \dots + {\hat{B}}_{n}}{{\hat{d}}_{0} s^{n} + {\hat{d}}_{1} s^{n - 1} + \dots + 1}, \end{matrix}

(2)

where

{\hat{B}}_{i} \in R^{l \times k}

, n is the reduced system order, and

{\hat{d}}_{i} \in R

,

i = 0, 1, 2, \dots, n

.

Numerous techniques have been developed for model reduction in linear invariant systems. These techniques encompass a wide range of approaches, including linear matrix inequalities [7], error minimization [8], magnitude and phase criteria [9], balanced truncation [10,11], rational interpolation [12], the Krylov method [13], adaptive Fourier decomposition (AFD) [14], Routh approximations, and Padé-type model reductions [15,16]. However, those methods incur substantial computational costs and necessitate prior knowledge of the actual system, which is typically unknown in practice. Consequently, a persistent demand exists for rapid and efficient methods to reduce the order of LTI systems. To address this research gap, this study explores novel MIMO system model reduction approaches that are computationally efficient and do not rely on explicit knowledge of the underlying system. By leveraging the latest advancements in the field, we seek to develop effective techniques that can reliably reduce the order of LTI systems while preserving their essential dynamic characteristics.

AFD and SCRVM methods are effective in reducing LTI systems. These methods rely on the rational orthogonal basis, also called the Takenaka–Malmquist (TM) system [17]. In the case of the open right-half plane

Π = {s \in C : R (s) > 0}

, the TM system is defined as follows:

\begin{matrix} B_{k} (s) = \frac{\sqrt{2 R (a_{k})}}{s + {\bar{a}}_{k}} \prod_{l = 1}^{k - 1} \frac{s - a_{l}}{s + {\bar{a}}_{l}}, k = 1, 2, \dots, \end{matrix}

(3)

where

a_{k} \in Π

and

R (\cdot)

are the complex value real parts. The basis

{B_{k}}_{k = 0}^{\infty}

(with

B_{0} = 1

) forms a basis of the Hardy-2 space

H_{2} (Π)

if and only if

\begin{matrix} \sum_{k = 1}^{\infty} \frac{\sqrt{2 R (a_{k})}}{1 + | a_{k} |^{2}} = \infty . \end{matrix}

(4)

The shifted Cauchy kernel, denoted as

B (s) = e_{a} (s) = \frac{\sqrt{2 R (a)}}{s + \bar{a}}

, is a component of the TM system. These systems are commonly used in model reduction due to their parameter-based linear model structure [18,19,20,21,22].

In recent years, the relevance vector machine (RVM) has emerged as a robust framework for solving sparse coding problems and providing uncertainty quantification [23,24,25]. RVM, also called sparse Bayesian learning (SBL), was proposed by Tipping [26] in 2001. RVM has achieved significant success in high-spectral image classification [27,28,29] and reconstruction [30]. It also has found applications in various fields such as direction-of-arrival estimation (DOA) [31,32,33], classification [25], compressive sensing [34], feature selection [35], signal processing [36,37,38], image reconstruction [30,39], financial prediction [40], and more.

The RVM offers computational efficiency by progressively “pruning” irrelevant vectors during the inference process, reducing computation time for covariance matrix inversion. This advantage is precious for high-dimensional problems. However, when dealing with large-scale datasets, RVM faces challenges due to the computational costs of the iterative process, scaling

O (T M^{3})

in time and

O (M^{2})

in space, where T is the iteration time, and M is the number of parameters. Additionally, for complex-valued data, the computational time increases to

O (T {(2 M)}^{3})

with a storage space of

O ({(2 M)}^{2})

. There exist several methods to reduce the costs of the iterative process, such as iteratively reweighted least-squares (IRLS) [41], approximate message passing (AMP) [42], and variational inference (VI) [43]. A popular method that does not require the computation of the inverse matrix, called inverse-free sparse Bayesian learning (IFSBL) [33,43], is often faster in practice. IFSBL bypasses matrix inversion via relaxed evidence lower bound (ELBO), employing a variational EM scheme for efficient, fast-convergence SBL. However, these methods need more scalability at very high dimensions M and more accurate recovery of the sparse codes. Researchers have proposed approximate inference algorithms and acceleration techniques like CoFEM [44], which have significant advantages in both scalability and accuracy over other SBL approaches. Unfortunately, CoFEM only handles real-valued data. This paper introduces a novel method called covariance-free maximum likelihood (CoFML) to address the complex-valued issue similar to CoFEM. CoFML, which omits explicit covariance matrix calculations, provides unbiased posterior estimates through linear system solutions and advanced numerical algebra, resulting in a fast, accurate, and sparsely approximated model.

The paper is organized as follows: Section 2 presents an innovative approach to enhance the strictly complex-valued relevant vector machine (SCRVM) algorithm’s computational efficiency in scenarios involving high-dimensional problems, which we call CoFML. Section 3 provides a theoretical analysis of CoFML. Numerical examples are presented in Section 4, followed by conclusions in Section 5.

2. Reduction in MIMO Systems

2.1. Strictly Complex-Valued Relvence Vector Machine Inference

We expand the MIMO LTI systems

F (s)

in (1) row by row into L SISO LTI systems

F_{l} (s)

, with

F_{l}

denoting the l-th system of

F (s)

, where

l = 1, 2, \dots, L

and

L = p \times q

. We define the single-input signal

s = {[s_{1}, \dots, s_{N}]}^{T}

and the single-output signal as

z_{l} = {[z_{l 1}, \dots, z_{l N}]}^{T}

, and N is the number of examples. The transfer function of the l-th systems is represented as follows:

\begin{matrix} z_{l} = F_{l} (s; θ_{l}) + v_{l} = Ω_{l} θ_{l} + v_{l}, \end{matrix}

(5)

where

θ_{l} = {[θ_{l 1}, \dots, θ_{l M}]}^{T} \in C^{M \times 1}

,

Ω_{l} =

[\begin{matrix} e_{a_{1}} (s_{1}) & \dots & e_{a_{M}} (s_{1}) \\ ⋮ & ⋱ & ⋮ \\ e_{a_{1}} (s_{N}) & \dots & e_{a_{M}} (s_{N}) \end{matrix}] \in C^{N \times M}

and

e_{a_{i}} (s_{j}) = \frac{\sqrt{2 R (a_{i})}}{s_{j} + {\bar{a}}_{i}}, i = 1, 2, \dots, M, j = 1, 2, \dots, N

. The noise vector

v_{l} = {[v_{l 1}, v_{l 2}, \dots, v_{l N}]}^{T}

is assumed to be complex Gaussian-distributed, with

v_{l n} \sim N (v_{l n} | 0, σ^{2} I_{2})

. Here,

0 = {[0, \dots, 0]}^{T} \in R^{N \times 1}

,

v_{l n}

includes both the real and imaginary components of the complex noise, and

σ^{2}

denotes the variance. Notably, those systems have the same denominator, so we have

Ω_{1} = Ω_{2} = \dots = Ω_{L}

. By considering the real and imaginary parts separately and assuming independence among

z_{l}

, the likelihood function for the complete data set can be expressed as follows:

\begin{matrix} p (Z_{l} | Θ_{l}, σ^{2}) & = N (Z_{l} | K Θ_{l}, σ^{2} I_{2 N}) \\ = {(2 π σ^{2})}^{- N} exp \{- \frac{1}{2} {(Z_{l} - K Θ_{l})}^{T} σ^{- 2} I_{2 N} (Z_{l} - K Θ_{l})\}, \end{matrix}

(6)

where

K =

[\begin{matrix} ℜ (Ω) & - ℑ (Ω) \\ ℑ (Ω) & ℜ (Ω) \end{matrix}]

,

Z_{l} =

[\begin{matrix} ℜ (z_{l}) \\ ℑ (z_{l}) \end{matrix}]

,

S =

[\begin{matrix} ℜ (s) \\ ℑ (s) \end{matrix}]

,

Θ_{l} =

[\begin{matrix} ℜ (θ_{l}) \\ ℑ (θ_{l}) \end{matrix}]

, and

ℑ (\cdot)

is the imaginary part of a complex number.

The above notation represents a

2 N

-dimensional Gaussian distribution over

Z_{l}

with a mean of

K Θ_{l}

and variance of

σ^{2} I_{2 N}

. For simplicity, the implicit conditioning on the set of input vectors

S

is omitted in Equation (6) and subsequent expressions during inference.

The prior distribution for the coefficients

Θ_{l}

is modeled as a zero-mean Gaussian distribution:

\begin{matrix} p (Θ_{l} | α) = N (Θ_{l} | 0, A), \end{matrix}

(7)

where

0 = {[0, \dots, 0]}^{T} \in R^{2 M \times 1}

,

A = diag (α)

, and

α = {[α_{1}, \dots, α_{2 M}]}^{T}

. To enforce sparsity with equal sparsity patterns for the real and imaginary parts, we set

α_{l} = α_{M + l} > 0

for

l = 1, 2, \dots, M

. For convenience, we define

A_{1} = diag ({[α_{1}, \dots, α_{M}]}^{T})

, so that

A = [\begin{matrix} A_{1} & O \\ O & A_{1} \end{matrix}]

where

O

is the zero matrix. Additionally, we assign Gamma distributions [24] as hyperpriors for

α_{m}

and the noise variance

σ^{2}

:

\begin{matrix} p (α) & = \prod_{m = 1}^{2 M} Γ (α_{m}^{- 1} | a, b), \\ p (σ^{- 2}) & = Γ (σ^{2} | c, d) . \end{matrix}

We set

a = b = c = d = 10^{- 4}

[26] to ensure flat priors.

By using Bayes’ rule, the posterior covariance and mean are given by

\begin{matrix} Σ = {(σ^{- 2} K^{T} K + A)}^{- 1}, \end{matrix}

(8)

and

\begin{matrix} U_{l} = σ^{- 2} Σ K^{T} Z_{l}, \end{matrix}

(9)

respectively. To facilitate subsequent inference, we introduce the notation

μ_{i l} = U_{i l} + j U_{i + M l}

, where

i = 1, 2, \dots, M

. We define

μ_{l} = {(μ_{1 l}, μ_{2 l}, \dots, μ_{M l})}^{T}

, leading to

U_{l} = [\begin{matrix} R (μ_{l}) \\ I (μ_{l}) \end{matrix}]

.

To infer the SCRVM, we use the augmented vector

\underset{̲}{z_{l}} = [\begin{matrix} z_{l} \\ z_{l}^{*} \end{matrix}]

,

\underset{̲}{θ_{l}} = [\begin{matrix} θ_{l} \\ θ_{l}^{*} \end{matrix}]

,

\underset{̲}{μ_{l}} = [\begin{matrix} μ_{l} \\ μ_{l}^{*} \end{matrix}]

and the augmented matrix

\underset{̲}{Ω} = [\begin{matrix} Ω & O \\ O & Ω^{*} \end{matrix}]

[45]. Using a simple relation of the composite augmented vector and matrix, given by

{\underset{̲}{z}}_{l} = T Z_{l}

,

{\underset{̲}{θ}}_{l} = T Θ_{l}

,

{\underset{̲}{μ}}_{l} = T U_{l}

, and

\underset{̲}{Ω} = \frac{1}{2} T K T^{H}

, where

T = [\begin{matrix} I & j I \\ I & - j I \end{matrix}] \in C^{2 N \times 2 N}

, and

T^{H} T = T T^{H} = 2 I

. Hence, we have

Z_{l} = \frac{1}{2} T^{H} {\underset{̲}{z}}_{l}

,

Θ_{l} = \frac{1}{2} T^{H} {\underset{̲}{θ}}_{l}

, and

U_{l} = \frac{1}{2} T^{H} {\underset{̲}{μ}}_{l}

. By using these simple transformations, we obtain the following:

\begin{matrix} p (θ_{l} | z_{l}, α, σ^{2}) & = \frac{1}{{(π)}^{M} {| Σ_{θ_{l} θ_{l}} |}^{\frac{1}{2}}} exp {- \frac{1}{2} {({\underset{̲}{θ}}_{l} - {\underset{̲}{μ}}_{l})}^{H} Σ_{θ_{l} θ_{l}}^{- 1} ({\underset{̲}{θ}}_{l} - {\underset{̲}{μ}}_{l})}, \end{matrix}

(10)

where

\begin{matrix} Σ_{θ_{l} θ_{l}}^{- 1} & = \frac{1}{4} T Σ^{- 1} T^{H} = [\begin{matrix} Λ^{- 1} & O \\ O & Λ^{- *} \end{matrix}], \\ | Σ_{θ_{l} θ_{l}} | & = 2^{2 M} | Σ |, \end{matrix}

(11)

and

\begin{matrix} {\underset{̲}{μ}}_{l} = T U_{l} = {(2 σ^{2})}^{- 1} Σ_{θ_{l} θ_{l}} Ω^{H} {\underset{̲}{z}}_{l} = [\begin{matrix} μ_{l} \\ μ_{l}^{*} \end{matrix}], \end{matrix}

(12)

where

Λ = {({(2 σ^{2})}^{- 1} Ω^{H} Ω + \frac{A_{1}}{2})}^{- 1}

and

μ_{l} = {(2 σ^{2})}^{- 1} Λ Ω^{H} z_{l}

.

To simplify the inference, we introduce the notation

c^{2} = 2 σ^{2}

and

E = \frac{1}{2} A_{1}

(with

b_{i} = \frac{α_{i}}{2}

), yielding the following:

\begin{matrix} Λ = {(c^{- 2} Ω^{H} Ω + E)}^{- 1}, \end{matrix}

(13)

and

\begin{matrix} μ_{l} = c^{- 2} Λ Ω^{H} z_{l} . \end{matrix}

(14)

Therefore, the distribution of Equation (10) can also be rewritten as follows:

p (θ_{l} | z_{l}, α, σ^{2}) = \frac{1}{{(π)}^{M} | Λ |} exp {- {(θ_{l} - μ_{l})}^{H} Λ^{- 1} (θ_{l} - μ_{l})} .

(15)

Then, we also have

p (z_{l} | α, σ^{2}) = {(π)}^{- N} {| D |}^{- 1} exp \{- z_{l}^{H} D^{- 1} z_{l}\},

(16)

where

D = 2 σ^{2} I + 2 Ω A_{1}^{- 1} Ω^{H} = c^{2} I + Ω E^{- 1} Ω^{H}

is a Hermitian positive definite matrix [46].

So, we can represent the joint probability density function of MIMO LTI systems as follows:

\begin{matrix} L (α, σ^{2}) & = \sum_{l = 1}^{L} log p (z_{l} | α, σ^{2}) \\ = - (L N log π + L log | D | + \sum_{l = 1}^{L} z_{l}^{H} D^{- 1} z_{l}) . \end{matrix}

(17)

Using the maximum likelihood method [24] to find its maximum value, we obtain the following:

α_{i}^{n e w} = \frac{γ_{i}}{\frac{1}{L} \sum_{l = 1}^{L} {| μ_{i l} |}^{2}},

(18)

and

{(σ^{2})}^{n e w} = \frac{2 N}{\frac{1}{L} \sum_{l = 1}^{L} | | z_{l} - Ω μ_{l} {| |}^{2} + T r (Λ Ω^{H} Ω)},

(19)

where

γ_{i} = 2 - α_{i} Λ_{i i}

,

μ_{i l}

is the i-th elements of (14), and

T r (\cdot)

is the trace of a matrix.

2.2. An Estimator for the Diagonal of Covariance Matrix

We adopt the technique in [47] to estimate the diagonal components of

Λ

.

Proposition 1.

Let

v_{k} = {[v_{k, 1}, v_{k, 2}, \dots, v_{k, M}]}^{T} \in C^{M \times 1} (k = 1, 2, \dots, 2 M

) be random probe vectors, where each

v_{k}

comprises independent and identically distributed elements such that

E [\sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, j}] = 0

for all

i \neq j = 1, \dots, M

. Moreover, it also assumes that

\sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, j^{'}}

is independent of

\sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, j^{″}}, j^{'} \neq j^{″}

. For each

v_{k}

. Let

r_{k} = Λ v_{k}

, where Λ is (13) and

r_{k} = {[r_{k, 1}, \dots, r_{k, M}]}^{T}

. Consider the estimator

τ \in R^{M \times 1}

, where, for each

j = 1, \dots, M

,

\begin{matrix} τ_{i} = \frac{\sum_{k = 1}^{2 M} v_{k, i} \cdot r_{k, i}}{\sum_{k = 1}^{2 M} v_{k, i}^{2}} . \end{matrix}

(20)

Then, the

τ_{i}

provides an unbiased estimate of

Λ_{i i}

.

Proof.

The expected value of

τ_{i}

is given by the following:

E [τ_{i}] = Λ_{i, i} + \sum_{j \neq i} Λ_{i, j} \cdot (\frac{E [\sum_{k = 1}^{2 M} v_{k, j} \cdot v_{k, i}]}{\sum_{k = 1}^{2 M} v_{k, i}^{2}}),

because the

E [\sum_{k = 1}^{2 M} v_{k, i} v_{k, j}] = 0

for all

i \neq j

, and we have

E [τ_{i}] = Λ_{i, i}

. The proof is completed. □

In particular, we assume that

v_{1}, v_{2}, \dots, v_{2 M}

are independent Rademacher variables, which means

P (v_{k, i} = - 1) = P (v_{k, i} = 1) = 0.5

. Then, we can simplify (20) to the following form:

\begin{matrix} τ_{i} = \frac{1}{2 M} \sum_{k = 1}^{2 M} v_{k, i} \cdot r_{k, i} . \end{matrix}

(21)

Theorem 1.

Let

τ_{i}

be the estimator given by Equation (21), satisfying the conditions of Proposition 1. Then,

τ_{i}

is one of the optimal estimators for

Λ_{i i}

.

Proof.

By using Proposition 1 and the calculated methods of [44] that confirm

E [τ_{i}] = Λ_{i, i}

, we can calculate the variance

ς_{i}^{2}

of

τ_{i}

as follows:

\begin{matrix} ς_{i}^{2} & = E [{(τ_{i} - E [τ_{i}])}^{2}] \\ = E [{(\sum_{i^{'} \neq i} Λ_{i, i^{'}} \cdot \frac{\sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, i^{'}}}{\sum_{k = 1}^{2 M} v_{k, i}^{2}})}^{2}] \\ = E [\sum_{i^{'} \neq i} \sum_{i^{″} \neq i} Λ_{i, i^{'}} \cdot Λ_{i, i^{″}} \cdot \frac{e_{i, i^{'}}}{e_{i, i}} \cdot \frac{e_{i, i^{″}}}{e_{i, i}}], \end{matrix}

where

e_{i, l} : = \sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, l}

. In the numerator, due to the independence of

\sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, i^{'}}

and

\sum_{k = 1}^{2 M} v_{k, i} \cdot v_{k, i^{″}}

when

i^{'} \neq i^{″}

, we observe that

E [e_{i, i^{'}} \cdot e_{i, i^{″}}] = E [e_{i, i^{'}}] E [e_{i, i^{″}}] = 0

. Consequently,

τ_{i}

serves as an optimal estimator for

Λ_{i, i}

. This completes the proof. □

Then, we transform the inversion problem into the problem of solving a linear system; that is, solving the linear equation

C y = b

, where

C : = Λ^{- 1}

and

b : = v_{k}

. Furthermore, we can concurrently solve these systems by considering the matrix equation

CY = B

, where the inputs

C \in C^{M \times M}

and

B \in C^{M \times (2 M + L)}

are defined as follows:

\begin{matrix} C : = c^{- 2} Ω^{H} Ω + E, \\ B : = [b_{1}, b_{2}, \dots, b_{2 M + L}] = [v_{1}, v_{2}, \dots, v_{2 M}, c^{- 2} Ω^{H} z_{1}, \dots, c^{- 2} Ω^{H} z_{L}] . \end{matrix}

By labeling the columns of the solution matrix

Y \in C^{M \times (2 M + L)}

as

Y : = [y_{1}, y_{2} \dots, y_{2 M}, \dots, y_{2 M + L}] = [r_{1}, \dots, r_{2 M}, μ_{1}, \dots, μ_{L}],

our desired quantities for CoFML,

μ_{l}

and

τ

, can be calculated using (21). Then, we perform the update in (18) as

α_{i}^{n e w} = \frac{γ_{i}}{\frac{1}{L} \sum_{l = 1}^{L} {| μ_{j l} |}^{2}},

(22)

where

γ_{i} = 2 - α_{i} τ_{i}

.

Then, we choose the conjugate gradient (CG) algorithm to solve the multiple linear system

CY = B

. This requires converting the matrix–vector multiplication of the CG algorithm for a single linear system into a matrix–matrix multiplication.

Lemma 1.

Consider the CG algorithm applied to solve

{Cy}_{i} = b_{i}

for

i = 1, 2, \dots, 2 M + L

, where

C \in C^{M \times M}

is a positive definite matrix, and

B \in C^{M \times (2 M + L)}

. Here,

y_{i}

and

b_{i}

represent the i-th column of

Y

and

B

, respectively. Let

y_{i}^{0} \in C^{M \times 1}

be the initial solution,

y_{i}^{*}

be the exact solution, and

y_{i}^{k}

denote the solution obtained by the CG algorithm at the k-th step. We can establish the following relationship:

{∥y_{i}^{k} - y_{i}^{*}∥}_{C} \leq 2 {(\frac{\sqrt{K} - 1}{\sqrt{K} + 1})}^{k} {∥y_{i}^{0} - y_{i}^{*}∥}_{C},

(23)

where

{| | x | |}_{C} : = \sqrt{x^{H} C x}

denotes the norm induced by the positive definite matrix

C

for any vector

x \in C^{M \times 1}

, and

K = {cond}_{2} (C)

[48]. From (23), we observe that when

C

is ill-conditioned (

K > 1

), the convergence of the CG algorithm tends to be slow. However, in SBL iterations, many diagonal elements of

α_{i}

are pushed towards infinity [26], resulting in a large value of K. To address this issue, we incorporate a preconditioned matrix into the CG algorithm, which will be discussed in the following section.

3. Preconditioned Conjugate Gradient Method for SCRVM

3.1. Preconditioned Matrix

Here, we would rather solve the equivalent system

C^{'} Y^{'} = B^{'}

than solve

C Y = B

, where

C^{'} : = P^{- 1 / 2} {CP}^{- 1 / 2}

,

B^{'} : = P^{- 1 / 2} B

, and

Y^{'} : = P^{1 / 2} Y

. The matrix

P

is called the preconditioned matrix. Before presenting the convergence proof of the preconditioned conjugate gradient method, the parallel conjugate gradient (PCG) algorithm is outlined, as shown in Algorithm 1.

Algorithm 1

PCG (C, B, P, T, 2 M)

1:

Initialize α_{i}^{(1)} \leftarrow 1 for i = 1, \dots, M .

2:

for t = 1, 2, \dots, T do

3:

Define C \leftarrow c^{- 2} Ω^{H} Ω + E .

4:

Draw v_{1}, v_{2}, \dots, v_{2 M} \sim Rademacher distribution .

5:

Define B \leftarrow [v_{1}, v_{2}, \dots, v_{2 M}, c^{- 2} Ω^{H} z_{1}, c^{- 2} Ω^{H} z_{2}, \dots, c^{- 2} Ω^{H} z_{L}] .

6:

C^{'} \leftarrow P^{- \frac{1}{2}} C P^{- \frac{1}{2}}, B^{'} \leftarrow P^{- \frac{1}{2}} B .

7:

Y^{'} \leftarrow CG (C^{'}, B^{'}) .

8:

Y \leftarrow P^{- \frac{1}{2}} Y^{'} .

9:

Compute τ_{i} \leftarrow \frac{1}{2 M} \sum_{k = 1}^{2 M} v_{k, i} \cdot r_{k, i} for i = 1, \dots, M .

10:

if α_{i}^{(t + 1)} = \infty and t < T then

11:

Delete the corresponding columns of Ω .

12:

else

13:

Compute γ_{i} \leftarrow 2 - α_{i}^{(t)} τ_{i} .

14:

Update α_{i}^{(t + 1)} \leftarrow \frac{γ_{i}}{\frac{1}{L} \sum_{l = 1}^{L} {| μ_{i l} |}^{2}}, for i = 1, \dots, M .

15:

end if

16:

end for

17:

return α^{(T)}, μ_{1}, μ_{2}, \dots, μ_{L}, τ .

However, in practice, it is impossible to obtain

α^{(t + 1)} = \infty

in Algorithm 1. So, we usually use the condition

α_{i}^{(t + 1)} > 10^{12}

to replace

α^{(t + 1)} = \infty

. Finally, we call the final Algorithm 1 CoFML. After running Algorithm 1, we can obtain the approximate model (2). In particular, when

n = 2

and the basis consists of real numbers, the l-th system of the second approximating model is given by the following:

\begin{matrix} {\hat{F}}_{2, l} (s) & = μ_{1, l} \frac{\sqrt{2 R (a_{1})}}{s + a_{1}} + μ_{2, l} \frac{\sqrt{2 R (a_{2})}}{s + a_{2}}, \\ = \frac{(μ_{1, l} \sqrt{2 R (a_{1})} + μ_{2, l} \sqrt{2 R (a_{2})}) s + (μ_{1, l} a_{2} \sqrt{2 R (a_{1})} + μ_{2, l} a_{1} \sqrt{2 R (a_{2})})}{s^{2} + (a_{1} + a_{2}) s + a_{1} a_{2}}, \\ = \frac{(μ_{1, l} \sqrt{2 a_{1}} + μ_{2, l} \sqrt{2 a_{2}}) s + (μ_{1, l} a_{2} \sqrt{2 a_{1}} + μ_{2, l} a_{1} \sqrt{2 a_{2}})}{s^{2} + (a_{1} + a_{2}) s + a_{1} a_{2}} . \end{matrix}

To ensure the l-th steady-state values of the reduced model are equal to the original systems (1), we use the following constraint:

\begin{matrix} \frac{μ_{1, l} a_{2} \sqrt{2 a_{1}} + μ_{2, l} a_{1} \sqrt{2 a_{2}}}{a_{1} a_{2}} \approx \frac{B_{0, l}}{b_{0}} . \end{matrix}

(24)

3.2. Convergence of CoFML

When the values of

a, b, c, d

are small and

t \to \infty

, it is common for specific

α

values to approach infinity and effectively “prune” the associated “nuisance” parameters while maintaining finite values for

α

[26]. We designate these retained parameters as the “true” parameters.

Definition 1

(CoFML Convergence). In Algorithm 1, let

\hat{α} : = lim_{t \to \infty} α^{(t)}

, and the fulfillment of

(N, T, \hat{α})

-convergence for CoFML is established. Partition the indices

N_{M} : = {1, 2, \dots, M}

into a “nuisance” set denoted as

N \subseteq N_{M}

and a “true” set denoted as

T : = N_{M} ∖ N

, respectively. In this partition,

{\hat{α}}_{i}

is finite when

i \in T

, while

{\hat{α}}_{i} = \infty

when

i \in N

.

Here, we denote the resulting matrices with the appropriate “true” and “nuisance” columns removed as

Ω_{T}

and

Ω_{N}

, respectively. By leveraging the expression for the inverse of a partitioned matrix, we can demonstrate the following relationship:

\begin{matrix} Λ & = {[c^{- 2} Ω^{H} Ω + E]}^{- 1} \\ = {[c^{- 2} {[\begin{matrix} Ω_{T} & Ω_{N} \end{matrix}]}^{H} [\begin{matrix} Ω_{T} & Ω_{N} \end{matrix}] + [\begin{matrix} E_{T} & O \\ O & E_{N} \end{matrix}]]}^{- 1} \\ = {[c^{- 2} [\begin{matrix} Ω_{T}^{H} \\ Ω_{N}^{H} \end{matrix}] [\begin{matrix} Ω_{T} & Ω_{N} \end{matrix}] + [\begin{matrix} E_{T} & O \\ O & E_{N} \end{matrix}]]}^{- 1} \\ = {[\begin{matrix} c^{- 2} Ω_{T}^{H} Ω_{T} + E_{T} & c^{- 2} Ω_{T}^{H} Ω_{N} \\ c^{- 2} Ω_{N}^{H} Ω_{T} & c^{- 2} Ω_{N}^{H} Ω_{N} + E_{N} \end{matrix}]}^{- 1} \\ \overset{t \to \infty}{\to} [\begin{matrix} Λ_{T} & O \\ O & O \end{matrix}] . \end{matrix}

Building upon the derivation above, we can establish the convergence theorem for the preconditioned matrix.

Theorem 2

(PCG Convergence). Let

C^{(t)} : = c^{- 2} Ω^{H} Ω + E^{(t)}

and

P^{(t)} : = c^{- 2} I + E^{(t)}

, respectively, represent the inverse-covariance matrix and preconditioned matrix at the t-th iteration of Algorithm 1. We let

C^{' (t)} : = {(P^{(t)})}^{- 1 / 2} C^{(t)} {(P^{(t)})}^{- 1 / 2}

,

b^{' (t)} : = {(P^{(t)})}^{- 1 / 2} b^{(t)}

, and

y^{0} \in C^{M \times 1}

be the initial solution,

y^{*}

is the exact solution, and

y^{k, (t)}

represents the solution obtained by the algorithm at the k-th step and t-th iteration. Then, given

(N, T, \hat{α})

-convergence, it follows that

lim_{t \to \infty} ∥ y^{k, (t)} - y^{*} ∥_{{\hat{C}}^{'}} \leq 2 exp {- k \sqrt{\frac{1 - η}{1 + η}}} {∥ y^{0} - y^{*} ∥}_{{\hat{C}}^{'}},

(25)

where

η = ∥ Ω_{T}^{H} Ω_{T} {- I ∥}_{2}

.

Proof.

From Lemma 1, we have the following residual:

\begin{matrix} ∥ y^{k, (t)} - y^{*} ∥_{C^{' (t)}} & \leq 2 {(\frac{\sqrt{K^{' (t)}} - 1}{\sqrt{K^{' (t)}} + 1})}^{k} ∥ y^{0} - y^{*} ∥_{C^{' (t)}} \\ \leq 2 {(1 - \frac{1}{\sqrt{K^{' (t)}}})}^{k} ∥ y^{0} - y^{*} ∥_{C^{' (t)}} \\ \leq 2 exp {- \frac{k}{\sqrt{K^{' (t)}}}} ∥ y^{0} - y^{*} ∥_{C^{' (t)}}, \end{matrix}

(26)

where

K^{' (t)} = λ_{max} (C^{' (t)}) / λ_{min} (C^{' (t)})

and

K^{'} = lim_{t \to \infty} K^{' (t)} .

In (26), we notice that our goal is to bound

K^{'} = λ_{max} (C^{'}) / λ_{min} (C^{'})

.

Let

Ψ : = c^{- 2} Ω^{H} Ω - c^{- 2} I

. So, we obtain the following:

\begin{matrix} C^{(t)} & = P^{(t)} + Ψ ⟹ \\ C^{' (t)} & = I + {(P^{(t)})}^{- \frac{1}{2}} Ψ {(P^{(t)})}^{- \frac{1}{2}} \\ ⟹ {\hat{C}}^{'} : = lim_{t \to \infty} C^{' (t)} = I + [\begin{matrix} {\hat{P}}_{T, T}^{- \frac{1}{2}} Ψ_{T, T} {\hat{P}}_{T, T}^{- \frac{1}{2}} & O \\ O & O \end{matrix}], \end{matrix}

(27)

where

{\hat{P}}_{T, T} : = c^{- 2} I_{T} + E_{T}

.

Ref. (27) demonstrates that if

λ

is an eigenvalue of

{\hat{P}}_{T, T}^{- 1 / 2} Ψ_{T, T} {\hat{P}}_{T, T}^{- 1 / 2}

, then the eigenvalue of

{\hat{C}}^{'}

is

1 + λ

.

M

is similar to

N

if there exists an invertible matrix

T

, such that

N = T^{- 1} M T

. Similar matrices have the same eigenvalues [49]. Consider

M : = {\hat{P}}_{T, T}^{- 1 / 2} Ψ_{T, T} {\hat{P}}_{T, T}^{- 1 / 2}

,

N : = {\hat{P}}_{T, T}^{- 1} Ψ_{T, T}

and

T : = {\hat{P}}_{T, T}^{1 / 2}

, so the eigenvalue of

{\hat{P}}_{T, T} Ψ_{T, T}

is

λ

. Since the absolute value of the eigenvalue of a matrix does not exceed its spectral norm, we have

\begin{matrix} | λ | \leq {∥{\hat{P}}_{T, T}^{- 1} Ψ_{T, T}∥}_{2} = {∥{\hat{P}}_{T, T}^{- 1} (c^{- 2} Ω_{T}^{H} Ω_{T} - c^{- 2} I)∥}_{2} \\ \leq {∥{(c^{- 2} I)}^{- 1} (c^{- 2} Ω_{T}^{H} Ω_{T} - c^{- 2} I)∥}_{2} = {∥Ω_{T}^{H} Ω_{T} - I∥}_{2} . \end{matrix}

It follows that

λ_{max} ({\hat{C}}^{'}) \leq 1 + {∥Ω_{T}^{H} Ω_{T} - I∥}_{2}

and

λ_{min} ({\hat{C}}^{'}) \geq 1 - {∥Ω_{T}^{H} Ω_{T} - I∥}_{2}

, so we have

\begin{matrix} K^{'} = \frac{λ_{max}}{λ_{min}} \leq \frac{1 + ∥ Ω_{T}^{H} Ω_{T} {- I ∥}_{2}}{1 - ∥ Ω_{T}^{H} Ω_{T} {- I ∥}_{2}}, \end{matrix}

(28)

where we first let

t \to \infty

on both sides of (26), and then substitute (28) into it to obtain (25). The Proof is completed. □

3.3. Computational Complexities of CoFML

After satisfying the convergence conditions mentioned above, we analyze the algorithm’s overall complexity. During each of the T iterations of CoFML, a maximum of

H (H < M)

steps of conjugate gradient (CG) is required, resulting in an overall time complexity of

O (T H)

and space complexity of

O (H)

for each of the L systems.

4. Examples

This section provides three examples to illustrate the algorithm described above. The l-th system impulse response is defined as follows:

\begin{matrix} h_{l} (t) = \frac{1}{2 π j} \int_{Q - j \infty}^{Q + j \infty} e^{ζ t} F_{l} (ζ) d ζ, \end{matrix}

(29)

where

F_{l} (ζ) \in H_{2} (Π)

is a high-order transfer function, and Q is chosen to be greater than the real part of the largest pole of

F_{l} (ζ)

. The l-th system impulse response energy (IRE) is calculated as follows:

\begin{matrix} I R E_{l} = \int_{0}^{\infty} h_{l}^{2} (t) d t = \frac{1}{2 π} \int_{- \infty}^{\infty} {| F_{l} (j ω) |}^{2} d ω . \end{matrix}

(30)

Then, we consider three well-known examples when

L = 1, 2, 4

, respectively.

Example 1.

We first consider a SISO LTI system [16,50,51] as follows:

\begin{matrix} F_{10} (s) = \frac{540.70748 \times 10^{17}}{\prod_{l = 1}^{10} (s + b_{l})}, \end{matrix}

(31)

where

b_{1} = 2.04, b_{2} = 18.3, b_{3} = 50.13, b_{4} = 95.15, b_{5} = 148.85, b_{6} = 205.16, b_{7} = 257.21, b_{8} = 298.03, b_{9} = 320.97

, and

b_{10} = 404.16

.

In this example, we use

N = 251

frequency-domain measurements within the interval

[- 5 j, 5 j]

and

M = 14

basis functions within the interval

[0.1, 10]

. By applying the CoFML method, we obtain

a_{1} = 2.3846

,

a_{2} = 6.9538

and take the real part of

μ

. Then, the 2nd partial sum is given by the following:

\begin{matrix} {\hat{F}}_{2} (s) = \frac{0.01053 s + 16.26}{s^{2} + 9.338 s + 16.58} . \end{matrix}

Figure 1 shows the step response comparison of the original system and the simplified model obtained by other methods, while Table 1 compares the IRE with different techniques. Figure 1 and Table 1 show that the CoFML method is effective.

Example 2.

The next example is studied in [53] which is a 4th-order system, where

F_{4} (s) = \frac{[\begin{matrix} 28 \\ 12 \end{matrix}] s^{3} + [\begin{matrix} 496 \\ 528 \end{matrix}] s^{2} + [\begin{matrix} 1800 \\ 1440 \end{matrix}] s + [\begin{matrix} 2400 \\ 4320 \end{matrix}]}{2 s^{4} + 36 s^{3} + 204 s^{2} + 360 s + 240} .

(32)

Here we use

N = 334

frequency-domain measurements in the interval

[- 5 j, 5 j]

and

M = 14

basis numbers in the interval

[0.1, 10]

. Using the CoFML method, we obtain

a_{1} = 0.8615

,

a_{2} = 1.6231

and take the real part of the

μ

. Then, the 2nd partial sum is given by the following:

{\hat{F}}_{2} (s) = \frac{[\begin{matrix} 12.3043 \\ 9.5249 \end{matrix}] s + [\begin{matrix} 14.4694 \\ 26.2957 \end{matrix}]}{s^{2} + 2.4846 s + 1.3983} .

(33)

Figure 2 shows the original and other reduced model step responses, while Table 2 compares the IRE with different methods. Figure 2 and Table 2 show the adequate CoFML model.

Example 3.

We finally consider the transfer function study in [54]:

F_{4} (s) = \frac{[\begin{matrix} 14.96 (s + 1.7) (s + 100) & 95150 (s + 1.898) (s + 10) \\ 85.20 (s + 1.44) (s + 100) & 124000 (s + 2.077) (s + 10) \end{matrix}]}{(s + 1.338354) (s + 1.886647) (s + 10) (s + 100)} .

(34)

In this example, we use

N = 334

frequency-domain measurements within the interval

[- 5 j, 5 j]

and

M = 8

basis functions within the interval

[0.1, 12]

. By applying the CoFML method, we obtain

a_{1} = 1.8

,

a_{2} = 3.5

and take the real part of the

μ

. With these values, the 2nd partial sum

{\hat{F}}_{2} (s)

is given by:

{\hat{F}}_{2} (s) = \frac{[\begin{matrix} 0.4517 s + 9.0010 & 724.0193 s + 6141.1035 \\ 3.6572 s + 46.0935 & 856.7064 s + 8562.8471 \end{matrix}]}{s^{2} + 7 s + 9.36} .

(35)

Figure 3 shows the step responses of the original system and the reduced models obtained using different methods. In contrast, Table 3 and Table 4 compare the IRE using other methods. From Figure 3 and Table 3 and Table 4, we can observe that the CoFML method is effective.

5. Conclusions

In this paper, we developed the CoFML method to accelerate SCRVM. By solving the SCRVM inversion problem with unbiased estimates of the matrix diagonal elements, CoFML showed superior time and space efficiency compared to existing SCRVM techniques, especially when the number of unknowns M was large. Next, we theoretically analyzed the convergence of the CoFML algorithm, including convergence under preconditioned matrices. In addition, three well-known examples of results demonstrated that CoFML outperforms existing model reduction methods for MIMO systems, even with a small data set. Moreover, the applicability of CoFML extends to any scenario involving complex sparse Bayesian methods where covariance computation is required. This versatility opens up opportunities for its use in various fields such as compressed sensing, direction of arrival (DOA) estimation, multi-target tracking, and beyond.

Author Contributions

Methodology, W.X.; writing—original draft, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This researchis supported by the Science Foundation of Shaoguan University (No. SY2021KJ11) and the Scientific Computing Research Innovation Team of Guangdong (No. 2021KCXTD052).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ogata, K. Discrete-Time Control Systems; Prentice-Hall, Inc.: Upper Saddle River, NJ, USA, 1995. [Google Scholar]
d’Azzo, J.J.; Houpis, C.D. Linear Control System Analysis and Design: Conventional and Modern; McGraw-Hill Higher Education: New York, NY, USA, 1995. [Google Scholar]
Khalil, I.S.; Doyle, J.C.; Glover, K. Robust and Optimal Control; Prentice Hall: Upper Saddle River, NJ, USA, 1996; Volume 2. [Google Scholar]
Goodwin, G.C.; Graebe, S.F.; Salgado, M.E. Control System Design; Prentice Hall: Upper Saddle River, NJ, USA, 2001; Volume 240. [Google Scholar]
Lathi, B.P.; Green, R.A. Linear Systems and Signals; Oxford University Press: New York, NY, USA, 2005; Volume 2. [Google Scholar]
Albertos, P.; Antonio, S. Multivariable Control Systems: An Engineering Approach; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Geromel, J.; Kawaoka, F.; Egas, R. Model reduction of discrete time systems through linear matrix inequalities. Int. J. Control 2004, 77, 978–984. [Google Scholar] [CrossRef]
Mittal, A.; Prasad, R.; Sharma, S. Reduction of linear dynamics systems using an error minimization technique. J.-Inst. Eng. India Part Electr. Eng. Div. 2004, 84, 201–206. [Google Scholar]
Sandberg, H.; Lanzon, A.; Anderson, B.D. Model approximation using magnitude and phase criteria: Implications for model reduction and system identification. Int. J. Robust Nonlinear Control-IFAC-Affil. J. 2007, 17, 435–461. [Google Scholar] [CrossRef]
Gugercin, S.; Sorensen, D.; Antoulas, A. A modified low-rank Smith method for large-scale Lyapunov equations. Numer. Algorithms 2003, 32, 27–55. [Google Scholar] [CrossRef]
Penzl, T. Algorithms for model reduction of large dynamical systems. Linear Algebra Its Appl. 2006, 415, 322–343. [Google Scholar] [CrossRef]
Gugercin, S.; Antoulas, A.C.; Beattie, C. H₂ model reduction for large-scale linear dynamical systems. SIAM J. Matrix Anal. Appl. 2008, 30, 609–638. [Google Scholar] [CrossRef]
Magruder, C.; Beattie, C.; Gugercin, S. Rational Krylov methods for optimal $L$ 2 model reduction. In Proceedings of the 49th IEEE Conference on Decision and Control (CDC), Atlanta, GA, USA, 15–17 December 2010; pp. 6797–6802. [Google Scholar]
Mi, W.; Qian, T.; Wan, F. A fast adaptive model reduction method based on Takenaka–Malmquist systems. Syst. Control Lett. 2012, 61, 223–230. [Google Scholar] [CrossRef]
Freund, R.W. Padé–Type Model Reduction of Second-Order and Higher-Order Linear Dynamical Systems. In Dimension Reduction of Large-Scale Systems: Proceedings of the Workshop held in Oberwolfach, Germany, 19–25 October 2003; Benner, P., Sorensen, D.C., Mehrmann, V., Eds.; Lecture Notes in Computational Science and Engineering; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
Parmar, G.; Mukherjee, S.; Prasad, R. System reduction using eigen spectrum analysis and Padé approximation technique. Int. J. Comput. Math. 2007, 84, 1871–1880. [Google Scholar] [CrossRef]
Walsh, J.L. Interpolation and Approximation by Rational Functions in the Complex Domain; American Mathematical Soc.: Providence, RI, USA, 1935; Volume 20. [Google Scholar]
Heuberger, P.S.C.; Hof, P.M.J.V.D.; Bosgra, O.H. A generalized orthonormal basis for linear dynamical systems. In Proceedings of the 32nd IEEE Conference on Decision and Control, New Orleans, LA, USA, 13–15 December 1995. [Google Scholar]
Ward, N.F.D.; Partington, J.R. Rational wavelet decompositions of transfer functions in hardy-sobolev classes. Math. Control Signals Syst. 1995, 8, 257–278. [Google Scholar] [CrossRef]
Akçay, H.; Ninness, B. Orthonormal basis functions for modelling continuous-time systems. Signal Process. 1999, 77, 261–274. [Google Scholar] [CrossRef]
Akçay, H.; Heuberger, P. A frequency-domain iterative identification algorithm using general orthonormal basis functions. Automatica 2001, 37, 663–674. [Google Scholar] [CrossRef]
Hof, P.V.D.; Wahlberg, B.; Heuberger, P.; Ninness, B.; Bokor, J.; e Silva, T.O. Modelling and Identification with Rational Orthogonal Basis Functions—ScienceDirect. IFAC Proc. Vol. 2000, 33, 445–455. [Google Scholar]
Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006; Volume 4. [Google Scholar]
Berger, J.O. Statistical Decision Theory and Bayesian Analysis; Springer: Berlin/Heidelberg, Germany, 1985. [Google Scholar]
Luo, J.; Vong, C.M.; Wong, P.K. Sparse Bayesian extreme learning machine for multi-classification. IEEE Trans. Neural Netw. Learn. Syst. 2013, 25, 836–843. [Google Scholar]
Tipping, M.E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar]
Demir, B.; Erturk, S. Hyperspectral image classification using relevance vector machines. IEEE Geosci. Remote Sens. Lett. 2007, 4, 586–590. [Google Scholar] [CrossRef]
Mianji, F.A.; Zhang, Y. Robust hyperspectral classification using relevance vector machine. IEEE Trans. Geosci. Remote Sens. 2011, 49, 2100–2112. [Google Scholar] [CrossRef]
Liu, X.; Chen, X.; Li, J.; Zhou, X.; Chen, Y. Facies identification based on multikernel relevance vector machine. IEEE Trans. Geosci. Remote Sens. 2020, 58, 7269–7282. [Google Scholar] [CrossRef]
Ospina-Acero, D.; Marashdeh, Q.M.; Teixeira, F.L. Relevance vector machine image reconstruction algorithm for electrical capacitance tomography with explicit uncertainty estimates. IEEE Sens. J. 2020, 20, 4925–4939. [Google Scholar] [CrossRef]
Zhang, J.; Qiu, T.; Luan, S. An efficient real-valued sparse Bayesian learning for non-circular signal’s DOA estimation in the presence of impulsive noise. Digit. Signal Process. 2020, 106, 102838. [Google Scholar] [CrossRef]
Dai, J.; So, H.C. Real-valued sparse Bayesian learning for DOA estimation with arbitrary linear arrays. IEEE Trans. Signal Process. 2021, 69, 4977–4990. [Google Scholar] [CrossRef]
Lu, J.; Yang, Y.; Yang, L. An efficient off-grid direction-of-arrival estimation method based on inverse-free sparse Bayesian learning. Appl. Acoust. 2023, 211, 109521. [Google Scholar] [CrossRef]
Ji, S.; Dunson, D.; Carin, L. Multitask compressive sensing. IEEE Trans. Signal Process. 2008, 57, 92–106. [Google Scholar] [CrossRef]
Cheng, H.; Chen, H.; Jiang, G.; Yoshihira, K. Nonlinear feature selection by relevance feature vector machine. In Proceedings of the International Workshop on Machine Learning and Data Mining in Pattern Recognition, Leipzig, Germany, 18–20 July 2007; Springer: Berlin/Heidelberg, Germany, 2007; pp. 144–159. [Google Scholar]
Wipf, D.P.; Rao, B.D. Bayesian learning for sparse signal reconstruction. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), Hong Kong, 6–10 April 2003; Volume 6, p. VI–601. [Google Scholar]
Huang, K.; Aviyente, S. Sparse representation for signal classification. Adv. Neural Inf. Process. Syst. 2006, 19. [Google Scholar] [CrossRef]
Zhang, Z.; Rao, B.D. Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning. IEEE J. Sel. Top. Signal Process. 2011, 5, 912–926. [Google Scholar] [CrossRef]
Bilgic, B.; Goyal, V.K.; Adalsteinsson, E. Multi-contrast reconstruction with Bayesian compressed sensing. Magn. Reson. Med. 2011, 66, 1601–1615. [Google Scholar] [CrossRef]
Hossain, A.; Nasser, M. Recurrent support and relevance vector machines based model with application to forecasting volatility of financial returns. J. Intell. Learn. Syst. Appl. 2011, 3, 230. [Google Scholar] [CrossRef][Green Version]
Wipf, D.; Nagarajan, S. Iterative Reweighted ℓ₁ and ℓ₂ Methods for Finding Sparse Solutions. IEEE Trans. Signal Process. 2010, 4, 317–329. [Google Scholar]
Fang, J.; Zhang, L.; Li, H. Two-Dimensional Pattern-Coupled Sparse Bayesian Learning via Generalized Approximate Message Passing. IEEE Trans. Image Process. 2016, 25, 2920–2930. [Google Scholar] [CrossRef]
Duan, H.; Yang, L.; Fang, J.; Li, H. Fast inverse-free sparse Bayesian learning via relaxed evidence lower bound maximization. IEEE Signal Process. Lett. 2017, 24, 774–778. [Google Scholar] [CrossRef]
Lin, A.; Song, A.H.; Bilgic, B.; Ba, D. Covariance-free sparse Bayesian learning. IEEE Trans. Signal Process. 2022, 70, 3818–3831. [Google Scholar] [CrossRef]
Boloix-Tortosa, R.; Murillo-Fuentes, J.J.; Velázquez, I.S.; Pérez-Cruz, F. Complex-Valued Kernel Methods for Regression. arXiv 2016, arXiv:1610.09915. [Google Scholar]
Schreier, P.J.; Scharf, L.L. Statistical Signal Processing of Complex-Valued Data: The Theory of Improper and Noncircular Signals; Cambridge University Press: Cambridge, MA, USA, 2010. [Google Scholar]
Bekas, C.; Kokiopoulou, E.; Saad, Y. An estimator for the diagonal of a matrix. Appl. Numer. Math. 2007, 57, 1214–1229. [Google Scholar] [CrossRef]
Young, D.M. Iterative Solution of Large Linear Systems; Elsevier: Amsterdam, The Netherlands, 2014. [Google Scholar]
Strang, G. Linear Algebra and Its Applications; Pearson Education India: Bangalore, India, 2012. [Google Scholar]
Mukherjee, S. Order Reduction of Linear Systems using Eigenspectrum Analysis. J. Inst. Eng. (India) Electr. Eng. Div. 1996, 77, 76–79. [Google Scholar]
Therapos, C.P.; Diamessis, J.E. A new method for linear system reduction. J. Frankl. Inst. 1984, 317, 359–371. [Google Scholar] [CrossRef]
Edgar, T.F. Least squares model reduction using step response. Int. J. Control 1975, 22, 261–270. [Google Scholar] [CrossRef]
Hutton, M.; Friedland, B. Routh approximations for reducing order of linear, time-invariant systems. IEEE Trans. Autom. Control 1975, 20, 329–337. [Google Scholar] [CrossRef]
Shamash, Y. Linear system reduction using Pade approximation to allow retention of dominant modes. Int. J. Control 1975, 21, 257–272. [Google Scholar] [CrossRef]

Figure 1. Step responses of the original and reduced models.

Figure 2. Step responses of the original and reduced models.

Figure 3. Step responses of the original and reduced models.

Table 1. IRE of the reduced models.

Method	Reduced Model	IRE
Original system	$F_{10} (s)$	0.1503
CoFML	${\hat{F}}_{2} (s)$	0.1422
AFD [14]	$\frac{- 0.5367 s + 20.96}{s^{2} + 11.86 s + 20.97}$	0.1498
G. Parmar et al. [16]	$\frac{- 28.367 s + 647.60193}{s^{2} + 359.999 s + 647.60193}$	8.2056
Edgar [52]	$\frac{- 0.93 s + 26.28}{s^{2} + 14.92 s + 26.4961}$	0.1580
Therapos and Diamessis [51]	$\frac{- 1.999638 s + 37.32915}{s^{2} + 20.34 s + 37.332}$	0.2027

Table 2. IRE of the reduced models.

Method	Reduced Model	${IRE}_{1}$	${IRE}_{2}$
Original system	$F_{4} (s)$	11.7242	20.5297
CoFML	${\hat{F}}_{2} (s)$	10.8311	19.4217
Routh [53]	$\frac{[\begin{matrix} 30 \\ 24 \end{matrix}] s + [\begin{matrix} 40 \\ 72 \end{matrix}]}{3 s^{2} + 6 s + 4}$	10.2140	20.6014
S. Gugercin et al. [10,11]	$\frac{[\begin{matrix} 0.7108 \\ 0.9026 \end{matrix}] s^{2} + [\begin{matrix} 10.4003 \\ 6.4600 \end{matrix}] s + [\begin{matrix} 15.9178 \\ 28.6521 \end{matrix}]}{s^{2} + 2.2218 s + 1.5918}$	8.4488	17.9249

Table 3. IRE of the reduced models.

Model Reduction Method	Reduced Model
Original system	$F_{4} (s)$
CoFML	${\hat{F}}_{2} (s)$
Padé [54]	$\frac{[\begin{matrix} 2.5095 + 1.2406 s & 1782.9822 + 931.9822 s \\ 12.1064 + 7.2706 s & 2492.4439 + 1213.9036 s \end{matrix}]}{s^{2} + 3.1976 s + 2.4916}$
S. Gugercin et al. [10,11]	$\frac{[\begin{matrix} - 0.241 s^{2} + 0.7760 s + 2.8787 & - 8.9938 s^{2} + 932.5588 s + 2044.2005 \\ - 0.6891 s^{2} + 7.7785 s + 13.8874 & - 10.4496 s^{2} + 1202.2714 s + 2915.2559 \end{matrix}]}{s^{2} + 3.4801 s + 2.8581}$

Table 4. IRE of the reduced models.

Method	${IRE}_{1}$	${IRE}_{2}$	${IRE}_{3}$	${IRE}_{4}$
Original system	0.1293	3.4819	$6.5224 \times 10^{4}$	$1.2412 \times 10^{5}$
CoFML	0.1280	3.5463	$6.9290 \times 10^{4}$	$1.2854 \times 10^{5}$
Padé [54]	0.1348	3.7616	$7.1415 \times 10^{4}$	$1.3136 \times 10^{5}$
S. Gugercin et al. [10,11]	0.2164	6.0371	$7.4435 \times 10^{4}$	$1.3888 \times 10^{5}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xie, W.; Song, J. A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems. Mathematics 2024, 12, 2991. https://doi.org/10.3390/math12192991

AMA Style

Xie W, Song J. A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems. Mathematics. 2024; 12(19):2991. https://doi.org/10.3390/math12192991

Chicago/Turabian Style

Xie, Weixiang, and Jie Song. 2024. "A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems" Mathematics 12, no. 19: 2991. https://doi.org/10.3390/math12192991

APA Style

Xie, W., & Song, J. (2024). A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems. Mathematics, 12(19), 2991. https://doi.org/10.3390/math12192991

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Covariance-Free Strictly Complex-Valued Relevance Vector Machine for Reducing the Order of Linear Time-Invariant Systems

Abstract

1. Introduction

2. Reduction in MIMO Systems

2.1. Strictly Complex-Valued Relvence Vector Machine Inference

2.2. An Estimator for the Diagonal of Covariance Matrix

3. Preconditioned Conjugate Gradient Method for SCRVM

3.1. Preconditioned Matrix

3.2. Convergence of CoFML

3.3. Computational Complexities of CoFML

4. Examples

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI