Next Article in Journal
Quantum Models of Consciousness from a Quantum Information Science Perspective
Previous Article in Journal
Principles Entailed by Complexity, Crucial Events, and Multifractal Dimensionality
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Variational Bayes for Space-Time Adaptive Processing

1
School of Electronic Information Engineering, Anhui University, Hefei 230601, China
2
Sun Create Electronics Co., Ltd., Hefei 230088, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Entropy 2025, 27(3), 242; https://doi.org/10.3390/e27030242
Submission received: 23 January 2025 / Revised: 19 February 2025 / Accepted: 24 February 2025 / Published: 26 February 2025
(This article belongs to the Section Signal and Data Analysis)

Abstract

:
To tackle the challenge of enhancing moving target detection performance in environments characterized by small sample sizes and non-uniformity, methods rooted in sparse signal reconstruction have been incorporated into Space-Time Adaptive Processing (STAP) algorithms. Given the prominent sparse nature of clutter spectra in the angle-Doppler domain, adopting sparse recovery algorithms has proven to be a feasible approach for accurately estimating high-resolution spatio-temporal two-dimensional clutter spectra. Sparse Bayesian Learning (SBL) is a pivotal tool in sparse signal reconstruction and has been previously utilized, yet it has demonstrated limited success in enhancing sparsity, resulting in insufficient robustness in local fitting. To significantly improve sparsity, this paper introduces a hierarchical Bayesian prior framework and derives iterative parameter update formulas through variational inference techniques. However, this algorithm encounters significant computational hurdles during the parameter update process. To overcome this obstacle, the paper proposes an enhanced Variational Bayesian Inference (VBI) method that leverages prior information on the rank of the temporal clutter covariance matrix to refine the parameter update formulas, thereby significantly reducing computational complexity. Furthermore, this method fully exploits the joint sparsity of the Multiple Measurement Vector (MMV) model to achieve greater sparsity without compromising accuracy, and employs a first-order Taylor expansion to eliminate grid mismatch in the dictionary. The research presented in this paper enhances the moving target detection capabilities of STAP algorithms in complex environments and provides new perspectives and methodologies for the application of sparse signal reconstruction in related fields.

1. Introduction

Mounted on elevated platforms, airborne radar exhibits remarkable mobility and effectively addresses the shading limitations of ground-based radar. It provides a substantially longer detection range for ground and low-altitude targets compared to ground-based radar, enjoying superior electromagnetic wave propagation conditions. Consequently, it has garnered considerable attention. Within the diverse applications of airborne radar, moving target detection holds a pivotal position. Phased array radar is frequently employed in airborne systems due to its capacity to generate multiple beams concurrently, its adaptable beam steering, and its robust anti-jamming capabilities. Nevertheless, this radar operates in a downward-facing configuration, where clutter is abundant and intense. Furthermore, the aircraft’s movement results in considerable broadening of the clutter spectrum, frequently causing targets to be obscured within the clutter, significantly hindering the effectiveness of moving target detection [1]. Hence, the foremost objective for airborne radar is clutter suppression. Space-time adaptive processing leverages a combined space-time two-dimensional strategy to devise suitable filter weight vectors, effectively mitigating clutter and preserving target energy across Doppler and angular domains, ultimately bolstering moving target detection performance. STAP technology represents an advancement of array adaptive technology [2]. In 1973, Brennan and Reed pioneered the notion of space-time two-dimensional adaptive processing [3], applying the fundamental principles of array signal processing to the two-dimensional data space of pulse and array element samples, sparking a wave of enthusiasm in STAP research. However, optimal STAP necessitates a substantial quantity of training samples that adhere to the Independent and Identically Distributed (IID) criterion, posing challenges such as elevated computational demands, clutter non-uniformity, and non-stationarity. Consequently, the direct deployment of optimal STAP technology is infeasible.
To mitigate the aforementioned computational demands, recent scholarly endeavors have introduced a diverse array of methodologies. Dimensionality reduction techniques within the space-time adaptive processing framework decrease the system’s degrees of freedom by applying clutter-independent linear transformations. This, in turn, minimizes the requisite number of training samples and lessens computational intricacy [4]. Alternatively, rank reduction STAP methodologies employ clutter-related transformations to fashion adaptive space-time filters, thereby curbing the dependency on extensive training datasets [5,6]. For both dimensionality-reduced and rank-reduced STAP strategies, the needed training samples can be diminished to twice the reduced dimension or clutter rank, respectively; however, in non-uniform settings, the sample size remains considerable. Direct digital domain methodologies [7,8] circumvent the utilization of adjacent range cell training samples by leveraging detection cell data directly to construct the STAP processor. This theoretically tackles all non-uniformity challenges but at the expense of some space-time degrees of freedom and STAP output quality due to the exclusion of samples from other range cells in weight vector formulation. Knowledge-aided STAP approaches augment STAP efficacy by incorporating prior information encompassing environmental conditions, radar system specifics, and platform motion characteristics [9]. The efficacy of this technique heavily leans on the accuracy of the preliminary knowledge. Sparse Recovery STAP (SR-STAP) methodologies exploit the sparsity of clutter distribution in the space-time domain, facilitating the recovery of high-resolution clutter power spectra with limited training samples by estimating the clutter covariance matrix (CCM). This leads to enhanced clutter suppression capabilities. For stationary radar systems, several sparse recovery methods exist, including greedy algorithms, convex optimization techniques, sparse Bayesian learning algorithms, and more. Greedy algorithms iteratively select elements from a predefined set (basis or dictionary) and compute corresponding sparse coefficients, progressively minimizing the discrepancy between the linear combination of these elements and the observed data [10]. Common greedy algorithms encompass Matching Pursuit [11], Orthogonal Matching Pursuit [12], Relaxed Greedy Algorithm [13], and  L 1 -norm Greedy Algorithm, among others. These algorithms are advantageous for their flexibility in incorporating constraints [12]; however, they may yield suboptimal sparse coefficient solutions in certain scenarios [13]. Convex optimization methods recast the optimization challenge into a convex framework and solve for the sparse coefficient vector by leveraging the properties of convex functions. Prominent convex optimization algorithms include interior point methods and gradient descent-based approaches. Interior point methods, being among the earliest convex optimization techniques for sparse problems, are mature and accompanied by readily accessible software tools. They are sensitive to solution sparsity and regularization parameters but suffer from high computational complexity, especially for high-dimensional signals. Gradient descent-based algorithms, such as Iterative Splitting and Thresholding (ITS) [14], Two-step IST (TwIST), Fixed-Point Iteration (FPI), and others, closely depend on the regularization parameter’s magnitude. The burgeoning SBL approach, introduced by Tipping around 2001 [15], assumes a sparse prior distribution for the coefficient vector and employs a maximum posterior estimator to integrate prior knowledge with observations for sparse vector recovery. In noiseless conditions, SBL yields the most accurate sparse solution and maintains robustness even when the sensing matrix columns are highly correlated. However, Bayesian learning entails matrix inversion in each iteration, substantially augmenting computational complexity and posing challenges for real-time applications [16]. In recent years, the application of Bayesian methods in the field of communications has become increasingly widespread, particularly showcasing its unique advantages in channel estimation and signal processing. By incorporating prior information, Bayesian methods can effectively handle the uncertainty and sparsity in high-dimensional data, thereby enhancing estimation accuracy and computational efficiency. For example, Cheng et al. [17] proposed a Bayesian channel estimation algorithm based on irregular array manifolds, which transforms the channel estimation problem into a tensor decomposition problem under missing data scenarios and incorporates Bayesian model order selection techniques to automatically estimate the number of channel paths and significantly improve estimation accuracy. Xu et al. [18] introduced a Bayesian multiband sparsity-based channel estimation framework to address the beam squint effect in millimeter-wave massive MIMO systems. By constructing a virtual channel model and leveraging the common sparse structure across sub-bands, combined with a first-order Taylor expansion to mitigate dictionary grid mismatch, their variational Expectation-Maximization (EM) algorithm can adaptively balance the likelihood function and sparse prior information, significantly enhancing channel estimation accuracy in dual-broadband scenarios. These works highlight the unique advantages of Bayesian methods in joint sparse modeling and computational efficiency optimization, providing important insights for clutter spectrum estimation in STAP.
The SR-STAP based on Bayesian learning necessitates matrix inversion during each iteration, resulting in a steep surge in computational complexity as the system’s dimensionality expands [19]. To mitigate this challenge, various prominent strategies have been proposed. Specifically, a rapid inversion-free approach for Sparse Bayesian Learning (SBL) was presented in [19], leveraging the fact that inverting diagonal matrices is significantly faster than traditional matrix inversion. Nonetheless, this method may suffer from performance decrement in scenarios with a relatively limited number of measurements, stemming from the relaxation of the evidence bound. In another study [20], the Spatial Alternating Variational Estimation (SAVE) method was introduced, which circumvents matrix updates by alternately optimizing each signal element. This approach significantly accelerates the reconstruction process for signals characterized by small-dimensional samples. However, its efficacy diminishes when dealing with signals of exceptionally high dimensionality. Furthermore, Al-Shoukairi et al. [21] put forward an SBL algorithm based on Generalized Approximate Message Passing (GAMP). By employing quadratic approximations and Taylor series expansions, GAMP furnishes approximations for the Maximum A Posteriori (MAP) estimates of the signal, thereby bypassing the need for matrix inversion. Nevertheless, the introduction of an iterative method to replace matrix inversion in SBL still fails to substantially alleviate the computational strain associated with large-scale datasets [22]. To address the limitations of existing methods, we introduce a novel approach that capitalizes on prior knowledge concerning the rank of the space-time clutter covariance matrix. Our contribution lies in an enhanced Variational Bayesian method, which optimizes the parameter update formulations to circumvent the inefficiencies associated with high-dimensional matrix inversion, thereby effectively minimizing computational complexity.

2. Signal Model

Consider an airborne side-looking uniform linear array (ULA) radar comprised of  N  array elements, with each element spaced at a distance d, equivalent to half the radar’s operational wavelength. The height of the carrier platform is designated as  H , the frequency of pulse repetition is expressed as  f r , and the count of pulses within the coherent processing interval ( C P I ) is indicated by  M . The geometrical representation of this airborne radar setup is depicted in Figure 1.
Considering  v  as the velocity of the carrier platform traversing along the  x -axis, and  α  and  β  signifying the elevation and azimuth angles of the ground reflection point, respectively, the  t -th range ring’s space-time snapshot data, incorporating both clutter and noise, and disregarding the impact of range ambiguity, can be formulated as follows:
x t = i = 1 N c δ i V ( f d , i , f s , i ) + n 0
In the formula,  N c  denotes the count of clutter patches contained within the specific range bin, while  δ i  indicates the scattering power of each individual clutter patch.  n 0 , which belongs to the complex space  C NM × 1 , represents the thermal noise component. Furthermore,  V ( f d , i , f s , i )  stands for the space-time steering vector associated with the  i -th clutter patch and can be formulated as follows:
V ( f d , i , f s , i ) = V d f d , i V s f s , i
While
V d ( f d , i ) = 1 , e j 2 π f d , i , , e j 2 π M - 1 f d , i T
V s ( f s , i ) = 1 , e j 2 π f s , i , , e j 2 π N - 1 f s , i T
In the formula,   is used to indicate the Kronecker product, while  [ · ] T  stands for the transposition of a matrix.  f d , i  and  f s , i  represent the normalized Doppler frequency and normalized spatial frequency, respectively, corresponding to the  i -th clutter patch, and they can be formulated as outlined below:
f d , i = 2 vcos α i cos β i λ f r
f s , i = dcos α i cos β i λ
In the equation,  λ  represents the wavelength.
Assuming that the snapshot data of each range bin are independent and identically distributed (IID), the covariance matrix can be expressed as follows [23]:
R C = E [ x ( t ) x H t ]
In the equation,  E [ · ]  denotes the mathematical expectation operation, and  ( · ) H  represents the conjugate transpose of a matrix.
Based on the Linearly Constrained Minimum Variance (LCMV) criterion, the optimal STAP weight vector can be expressed as follows:
W OPT = R C 1 V t f d , f s V t H f d , f s R C 1 V t f d , f s
In the equation,  ( · ) 1  denotes the matrix inversion operation, and  V t ( f d , t , f s , t )  represents the space-time steering vector of the target, which can be expressed as follows:
V t ( f d , t , f s , t ) = V d ( f d , t ) V s ( f s , t )
While
V d ( f d , t ) = 1 , e j 2 π f d , t , , e j 2 π M - 1 f d , t T
V s ( f s , t ) = 1 , e j 2 π f s , t , , e j 2 π N - 1 f s , t T
While
f d , t = 2 v t cos α t cos β t λ
f s , t = dcos α t cos β t λ
In the equation,  f d , t  and  f s , t  represent the Doppler normalized frequency and spatial normalized frequency of the target, respectively, while  α t  and  β t  signify the elevation angle and azimuth angle of the target, respectively.
Finally, the filtered snapshot data is represented as follows:
s ( t ) = W OPT H x ( t )
Given that the Doppler and spatial frequencies of the clutter space-time snapshot signals outlined in Equation (1) are confined within a specific range, a comprehensive set of these frequencies can be derived using an exhaustive approach, with a tolerance for a certain degree of quantization error. This set is denoted as follows:
Φ 0 = [ V ( f d , 1 , f s , 1 ) , V ( f d , 1 , f s , 2 ) , , V ( f d , N s , f s , N s ) ]
where  N d = ρ d N  and  N S = ρ s M , with  ρ d  >> 1 and  ρ s  >> 1 indicating the resolution scales for the Doppler frequencies and spatial frequencies, respectively. These resolution scales serve to regulate the extent of quantization error. Consequently, the space-time snapshot data pertaining to the clutter in the Equation (1) can alternatively be formulated as:
x ( t ) = k = 1 N d i = 1 N s δ k , i V ( f d , k , f s , i ) + n 0 = Φ 0 y ( t ) + n 0
In the equation,  y ( t ) = [ y 1 , 1 ( t ) , y 1 , 2 ( t ) , , y N d , N s ( t ) ] T  represents the complex amplitude of the clutter space-time snapshot data  x ( t )  on the angle-Doppler image, which can also be referred to as the angle-Doppler image.
Considering the space-time steering dictionary set in Equation (15), its column vectors correspond to discretized normalized Doppler frequencies  f d  and spatial frequencies  f s . The actual frequencies  ( f d ~ , f s ~ )  of clutter scatterers may deviate from the predefined grid points. To address this, grid offsets  δ d R N d  and  δ s R N s  are introduced to model the true frequencies as:  f d ~ = f d + δ d , f s ~ = f s + δ s . During initialization, set all elements of both “δ_d” and “δ_s” to 0. Using a first-order Taylor expansion, the off-grid steering vector can be approximated as:
V ( f d ~ , f s ~ ) = V ( f d , f s ) + V f d δ d + V f s δ s
In the equation, the partial derivative terms are:
V f d = j 2 π · diag ( 0 , 1 , , M 1 ) V ( f d , f s )
V f s = j 2 π · diag ( 0 , 1 , , N 1 ) V ( f d , f s )
Therefore, the off-grid dictionary set can be rewritten as follows:
Φ = Φ 0 + Φ d diag δ d + Φ s diag δ s
In the equation,  Φ d  and  Φ s  represent the Jacobians of  Φ 0  with respect to  f d  and  f s , respectively.
Then, Equation (7) can be re-expressed as:
R C = E [ x ( t ) x H t ] = Φ H diag ( P ) Φ + σ n 2 I NM
In the equation,  σ n 2  is the noise power,  I NM  is an identity matrix of dimension NM × NM;  P = [ p 1 , 1 , p 1 , 2 , p N d , N s ] T , where  p k , i = E [ | δ k , i | 2 ] , for  k = 1 , 2 , N d  and  i = 1 , 2 , N s .
Similarly, for the Multiple Measurement Vector (MMV) scenario, the following equation applies:
X = Φ Y + N 0
In the equation,  X = [ x 1 , x 2 , , x L ]  and  Y = [ y 1 , y 2 , , y L ] , where L represents the number of snapshot data points, and each column of  N 0  denotes the received Gaussian white noise.
Assuming that the training samples satisfy the IID condition, the implementation for solving the sparse solution using the MMV method can be expressed as follows:
Y = min | | Y | | 2 , 0 , s . t . | | X - Φ Y | | F 2
In the equation,  | | · | | 2 , 0  is a mixed norm defined as the number of zero elements in the vector formed by the  L 2  norms of each row vector.  | | · | | F  denotes the Frobenius norm.
In Equation (21), the involvement of the  L 0  norm has proven that solving the aforementioned optimization problem is NP-hard. In pursuit of sparsity, dealing with optimization problems and requiring mathematical approximations, the  L 1  norm can be used as a substitute for the  L 0  norm. Therefore, the above equation can be rewritten as:
Y = min | | Y | | 2 , 1 , s . t . | | X - Φ Y | | F 2
In the equation,  | | · | | 2 , 1  is a mixed norm defined as the  L 1  norm of the vector formed by the  L 2  norms of each row vector.

3. The Proposed Method

3.1. Bayesian Framework

Constructing an apt Bayesian model is indispensable and pivotal for SBL. In this paper, we incorporate hierarchical prior information into the latent variables to further augment sparsity. The likelihood associated with the observed variable  X  is expressed as follows:
p ( X | Y , σ , δ d , δ s ) = t = 1 L CN ( Φ y ( t ) , σ 1 I NM )
In the equation,  σ  represents the noise precision.
By imposing a Gaussian distribution prior on the latent variable  Y , we obtain:
p ( Y | γ ) = t = 1 L CN y t 0 , Σ
In the equation,  γ = [ γ 1 , γ 2 , γ N d N s ]  and  Σ = diag ( γ ) . Since the inverse Gamma distribution is conjugate to the Gaussian distribution, we adopt a Gamma distribution prior for each element of  γ , which can be expressed as:
p ( γ | a , b ) = m = 1 N d N s b a γ m a 1 e b γ m Γ a
where  Γ ( a ) = 0 + x a 1 e x dx , with  a  being the shape parameter and  b  being the scale parameter. Similarly, assuming that  σ  follows a Gamma distribution, we obtain:
p ( σ | c , d ) = c d σ c 1 e d σ Γ c
where  c  and  d  are the corresponding shape and scale parameters, respectively. The directed acyclic graph for representing the Bayesian model is shown in Figure 2.
Based on the aforementioned Bayesian hierarchical model assumptions, the joint distribution of the signals can be obtained as:
p ( X , Y , γ , σ | δ d , δ s ) = p ( X | Y , σ , δ d , δ s ) p ( Y | γ ) p ( γ | a , b ) p ( σ | c , d )

3.2. Minimization of KL Divergence (Variational E-Step)

According to Bayesian theory, the maximum posterior probability of the parameters to be estimated can be obtained, but its calculation usually involves high-dimensional and complex integrals, making it difficult to solve. Therefore, we introduce variational inference to address this issue of maximum posterior estimation.
In variational inference, the observed data  X  represents the data received by the array elements, and the set of latent variables  ξ = { Y , γ , σ }  consists of the parameters to be estimated. The following equation holds:
ln p ( X | δ d , δ s ) = L ( q ) + KL ( q | | p ) = q ξ ln p X , ξ q ξ d ξ q ( ξ ) ln { p ξ X q ξ } d ξ
In the equation,  L ( q )  represents the Evidence Lower Bound (ELBO), and  KL ( q | | p )  denotes the Kullback-Leibler (KL) divergence, which measures the approximation degree between the probability distribution  q  and the posterior distribution  p . The smaller the KL divergence, the higher the degree of approximation. The goal of variational inference is to maximize the lower bound of  ln p ( X )  by finding the distribution of  q ( ξ )  that maximizes the ELBO. At this point,  q ( ξ )  can be used to approximate the posterior probability distribution of the latent variables.
To minimize KL Divergence, the probability distribution is decomposed into  q ( ξ ) = q ( Y ) q ( γ ) q ( σ ) , and the general expression for the optimal approximate distribution is provided as follows:
ln q j ( ξ j ) = E i j [ ln p ( X , ξ | δ d , δ s ) ] + const
In the expression,  E i j [ · ]  denotes the conditional expectation of the parameter  ξ i  under the approximate distribution  q , with the condition that  ξ i  for  i j  is held fixed.
By ignoring the terms unrelated to  Y , the optimal approximate posterior distribution for  q ( Y )  can be obtained from Equation (29) as follows:
q ( Y ) = < ln p X Y , ρ , δ d , δ s p Y γ > q γ q ρ + const
where  < · > q ·  denotes the expectation concerning  q · , then  q ( Y )  can be solved as being subject to a joint complex Gaussian distribution, which is expressed as:
q ( Y ) = t = 1 L CN ( y ( t ) ; μ ( t ) , Σ Y )
While
μ ( t ) = ρ Σ Y Φ H x ( t )
Σ Y = ρ Φ H Φ + Σ 1 1
Ignoring the terms unrelated to  γ , the optimal approximate posterior distribution for  q ( γ )  can be obtained from Equation (29) as follows:
q ( γ ) = < ln p Y γ p γ a , b > q Y q ρ + const
q ( γ )  is solved to be a Gamma distribution, with its probability distribution expressed as:
q ( γ ) = m = 1 N d N s Γ ( γ m ; a + 1 2 L , b + b m γ )
γ m new = < γ m > q ( γ m ) = a + 1 2 L b + b m γ
In the equation,  b m γ = 1 2 t = 1 L [ ( μ m ( t ) ) 2 + Σ Y m , m ] , where  μ m ( t )  denotes the  m -th element of the vector  μ ( t ) , and  Σ Y  represents the element in the  m -th row and  m -th column of the matrix  Σ Y .
Ignoring the terms unrelated to  σ , the optimal approximate posterior distribution for  q ( σ )  can be obtained from Equation (29) as follows:
q ( ρ ) = < ln p X Y ; ρ p ρ c , d > q Y q γ + const
q ( ρ )  is solved to be a Gamma distribution, with its probability distribution stated as follows:
q ( ρ ) = Γ ( ρ ; c + 1 2 LNM , d + d ρ )
ρ new = < ρ > q ( ρ ) = c + 1 2 LNM d + d ρ
In the equation,  d ρ = 1 2 | | X Φ U | | 2 2 + 1 2 tr ( Φ H Σ Y Φ ) , where  U = [ μ 1 , μ 2 , , μ L ]  and tr( · ) denotes the trace of a matrix.

3.3. Maximization of the Lower Bound (M-Step)

When  q ( ξ )  is fixed, the maximization of the lower bound of Equation (28) is:
max δ d , δ s L δ d , δ s = < ln p ( X , ξ | δ d , δ s )   > q ( ξ ) + const
Ignoring the terms unrelated to  δ d  and  δ s , Equation (40) can be simplified as [18]:
min δ d , δ s { - 2 R v d T δ d + v s T δ s + δ d T E dd δ d + δ s T E ss δ s + 2 δ d T R E dS δ s }
In the equation, R(∙) denotes taking the real part, where:
v d = t = 1 L diag ( μ ( t ) ) Φ d T E [ x t - Φ μ ( t ) ]
v s = t = 1 L diag ( μ ( t ) ) Φ s T E [ x t - Φ μ ( t ) ]
E dd = t = 1 L diag ( μ ( t ) ) Φ d T Φ d diag ( μ ( t ) ) + tr ( Φ d T Φ d Σ Y )
E ss = t = 1 L diag ( μ ( t ) ) Φ s T Φ s diag ( μ ( t ) ) + tr ( Φ s T Φ s Σ Y )
E ds = t = 1 L diag ( μ ( t ) ) Φ d T Φ s diag ( μ ( t ) ) + tr ( Φ d T Φ s Σ Y )
Then, by setting the derivatives of Equation (41) with respect to  δ d  and  δ s  to zero, the optimal solution for Equation (41) can be obtained.
δ d new δ s new = E dd R E dS R E dS T E ss - 1 R ( v d ) R ( v d )
The iterative algorithm for Variational Bayesian Inference is structurally simple, with the specific steps detailed in Algorithm 1.
Algorithm 1. VB-SR-STAP algorithms.
Stap1: Set   initial   values   for   hyperparameters   γ 0 ,   noise   precision   σ 0 ,   a   predefined   error   tolerance   ε ,   maximum   iteration   count   k max ,   and   parameters   a , b , c , d , δ d , δ s
Stap2: Use   Equations   ( 32 )   and   ( 33 )   to   update   μ ( t ) and   Σ Y ,  respectively
Stap3: Use   Equations   ( 36 )   and   ( 39 )   to   update   γ and   σ ,  respectively
Stap4: Use   Equation   ( 42 )   to   update   δ d and   δ s ,   and   use   Equation   ( 18 )   to   update   Φ
Stap5: Increase   k  by 1
Stap6: If   | | γ new - γ old | | 2 | | γ old | | 2 < ε or   k > k max , then return to Step 2
Stap7:Output the final result of μ(t)
Stap8: First   calculate   R C = 1 L t = 1 L [ ( Φ μ ( t ) ) ( Φ μ ( t ) ) H ] ,   and   subsequently   calculate   w OPT = R C - 1 V t ( f d , f s ) V t H ( f d , f s ) R C - 1 V t ( f d , f s )

3.4. Improved Variational Bayesian Inference

For a squint-looking uniform linear array Doppler pulse radar, assuming a constant pulse repetition frequency, constant platform velocity, and idealized clutter conditions where the clutter scatterers are stationary with no internal motion, the clutter subspace can be approximated by a subspace computed using a set of space-time steering vectors. These space-time steering vectors satisfy the following conditions:
V p   ¯ = 1 , , e j 2 π f s p β n + m , , e j 2 π f s p β N 1 + M 1 T
While
f s p = p N r , p = 0 , 1 , N r 1
β = f d f s = 2 v d f r
In the equation,  N r  represents the rank of the clutter, which can be estimated using the well-known Brennan’s rule [23]. When  β = 2 v d f r  is an integer less than  M , the rank of the clutter covariance matrix satisfies the following equation:
Rank ( R C ) = N r = β ( N - 1 ) + M
When  β = 2 v d f r  is a decimal number less than  M , the rank of the clutter covariance matrix satisfies the following relationship:
Rank ( R C ) N r = β ( N 1 ) + M
In the formula, ⌊∙⌋ denotes the floor function. Then, the  t -th space-time snapshot data in Equation (1) can also be expressed as the following equation:
x ( t ) = p = 0 N r δ p ¯ V p ¯
In the equation,  δ p ¯ , where  p = 1 , 2 , , N r 1 , are complex coefficients corresponding to the space-time steering vectors  V p ¯ . When the space-time steering dictionary includes all space-time steering vectors  { V p ¯ } p = 0 N r 1 , the clutter space-time snapshot data represented by Equation (46) can also be expressed in the form of Equation (16). At this point, the non-zero elements in the angle-Doppler image correspond to the complex coefficients  δ p ¯ , where  p = 1 , 2 , , N r 1 . This indicates that the number of non-zero elements in the clutter-Doppler spectrum,  N r , can be much smaller than the number of clutter scatterers,  N c . Furthermore, the space-time snapshot data of the clutter can be fully recovered using only  N r  space-time steering vectors from the dictionary.
The previous analysis shows that  μ ( t )  is a highly sparse signal, with most of its terms being close to zero. Additionally, Equation (36) indicates that when a corresponding term  μ m ( t )  in  μ ( t )  is non-zero, due to the typically very small settings of parameters a and b, the corresponding term  γ m  will be extremely small. Conversely, if a certain term  γ m  of  γ  is large, the probability that the corresponding term  μ m ( t )  in  μ ( t )  is zero is high. Since each term of  μ ( t )  corresponds to a complex coefficient of a space-time steering vector in the dictionary set,  μ m ( t )  is non-zero only when clutter exists at the corresponding angle-Doppler bin. Therefore, when  γ m  is small, the corresponding  μ m ( t )  (clutter in the angle-Doppler image) is non-zero. Furthermore, based on the aforementioned prior knowledge that there are approximately  N r  non-zero solutions in  μ ( t ) , and inspired by a K-means clustering sparse Bayesian learning algorithm proposed in the literature [24], as well as the rank pruning techniques presented in references [17,18], combining these information allows us to significantly reduce the number of iterations by only updating the values of  μ ( t )  corresponding to the  N r  smallest values of  γ . The update rules for each iteration can be stated as follows: (1) Record the indices of the  N r  smallest elements in the hyperparameter  γ  to form a heap set  A ; (2) Update the terms in  μ ( t )  whose indices are in the heap set  A . The corresponding update formulas then become as follows:
μ m ( t ) = ρ σ m 2 Φ m H x ( t )
σ m 2 = ρ Φ m H Φ m + γ m - 1 - 1
where  Φ m  denotes the  m -th column of matrix  Φ , and  μ m ( t )  and  σ m 2  represent the mean and variance of the complex coefficient corresponding to the  m -th space-time steering vector, respectively. The specific steps of the improved iterative algorithm are listed in Algorithm 2.
Algorithm 2. IVB-SR-STAP algorithms.
Stap1: Set   initial   values   for   hyperparameters   γ 0 ,   noise   precision   σ 0 ,   a   predefined   error   tolerance   ε ,   maximum   iteration   count   k max ,   and   parameters   a , b , c , d , δ d , δ s
Stap2: Update   all   elements   of   μ ( t )   and   Σ Y using   Equations   ( 32 )   and   ( 33 ) ,   respectively ,   and   identify   the   indices   of   the   N r smallest   elements   in   γ   to   form   the   initial   heap   set   A
Stap3: If   the   index   m   of   γ m belongs   to   A ,   update   σ m 2 using   Equation   ( 48 )   and   the   corresponding   μ m ( t )  using Equation (47)
Stap4: Use   Equations   ( 36 )   and   ( 39 )   to   update   γ and   σ ,  respectively
Stap5:Update set A
Stap6: Use   Equation   ( 42 )   to   update   δ d and   δ s ,   and   use   Equation   ( 18 )   to   update   Φ
Stap7: Increase   k  by 1
Stap8: If   | | γ new - γ old | | 2 | | γ old | | 2 < ε or   k > k max , then return to Step 3
Stap9:Output the final result of μ(t)
Stap10: First   calculate   R C = 1 L t = 1 L [ ( Φ μ ( t ) ) ( Φ μ ( t ) ) H ] ,   and   subsequently   calculate   w OPT = R C - 1 V t ( f d , f s ) V t H ( f d , f s ) R C - 1 V t ( f d , f s )

3.5. Comparison of Computational Complexity

The main focus of this article is on the frequency of multiplication and division operations during a single iteration of an algorithm, and it conducts an in-depth analysis of its computational complexity. The proposed algorithm exhibits a computational complexity of  O ( Z N r )  when calculating the variables  μ ( t )  and  Σ Y , where  Z = MN . Meanwhile, the computational complexity for calculating the variable  γ  is  O ( N r ) . As a result, the total computational complexity per iteration amounts to  O ( Z N r ) . In comparison, the computational complexities reported in the pieces of literature [15,20] are O(Z Z ~ 2) and O(Z Z ~ ). Where  Z ~ = N S N d > N r , it is evident that the proposed algorithm significantly reduces computational complexity.

4. Experimental Simulation

4.1. Analysis of Clutter Power Spectrum

The performance of the IVB-SR-STAP algorithms was analyzed through simulation experiments, and compared with the Homotopy-SR-STAP [25], LMSSE-SR-STAP [26], and VB-SA-STAP algorithms [20]. Table 1 presents the simulation parameters for a radar system configured with a uniformly spaced linear array in broadside-looking configuration.

4.1.1. Clutter Power Spectrum Under Ideal Conditions

The first experiment meticulously examined the clutter power spectra of the LMSSE-SR-STAP, Homotopy-SR-STAP, VB-SR-STAP, and IVB-SR-STAP algorithms under ideal conditions, with the results presented in Figure 3. The LMSSE-SR-STAP and Homotopy-SR-STAP algorithms require the setting of regularization parameters; however, it is difficult to determine an optimal value for these parameters. This has led to a noticeable broadening phenomenon in the clutter spectrum, resulting in the dispersion of clutter energy and subsequent degradation in clutter suppression performance. In comparison, the VB-SR-STAP algorithm does not require the setting of regularization parameters, and it continuously updates Equation (18) to address quantization errors. Moreover, its recovered clutter spectrum does not show significant broadening. The clutter energy is concentrated on the clutter ridge, demonstrating its superiority in maintaining spectral integrity. The IVB-SR-STAP algorithm, by updating only a selected few key hyperparameters, is also able to produce a clutter spectrum that compares favorably with the performance of the VB-SR-STAP algorithm. Both the VB-SR-STAP and IVB-SR-STAP algorithms overcome the challenges posed by the setting of regularization parameters, and the results they produce are very close to the optimal clutter spectrum, highlighting their exceptional accuracy and efficiency.

4.1.2. Clutter Power Spectrum Under Array Element Error Conditions

In this subsection, we consider the non-ideal scenario with gain-phase (GP) errors. GP errors arise from inconsistent amplitude and phase characteristics in the radio frequency (RF) amplifier components of the array channels. These errors manifest as variations in amplitude and phase across different channels. To describe this, we introduce an error matrix T into the steering vector modeling. By extending the signal model to account for GP errors, Equation (20) can be rewritten as follows:
X GP = T Φ Y + N 0
where error matrix T is:
T = I M diag ( [ g 1 exe ( j h 1 ) , , g N exe ( j h N ) ] )
where  g i [ 0 , 0.1 ] h i [ 0 , 0.1 π ] , i = 1, …, N represents the amplitude error and phase error of i-th element.
When analyzing the clutter spectra of various algorithms shown in Figure 4 under the condition of GB errors, we can draw the following conclusions: The clutter spectrum calculated by the LSSME-SR-STAP algorithm exhibits significant broadening. This indicates that, under the influence of GB errors, the performance of this algorithm in clutter suppression is greatly affected, resulting in an increased width of the clutter spectrum. Compared to the VB-SR-STAP and IVB-SR-STAP algorithms, the clutter spectrum computed by the Homotopy-SR-STAP algorithm is slightly wider. This implies that, under the same GB error conditions, the Homotopy-SR-STAP algorithm is slightly inferior in maintaining the compactness of the clutter spectrum. The performance of the VB-SR-STAP and IVB-SR-STAP algorithms is noteworthy. The clutter spectra calculated by these two algorithms are close to the optimal clutter spectrum, demonstrating their excellent clutter suppression performance even in the presence of GB errors. This shows that the VB-SR-STAP and IVB-SR-STAP algorithms exhibit greater robustness and adaptability in addressing GB errors.

4.2. Analysis of Signal-to-Clutter-and-Noise Ratio Loss (SCNR Loss)

4.2.1. Signal-to-Clutter-and-Noise Ratio Loss Under Ideal Conditions

The second experiment compares the SCNR Loss of the LMSSE-STAP, Homotopy-SR-STAP, VB-SR-STAP, and IVB-SR-STAP algorithms to evaluate their clutter suppression performance. The SCNR Loss is calculated as follows:
SCNR Loss = σ n 2 W OPT H V t 2 NM W OPT H R C W OPT
In the equation,  σ n 2  represents the noise power.
In ideal scenarios, the experimental results presented in Figure 5 reveal a compelling discovery: The SCNR Loss curve of the LMSSE-SR-STAP algorithm exhibits the most prominent notch width. Compared to the VB-SR-STAP algorithm and the IVB-SR-STAP algorithm proposed in this paper, the SCNR loss curve of the Homotopy-SR-STAP algorithm also displays a relatively wide notch, albeit less pronounced than that of the LMSSE-SR-STAP algorithm. Within the mainlobe clutter region, both the IVB-SR-STAP algorithm and the VB-SR-STAP algorithm are able to form deeper notches compared to the LMSSE-SR-STAP algorithm and the Homotopy-SR-STAP algorithm, with both trailing closely behind the optimal performance. It is worth emphasizing that the proposed new algorithm demonstrates its exceptional effectiveness and remarkable clutter suppression capabilities while significantly reducing the computational burden.

4.2.2. Signal-to-Clutter-and-Noise Ratio Loss Under Array Element Error Conditions

In this subsection, we delve into the specific performance of different algorithms in terms of SCNR Loss when faced with GB errors. By observing Figure 6, it is evident that the SCNR Loss notch produced by the LMSSE-SR-STAP algorithm is significantly broader than that of other algorithms. In contrast, although the Homotopy-SR-STAP algorithm is comparable to the VB-SR-STAP and IVB-SR-STAP algorithms in terms of the width of the SCNR loss notch, it falls significantly behind in the depth of the notch formed in the main clutter region, which is notably less than that of the VB-SR-STAP and IVB-SR-STAP algorithms. Based on the above analysis, we can conclude that in complex environments with array element errors, the IVB-SR-STAP algorithm proposed in this paper not only successfully reduces the computational complexity of the VB-SR-STAP algorithm but also maintains excellent clutter suppression performance. This discovery further validates the feasibility and advantages of the IVB-SR-STAP algorithm in practical applications.

5. Conclusions

In this paper, a hierarchical Bayesian prior framework is proposed, and iterative update formulas for parameters are derived through variational inference methods. Leveraging the prior information on the rank of the spatio-temporal clutter covariance matrix, an improved variational Bayesian approach is introduced, optimizing the updated formulas for parameters and effectively reducing computational complexity. Furthermore, this method fully exploits the joint sparsity of the multiple measurement vector model, achieving higher sparsity while maintaining high accuracy, and employs a first-order Taylor expansion to eliminate grid mismatch in the dictionary. Experimental results demonstrate that the proposed algorithm maintains excellent performance while effectively reducing computational complexity, showcasing its remarkable efficiency and effectiveness.

Author Contributions

Each author contributed to the manuscript. K.L. and J.L. wrote the manuscript; Z.H. and L.Y. derived the theoretical method; P.L. and G.L. provided raw materials. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key Research and Development Program of Anhui Province under Grant 2023Z04020018.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Restrictions apply to the datasets.

Acknowledgments

The authors would like to thank the Editorial Board and anonymous reviewers for their careful reading and constructive comments which provide important guidance for our paper writing and research work.

Conflicts of Interest

Author Peng Li was employed by the company Sun Create Electronics Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Gao, Z.; Deng, W.; Huang, P.; Xu, W.; Tan, W. Airborne Radar Space–Time Adaptive Processing Algorithm Based on Dictionary and Clutter Power Spectrum Correction. Electronics 2024, 13, 2187. [Google Scholar] [CrossRef]
  2. Widrow, B.; Mantey, P.E.; Griffiths, L.J.; Goode, B.B. Adaptive antenna systems. Proc. IEEE 1967, 55, 2143–2159. [Google Scholar] [CrossRef]
  3. Brennan, L.E.; Reed, L.S. Theory of Adaptive Radar. IEEE Trans. Aerosp. Electron. Syst. 1973, AES-9, 237–252. [Google Scholar] [CrossRef]
  4. Melvin, W.L. A STAP overview. IEEE Aerosp. Electron. Syst. Mag. 2004, 19, 19–35. [Google Scholar] [CrossRef]
  5. Honig, M.L.; Goldstein, J.S. Adaptive reduced-rank interference suppression based on the multistage Wiener filter. IEEE Trans. Commun. 2002, 50, 986–994. [Google Scholar] [CrossRef]
  6. Wang, L.; Lamare, R.C.d. Adaptive reduced-rank LCMV beamforming algorithm based on the set-membership filtering framework. In Proceedings of the 2010 7th International Symposium on Wireless Communication Systems, York, UK, 19–22 September 2010; pp. 125–129. [Google Scholar]
  7. Cristallini, D.; Burger, W. A Robust Direct Data Domain Approach for STAP. IEEE Trans. Signal Process. 2012, 60, 1283–1294. [Google Scholar] [CrossRef]
  8. Cristallini, D.; Rosenberg, L.; Wojaczek, P. Complementary direct data domain STAP for multichannel airborne passive radar. In Proceedings of the 2021 IEEE Radar Conference (RadarConf21), Atlanta, GA, USA, 8–14 May 2021; pp. 1–6. [Google Scholar]
  9. Du, X.; Jing, Y.; Chen, X.; Cui, G.; Zheng, J. Clutter Covariance Matrix Estimation via KA-SADMM for STAP. IEEE Geosci. Remote Sens. Lett. 2024, 21, 1–5. [Google Scholar] [CrossRef]
  10. Tropp, J.A.; Wright, S.J. Computational Methods for Sparse Solution of Linear Inverse Problems. Proc. IEEE 2010, 98, 948–958. [Google Scholar] [CrossRef]
  11. Mallat, S.G.; Zhifeng, Z. Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 1993, 41, 3397–3415. [Google Scholar] [CrossRef]
  12. Wang, J.; Shim, B. On the Recovery Limit of Sparse Signals Using Orthogonal Matching Pursuit. IEEE Trans. Signal Process. 2012, 60, 4973–4976. [Google Scholar] [CrossRef]
  13. Shaobo, L.; Yuanhua, R.; Xingping, S.; Zongben, X. Learning Capability of Relaxed Greedy Algorithms. IEEE Trans. Neural Networks Learn. Syst. 2013, 24, 1598–1608. [Google Scholar] [CrossRef] [PubMed]
  14. Zou, J.; Fu, Y.; Xie, S. A Block Fixed Point Continuation Algorithm for Block-Sparse Reconstruction. IEEE Signal Process. Lett. 2012, 19, 364–367. [Google Scholar] [CrossRef]
  15. Tipping, M.E. Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar] [CrossRef]
  16. Zhang, L.; Dai, L. Image Reconstruction of Electrical Capacitance Tomography Based on an Efficient Sparse Bayesian Learning Algorithm. IEEE Trans. Instrum. Meas. 2022, 71, 1–14. [Google Scholar] [CrossRef]
  17. Cheng, L.; Xing, C.; Wu, Y.C. Irregular Array Manifold Aided Channel Estimation in Massive MIMO Communications. IEEE J. Sel. Top. Signal Process. 2019, 13, 974–988. [Google Scholar] [CrossRef]
  18. Xu, L.; Cheng, L.; Wong, N.; Wu, Y.C.; Poor, H.V. Overcoming Beam Squint in mmWave MIMO Channel Estimation: A Bayesian Multi-Band Sparsity Approach. IEEE Trans. Signal Process. 2024, 72, 1219–1234. [Google Scholar] [CrossRef]
  19. Duan, H.; Yang, L.; Fang, J.; Li, H. Fast Inverse-Free Sparse Bayesian Learning via Relaxed Evidence Lower Bound Maximization. IEEE Signal Process. Lett. 2017, 24, 774–778. [Google Scholar] [CrossRef]
  20. Thomas, C.K.; Slock, D. Save-space alternating variational estimation for sparse bayesian learning. In Proceedings of the 2018 IEEE Data Science Workshop (DSW), Lausanne, Switzerland, 4–6 June 2018; pp. 11–15. [Google Scholar]
  21. Al-Shoukairi, M.; Schniter, P.; Rao, B.D. A GAMP-Based Low Complexity Sparse Bayesian Learning Algorithm. IEEE Trans. Signal Process. 2018, 66, 294–308. [Google Scholar] [CrossRef]
  22. Zhou, W.; Zhang, H.-T.; Wang, J. An Efficient Sparse Bayesian Learning Algorithm Based on Gaussian-Scale Mixtures. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 3065–3078. [Google Scholar] [CrossRef] [PubMed]
  23. Ward, J. Space-time adaptive processing for airborne radar. In Proceedings of the 1995 International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, USA, 9–12 May 1995; Volume 2805, pp. 2809–2812. [Google Scholar]
  24. Lv, X.; Yuan, L.; Cheng, Z.; Zuo, L.; Yin, B.; He, Y.; Ding, C. An Improved Bayesian Learning Method for Distribution Grid Fault Location and Meter Deployment Optimization. IEEE Trans. Instrum. Meas. 2024, 73, 1–10. [Google Scholar] [CrossRef]
  25. Yang, Z.; Nie, L.; Huo, K.; Wang, H.; Li, X. Sparsity-based space-time adaptive processing using complex-valued homotopy technique. In Proceedings of the 2013 IEEE Radar Conference (RadarCon13), Ottawa, ON, Canada, 29 April–3 May 2013; pp. 1–6. [Google Scholar]
  26. Melvin, W.L.; Guerci, J.R. Knowledge-aided signal processing: A new paradigm for radar and other advanced sensors. IEEE Trans. Aerosp. Electron. Syst. 2006, 42, 983–996. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of airborne radar.
Figure 1. Schematic diagram of airborne radar.
Entropy 27 00242 g001
Figure 2. Directed acyclic graph of the proposed Bayesian model.
Figure 2. Directed acyclic graph of the proposed Bayesian model.
Entropy 27 00242 g002
Figure 3. Clutter Power Spectrum under Ideal Conditions. (a) OPT; (b) LSSME-SR-STAP algorithm; (c) Homotopy-SR-STAP algorithm; (d) VB-SR-STAP algorithm; (e) IVB-SR-STAP algorithm.
Figure 3. Clutter Power Spectrum under Ideal Conditions. (a) OPT; (b) LSSME-SR-STAP algorithm; (c) Homotopy-SR-STAP algorithm; (d) VB-SR-STAP algorithm; (e) IVB-SR-STAP algorithm.
Entropy 27 00242 g003
Figure 4. Clutter Power Spectrum under Array Element Error Conditions. (a) OPT; (b) LSSME-SR-STAP algorithm; (c) Homotopy-SR-STAP algorithm; (d) VB-SR-STAP algorithm; (e) IVB-SR-STAP algorithm.
Figure 4. Clutter Power Spectrum under Array Element Error Conditions. (a) OPT; (b) LSSME-SR-STAP algorithm; (c) Homotopy-SR-STAP algorithm; (d) VB-SR-STAP algorithm; (e) IVB-SR-STAP algorithm.
Entropy 27 00242 g004
Figure 5. Signal-to-Clutter-and-Noise Ratio Loss under Ideal Conditions.
Figure 5. Signal-to-Clutter-and-Noise Ratio Loss under Ideal Conditions.
Entropy 27 00242 g005
Figure 6. Signal-to-Clutter-and-Noise Ratio Loss under Array Element Error Conditions.
Figure 6. Signal-to-Clutter-and-Noise Ratio Loss under Array Element Error Conditions.
Entropy 27 00242 g006
Table 1. Simulation parameters of the radar system with a side-looking uniform linear array.
Table 1. Simulation parameters of the radar system with a side-looking uniform linear array.
ParametersValue
Number of Array Elements10
Number of Pulses10
Element Spacing (m)0.1
Operating Wavelength (m)0.2
Flight Speed (m/s)150
Flight Altitude (m)4000
Pulse Repetition Frequency (Hz)5000
ρ d 4
ρ s 4
Training snapshot number10
SNR (dB)30
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, K.; Luo, J.; Li, P.; Liao, G.; Huang, Z.; Yang, L. Improved Variational Bayes for Space-Time Adaptive Processing. Entropy 2025, 27, 242. https://doi.org/10.3390/e27030242

AMA Style

Li K, Luo J, Li P, Liao G, Huang Z, Yang L. Improved Variational Bayes for Space-Time Adaptive Processing. Entropy. 2025; 27(3):242. https://doi.org/10.3390/e27030242

Chicago/Turabian Style

Li, Kun, Jinyang Luo, Peng Li, Guisheng Liao, Zhixiang Huang, and Lixia Yang. 2025. "Improved Variational Bayes for Space-Time Adaptive Processing" Entropy 27, no. 3: 242. https://doi.org/10.3390/e27030242

APA Style

Li, K., Luo, J., Li, P., Liao, G., Huang, Z., & Yang, L. (2025). Improved Variational Bayes for Space-Time Adaptive Processing. Entropy, 27(3), 242. https://doi.org/10.3390/e27030242

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop