**Recent Advancements in Radar Imaging and Sensing Technology**

Editors

**Piotr Samczynski Elisa Giusti**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Piotr Samczynski Warsaw University of Technology Poland

Elisa Giusti CNIT—National Inter-University Consortium for Telecommunications Italy

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Sensors* (ISSN 1424-8220) (available at: https://www.mdpi.com/journal/sensors/special issues/ Radar Imaging Sensing).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-0918-1 (Hbk) ISBN 978-3-0365-0919-8 (PDF)**

Cover image courtesy of Prof. Samczynski.

© 2021 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


#### **Songhua He and Xiaotian Wu**


## **About the Editors**

**Piotr Samczynski** (Associate Professor) received his B.Sc. and M.Sc. degrees in electronics and Ph.D. and D.Sc. degrees in telecommunications, all from the Warsaw University of Technology (WUT), Warsaw, Poland in 2004, 2005, 2010 and 2013 respectively. Since 2018, he has been the Associate Professor at the WUT; and since 2014—a member of the WUT's Faculty of Electronics and Information Technology Council. Prior to this, he was Assistant Profesor at WUT (2018–2010), a research assistant at the Przemyslowy Instytut Telekomunikacji S.A. (PIT S.A.) (2010–2005) and the head of PIT's Radar Signal Processing Department (2010–2009). Prof. Samczynski's research interests are in the areas of radar signal processing, passive radar, synthetic aperture radar and digital signal processing. He is the author of over 200 scientific papers. Prof. Samczynski was involved in several projects for the European Research Agency (EDA), Polish National Centre for Research and Development (NCBiR) and Polish Ministry of Science and Higher Education (MiNSW), including the projects on SAR, ISAR and passive radars. Since 2009 he has been a member of several research task groups under the NATO Science and Technology Organization, where he supports the research work in the fields of radar signal processing, modern passive and active radars architectures and noise radars. Since 2018 he is a Chair of NATO SET-258 research task group (RTG) on Deployable Multiband Passive/Active Radar (DMPAR) deployment and assessment in military scenarios. Prof. Samczynski is an IEEE member since 2003, and IEEE Senior member since 2016. He is a member of IEEE AES, SP, and GRS Societies and in 2017–2019 Prof. Samczynski was a Chair of the Polish Chapter of the IEEE Signal Processing Society, and from 2019 he is Vice-Chairman (AES) of the IEEE Poland APS/AESS/MTTS Joint Chapter. He received IEEE Fred Nathanson Memorial Award for outstanding contribution to the field of passive radar imaging, including systems design, experimentation and algorithm development in 2017.

**Elisa Giusti** received the Telecommunication Engineering Laurea (cum laude) and Ph.D. degrees from the University of Pisa (Italy) in 2006 and 2010, respectively. From June 2009 to November 2014, she was a researcher under contract at the Department of Information Engineering of the University of Pisa. From November 2014 to December 2015 she was a research under contract at the CNIT-RaSS National Laboratory. Since December 2015 she is a permanent researcher at the CNIT-RaSS National Laboratory (Pisa, Italy). Since 2009 she has been involved as a researcher in several international projects funded by Italian ministries (Ministry of Defence, Ministry of Economic Development) and European organisations (EDA, ESA). She is co-founder of a radar systems-related spin-off company, "ECHOES radar technologies" located in Pisa. She is a co-author of more than 60 papers and 3 book chapters. She is editor of a book titled "Radar Imaging for Maritime Observation", CRC press. She has been recipient of the 2016 Outstanding Information Research Foundation Book publication award for the book "Radar Imaging for Maritime Observation". Her research interests are mainly in the field of radar imaging, including active, passive, bistatic, multistatic, and polarimetric radar.

## **Preface to "Recent Advancements in Radar Imaging and Sensing Technology"**

During the last decades, radar imaging and sensing technology has made major scientific and technical progress. The first applications of this technology were devoted mostly to military uses. Nowadays, radar imaging and sensing techniques are widely used in many civilian applications, ranging from medicine, though security, to safety assistance sensors widely used in transportation, including cars, trains, and airplanes, among others. These technologies are starting to be present everywhere around us. With the fast development of new hardware platforms with advanced computational resources that are widely available on the market, novel signal processing techniques—enabling enhanced functionalities of radar systems—have been implemented. This, in turn, makes it possible to apply new technology in radar imaging, such as, for example, passive radar sensing. Just a few years ago, this type of sensing was at a very low technical readiness level, and today it has become a mature technology that will probably be offered on the market within the next few years. Moreover, the ever-wider bandwidth of the currently available receivers allows the creation of very high-resolution radar images utilizing both active and passive radar technology.

The aim of this Printed Edition of Special Issue was to gather the latest research results in the area of modern radar technology using active and/or radar imaging sensing techniques in different applications, including both military use and a broad spectrum of civilian applications. As a result, the 19 papers that have been published highlighted a variety of topics related to modern radar imaging and microwave sensing technology. The papers included in the Printed Edition of Special Issue dealt with wide aspects of different applications of radar imaging and sensing technology. A brief revision of the content of this book is presented below.

The first paper in this Special Issue entitled "A Hybrid SAR/ISAR Approach for Refocusing Maritime Moving Targets with the GF-3 SAR Satellite" proposed to combine Synthetic Aperture radar (SAR) and Inverse SAR (ISAR) to refocus the maritime moving targets. In the paper, the authors describe a novel hybrid SAR/ISAR approach. This approach is based on the improved rank-one phase estimation method (IROPE). The authors proposed to use an iterative two-step convergence approach in the IROPE. As a result, the proposed method achieves accurate phase error, maintains robustness to noise, and performs well in estimating various phase errors. In the presented paper, the proposed method's performance has been compared with other focusing algorithms in terms of processing simulated data and real complex image data acquired by Gaofen-3 (GF-3) in spotlight mode. The results shown in this paper demonstrates the effectiveness of the proposed method and high potentials also for high-resolution long-CPI spaceborne radar.

The authors of the paper "Azimuth Phase Center Adaptive Adjustment upon Reception for High-Resolution Wide-Swath Imaging", propose a method to set the proper PRF in a multichannel SAR system composed of a number of receiving antennas to effectively implement high resolution wide swath (HRWS) SAR imaging. In fact when using sub-apertures at the receiver the used PRF may be not optimum decreasing the quality of the SAR image. Particularly a non uniform sampling of the received along the along track dimension may cause gratings lobes, higher side lobes and even ghost targets. The authors propose a way to automatically adapt the optimum value of the PRF within a certain range by adjusting the phase center spacing of the sub-apertures.

The paper entitled "Focusing Bistatic Forward-Looking Synthetic Aperture Radar Based on an Improved Hyperbolic Range Model and a Modified Omega-K Algorithm" proposes to improve hyperbolic approximation range form with high-order terms to obtain a more accurate compensation result in focusing bistatic forward-looking SAR. Additionally, the authors adopt a modified omega-K algorithm based on the new slant range or parallel bistatic forward-looking SAR imaging. The paper presents several simulation results, which validate the effectiveness of the proposed imaging algorithm.

The paper "Microwave Staring Correlated Imaging Based on Unsteady Aerostat Platform" propose an algorithm to form radar images of the observed area by using an unsteady aerostat platform and the microwave staring correlated imaging algorithm. MSCI has been proven effective when SAR cannot be used (forward looking or staring geometries). However the MSCI algorithm relies on the hypothesis of stationary platform which is difficult to be realized in practice especially when using a tethered aerostat. The authors then propose an algorithm to include the antenna motion and the dynamic beam coverage cause by the instability of the platform in the imaging model. Therefore the real-time position vectors of the antenna are used in the imaging model instead of static position vector. A least square error curve fitting is used to estimate the accurate translational speed and rotational velocity of the array at each pulse.

The paper "Strip-Mode Microwave Staring Correlated Imaging with Self-Calibration of Gain–Phase Errors" proposes to apply microwave staring correlated imaging (MSCI) with strip-mode self-calibration of gain–phase errors. The method solves the problem of MSCI with gain–phase error, which occurs in a large SAR scene. The problem exists in the multi-transmitter array, resulting in an imaging model mismatch and considerably degrading the imaging performance. The authors propose to divide the imaging SAR scene into multiple imaging strips. Then in the next step, the strip target scattering coefficient and the gain–phase errors are combined into a multi-parameter optimization problem that can be solved in the iteration procedure. The error estimation results in each iteration are set as the initial value for the next iteration. As a result, the whole SAR imaging in a large scene is achieved by multi-strip image splicing. The proposed method reduces the time required by the SAR imaging process and improves the imaging quality.

The paper "Geometrical Matching of SAR and Optical Images Utilizing ASIFT Features for SAR-based Navigation Aided Systems" proposes a new approach for the estimation of shift and rotation between optical and SAR images. The estimated shift and rotation can be used to calculate the navigation correction when the drift of the calculated SAR platform trajectory is expected. The method can be used in platforms where there is no satellite navigation signal present. In such a case, the trajectory is calculated only on the basis of an inertial navigation system, which is characterized by a significant error. The proposed method of estimating the navigation error utilizing Affine Scale-Invariant Feature Transform (ASIFT) and Structure from Motion (SfM) is described in this paper. The presented methodology was tested and verified using real-life SAR images. Merged techniques such as ASIFT-based keypoint extraction and SfM-based keypoints matching make this method robust and resistant to noise and interference. Thus, the presented methodology can be successfully integrated with existing systems to enhance their precision and dependability. Additionally, the authors provide a comparison of several filters, including their computational complexity and performances. The detailed results of this analysis are shown in the article.

The paper "Wavelength-Resolution SAR Ground Scene Prediction Based on Image Stack" presents five different statistical methods for ground scene prediction (GSP) in wavelength-resolution SAR images. The predictions are based on image stacks, which are composed of images from the same scene acquired at different instants with the same flight geometry. In the paper, the authors considered the following methods in their study: autoregressive models, trimmed mean, median, intensity mean, and mean calculations. The authors indicate that the median method provided the most accurate representation of the true ground. Additionally, in the paper, a change detection algorithm was considered using the median ground scene as a reference image to show the applicability of the GSP. The obtain results presenting competitive performance when compared with recently published works.

The paper "A Multi-Scale U-Shaped Convolution Auto-Encoder Based on Pyramid Pooling Module for Object Recognition in Synthetic Aperture Radar Images" proposes another way to implement auto-encoder to simultaneously extract global and local target features. More specifically, the proposed approach learns multi-scale features at two levels: the modality level features and the branch level feature. Also a modified objective function is proposed to handle the degradation caused by the speckle. Moreover, a new convolution layer and its counterpart are also developed to reduce the number of trainable parameters in the model in order to alleviate overfitting caused by the limited training samples.

The paper "A MIMO-SAR Tomography Algorithm Based on Fully-Polarimetric Data" presents a fully-polarimetric unitary multiple signal classification (UMUSIC) tomography algorithm to acquire high-resolution 3D radar imagery for a multiple-input multiple-output (MIMO) SAR with a small number of baselines. The authors employ fully-polarimetric data and their conjugation to obtain the sample covariance matrix to mitigate the effect of multi-looking on the range-azimuth resolution. In the presented paper, two algorithms, including the popular distributed compressed sensing (DCS) and UMUSIC, are compared through numeric simulation of different point scatterers. All these comparisons have been made by the authors using the fully-polarimetric data. The MIMO SAR algorithm has been validated using measured data of an aircraft model with six different baselines. The final obtain results show the usefulness of the algorithm for 3D imagery of complex radar targets.

The authors of "Target Localization Using Double-Sided Bistatic Range Measurements in Distributed MIMO Radar Systems" propose a way to enhance the target localization performance using a distributed MIMO radar system. The main novelty with respect to the literature consists in the use of both the target-transmitter and target receiver distances as auxiliary information to enhance the calculation of target time delays and therefore enhance the estimation of the target coordinates.

The authors of the paper "Research of a Radar Imaging Algorithm Based on High Pulse Repetition Random Frequency Hopping Synthetic Wideband Waveform" propose a radar imaging algorithm specifically designed for high PRF and random frequency hopping (RFH) waveforms. The use of high PRF has obvious advantages especially when using very fast moving platforms (e.g. supersonic or hypersonic aircrafts). Moreover the use of RFH make the radar system particularly robust to jammer. On the other end, however the use of RFH makes impossible the use of Fourier based approaches to form the radar image of the observed scene because the received echoes are not uniformly spaced in the data domain. Algorithms have been proposed in the literature but a branch of these algorithm requires many constraints on the structural characteristics of the non-uniform sampling signal therefore making them poor versatile, the other branch of algorithm that is based on the use of compressed sensing while showing good performance requires however high computational burden. The authors therefore propose an algorithm based on Doppler preprocessing and 2D generalized matched filter (GMF) to try overcoming the limitations of the algorithms proposed in the literature. Moreover several RFH modes are designed along with the corresponding imaging algorithm.

"Compressed Sensing Radar Imaging: Fundamentals, Challenges, and Advances" is a review of the modern concept of compressed sensing applied to radar imaging. The authors present the main technical achievements and the technical background of the most used techniques, such as the minimum variance unbiased estimation, least squares (LS) estimation, Bayesian maximum a posteriori (MAP) estimation. Moreover the main challenges and still open problems are analyzed in this paper These include the sampling scheme, the computational complexity, the sparse representation, the influence of clutter and the model error compensation.

"Compressive Sensing-Based Bandwidth Stitching for Multichannel Microwave Radars" address the topic of forming high range resolution range profiles (HRRP) using different frequency bands and compressive sensing. Phase errors due to incorrect timing synchronization and antenna phase's center relative locations make it complicated to form HRRPs. In this paper this challenges is addressed proposing two methods based on CS theory: the pruned orthogonal matching pursuit (POMP) and using a l 1-norm regularization algorithm to jointly estimate the range profile and the phase errors.

"Compressive Sensing for Tomographic Imaging of a Target with a Narrowband Bistatic Radar" proposes a method to form high resolution 2D radar images using narrow band radar but large synthetic aperture and compressive sensing (CS) based algorithm instead of standard tomographic approach. Particularly the authors of this paper propose the use of the parameter refined orthogonal matching pursuit (PROMP) algorithm. A key feature of this algorithm is that it can address the dictionary mismatch problem that may arise because of the presence of off-grid scatterers. The algorithm performance are the compared to that of standard tomographic approaches and of the orthogonal matching pursuit (OMP) by using simulated and data acquired in an anechoic chamber in a fully controlled experiment.

"Two-Dimensional Augmented State–Space Approach with Applications to Sparse Representation of Radar Signatures" propose a sub-space based method to reconstruct a sparse 2D radar image of man-made targets. Among all the approaches that have been proposed in literature a group of those are the subspace-based approaches, such as MUSIC, MEMP, etc. These demonstrate to have good performance but still some challenges to address, such as the model order that is a requirement for the MUSIC and that is difficult to be a priori known. The authors of this paper propose a 2D augmented state-space approach (ASSA) to try answering these challenges adequately.

The paper "On the Slow-Time k-Space and its Augmentation in Doppler Radar Tomography" presents enabling signal processing technique as a combination of Doppler Radar Tomography (DRT) and a sparse reconstruction technique such as Orthogonal Matching Pursuit (OMP), with a unifying mathematical framework based on the slow-time k-space. DRT relies on spatial diversity from the rotational motion of a target rather than spectral diversity from wide bandwidth signals. The slow-time k-space is a novel form of the spatial frequency space generated by the relative rotational motion of a target at a single radar frequency, which can be exploited for high-resolution target imaging by a narrowband radar with Doppler tomographic signal processing. In the paper, the authors demonstrated the ability to improve image resolution using a rotating target with an ultra-narrowband radar. The proposed technique has been validated using real measurements. As a result, it has been shown that closely spaced scatterers can be resolved by illustrating the creeping wave effect when the scatterer size is similar to the radar wavelength. The proposed method offers a unique and interesting characteristic of the slow-time k-space, which can be augmented and significantly enhance imaging resolution by signal processing and provide more information to identify unknown targets detected by the radar.

The paper "Target Doppler Rate Estimation Based on the Complex Phase of STFT in Passive Forward Scattering Radar" presents a novel approach to estimating target motion parameters in passive forward scattering radars (FSR). In the proposed method, the modulation factor, also called the Doppler rate, is estimated in the time-frequency (TF) domain. The approach proposed by the authorsutilizes the idea of the complex phase of the short-time Fourier transform (STFT) and its modification known from the literature. Additionally, in this paper, the accuracy of the considered estimators were verified using the Cramer-Rao lower bound (CRLB). The authors validate the proposed method using simulations and signals collected during real measurement from a radar operated in passive FSR geometry. The accuracy of the considered tools has been verified by statistical analysis and the comparison of results to the CRLB. The obtained results showed the differences between the estimators as well as the expected accuracy.

The paper "The Use of the Reassignment Technique in the Time-Frequency Analysis Applied in VHF-Based Passive Forward Scattering Radar" presents the application of the time-frequency (TF) reassignment technique in passive forward scattering radar (FSR) using Digital Video Broadcasting – Terrestrial (DVB-T) transmitters of opportunity operating in the Very High Frequency (VHF) band. The authors propose to use this method to enhance the readability of the energy distribution in the TF domain, which improved the result of the Hough transform and finally the precision of the Doppler rate estimation in the passive FSR system. The algorithm has been validated using real-life signals collected by the passive radar demonstrator during a measurement campaign. Additionally, in the described experiment, the authors tested the possibility of utilizing FSR geometry in foliage penetration conditions taking advantage of the VHF band of a DVB-T illuminator of opportunity. The final obtained results show that the concentrated (reassigned) energy distribution of the signal in the TF domain allows a more precise target Doppler rate to be estimated using the Hough transform.

The paper "Noise Suppression for GPR Data Based on SVD of Window-Length-Optimized Hankel Matrix" presents a novel method based on singular value decomposition (SVD) of a window-length-optimized Hankel matrix in application to improve the noise suppression performance in ground-penetrating radar (GPR). The effectiveness of the proposed method has been verified by authors using simulated and real measurement GPR data. The experimental results show that the proposed method can effectively improve noise removal performance under different detection scenarios in GPR applications.

The "New Concept of Combined Microwave Delay Lines for Noise Radar-Based Remote Sensors" paper focuses on the implementation of an optimized analogue microwave tunable delay line to be used in a noise radar to detect micro-movement with range determination. Despite the fact that current development of noise radars mainly concerns the use of advanced techniques of digital signal processing in order to obtain fully-digital correlation receivers, the use of analog correlation based receiver and tunable reference delay line may still be an interesting solutions, especially when dealing with detection of millimeters movement, like vital activity of the human body. To address this issue the paper comprise the concept of a digital controlled delay line with a set of fine distance gates. This concept assumes the use of a combined set of three lines, including a new version of a tapped delay line.

In this section, the short introduction of the published articles in the Special Issue entitled "Recent Advancements in Radar Imaging and Sensing Technology" has been presented. All published papers show new directions of developing algorithms in the area of topics including high-resolution radar imaging, novel Synthetic Apertura Radar (SAR) and Inverse SAR (ISAR) imaging techniques, passive radar imaging technology, modern civilian applications of using radar technology for sensing, multiply-input multiply-output (MIMO) SAR imaging, tomography imaging, among others.

> **Piotr Samczynski, Elisa Giusti** *Editors*

## *Article* **A Hybrid SAR/ISAR Approach for Refocusing Maritime Moving Targets with the GF-3 SAR Satellite**

#### **Zhishuo Yan 1,2, Yi Zhang <sup>1</sup> and Heng Zhang 1,\***


Received: 18 February 2020; Accepted: 2 April 2020; Published: 4 April 2020

**Abstract:** Due to self-motion and sea waves, moving ships are typically defocused in synthetic aperture radar (SAR) images. To focus non-cooperative targets, the inverse SAR (ISAR) technique is commonly used with motion compensation. The hybrid SAR/ISAR approach allows a long coherent processing interval (CPI), in which SAR targets are processed with ISAR processing, and exploits the advantages of both SAR and ISAR to generate well-focused images of moving targets. In this paper, based on hybrid SAR/ISAR processing, we propose an improved rank-one phase estimation method (IROPE). By using an iterative two-step convergence approach in the IROPE, the proposed method achieves accurate phase error, maintains robustness to noise and performs well in estimating various phase errors. The performance of the proposed method is analyzed by comparing it with other focusing algorithms in terms of processing simulated data and real complex image data acquired by Gaofen-3 (GF-3) in spotlight mode. The results demonstrate the effectiveness of the proposed method.

**Keywords:** synthetic aperture radar (SAR); moving targets; inverse SAR (ISAR); motion compensation; hybrid SAR/ISAR; improved rank-one phase estimation (IROPE); Gaofen-3 (GF-3)

#### **1. Introduction**

Synthetic aperture radar (SAR) provides all-weather, day–night, wide-range high-resolution imaging capabilities for a wide range of applications in Earth science and climate change research, marine detection and imaging, and disaster monitoring [1]. Nevertheless, SAR uses the motion of the radar, ignoring target motion, to coherently synthesize a large aperture that provides a narrow synthesized beam, and it thus has a high resolution across the range. Therefore, in the ship detection and classification scenario, ships are always defocused with conventional SAR processing because of individual motion and sea waves. Inverse SAR (ISAR) uses the rotational motion of the targets, ignoring radar motion, to distinguish different relative velocities through coherent processing and Doppler effects and to form the synthetic aperture. Thus, ISAR processing is more adaptable to the moving scenario since it is superior in imaging moving targets undergoing complex unknown motions [2]. However, due to the unpredictability of non-cooperative targets, motion compensation in ISAR imaging is a challenging task and usually includes two steps: range tracking and Doppler tracking, i.e., coarse phase compensation and fine phase compensation [3]. This paper focuses on the study of fine phase compensation, which is more sensitive than range tracking. Moreover, during the operation of SAR for a moving target, both the radar and target are in motion, which means that the processing of the moving target must combine the processes of SAR (radar motion) and ISAR (target motion) [4,5]. Hybrid SAR/ISAR [6,7] processing is such an approach to optimally process SAR data

by treating target and radar platform motions on an equal footing, which takes advantage of ISAR processing to generate the focused image of the moving target in SAR.

Ideally, the phase of the echo signal in the range and azimuth profiles varies linearly during the processing time. However, due to various factors, there exist undesired phase changes in the echo signal, which are collectively referred to as phase error [8]. Phase error, which is divided into low-frequency, high-frequency, and random phase errors, causes geometric distortion, resolution degradation, false targets, and reduced signal-to-noise ratio (SNR), thus resulting in poor image quality. Low-frequency errors encompass linear phase errors, quadratic phase errors (QPEs), and so on. The low-frequency phase errors primarily affect the main lobe of the system impulse response, while high-frequency errors affect the sidelobe regions. Random phase errors cause multiple pairs of echoes around the target, and the main lobe energy is reduced [9].

To estimate the phase error and refocus the images, many fine phase compensation algorithms have been proposed, which are roughly divided into parametric and nonparametric algorithms. The parametric algorithms include the Mapdrift (MD) method [10,11], the phase difference (PD) method [12] and methods of parameter estimation [13]. MD and PD methods are easy to implement, but they only compensate for QPEs, which limits their applications. Furthermore, Chen et al. [14] proposed a parametric sparse representation method. The acceleration and third-order phase were considered in [15,16]. Tang et al. [17] achieved 2D velocity estimation of moving targets and refocusing based on back projection and velocity SAR (another multichannel SAR-GMTI technique). Nevertheless, these methods introduce nonlinear operations, which degrade the performance in the case of low SNR. [18–21] presented a method for imaging moving targets via the compressive sensing (CS) method, which is capable of generating images with better target focusing, especially with low SNR and high undersampling ratios.

The nonparametric algorithms mainly include the maximum contrast (MC) [22], minimum entropy (ME) [23,24], weighted least-squares (WLS) [25], sharpness optimization [26,27], Doppler centroid tracking (DCT) [28], phase gradient autofocus (PGA) [29] and rank-one phase estimation (ROPE) [30,31] methods. Because the MC, ME, WLS and sharpness optimization methods do not make any assumptions about the characteristics of the target itself, they are highly adaptable. However, since the synthetic aperture process is non-stationary and random, such algorithms generally have local extremum problems. The DCT, PGA, and ROPE methods align the range envelopes and successively adjust the phase to compensate for the translational motion. The DCT and PGA [32] methods are not model-based and exhibit robust performance. Nevertheless, their compensation accuracy is unsatisfactory if there are high-frequency and random phase errors. Furthermore, the model-based ROPE method assumes that each range bin contains no more than one scattering center. The main idea of the method is to use the phase finite difference to estimate the phase error, to find the phase average by alternating along the range direction and the azimuth direction, and to estimate both the phase error and Doppler frequency. In addition, the high azimuth resolution in ISAR processing is generated using a Doppler frequency gradient generated by the rotation of the target relative to the radar line of sight (RLOS) [33]. By averaging all range units of the estimated Doppler frequency, the phase error will be obtained more accurately with influences of the rotational phase weakened, ultimately owning a more precise compensation for the translational phase error. The most remarkable feature of the method is that it estimates not only the phase error of arbitrary order but also the wideband phase error. However, the ROPE method still includes some flaws: The model-based ROPE algorithm strictly requires that there be at most one strong scattering point for each range bin, which limits its application to many images that do not approximate the model; the performance of the method with respect to the phase error estimation will be greatly reduced under low SNR; and the ROPE method uses zero as the initial estimation for the Doppler frequency, which is blind and may lead to unsatisfactory estimates.

Motivated by these aforementioned observations, in this paper, a refocusing method named IRPOE is proposed to solve the above problems existing in ROPE. Our contributions are summarized as follows:


The Gaofen-3 (GF-3) satellite is the first Chinese C-band multi-polarization high-resolution SAR imaging satellite [34]. As one of the most important satellites in China's Earth observation systems, GF-3's features include high resolution, large imaging swath, multiple imaging modes, and long operating life [35,36]. GF-3 plays an essential role in the fields of marine environment monitoring, land resource investigation, and disaster prevention, providing high-quality data for scientific experiments. This paper uses the ocean data of the GF-3 satellite to demonstrate the proposed method and the work has the merit to show its potentialities against satellite data.

This paper is organized as follows. The moving signal model in SAR and ISAR systems is presented in Section 2. Section 3 proposes a phase estimation algorithm named IROPE and elaborates the performance. Extensive experimental results on both simulated and real data are presented in Section 4 to demonstrate the effectiveness and robustness of the proposed method. Finally, Section 5 concludes the paper.

#### **2. Moving Signal Model**

In this section, the signal models of SAR and ISAR are presented. When introducing the SAR signal model, the moving echo signal characteristics and the influence of motion parameters are analyzed. Furthermore, the translation and rotation Doppler shifts involved in the ISAR signal model are analyzed.

#### *2.1. SAR Signal Model*

#### 2.1.1. Analysis of Moving Echo Characteristics

The geometry of SAR imaging of the moving target is shown in Figure 1. The target is described in Cartesian coordinates with the initial position at *P*(*x*0, *y*0, 0). The SAR platform moves along the predetermined track, where *va* and *h* represent the velocity and height, respectively. *vx*, *ax*, *vy*, *ay*, *vr*, and *ar* are the observed target's velocities and accelerations in azimuth, range and radar RLOS directions. *t* is the slow time, and the distance from point P to the radar platform is *Rc*, *R*<sup>2</sup> *<sup>c</sup>* = *y*<sup>2</sup> <sup>0</sup> + *<sup>h</sup>*2. *R*<sup>0</sup> is the distance between SAR and the target at the initial time, and *R*<sup>2</sup> <sup>0</sup> = *<sup>x</sup>*<sup>2</sup> <sup>0</sup> + *<sup>R</sup>*<sup>2</sup> *<sup>c</sup>* . The target moves to *P*(*xt*, *yt*, 0) at time *t*, and the distance between SAR and the target is *R*(*t*).

**Figure 1.** Moving target SAR geometry.

Letting *<sup>v</sup>*-= *va* − *vx*, the square of the slant range *R*(*t*) is described as

$$\begin{split} R(t)^2 &= h^2 + (v\_{tt}t - \mathbf{x}\_0 - v\_{xt}t - \frac{1}{2}a\_{\mathbf{x}}t^2)^2 + (y\_0 + v\_{yt}t + \frac{1}{2}a\_{\mathbf{y}}t^2)^2 \\ &= R\_0^2 + (\hat{v}t - \frac{1}{2}a\_{\mathbf{x}}t^2)^2 - 2\mathbf{x}\_0(\hat{v}t - \frac{1}{2}a\_{\mathbf{x}}t^2) + (v\_{yt}t + \frac{1}{2}a\_{\mathbf{y}}t^2)^2 + 2y\_0(v\_{yt}t + \frac{1}{2}a\_{\mathbf{y}}t^2) \end{split} \tag{1}$$

Equation (2) gives Taylor series expansions of *R*(*t*) and ignores high-order items (cubic or higher), where *vr vy* <sup>=</sup> *<sup>y</sup>*<sup>0</sup> *<sup>R</sup>*<sup>0</sup> and *ar ay* <sup>=</sup> *<sup>y</sup>*<sup>0</sup> *R*0 . Equation (1) is simplified as

$$R(t) = R\_0 + \frac{1}{2R\_0}((\hat{v}^2 + v\_y^2 + a\_rR\_0 + x\_0a\_x)t^2 - 2x\_0\hat{v}t) + v\_rt\tag{2}$$

Accordingly, the Doppler phase *φ*(*t*) is

$$\begin{split} \phi(t) &= \frac{4\pi}{\lambda} R(t) \\ &= \frac{4\pi}{\lambda} (R\_0 + \frac{1}{2R\_0} ((\hat{v}^2 + v\_y^2 + a\_r R\_0 + \mathbf{x}\_0 a\_x) t^2 - 2\mathbf{x}\_0 \hat{v} t) + v\_r t) \end{split} \tag{3}$$

where *λ* represents the wavelength of the transmitted signal.

$$\begin{cases} f\_c = \left. \frac{-1}{2\pi} \frac{d\phi}{dt} \right|\_{t=0} = \frac{-2v\_r}{\lambda} + \frac{2\chi\_0 \hat{v}}{\lambda R\_0} \\ f\_r = \left. \frac{df\_c}{dt} \right|\_{t=0} = \frac{-2}{\lambda R\_0} (\hat{v}^2 + v\_y^2 + \chi\_0 a\_x + a\_r R\_0) \end{cases} \tag{4}$$

where *fc* represents the Doppler centroid frequency, and *fr* is the azimuth FM rate.

For stationary targets, *vx* = *vy* = *vr* = 0, *ax* = *ay* = *ar* = 0. Equation (4) is expressed as

$$\begin{cases} f\_{rc} = \frac{2x\_0 v\_a}{\lambda R\_0} \\ f\_{rr} = \frac{-2v\_a^2}{\lambda R\_0} \end{cases} \tag{5}$$

Then, the Doppler centroid frequency and the azimuth FM rate generated by the target's motion are

$$\begin{cases} f\_{lc} = \frac{-2v\_r R\_0 - 2x\_0 v\_x}{\lambda R\_0} \\ f\_{lr} = \frac{-2}{\lambda R\_0} (v\_x^2 - 2v\_a v\_x + v\_y^2 + a\_r R\_0 + x\_0 a\_x) \end{cases} \tag{6}$$

#### 2.1.2. Analysis of Moving Target Response

Equation (7) gives Taylor series expansions of the phase error:

$$\begin{split} \Delta\phi(t) &= \frac{4\pi}{\lambda} \Delta R(t) \\ &\approx \frac{4\pi}{\lambda} (R + \left.\frac{d\Delta R(t)}{dt}\right|\_{t=0} t + \left.\frac{d^2 \Delta R(t)}{dt^2}\right|\_{t=0} \frac{t^2}{2} + \dots \end{split} \tag{7}$$

As [37] mentioned, the first-order phase errors cause azimuth positional offset of the target scattering point, i.e., position deviation. Quadratic phase errors cause target defocusing. Third-order phase errors mainly cause the asymmetry of the sidelobe levels on both sides of the main lobe; in the strong target condition, the image appears ghost-like. The fourth phase errors mainly cause the sidelobe level to increase. The higher-order phase errors will increase the integrated sidelobe level and have little effect on the main lobe width. Generally, due to the Doppler effect, the azimuthal motion of the ship results in blurred defocusing, and the range motion results in an additional shift of the image.

#### *2.2. ISAR Signal Model*

Assuming that the number of scattering centers is K, the range-compressed data are represented as [37]

$$\mathbf{s}(\tau, t) = \sum\_{k=1}^{K} A\_k \rho\_r(\tau - 2r(t)/c) \omega\_a(t - t\_c) e^{-j\frac{4\pi f\_0 r(t)}{c}} \tag{8}$$

Here, *Ak* represents the backscattered coefficients of the scatterer *k*. *f*<sup>0</sup> is the carrier frequency of the system, and *τ*, *t* and *tc* denote the fast time, slow time, and beam center offset time, respectively. The distance of point P from the radar is *r*(*t*). *ρ<sup>r</sup>* represents range envelope (a sinc function) and *ω<sup>a</sup>* represents azimuth envelope (a sinc-squared function) [38].

The Doppler effect of the target's motion is described in the geometry in Figure 2. From Equation (8), motion compensation removes the phase term *exp*(−*j*4*π f*0*r*(*t*)/*c*). Assuming that the middle of the target is the origin O, *r*(*t*) is expressed as

$$r(t) \approx R(t) + x \cos \theta(t) + y \sin \theta(t) \tag{9}$$

**Figure 2.** Geometry of the ISAR system.

Here, *R*(*t*) is the target's translational range distance from the radar and *θ*(*t*) represents the rotational angle of the target with respect to the RLOS axis, *u*. Equation (10) gives the Taylor series expansions of *R*(*t*) and *θ*(*t*) and ignores high-order items (cubic or higher):

$$\begin{aligned} R(t) &\approx R\_0 + \upsilon\_l t + 1/2a\_lt^2 + \cdots \\ \theta(t) &\approx \theta\_0 + \omega\_r t + 1/2a\_r t^2 + \cdots \end{aligned} \tag{10}$$

*R*<sup>0</sup> is the initial range of the target, and *vt* and *at* are the target's translational velocity and acceleration, respectively. Similarly, *θ*<sup>0</sup> is the initial angle of the target with respect to the RLOS axis. *ω<sup>r</sup>* and *α<sup>r</sup>* are the angular velocity and acceleration of the target, respectively.

The echoes of the *k*th range bin are expressed as:

$$s\_k(\tau, t) = A\_k \rho\_r(\tau - 2r(t)/c) \omega\_a(t - t\_c) \phi\_l \phi\_r \tag{11}$$

where *φ<sup>t</sup>* and *φ<sup>r</sup>* are the phase terms caused by the translational and rotational movement of the target, respectively.

$$\begin{cases} \phi\_t = \varepsilon^{\frac{-j4\pi f\_0}{\varepsilon}(R\_0 + v\_t t + 1/2a\_t t^2 + \cdots)}\\ \phi\_r = \varepsilon^{\frac{-j4\pi f\_0}{\varepsilon}\sqrt{\frac{r^2 + y^2}{\varepsilon}}} \sin(\beta + \theta\_0 + \omega\_r t + 1/2a\_r t^2 + \cdots) \end{cases} \tag{12}$$

Here, *sinβ* = √ *<sup>x</sup> <sup>x</sup>*2+*y*<sup>2</sup> . The imaging process of the following section removes the influence of the translational phase, including *vt* and *at*, which offers no contribution to ISAR imaging [39].

#### **3. Improved Rank-One Phase Estimation Algorithm**

#### *3.1. Problems of the Rank-One Phase Estimation Algorithm*

The ROPE method, first developed in [30], estimates and removes the phase error, which guarantees that range-Doppler (RD) imaging can proceed in the usual manner. The algorithm has been modified by [31] to extend its scope of application (including ISAR processing), but some limitations remain. First, the model-based ROPE algorithm strictly requires that there be at most one strong scattering point for each range bin, which limits its application to many images that do not fit the model. Second, the performance of the algorithm decreases sharply under low SNR. Next, blind initialization of the Doppler frequency to zero results in inaccurate estimation. These problems limit the performance and application of the ROPE algorithm. To solve the above problems, the IROPE algorithm is proposed and explained in detail as follows.

#### *3.2. Principle of IROPE*

Combine formulas to explain the improvements of IROPE and the reasons for the improvement in detail.

#### I. Preliminary Phase Compensation

First, the range-aligned echo signal *e*(*r*) is subjected to preliminary phase estimation and compensation using the DCT algorithm.

Multiply the conjugate of the *i*th echo with the next echo to find the average phase difference between adjacent range units:

$$\varepsilon^{j\varphi} = \frac{\int \varepsilon\_i^\*(r)\varepsilon\_{i+1}(r) dr}{\int |\varepsilon\_i(r)\varepsilon\_{i+1}(r)| \, dr} \tag{13}$$

Use *ej<sup>ϕ</sup>* to adjust the phase shift of *ei*+1(*r*) so that the average phase shift with respect to the adjacent one-dimensional range direction is zero, which is equivalent to aligning the target to a phase center, and the average Doppler shift of the target rotating around the center is zero, thus eliminating the effect of the remaining phase difference.

Through the preliminary phase correction, the SNR is improved, the subsequent estimation becomes more accurate, and the processed image is more consistent with the signal model. After the initial phase correction in this step, the peak value of the special point obtained by Fourier transform will be sharper than the original, which improves the effect of the subsequent circular shifting step.

#### II. Two-step Convergence

Next, there are *J* range units and that each range cell contains no more than one scattering center. The signal of the *j*th range cell is given by

$$s\_{k,j} = a\_j \exp\left[i\left(\omega\_j k \Delta T + \varepsilon\_k + a\_j\right)\right] + n\_{k,j} \tag{14}$$

where *aj* is the amplitude of the signal, *k* is the azimuth pulse number, Δ*T* is the pulse period, *ε<sup>k</sup>* is the phase error, *nk*,*<sup>j</sup>* is the additional complex noise, and *ωj*/2*π* is the Doppler position after imaging.

The phase in each pixel of the range-time array is

$$
\varphi\_{k,j} = \omega\_j k \Delta T + \varepsilon\_k + \mathfrak{a}\_j \tag{15}
$$

The difference of Equation (15) is

$$
\rho\_{k,j}^\* = \varrho\_{k+1,j} - \varrho\_{k,j} = \omega\_j \Delta T + \varepsilon\_{k+1} - \varepsilon\_k \tag{16}
$$

Assume *ω*ˆ*<sup>j</sup>* = *ωj*Δ*T* so that

$$
\omega \circ\_{k,j} = \varphi\_{k+1,j} - \varphi\_{k,j} = \omega\_j + \varepsilon\_{k+1} - \varepsilon\_k \tag{17}
$$

when the SNR is infinite, i.e., *nk*,*<sup>j</sup>* = 0.

$$D\_{k,j} = \frac{s\_{k+1,j} s\_{k,j}^\*}{\left| s\_{k+1,j} \right| \left| s\_{k,j}^\* \right|} = \exp\left[i\left(\hat{\omega}\_{\hat{\jmath}} + \varepsilon\_{k+1} - \varepsilon\_k\right)\right] = \exp\rho \hat{\varrho}\_{k,j} = \exp\left[i\hat{\omega}\_{\hat{\jmath}}\right] \exp\left[i\left(\varepsilon\_{k+1} - \varepsilon\_k\right)\right] \tag{18}$$

*Dk*,*<sup>j</sup>* is the product of a column vector and a row vector, and *D* = *Dk*,*<sup>j</sup>* is a rank-one matrix.

Let *ε*ˆ (*p*) *<sup>k</sup>* = *εk*+<sup>1</sup> − *εk*. After initialization, the ROPE method consists of the following two operations

$$\begin{cases} \quad \mathfrak{E}\_k^{(p)} = \angle \sum\_{j=1}^{\bar{l}} D\_{k,j} \exp\left(-i2\pi\omega \mathfrak{d}\_j^{\cdot (p-1)}\right) \\\\ \omega\_{\hat{l}}^{(p)} = \angle \sum\_{k=1}^{K} D\_{k,j} \exp\left(-i2\pi\mathfrak{k}\_k^{(p)}\right) \end{cases} \tag{19}$$

The superscript *p* represents the *p*th operation. When the maximum value of the two estimated changes is less than the small threshold value *T*, the process leads to convergence. When the SNR is high, although the influence of noise causes the rank of matrix D to not be equal to one but approaching one, the two-dimensional alternative estimation is still reasonable and feasible. Nevertheless, when the SNR is low, *nk*,*<sup>j</sup>* cannot be ignored, *D* is not a rank-one matrix, and the ROPE algorithm fails. The final phase error estimate is

$$\mathfrak{E} = \sum\_{k=1}^{K} \mathfrak{E}\_k^{(p)} \tag{20}$$

Equation (19) estimates the phase error and Doppler frequency simultaneously, which estimates the phase error more accurately and avoids the influence of the phase rotation component.

However, the initial phase error <sup>E</sup><sup>ˆ</sup> <sup>1</sup> in Equation (20) of the original ROPE algorithm is simply set to zero, which is relatively blind and results in unsatisfactory estimation.

#### III. Circular Shifting

The maximum value of the range bin still represents the Doppler frequency corresponding to the strong scattering point, but the energy of the scattering point diffuses in the azimuth direction. Moving the strongest scattering point to zero Doppler frequency eliminates the deficiency of setting the initial Doppler frequency to zero to some extent. The circular shift operation not only aligns the strong scatterers but also improves the SNR of the phase compensation; subsequently, the processed data are more consistent with the model.

#### IV. Iteration

Nevertheless, it is difficult to accurately align the Doppler circular shift when the SNR is low, which affects the estimation accuracy. Therefore, multiple iterative algorithms are then used to improve the SNR, thereby further improving the accuracy of the Doppler circular shift and the estimation of phase error.

The algorithm is summarized in Algorithm 1.

**Algorithm 1:** The IROPE algorithm for phase compensation

```
Input: The range-aligned echo e(r), [Nr, Na] = size(e(r)), p = 0, number of iterations l, threshold value T
1: I. Preliminary Phase Compensation
2: for i = 1 : Na − 1
3: ei+1(r) = ei+1(r). ∗ ejϕ
4: end
5: IV. Iteration
6: for l = 1 : l (Image entropy is further applied to control the iteration process)
7: II. Two-step Convergence
8: III. Circular Shifting
9: while εˆ
           (p)
           k - εˆ
                (p−1)
                k > T
10: for k = 1 : Na − 1
11: Update εˆ
                    (p)
                    k calculated by Equation (19)
12: end
13: for j = 1 : Nr
14: Update ω(p)
                     j calculated by Equation (19)
15: end
16: p = p + 1
17: end while
18: εˆ = ∑K
          k=1 εˆ
               (p)
               k
19: e(r) = e(r). ∗ exp(−1i ∗ εˆ)
20: end
Output: Compensated range-Doppler echo e(r), phase error εˆ
```
The flow chart of the IROPE procedure is indicated in Figure 3 and described in detail as follows.

**Figure 3.** Block diagram of the IROPE procedure.

Step 1: Use the DCT method to perform initial phase correction on the echo data after range tracking for preprocessing.

Step 2: Perform IFFT transform in the azimuth direction to generate an ISAR image.

Step 3: Find the maximum amplitude and set the initial zero Doppler to the circular shift of the prominent point in each range bin.

Step 4: By performing azimuth FFT, the data are transformed to the range-Doppler domain. Step 5: Use two-step convergence approach to obtain and compensate for the phase error. Step6: If the effect of refocusing is not sufficient, repeat the process from Step2 to Step5.

#### *3.3. Performance of IROPE*

This subsection presents the experimental results based on range-aligned echo to illustrate the performance of IROPE. The radar operates in the X band. The transmitted signal bandwidth and the synthetic aperture time are 100 MHz and 3.32 s, respectively. The translational motion of the target with range velocity of *vy* = 6 m/s, azimuth velocity of *vx* = 15 m/s, and azimuth acceleration of *ax* = 2 m/s<sup>2</sup> is determined. Assume that there is a sinusoidal error term caused by the target's rotation, which is chosen as 0.5*<sup>π</sup>* <sup>180</sup> *sin*(0.6*t*) rad. The rotational velocity is 0.6 rad/s and *<sup>t</sup>* represents the azimuth observation time. The pulse repetition frequency (PRF) and the radar velocity are 600 Hz and 250 m/s, respectively.

*Example 1* Figure 4 shows the experimental results corresponding to a single moving point target. It can be seen that the point is well focused by the ROPE and IROPE methods. Comparing the interpolated contour and the value of PSLR, IROPE exhibits slight superiority over ROPE. (The technical indicators shown in the figure are explained as follows: impulse response width (IRW), namely the 3 dB main lobe width of impulse response; peak sidelobe ratio (PSLR), the height ratio of the maximum sidelobe to the main lobe; and integrated sidelobe ratio (ISLR)).

*Example 2* Figure 5 shows the experimental results for a simulated ship, i.e., multiple-point targets, in which each range bin of the ship's hull has three strong scattering points with the same strength. It can be seen from Figure 5b that the ROPE algorithm fails because of not satisfying the model in which each range cell contains no more than one scattering center. According to the above subsection analysis, IROPE compensates for the deficiency of ROPE and obtains a good focusing effect.

**Figure 4.** The interpolated contour and azimuth profile. (**a**,**d**) Conventional SAR processing; (**b**,**e**) ROPE; (**c**,**f**) IROPE.

**Figure 5.** Refocused performance. (**a**) Conventional SAR processing; (**b**) ROPE; (**c**) IROPE.

Figure 6a shows that the image entropy (defined in Equation (21)) decreases with the increase of IROPE's iteration times, which proves that iteration improves the image focusing effect. Figure 6b exhibits the image entropy processed by different algorithms, which changes with SNR. The results prove the robust performance with respect to the noise of the IROPE algorithm.

The phase error estimation performance is provided in the next section.

**Figure 6.** Variation in image entropy in terms of iterations (**a**) and SNRs (**b**).

#### *3.4. The Whole Process of the Refocusing Method*

The whole imaging flow chart is shown in Figure 7.

**Figure 7.** Block scheme of the whole process of the refocusing method.

First, the separated echo signal is obtained. The echo data are obtained from the original raw echo data or complex image data. The former approach implements range compression on the original raw echo data, while the latter performs azimuth inverse compression on the selected complex image data of the target of interest to acquire the required echo data for subsequent operations.

Second, range tracking is performed on echo data. Using the cross-correlation [40] of the average range profile, a real-time and efficient method, the ship echo is correlated with the first echo in the imaging time. In addition, through range alignment, the range units of the echoes are aligned, and the amplitude and phase changes of the echo range sequence of each range unit are normal. Eventually, the phase change process generated by the target translation is retained.

Furthermore, we apply IROPE to estimate phase error *ε*ˆ*<sup>k</sup>* and obtain the compensated range-Doppler echo *e*(*r*).

Next, we combine the conventional RD algorithm and use the Hamming window to achieve well-focused images and suppress sidelobes. The conventional RD algorithm is applied to obtain focused ISAR images if the maritime target moves smoothly. However, if the target maneuvers or undergoes significant angular motions (roll, pitch, and yaw), the RD technique does not function properly, and the time-frequency analysis [41] method is a better choice.

#### **4. Experiments and Performance Comparisons**

In this section, the robustness and effectiveness of the proposed method are verified by simulation experiments. Then, the results based on the spaceborne SAR data acquired by the GF-3 SAR system are demonstrated.

#### *4.1. Results of Spotlight Simulation*

The proposed method is applied to spotlight simulation data and analyzed for different motions of the ship target by comparing it with other refocusing algorithms. The basic parameters of the spotlight simulation are shown in Table 1.


**Table 1.** Parameters of spotlight simulation.

#### 4.1.1. Ship Target with Velocity and Acceleration

The target is moving away from the radar with range velocity of *vy* = 3 m/s, azimuth velocity of *vx* = 15 m/s and azimuth acceleration of *ax* = 2 m/s2. As mentioned in Sections 1 and 2, the quadratic phase errors caused by velocity lead to image defocusing; cubic phase error introduced by acceleration mainly causes the asymmetry of the sidelobe levels on both sides of the main lobe [37]. The conventional SAR image and the recovered images of MD, ROPE and the proposed algorithm are shown in Figure 8. It can be seen from Figure 8c,d that the MD algorithm only compensates for the quadratic phase errors caused by velocity but cannot compensate for the cubic phase error caused by acceleration. The image processed by the ROPE algorithm has a high energy of the sidelobes, as shown in Figure 8e,f. Based on the above analysis, the image quality of the proposed method is superior to the other methods.

A discussion of what amount of non-uniform motion can be effectively analyzed during the long CPI, i.e., the limitation in azimuth linear acceleration, follows. Azimuth linear acceleration varies from −2 to 6 m/s2 according to a step size of 2 m/s<sup>2</sup> [42]. The variations in the magnitudes of PSLR and ISLR are shown in Figure 9. It can be seen that the magnitudes of PSLR and ISLR fluctuate with increasing acceleration, and the overall trend is upward. The maximum magnitude of PSLR is below −14 dB, while the maximum magnitude of ISLR is below −9 dB, which means that the algorithm is suitable for practical situations.

**Figure 8.** Recovered images of defocused ship target with velocity and acceleration. (**a,b**) Conventional SAR processing system; (**c,d**) MD method; (**e,f**) ROPE method; (**g,h**) The proposed method.

**Figure 9.** Variations in the magnitudes of PSLR and ISLR in terms of the linear acceleration based on the proposed method. (**a**) PSLR; (**b**) ISLR.

#### 4.1.2. Ship target with Translation and Rotation

The target is moving away from the radar with range velocity of *vy* = 3 m/s and azimuth velocity of *vx* = 15 m/s. We assume that there is a sinusoidal error term caused by the target's rotation, which is chosen as 0.5*<sup>π</sup>* <sup>180</sup> *sin*(0.6*t*) rad. High-frequency and wideband phase errors introduced by rotational velocity mainly affect the sidelobe region and increase the sidelobe level [37]. The conventional SAR image and the recovered images of MD, ROPE and the proposed algorithm are shown in Figure 10. It can be seen from Figure 10c-d that the MD algorithm only compensates for the quadratic phase errors caused by velocity and cannot compensate for the high-frequency and wideband phase errors introduced by rotational velocity. From Figure10e-f, processed by the ROPE method, the main lobe is almost submerged by the sidelobes. Comparing the values of PSLR and ISLR for different algorithms, the image quality of the proposed algorithm is also superior to the other approaches.

**Figure 10.** Recovered images of defocused ship target with translation and rotation speed. (**a**,**b**) Conventional SAR processing system; (**c**,**d**) MD method; (**e**,**f**) ROPE method; (**g**,**h**) The proposed method.

Moreover, assume that *<sup>ω</sup><sup>r</sup>* <sup>=</sup> 0.5*<sup>π</sup>* <sup>180</sup> *sin*(*wt*) rad. *<sup>w</sup>* varies according to a step size of 0.1 rad/s from 0 to 1 rad/s [43]. The experimental results are shown in Figure 11. It can be seen that the magnitudes of PSLR and ISLR fluctuate with increasing rotational angular velocity, and the overall trend is upward. It is observed that the magnitudes of PSLR and ISLR fluctuate with increasing acceleration, and the overall trend is upward. The maximum magnitude of PSLR is below −14 dB, while the maximum magnitude of ISLR is below −9.5 dB, which means that the algorithm is suitable for practical situations.

**Figure 11.** Variation in the magnitudes of PSLR and ISLR in terms of the rotational angular velocity based on the proposed method. (**a**) PSLR; (**b**) ISLR.

#### *4.2. Spaceborne SAR Data Experiments*

In this subsection, the results based on the spaceborne SAR data acquired by the GF-3 SAR system are demonstrated. The parameters are shown in Table 2. The synthetic aperture time has a relatively large value of 8.58 s. In addition to visual inspection, image quality is also evaluated by entropy, contrast and the peak value of the intensity image [44], which are referred to as image quality evaluation metrics (IQEMs).


**Table 2.** Parameters of the GF-3 SAR System.

Let *I*(*m*, *n*) be the absolute value of a two-dimensional complex image, where *m* is the range sample number and *n* is the azimuth number. The image entropy (IE) [23] is written as follows:

$$IE(I) = -\sum\_{m=1}^{M} \sum\_{n=1}^{N} \frac{|I(m,n)|^2}{\alpha\_I} \ln \frac{|I(m,n)|^2}{\alpha\_I} \tag{21}$$

where *α<sup>I</sup>* is the total energy of the image, explained as follows:

$$\alpha\_I = \sum\_{m=1}^{M} \sum\_{n=1}^{N} |I(m,n)|^2 \tag{22}$$

When the image is well focused, the entropy value is small because of the uniform distribution.

The image contrast (IC) is defined as follows [45]:

$$IC(I) = \frac{\sqrt{E\{[I(m,n) - E\{I(m,n)\}]^2\}}}{E\{I(m,n)\}}\tag{23}$$

when the image is focused correctly, it comprises several significant peaks, which enhances the contrast.

The peak value of the intensity image (IP) is an indicator of the image focusing of a local area of the image, and the calculation expression is

$$IP(I) = 10 \cdot \log 10 (\max(I(m, n)))\tag{24}$$

The larger the IP is, the better the image focus is.

Hence, to present a more intuitive and quantitative comparison, Tables 3 and 4 provide the difference of IQEMs between the processed and the original images—contrast increase, entropy reduction, and IP increase. The higher the value is, the better the image quality is.

#### 4.2.1. Real Data Corrupted by Phase Error

Figures 12 and 13 show the experimental image results for phase error and demonstrate the capability to estimate the phase error of the proposed method. The phase errors of the quadratic, third and fourth superpositions of the same weight are added to Figure 12a. Figure 12b shows the corrupted image. There is no improvement in the image after the MD method refocusing, as can be seen from Figure 12c. The ROPE method exhibits a better focusing effect than the MD method does, but it is visually worse than the proposed method. A comparison of the image recovered using the proposed method with the original image shows very good agreement, although there are some slight differences. Figure 13 demonstrates that the phase error estimated by IROPE is closest to the introduced phase error. MD only compensates for the quadratic phase errors, so the estimated and true

phase errors display considerable deviation. The difference phase error also exhibits a large deviation of ROPE because the actual image does not satisfy the model. The success of the IROPE algorithm is obvious; a bias exists, but the bias does not affect the quality of the recovered image from Table 3.

**Figure 12.** Nominal, corrupted, and recovered images. (**a**) Nominal; (**b**) Corrupted by phase error; (**c**) Image recovered with MD method; (**d**) Image recovered with ROPE method; (**e**) Image recovered with the proposed method.

**Figure 13.** Phase error curve. (**a**) Introduced and estimated phase error; (**b**) Phase error difference.


**Table 3.** IQEMs of the corrupted and recovered images.

#### 4.2.2. Intrinsically Corrupted Real Data

Figure 14 illustrates C-band, spotlight GF3 SAR images with azimuth resolution of 1 m.

The center latitude and longitude of the photographed area are (E104.0, N1.3) and (E104.1, N1.3), respectively, located at the Port of Singapore near Changi Airport. This location is at the southern end of the Malay Peninsula, the entrance to the Straits of Malacca.

Three representative defocused ships, marked as Ship1, Ship2 and Ship3, are selected from Figure 14 for the experiment.

Figure 15 gives the original and refocused images, in which the first column shows the original images and the second, third, and fourth columns present the images obtained by MD, ROPE, and IROPE, respectively. From the defocused images of Ship1 and Ship2, the outlines of these large petroleum tankers are vague, and the details are unrecognizable. The wake of Ship3 is obvious due to the smooth sea conditions and the rapid speed. It is determined that the visual quality of every original image after refocusing is improved; the details and contours of the targets are apparent, which is more conducive to subsequent use. Nevertheless, the images processed by the ROPE method are not well focused and remain slightly blurry. A detailed look reveals that the edge of every ship is processed worse by MD than by the proposed method. The values of the quality metrics in Table 4 indicate that the ship reconstructed by the proposed method, is better focused than that reconstructed by the other methods. The entropy, contrast, and IP are superior in our case.

**Figure 14.** GF-3 SAR image of the Port of Singapore. The yellow rectangles are the enlarged defocused sub-images. (**a**) (E104.0, N1.3); (**b**) (E104.1, N1.3).

**Figure 15.** Original and refocused images. (**a**–**d**) Ship1; (**e**–**h**) Ship2; (**i**–**l**) Ship3.

The experimental results for real data demonstrate the effectiveness and superiority of the IROPE algorithm. The three ships used in the experiment are characterized by various phase errors, with


multiple strong scattering points in each range bin under low SNR condition. In these cases, the MD and ROPE algorithms fail, and the proposed method exhibits a better refocusing effect.

#### **5. Discussion**

In this paper, we studied the topic of how to refocus and accurately image a moving ship which cannot be focused by using conventional SAR. The proposed method is compared with the MD [10,11] method and the ROPE [30,31] method and the comparison result is shown in Table 5: Experimental results show that the performance of MD and ROPE in phase error estimation and accuracy are unsatisfactory. The sub-aperture correlation operation of the MD method only compensates for quadratic phase errors, and real SAR images do not fit well with the narrowly defined ROPE model, which limit their applications.

In view of above-mentioned problems, we have made improvements in preprocessing, circular shifting and iteration based on two-step convergence, which is embodied in the following aspects: improving the SNR and the accuracy of estimation through the DCT method for preliminary phase compensation; eliminating the shortcoming of setting the initial zero Doppler through the center shift of the strongest scattering point of each range bin and enhancing the performance of the method through several iterations. With these improvements, our proposed IROPE can achieve more complete image recovery for SAR images. The application scope and further development of the ROPE method are promoted through our improvement. Meanwhile, the experiment on adding phase error to the real data shows that IROPE estimates the phase error and compensates for arbitrary phase errors more accurately.

Moreover, many studies have been taken on this topic [15,16,19,22,24]. Nevertheless, these methods are based on low-resolution, short-CPI simulation or airborne SAR data, and their application to high-resolution, long-CPI spaceborne SAR images requires further verification. However, the proposed method processes high-resolution spaceborne SAR images, which is verified with GF-3 satellite data. Next, our results suggest a possibility of applying to the spaceborne SAR system with a long CPI (8.58 *s*). Furthermore, our work demonstrates great potency of the application of the high-resolution (1 *m*) SAR images.

However, the proposed method also has shortcomings. On the one hand, the dimensional classes of the ships in the experiments need further confirmation. On the other hand, although the proposed method allows a long coherent processing interval and performs well for maritime targets in stable sea conditions, it still needs further research and improvement for maneuvering targets that are experiencing high sea conditions. Therefore, in the future, we will focus our efforts on solving these problems like engaging the verification of the stated ship type in the AIS signals [27] and the optical photographs [46], and exploring the algorithm of maneuvering targets.


**Table 5.** Comparison with other refocusing algorithms.

#### **6. Conclusions**

During the detection of marine moving targets, both SAR and targets are in motion, and thus, conventional SAR processing that only images stationary targets achieves unsatisfactory performance. In this paper, we combined SAR and ISAR techniques and proposed a hybrid SAR/ISAR approach, as the core of the IROPE, to refocus the maritime moving targets. The proposed IROPE method overcomes the weakness of the original ROPE method. Experiments based on simulation and GF-3 measured data demonstrate the effectiveness of the proposed method. In summary, our work makes contributions to the improvement of the unsatisfactory method and the proposed method is also suitable for high-resolution long-CPI spaceborne radar.

**Author Contributions:** Conceptualization, Z.Y., H.Z. and Y.Z.; methodology, Z.Y., H.Z. and Y.Z.; software, Z.Y.; validation, Z.Y., H.Z. and Y.Z.; formal analysis, Z.Y.; investigation, Z.Y.; resources, Z.Y. and H.Z.; data curation, Z.Y. and H.Z.; writing—original draft preparation, Z.Y; writing—review and editing, Z.Y., H.Z. and Y.Z.; visualization, Z.Y.; supervision, H.Z. and Y.Z.; project administration, H.Z. and Y.Z.; funding acquisition, Y.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported in part by the National Key Research and Development Program of China under Grant 2017YFB0502700 and in part by the Talent Fund of the Aerospace Information Research Institute for Distinguished Young Scholars under Grant Y9G01903AF.

**Acknowledgments:** The authors would like to thank the anonymous reviewers for their valuable suggestions and comments. The authors also thank the National Satellite Ocean Application Center in China for providing the GF-3 images. Many thanks to Da Liang, who provided great help and many valuable comments on the revision of this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Azimuth Phase Center Adaptive Adjustment upon Reception for High-Resolution Wide-Swath Imaging**

#### **Wei Xu 1,2,\*, Jialuo Hu 1,2, Pingping Huang 1,2,\*, Weixian Tan 1,2 and Yifan Dong 1,2**


Received: 3 September 2019; Accepted: 30 September 2019; Published: 2 October 2019

**Abstract:** A spaceborne azimuth multichannel synthetic aperture radar (SAR) system can effectively realize high resolution wide swath (HRWS) imaging. However, the performance of this system is restricted by its two inherent defects. Firstly, non-uniform sampling is generated if the pulse repetition frequency (PRF) deviates from the optimum value. Secondly, multichannel systems are very sensitive to channel errors, which are difficult to completely eliminate. In this paper, we propose a novel receive antenna architecture with an azimuth phase center adaptive adjustment which adjusts the phase center position of each sub-aperture to improve multichannel SAR system performance. On one hand, the optimum value of the PRF can be adaptively adjusted within a certain range by adjusting receiving phase centers to obtain uniform azimuth sampling. On the other hand, false targets introduced by residual channel errors after azimuth multichannel error compensation can be further suppressed. The effectiveness of the proposed method to compensate for non-uniform sampling and suppress false targets is verified by simulation experiments.

**Keywords:** synthetic aperture radar (SAR); high resolution wide swath (HRWS); azimuth multichannel reconstruction; phase center adaptation; false targets suppression

#### **1. Introduction**

Synthetic aperture radar (SAR) is an extremely important device for the application of earth observation, especially where there is cloud or poor atmosphere conditions [1–3]. It is also widely used in military surveillance and civilian remote sensing [4–7]. The azimuth multichannel spaceborne SAR systems, usually working with a single transmit antenna and multiple received sub-apertures, can effectively overcome the contradiction between azimuth resolution and range mapping swath width [8–10]. However, in practical multichannel high resolution wide swath (HRWS) SAR systems, there are two problems which affect SAR system performance. Firstly, azimuth non-uniform sampling is generated if the pulse repetition frequency (PRF) is not selected as the optimum value [11–13]. Secondly, the multichannel system is very sensitive to channel errors, which are difficult to completely eliminate.

Many researchers have done a lot of research on the two mentioned issues, but some problems still exist. Firstly, azimuth non-uniform sampling in HRWS SAR can be overcome by the recent proposed azimuth multichannel reconstruction algorithms, but these are only prepared for azimuth band-limited signals [11,14–16]. However, the azimuth echo signal received by a practical SAR system is non-band-limited, and false targets still exist after reconstruction. To solve this problem, a novel transmit antenna architecture which allows for the adjustment of the transmit phase center position through the activation of a specific number of elements on the corresponding location of the transmit antenna was proposed in [17]. However, a decreased number of activated elements reduces the transmit antenna gain and the transmit signal power. Secondly, azimuth channel errors can be compensated for by many recently proposed compensation methods [18–20]. However, since factors such as the manufacturing process, temperature and radiation that affect the characteristics of antenna are not fixed, these methods are unable to perfectly estimate or compensate for multichannel errors, especially when the signal-to-noise ratio (SNR) of the obtained raw data is not high enough. The residual channel errors still cause high false targets which cannot be ignored and need to be further suppressed.

In this paper, a novel receiving antenna architecture that allows for the adjustment of the receiving phase center position of each sub-aperture through the closing of the corresponding elements on the side of each sub-antenna is proposed. In this approach, the transmitted signal power is not reduced, but the reduced receive antenna gain is compensated for by a narrow transmit antenna beam with a high antenna gain. Compared with traditional multichannel SAR systems, the novel proposed approach brings two benefits. Firstly, the optimum value of the PRF can be adaptively adjusted within a certain range by adjusting the phase center spacing of the sub-apertures. Uniform samples can be obtained if the PRF is taken in that range. Secondly, false targets caused by the residual channel errors could be further suppressed by adaptively adjusting phase center upon reception. The novel antenna structure is an improvement on the widely used conventionally phased array antenna. This improvement will hardly increase costs.

Based on the ideas above, this paper is structured as follows. Section 2 analyzes the influence of azimuthal non-uniform sampling and channel imbalance on azimuth multichannel SAR imaging. In Section 3, the basic principle of the proposed azimuth phase center adaptive adjustment upon reception is presented, and its effects on SAR system performance improvement and false targets suppression are analyzed. Simulation experiments are carried out to validate the proposed method in Section 4. Finally, this paper is concluded in Section 5.

#### **2. Influence of Azimuthal Non-Uniform Sampling and Channel Imbalance**

#### *2.1. Influence of Azimuthal Non-Uniform Sampling*

For simplicity, only the azimuth signal of multichannel SAR is analyzed in this paper. Assuming that the number of azimuth receiving channels is an odd *N* and the slant range from the radar to the target is *R*<sup>0</sup> and the intermediate channel is used as the reference channel, the received azimuth echo signal of receiving channel *i* can be written as

$$s\_i(t) = \exp\left[-j\frac{2\pi}{\lambda} \left(\sqrt{R\_0^2 + \left(v\_s t\right)^2} + \sqrt{R\_0^2 + \left(v\_s t - \Delta x\_i\right)^2}\right)\right] \tag{1}$$

where Δ*xi* is the phase center position of channel *i*, Δ*xi* = (*N*+1) <sup>2</sup> − *i* · *daz*, *i* = 1, 2, ··· , *N*, *daz* is the phase center spacing of receiving channel, λ is the wavelength of the carrier, *vs* is the velocity of SAR platform, and *t* is the azimuth time.

Figure 1 illustrates the principle of multichannel system sampling. The equivalent sampling position can be approximately regarded as the midpoint of the transmitting position and the receiving position. In order to uniformly distribute the equivalent single-channel sampling centers, the PRF must meet Equation (2).

$$\text{PRF}\_{\text{opt}} = \frac{2 \cdot v\_s}{N \cdot d\_{az}} \tag{2}$$

If (2) is violated, azimuth non-uniform sampling will be induced, and false targets in azimuth will be generated [21].

The multichannel reconstruction algorithm introduced in [12] can effectively compensate for the non-uniform sampling in azimuth. However, for strong deviations from the optimum PRF in Equation (2), the inverse character of such an algorithm might result in a degraded system performance.

**Figure 1.** Reasons of uniform and non-uniform sampling.

The multichannel SAR system performance of such a system with azimuth non-uniform sampling is based on multichannel reconstruction algorithms [22], and the ratio of the input to output SNR, normalized to the ratio obtained for uniform sampling is expressed as Equation (3) [12]:

$$\Phi\_{\rm bf}(\text{PRF}) = \frac{\left(\frac{\text{SNR}\_{\text{in}}}{\text{SNR}\_{\text{out}}}\right)}{\left(\frac{\text{SNR}\_{\text{in}}}{\text{SNR}\_{\text{out}}}\right)\Big|\_{\text{PRF}\_{\text{opt}}}} = N \cdot \sum\_{j=1}^{N} E\left[ \left| P\_j(f, \text{PRF}) \right|^2 \right],\tag{3}$$

where *E*[·] represents the calculation of the mean value, and *Pj*(*f*, PRF) is the filter function of channel *j*.

Assume that the reconstruction filter matrix is *P*(*f*), which is obtained by inverting matrix *H*(*f*) according to [14]. The azimuth ambiguity-to-signal ratio (AASR) multi-channel system can be written as Equation (4) [12]:

$$ASSR\_W = \frac{\int\_{-\frac{B\_d}{2}}^{\frac{B\_d}{2}} \left| 2 \cdot \sum\_{k=1}^{\infty} \sum\_{m=m\_0}^{N} \sum\_{j=1}^{N} \mathcal{U}\_{jk}(f) \cdot P\_{jm}(f) \right|^2 df}{\int\_{-\frac{B\_d}{2}}^{\frac{B\_d}{2}} \left| \mathcal{U}(f) \right|^2 df} \tag{4}$$

with

$$\mathcal{U}\_{\vec{\mathbb{R}}}(f) = \mathcal{U}(f) \cdot H\_{\vec{\mathbb{R}}}(f) \tag{5}$$

$$m\_0 = \max\{\mathbf{N} - k + 1, 1\},\tag{6}$$

where *Bd* is the Doppler bandwidth, *U*(*f*) is the spectrum of the equivalent monostatic SAR signal, *Pj*(*f*) is the reconstruction filter of channel *j*, *Pjm*(*f*) is *m*-th filter of *Pj*(*f*), *Hj*(*f*) is the pre-filter of channel *j*, and *Hjk*(*f*) is the *k*-th filter in *Hj*(*f*).

According to (3) and (4), both the SNR scaling factor and the AASR are related to the reconstruction matrix *P*(*f*). A PRF deviating strongly from the optimum value will lead to an irreversible *H*(*f*) or very different eigenvalues of *H*(*f*). As a result, the multichannel SAR system performance would be significantly declined.

Using the simulation parameters listed in Table 1, Figure 2 shows that the reconstruction filter causes a degraded imaging performance. Figure 2a,b show the spectrum and pulse compression, respectively, result of the equivalent single-channel signal when the PRF is the optimum value. Since the azimuth SAR echo signal is non-band-limited, several false targets caused by azimuth ambiguity appear in the compression result. When the actual PRF of the system takes a non-optimum value, the reconstruction result of the non-band-limited signal is shown in Figure 2c,d. It can be seen that the tiny false targets in Figure 2b are significantly enlarged in Figure 2d by the non-optimum PRF.


**Table 1.** System simulation parameters.

**Figure 2.** Influence of the pulse repetition frequency (PRF) deviating from the optimum value on imaging performance. (**a**) Reconstructed spectrum with the optimum PRF; (**b**) reconstructed pulse compression result with the optimum PRF; (**c**) reconstructed spectrum with the non-optimum PRF; and (**d**) reconstructed pulse compression result with the non-optimum PRF.

Assuming that the desired PRF range of the system is 1100–1500 Hz and the optimum PRF is 1234.3 Hz according to Table 1, the resulting azimuth ambiguity-to-signal ratio (*AASRN*) and SNR scaling factor (Φbf) are shown in Figure 3. With the growth of the PRF, the AASR continues to decline when the PRF is lower than the optimum value of 1234.3 Hz, but the AASR rises when the PRF exceeds the ideal value, as is shown in Figure 3a. From Figure 3b, it can be seen that sufficiently low values of Φbf are shown from 1100 up to 1370 Hz, but unacceptably high values are generated for the PRF range above 1400 Hz. The phase center adjustment method proposed in this article can obviously decrease the AASR and significantly suppress the unacceptably high values of Φbf.

**Figure 3.** Simulated ambiguity-to-signal ratio (AASR) and signal-to-noise ratio (SNR) scaling factor of conventional reconstruction approach. (**a**) Simulated AASR of conventional reconstruction approach and (**b**) simulated SNR scaling factor Φbf of conventional reconstruction approach.

#### *2.2. Influence of Channel Imbalance*

In actual multi-channel SAR systems, there are always channel errors among channels. These errors cause false targets that seriously reduce the quality of imaging. Assuming that the phase error of the *n*-th channel is φ*<sup>n</sup>* and the amplitude error is *an*, the echo signal model of channel *n* can be expressed as follows:

$$s\_{\mathbb{H}}(t) = a\_{\mathbb{H}} \exp(j\phi\_{\mathbb{H}}) \exp\left[-j\frac{2\pi}{\lambda} \left(\sqrt{R\_0^2 + \left(\upsilon\_{\mathbb{H}}t\right)^2} + \sqrt{R\_0^2 + \left(\upsilon\_{\mathbb{H}}t - \Delta\mathbf{x}\_{\mathbb{H}}\right)^2}\right)\right].\tag{7}$$

In the process of multi-channel signal combination, *N* − 1 zeros need to be added between each sampling point of each channel, and the equivalent single channel signal can be written as:

$$s(m) = \sum\_{n=0}^{N-1} s\_n^0(m) \, , \tag{8}$$

where *m* represents integers associated with sampling time and *s*<sup>0</sup> *<sup>n</sup>*(*m*) represents signals after adding zeros of channel *n*.

According to (8), the discrete time Fourier transform (DTFT) [23] of *s*<sup>0</sup> *<sup>n</sup>*(*m*) is:

$$S\_n^0(e^{j\omega}) = \frac{1}{N} \sum\_{k=0}^{N-1} e^{-j2\pi \frac{jk}{N}} S\left(e^{j(\omega - \frac{2\pi jk}{N})}\right) \tag{9}$$

where *k* represents integers associated with sampling frequency. Denote the digital frequency of (9) using analog frequency, and the spectrum of signals after adding zeros of channel *n* can be derived as:

$$S\_n^0(f) = a\_{\theta} \exp(j\phi\_{\mathbb{H}}) \operatorname{rect}\left(\frac{f}{\mathbb{K}\_d}\right) \exp\left(-j\pi \frac{f^2}{\mathbb{K}\_d}\right)$$

$$= a\_{\theta} \exp(j\phi\_{\mathbb{H}}) \sum\_{l=0}^1 \sum\_{k=1}^{N-1} \mathcal{W}(f) \exp\left(\frac{-j2\pi nk}{N}\right) \exp\left(-j\pi \frac{\left(f + l\text{FFF} - k\frac{\text{FFF}}{N}\right)^2}{\mathbb{K}\_d}\right). \tag{10}$$

where *Ka* is the azimuth frequency modulation (FM) rate, and *f* is the frequency variable in Hz. *W*(*f*) is given by:

$$\mathcal{W}(f) = \text{rect}\left[\frac{f - \frac{\left(\frac{\text{FFT}}{N} - \text{return}\right)}{2} - (-1)^f \frac{\left(\frac{\text{FFT} - \text{up}\_d}{4}\right)}{4}}{(-1)^{l+1}\left(\frac{\text{FFT} - \text{true}}{N} - \text{FFT}\right) + \frac{\left(\frac{\text{IF} - \text{row}\_d}{2}\right)}{2}}\right].\tag{11}$$

Due to the spectrum components corresponding to other *l* values move to the right and fall out, the value of *l* is only 0 and 1.

According to (8) and (10), the pulse compression result of the combined signal can be derived as:

$$\begin{array}{l} \text{s}\_{\text{vol}}(t) = B\_d \text{sinc}\, \text{c}(B\_d t) \sum\_{n=0}^{N-1} a\_{n\ell} \exp(j\phi\_n) + \sum\_{l=0}^1 \sum\_{k=1}^{N-1} f\_{l,k}(t) \\ \qquad \qquad \qquad \sum\_{n=0}^{N-1} \exp\left(\frac{-j2\pi nk}{N}\right) a\_n \exp(j\phi\_n), \end{array} \tag{12}$$

where *fl*,*k*(*t*) is a function introduced to simplify the written, and it has nothing to do with amplitude and phase errors. According to (12), there are 2(*N* − 1) false targets, and the position of each false target is shown as follows:

$$\text{POS}\_{l,k} = \frac{\left(^{l}\mu - \frac{k\_{\mu}}{N}\right)}{T\_4} \,, \tag{13}$$

where *l* = 0, 1, *k* = 1, 2, ... , *N* − 1, μ is the oversampling rate, and *Ta* is the synthetic aperture interval. Assuming that the channel characteristics are independent of frequency, the false target-to-peak ratio of the false target at POS*l*,*<sup>k</sup>* is as follows:

$$\text{PGR}\_{l,k} = 20\log\_{10}\left(\frac{\left|\sum\_{n=0}^{N-1} a\_{0} \exp(j\phi\_{n}) \exp\left(\frac{-j2\pi nk}{N}\right)\right|}{\left|\sum\_{n=0}^{N-1} a\_{n} \exp(j\phi\_{n})\right|}\right) + 20\log\_{10}\left(\frac{\xi\_{l,k}}{B\_{d}}\right),\tag{14}$$

in which *gl*,*<sup>k</sup>* is given by *gl*,*<sup>k</sup>* = (−1) *l*+1 *k*·PRF *<sup>N</sup>* <sup>−</sup> *<sup>l</sup>* · PRF + (*Bd*+PRF) <sup>2</sup> and is independent with amplitude and phase errors.

With parameters listed in Table 2, a simulation experiment to reconstruct the echo signal with channel errors is shown in Figure 4. A reconstructed spectrum and compression result of the system with no channel errors is shown in Figure 4a,b. Figure 4c,d show the reconstruction results in the presence of amplitude errors within 0.5 dB and phase errors of 1–5 degrees. It can be seen that these tiny channel errors that may be residual after channel error compensation can cause false targets of around −35 dB. False targets of such intensity still cause a reduction in image quality and should be further suppressed.

**Figure 4.** Influence of channelimbalance. (**a**) Reconstructed spectrumwith no channel errors; (**b**) reconstructed pulse compression result with no channel errors; (**c**) reconstructed spectrum with channel errors; and (**d**) reconstructed pulse compression result with no channel errors.

#### **3. Phase Center Adjustment upon Reception**

This section introduces an innovative receive antenna architecture in azimuth which allows for the compensation of the non-optimum PRF values and the suppression of false targets by phase center adjustment upon reception. The working process of the innovative system is shown in Figure 5.

**Figure 5.** The working process of the innovative system.

#### *3.1. System Architecture and Basic Principle*

Compared with traditional multi-channel SAR systems, such as systems used by satellites RADARSAT-2 (Canada, 2007), TerraSAR-X (Germany, 2007) and Sentinel-1A (launched by European Space Agency, in 2014), each receiving sub-antenna of the innovative system consists of a large number of individually controllable elements. Such an aperture permits the changing of the position and the length of the effective receiving sub-apertures on the antenna by activating respective elements. The phase center position of the receiving sub-apertures can be adaptively adjusted by turning off a part of elements on the antenna. As an example, a system with three receive apertures is shown in Figure 6.

**Figure 6.** System architecture and basic principle. (**a**) Adjustment to increase phase center spacing and (**b**) adjustment to reduce phase center spacing.

Closing a part of elements inside a sub-aperture can move its phase center outward and increase the phase center spacing, as shown in Figure 6a, while the inward movement of the phase center can be achieved by closing the elements on the outside of the sub-aperture, as shown in Figure 6b. To ensure the effective length of each antenna is equal, the same number of elements on each aperture should be turned off. The phase center position can be adjusted by controlling the amount of closed elements on each side of the aperture. After adjustment, the phase center spacing can be uniform or non-uniform, which should be decided according to practical needs.

#### *3.2. E*ff*ect of Receiving Sub-Aperture Phase Center Adjustment on Azimuth Non-Uniform Sampling*

Take the case where the phase center spacing is reduced as an example. If the length of the Δ*d* antenna on left side of channel 1 and the right side of channel *N* are invalid, the phase center of the two sub-apertures moves inward by <sup>Δ</sup>*<sup>d</sup>* <sup>2</sup> and the distance between the phase centers of channel 1 and channel *N* is reduced by Δ*d*. In the case where the phase center is evenly distributed, the new phase center spacing is:

$$d\_{\text{az\\_new}} = \frac{\left(N - 1\right) \cdot d\_{\text{az}} - \Delta d}{N - 1} = d\_{\text{az}} - \frac{\Delta d}{N - 1}.\tag{15}$$

To ensure that the phase centers are uniformly distributed, the phase center of channel *n* should move inward by:

$$\mathbf{x}\_{\rm nl} = \left| \frac{N+1}{2} - n \right| \cdot \frac{\Delta d}{N-1}, n = 1, 2, \dots, N. \tag{16}$$

The length of the invalid antenna on the outer side of channel *n* should be 2*xn* longer than the invalid antenna on the inner side:

$$
\Delta d\_{n, \text{outer}} - \Delta d\_{n, \text{inner}} = 2\mathbf{x}\_n = \frac{2\Delta d}{N - 1} \cdot \left| \frac{N + 1}{2} - n \right|. \tag{17}
$$

Since the number of closed elements in each channel must be equal (assumed to be Δ*d*), the length of the invalid part on both sides of channel *n* should satisfy:

$$
\Delta d\_{\text{n,outer}} + \Delta d\_{\text{n,irner}} = \Delta d.\tag{18}
$$

Combine Formulas (17) and (18) together, and then Δ*dn*,outer and Δ*dn*,inner can be written as:

$$
\Delta d\_{n, \text{outer}} = \frac{\Delta d}{2} + \frac{\Delta d}{N - 1} \cdot \left| \frac{N + 1}{2} - n \right| \tag{19}
$$

$$
\Delta d\_{n, \text{inner}} = \frac{\Delta d}{2} - \frac{\Delta d}{N - 1} \cdot \left| \frac{N + 1}{2} - n \right|. \tag{20}
$$

Because of the size of individually controllable antenna elements, the receive center of antenna in practical system cannot be adjusted arbitrarily but can only be selected in a series of discrete positions. Assuming that each receive antenna consists of a number of *K* elements, the number of closed elements on each antenna can be written as:

$$p = \text{round}\{\Delta d \cdot \frac{K}{l\_a}\}\_{\prime} \tag{21}$$

in which the operator round {·} indicates the calculation of rounding integers. The number of invalid elements on the outer side and the inner side of the antenna can be written as:

$$p\_{n, \text{outter}} = \text{round}\left\{ \left| \frac{\Delta d}{2} + \frac{\Delta d}{N - 1} \cdot \left| \frac{N + 1}{2} - n \right| \right| \cdot \frac{K}{l\_a} \right\}. \tag{22}$$

$$p\_{n, \text{inner}} = \text{round}\left\{ \left| \frac{\Delta d}{2} - \frac{\Delta d}{N - 1} \cdot \left| \frac{N + 1}{2} - n \right| \right| \cdot \frac{K}{l\_a} \right\} \tag{23}$$

If the actual value of the PRF is PRFact, which is higher than the optimum value, the phase center spacing should be reduced by Δ*daz*, which is given by:

$$
\Delta d\_{az} = d\_{az} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}}.\tag{24}
$$

According to (15), the length of closed elements on each receive antenna should be:

$$
\Delta d = \left( d\_{d\varepsilon} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}} \right) \cdot \left( N - 1 \right). \tag{25}
$$

The number of closed elements on each antenna can be derived as:

$$p = \text{round}\left\{ \frac{K}{I\_a} \cdot (N - 1) \cdot \left( d\_{az} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}} \right) \right\}. \tag{26}$$

The respective number of closed elements on the outer side and inner side on channel *n* is given by:

$$p\_{\text{n,outter}} = \text{round}\left\{ \left| \left( d\_{d\mathbb{Z}} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}} \right) \cdot \frac{(N-1)}{2} + \left( d\_{d\mathbb{Z}} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}} \right) \cdot \left| \frac{N+1}{2} - n \right| \right| \cdot \frac{K}{l\_d} \right\} \tag{27}$$

$$p\_{n, \text{inner}} = \text{round}\left( \left| \left( d\_{\text{dz}} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}} \right) \cdot \frac{(N-1)}{2} - \left( d\_{\text{ox}} - \frac{2v\_s}{N \cdot \text{PRF}\_{\text{act}}} \right) \cdot \left| \frac{N+1}{2} - n \right| \right| \cdot \frac{K}{l\_d} \right). \tag{28}$$

Using the method discussed above, if the number of closed elements on each channel is given by *p*, the optimum PRF of the system can be written as:

$$\text{PRF}\_{\text{opt}} = \frac{2v\_s}{N \cdot \left(d\_{az} \pm \frac{p \cdot l\_d}{K \cdot (N-1)}\right)}.\tag{29}$$

The "±" indicates the increase or decrease of the phase center spacing. In practice, in order to ensure the receiving performance of the antenna, a limited number of elements can be turned off. If the maximum ratio of the closed elements number to the total number on each antenna is η, the number of closed elements can be written as:

$$p\_m = \text{round}\{\eta \cdot K\} \tag{30}$$

The range of the adjusted optimum PRF is:

$$\frac{2\upsilon\_s}{N \cdot \left(d\_{az} + \frac{p\_{nr} \cdot l\_s}{K \cdot (N-1)}\right)} \le \text{PRF}\_{\text{opt}} \le \frac{2\upsilon\_s}{N \cdot \left(d\_{az} - \frac{p\_{nr} \cdot l\_s}{K \cdot (N-1)}\right)}.\tag{31}$$

The minimum value of PRFopt is:

$$\text{PRF}\_{\text{opt\\_min}} = \frac{2v\_s}{N \cdot \left(d\_{az} + \frac{p\_w \cdot l\_x}{K \cdot (N-1)}\right)};\tag{32}$$

and the maximum value is:

$$\text{PRF}\_{\text{opt\\_max}} = \frac{2v\_s}{N \cdot \left(d\_{az} - \frac{p\_{w} \cdot l\_a}{K \cdot (N-1)}\right)}.\tag{33}$$

This is to say, if the practical PRF of the system varies between PRFopt\_min and PRFopt\_max, a uniformly (at least approximately uniformly) sampled signal can be obtained by adjusting the phase center spacing. Consequently, the multichannel reconstruction algorithm can be omitted, and the computational complexity can be reduced compared with conventional systems. If the PRF is outside the optimum PRF range, the phase center spacing should be adjusted to the minimum or maximum value that makes the optimum PRF closest to the actual PRF of the system. In this case, although the non-uniformity of the samples is reduced, the sampling is still non-uniform, and the reconstruction algorithm is still needed. Therefore, the complexity of processing is approximately equal to that of traditional systems.

The effect of the largest proportion of closed elements on the range of the optimum PRF is shown in Figure 7, where Γ is as follows:

$$
\Gamma = \frac{\text{PRF}\_{\text{opt\\_max}} - \text{PRF}\_{\text{opt\\_min}}}{\text{PRF}\_{\text{opt}}}.\tag{34}
$$

**Figure 7.** Relationship between the optimum PRF and the invalid receiving antenna ratio η. (**a**) Maximum (solid blue line) and minimum (red dotted line) value of the optimum PRF and (**b**) range expansion ratio of the optimum PRF.

#### *3.3. Suppression of False Target by Adjusting Receiving Sub-Aperture Phase Center*

In Section 2.2, the expressions of the position and intensity of false targets caused by channel imbalance are analyzed. For the convenience of analysis, the false target-to-peak ratio of the false target at POS*l*,*<sup>k</sup>* given by (14) can be rewritten as:

$$\text{PGR}\_{l,k} = 20 \log\_{10} \left( \frac{\left| \sum\_{n=0}^{N-1} a\_n \exp(j\phi\_n) \exp\left(\frac{-j4\pi k |\mathbf{X}\_n|}{L}\right) \right|}{\left| \sum\_{n=0}^{N-1} a\_n \exp(j\phi\_n) \right|} \right) + 20 \log\_{10} \left( \frac{\mathcal{G}\_{l,k}}{B\_d} \right) \tag{35}$$

where *L* represents the length of the antenna and *Xn* represents the phase center position of channel *n*. If the phase centers of sub-apertures are adjusted, the antenna length *L* can be written as:

$$L = L\_a - \frac{l\_a \cdot q\_{1,l}}{K} - \frac{l\_a \cdot q\_{N,r}}{K},\tag{36}$$

where *La* is the length of receive antenna before adjusting, *qn*,*<sup>l</sup>* is the number of inactive elements on the left side of the antenna *n*, and *qn*,*r* is the number of inactive elements on the right side of the antenna *n*. The value of *Xn* can be written as:

$$X\_n = \left(\frac{N+1}{2} - n\right) \cdot d\_{az} + \frac{l\_a \cdot \left(q\_{n,r} + q\_{\frac{(N+1)}{2},l} - q\_{\frac{(N+1)}{2},r} - q\_{n,l}\right)}{2K}.\tag{37}$$

The value of *Xn* can be adaptively adjusted to generate an additional phase which can cancel a part of the phase error. Based on this, the molecule of the first item in Formula (35) can be minimized by adjusting the phase center position of each channel, and the false-target-to-peak-ratio can be suppressed. The false-target-to-peak ratio can be reduced to:

$$\text{PGR}\_{lk} = 20\log\_{10}\left(\frac{\min\left\{\left|\sum\_{n=0}^{N-1} a\_n \exp\left(j\phi\_n\right) \exp\left(\frac{-j4\pi kl/\kappa\_n}{L}\right)\right|\right\}}{\left|\sum\_{n=0}^{N-1} a\_n \exp(j\phi\_n)\right|}\right) + 20\log\_{10}\left(\frac{\mathcal{G}l.k}{B}\right) \tag{38}$$

Though false targets can be suppressed by simply scrambling the uniformly distributed phase center position on a tiny scale, it is often necessary to compare several different results for optimal suppression. As a result, computational complexity and time-consumption is increased, and the increased complexity and time-consumption are related to the number of comparisons. Assuming that *n* different suppression results are compared, the computational complexity is *n* times higher than that of a traditional multichannel system.

The reconstruction result of error-free signals after phase center adaptive adjustment is shown in Figure 8. It can be seen that adjusting phase center non-uniformly does not affect the reconstruction results for multichannel signals.

**Figure 8.** Reconstruction of non-uniform phase center echo signal without channel error. (**a**) Reconstructed pulse compression result with no channel errors and (**b**) magnified reconstructed pulse compression result with no channel errors.

#### **4. Simulation and Performance Analysis**

To validate the proposed receiving phase center adaptive adjustment approach, simulations were carried out. Simulation parameters are listed in Table 1.

#### *4.1. E*ff*ect on Azimuth Non-Uniform Sampling*

The seven-channel system was used in the simulation experiment. The optimum PRF of the system without adjusting receiving phase centers was 1234.3 Hz. Assuming that up to 30% of elements on each antenna was allowed to be turned off, the optimum value of the PRF could be adaptively adjusted within 1175.5–1299.2 Hz by invalidating elements at both ends of the antenna. Figure 9 gives the reconstruction result of a non-band-limited signal with 1300 Hz of the PRF in the case of conventional phase center spacing and adaptively adjusted phase center spacing. With respect to Figure 9a, false targets in Figure 9b were significantly reduced by phase center adaptive adjustment upon reception leaving only false targets caused by the non-band-limitation of the signal.

The corresponding AASR and SNR scaling factor Φbf are shown in Figure 10. When the PRF value fell into the range 1175.5−1299.2 Hz, the AASR was consistent with the value of equivalent single channel SAR. For PRF values below 1175.5 Hz or higher than 1299.2 Hz, because of the non-uniform sampling, the AASR was higher than that of the equivalent monostatic signal but was obviously decreased with respect to the conventional multichannel reconstruction approach.

**Figure 9.** Reconstruction results of non-uniform sampling and adaptive phase center adjustment. (**a**) Reconstructed spectrum in the non-optimum PRF; (**b**) reconstructed pulse compression result in the non-optimum PRF; (**c**) reconstructed spectrum after adaptive phase center adjustment; and (**d**) reconstructed pulse compression result after adaptive phase center adjustment.

Regarding the SNR, when the PRF fell into 1175.5–1299.2Hz, the value of Φbf remained constant independently of the PRF since the uniform sampling was ensured. For other values of the PRF, because of the non-uniform sampling, the AASR was higher than that in uniform sampling but was obviously lower than the conventional reference, as is shown in Figure 10b.

**Figure 10.** Simulated AASR and SNR scaling factor. (**a**) Simulated AASR and (**b**) simulated SNR scaling factor Φbf.

#### *4.2. E*ff*ect of False Target Suppression*

An example of phase center adjusting to suppress false targets is given in this section. The phase center positions before and after adjustment are listed in Table 2. Figure 11 verifies the suppression effect of phase center adaptively adjusting on the false targets. Figure 11a shows the reconstructed pulse compression results with the presence of channel errors; Figure 11b is an enlargement of Figure 11a at the top of a false target. It can be seen that false targets could be suppressed for about 2 dB after adaptive phase center adjustment. Though the suppression effect was not significant, the proposed method could further suppress the false targets after channel error compensation. This can also be regarded as an advantage of presented receiving phase center adjustment.


**Table 2.** Phase center position before and after adjusting.

**Figure 11.** The effect of phase center adjusting on the suppression of false targets. (**a**) Reconstructed pulse compression results for uniform and non-uniform phase center and (**b**) the top of a fake target is zoomed in.

#### **5. Conclusions**

An advanced azimuth antenna architecture which allows for the adjustment of the phase center position of sub-apertures was proposed in this paper. Benefiting from receiving phase center adjustment, the performance of the HRWS SAR system can be improved in following three aspects. Firstly, the PRF of the system in a certain range can be regarded as the optimum value, so complex signal reconstruction algorithms will be omitted in novel systems. Secondly, non-uniform sampling that results in the severe degradation of imaging performance is avoided by adjusting the phase center if it falls into the optimum-PRF-range. Thirdly, by adaptively adjusting the phase center position of each channel, false targets caused by residual channel error after compensation can be further suppressed to some degree, and the quality of the resulting SAR image can be further improved.

In conclusion, receiving phase center adjustment is an effective method for compensating for non-uniform sampling and can suppress the false targets caused by channel error to a certain extent. However, since the number of elements that can be turned off cannot exceed a certain ratio, the optimum PRF can only be adjusted within a range, which restricts the usable PRF range of the system. In further research, the joint adjustment of transmit and receive phase centers will be considered to compensate for a wider range of the non-uniform PRF. Furthermore, the presented method can be extended to multiple-input multiple-output (MIMO) SAR systems to further improve the performance.

**Author Contributions:** All the authors made contributions to this work. W.X. and J.H. proposed the idea and wrote the paper; P.H. conceived and designed the experiments; W.T. performed the experiments; and Y.D. revised the manuscript.

**Funding:** This research was funded by National Equipment Pre-Research Foundation of China, grant number JZX7Y20190253041401 and JZX7Y20190253040501, Inner Mongolia Science and Technology Innovation Guidance Project, grant number KCBJ2018014, and National Natural Science Foundation of China, grant number 61631011, 61701264 and 61661043.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## **Focusing Bistatic Forward-Looking Synthetic Aperture Radar Based on an Improved Hyperbolic Range Model and a Modified Omega-K Algorithm**

#### **Chenchen Wang, Weimin Su \*, Hong Gu and Jianchao Yang**

School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

**\*** Correspondence: suweimin@njust.edu.cn

Received: 15 July 2019; Accepted: 30 August 2019; Published: 1 September 2019

**Abstract:** For parallel bistatic forward-looking synthetic aperture radar (SAR) imaging, the instantaneous slant range is a double-square-root expression due to the separate transmitter-receiver system form. The hyperbolic approximation provides a feasible solution to convert the dual square-root expression into a single-square-root expression. However, some high-order terms of the range Taylor expansion have not been considered during the slant range approximation procedure in existing methods, and therefore, inaccurate phase compensation occurs. To obtain a more accurate compensation result, an improved hyperbolic approximation range form with high-order terms is proposed. Then, a modified omega-K algorithm based on the new slant range form is adopted for parallel bistatic forward-looking SAR imaging. Several simulation results validate the effectiveness of the proposed imaging algorithm.

**Keywords:** bistatic synthetic aperture radar (SAR); hyperbolic approximation; phase compensation; modified omega-K

#### **1. Introduction**

Synthetic aperture radar (SAR) attracts massive research enthusiasm among researchers due to its excellent ability to detect targets without the limitation of the external environment [1]. The penetration ability of SAR makes it irreplaceable compared with optical imaging, while it is challenging in traditional monostatic SAR to obtain excellent imaging performance in forward-looking imaging mode, which limits the application of SAR technology. To solve the problem, bistatic SAR has been widely used for forward-looking imaging due to its particular system configuration. The separate transmitter and receiver configuration provides extra advantages like reliable hiding power and system flexibility [2].

One-stationary bistatic SAR, as a special form of general bistatic SAR, was first studied for forward-looking imaging. Several methods have been proposed, such as the squint minimization [3,4], the keystone transform [5], and the ellipse model [2,4]. The Doppler frequency is decided by the moving transmitter or the moving receiver, which is similar to monostatic SAR. Then, the bistatic SAR was proposed where both the transmitter and the receiver are moving. The azimuth resolution is determined by both platforms. For bistatic forward-looking SAR, the difficulty of imaging algorithms lies in the solution of the two-dimensional spectrum because of its unique double-square-root form of echo signal expression [6,7]. Some basic studies of bistatic SAR were proposed to illustrate the advantages [6,8]. Compared with the monostatic situation, the principle of stationary phase (POSP) cannot be applied to solve the derivative zero point when performing azimuth Fourier transform. Several methods have been proposed to solve the problem. Loffeld's bistatic formula (LBF) was proposed to solve the double-square-root expression [6]. Respective stationary points of the transmitter

and receiver are obtained first to transform the double-square-root expression into Taylor expansion form. Then, the ultimate spectrum is solved based on the joint stationary point of the Taylor expression. The contributions of the transmitter and receiver are assumed to be the same, which leads to approximation errors. The extended Loffeld's bistatic formula (ELBF) [9] and the modified Loffeld's bistatic formula (MLBF) [10] were proposed later to improve the solution process of stationary points. These two methods assign different weights on the transmitter and receiver. However, all three LBF methods need to solve the stationary points three times, which leads to deduction complexity. The method of series reversion (MSR) [11] is a widely-used method for precisely solving those equations with series terms. In SAR imaging algorithms, Taylor expansion is regarded as a common operation, and MSR can be applied to solve the Fourier transform composed of Taylor expansion. However, it is still challenging to conduct imaging algorithm deduction due to the series form.

To simplify the solution of the spectrum, the hyperbolic approximation was utilized to transform the echo expression with the double-square-root form into the expression with the single-square-root form. In the first version, a parameter named the equivalent speed was defined [12]. In the improved version, two more parameters (the equivalent slant range and the equivalent squint angle) [13] were added in the hyperbolic function to approximate the range more accurately. Moreover, an improved hyperbolic approximation model with additional parameters was proposed for residual compensation [14]. However, considering the solution process in the methods mentioned above, the defined parameters are solved by setting the constant term, the linear term, and the quadratic term of the Taylor expansion of echo equal, which means the influence of the cubic term, the quartic term, and the remaining terms is ignored. In this article, we propose a new model to finish the hyperbolic approximation.

As for imaging algorithms, range Doppler (RD) imaging algorithms, chirp scaling (CS) imaging algorithms, back-projection (BP) imaging algorithms, and omega-K imaging algorithms based on the LBF spectrum, the MSR spectrum, and the hyperbolic approximation spectrum have been proposed in the past few years [9,14–16]. For RD imaging algorithms, it is too fundamental to handle the complex situation of bistatic SAR system. The calculation time consumption is a severe problem for real-time processing when applying BP imaging algorithms. For CS imaging algorithms, it is difficult for researchers to conduct formula derivation. Thus, the omega-K imaging algorithm is selected in this article to finish imaging.

To approximate the slant range more accurately, the cubic term and the quartic term are taken into account in this article. An equivalent hyperbolic range model is introduced first to lay the foundation of the imaging algorithm. The range error analysis is provided to demonstrate the approximation ability of the proposed range model immediately. Then, the modified omega-K imaging algorithm including the signal model and detailed processing steps are presented. Finally, some experimental simulations are given to prove the efficiency of the proposed algorithm.

This article is organized as follows. Section 2 gives the geometry of the bistatic forward-looking SAR and the equivalent hyperbolic range model corresponding to the bistatic system. Section 3 gives the detailed modified omega-K imaging algorithm. Simulation results are given in Section 4 to validate the proposed algorithm. Section 5 provides the conclusion.

#### **2. Geometry and Equivalent Slant Range Model**

The parallel bistatic forward-looking SAR system diagram in the Cartesian coordinate system and the derived equivalent slant range model are established first. Then, the analysis of range error based on the equivalent slant range model is provided.

#### *2.1. Equivalent Slant Range Model*

Figure 1 shows the geometry of parallel bistatic forward-looking SAR. The transmitter *T* and the receiver *R*move along the parallel red lines parallel to the *x*-axis. *ηpc* is the synthetic aperture center time of the imaging scene. (*xc*, *yc*, 0) is the location coordinate of the imaging center, and *xp*, *yp*, 0 is the location of an arbitrary target *P* in the imaging scene. *RTc* is the slant range between the transmitter and the target *P* at the phase center crossing time *ηpc*, and *RRc* is the range between the receiver and the target *P* at *ηpc*. The approximate forward-looking angle of the receiver is *θR*, and the approximate squint angle of the transmitter is *θT*. *VT* and *VR* represent the speed of the transmitter and the receiver, respectively. It is assumed that both the transmitter and the receiver can cover the imaging scene during the aperture synthesis.

**Figure 1.** Geometry of forward-looking bistatic SAR.

The instantaneous slant ranges from the transmitter and the receiver to the target *P* are:

$$\begin{cases} R\_T \left( \eta \right) = \sqrt{R\_{\rmTc}^2 + V\_T^2 \left( \eta - \eta\_{\rm pc} \right)^2 - 2R\_{\rmTc} V\_T \left( \eta - \eta\_{\rm pc} \right) \sin \theta\_{\rm T}}\\ R\_R \left( \eta \right) = \sqrt{R\_{\rm Rc}^2 + V\_R^2 \left( \eta - \eta\_{\rm pc} \right)^2 - 2R\_{\rm Rc} V\_R \left( \eta - \eta\_{\rm pc} \right) \sin \theta\_{\rm R}} \end{cases} \tag{1}$$

where *η* is the slow time.

Thus, the total range is:

$$R\left(\eta\right) = R\_T\left(\eta\right) + R\_R\left(\eta\right). \tag{2}$$

It is challenging to solve the two-dimensional spectrum due to the double-square-root expression form of *R* (*η*). The hyperbolic approximation [12] can be used to convert the double-square-root form to a single-square-root form by defining the equivalent speed and equivalent angle. Traditional hyperbolic approximation [12–14] ignored the high-order terms of the Taylor expansion of *R* (*η*). To realize a more accurate compensation, an improved equivalent slant range with high-order terms is proposed. The range model is expressed as:

$$R\_{\varepsilon} \left( \eta \right) = \sqrt{R\_{\varepsilon}^{2} + V\_{\varepsilon}^{2} \left( \eta - \eta\_{\rm pc} \right)^{2} - 2R\_{\varepsilon}V\_{\varepsilon} \left( \eta - \eta\_{\rm pc} \right) \sin \theta\_{\varepsilon}} + E \left( \eta - \eta\_{\rm pc} \right)^{3} + F \left( \eta - \eta\_{\rm pc} \right)^{4}, \tag{3}$$

$$\mathcal{R}\left(\eta\right) = \mathcal{R}\_{\mathfrak{E}}\left(\eta\right),\tag{4}$$

where *Re*, *Ve*, and *θ<sup>e</sup>* are the new equivalent slant range at phase crossing time, the new equivalent speed, and the new equivalent squint angle. Compared with existing hyperbolic approximation algorithms, the proposed range model adds two additional high-order terms for range error compensation. To solve the unknown variables, we first expand Equations (1) and (3) into a fourth-order Taylor series at *η* = *ηpc*. Then, we get:

$$\begin{split} R\_T \left( \eta \right) &= R\_{\rm fc} - V\_T \sin \theta\_T \left( \eta - \eta\_{\rm pc} \right) + \frac{V\_T^2 \cos \theta\_T^2}{2R\_{\rm Tc}} \left( \eta - \eta\_{\rm pc} \right)^2 + \\ &\frac{V\_T^3 \sin \theta\_T \cos^2 \theta\_T}{2R\_{\rm Tc}^2} \left( \eta - \eta\_{\rm pc} \right)^3 + \frac{V\_T^4 \cos^2 \theta\_T \left( 5 \sin^2 \theta\_T - 1 \right)}{8R\_{\rm Tc}^3} \left( \eta - \eta\_{\rm pc} \right)^4 \end{split} \tag{5}$$

*RR* (*η*) =*RRc* − *VR* sin *θ<sup>R</sup> <sup>η</sup>* <sup>−</sup> *<sup>η</sup>pc* + *V*2 *<sup>R</sup>* cos *<sup>θ</sup>*<sup>2</sup> *R* 2*RRc <sup>η</sup>* <sup>−</sup> *<sup>η</sup>pc*<sup>2</sup> <sup>+</sup> *V*3 *<sup>R</sup>* sin *<sup>θ</sup><sup>R</sup>* cos2 *<sup>θ</sup><sup>R</sup>* 2*R*<sup>2</sup> *Rc <sup>η</sup>* <sup>−</sup> *<sup>η</sup>pc*<sup>3</sup> <sup>+</sup> *V*4 *<sup>R</sup>* cos2 *<sup>θ</sup><sup>R</sup>* 5 sin2 *<sup>θ</sup><sup>R</sup>* <sup>−</sup> <sup>1</sup> 8*R*<sup>3</sup> *Rc <sup>η</sup>* <sup>−</sup> *<sup>η</sup>pc*<sup>4</sup> , (6) *V*2

$$\begin{split} R\_{\varepsilon} \left( \eta \right) &= R\_{\varepsilon} - V\_{\varepsilon} \sin \theta\_{\varepsilon} \left( \eta - \eta\_{pc} \right) + \frac{V\_{\varepsilon}^{2} \cos \theta\_{\varepsilon}^{2}}{2R\_{\varepsilon}} \left( \eta - \eta\_{pc} \right)^{2} + \\ &\frac{V\_{\varepsilon}^{3} \sin \theta\_{\varepsilon} \cos^{2} \theta\_{\varepsilon}}{2R\_{\varepsilon}^{2}} \left( \eta - \eta\_{pc} \right)^{3} + \frac{V\_{\varepsilon}^{4} \cos^{2} \theta\_{\varepsilon} \left( 5 \sin^{2} \theta\_{\varepsilon} - 1 \right)}{8R\_{\varepsilon}^{3}} \left( \eta - \eta\_{pc} \right)^{4} + \\ &E \left( \eta - \eta\_{pc} \right)^{3} + F \left( \eta - \eta\_{pc} \right)^{4}. \end{split} \tag{7}$$

Substituting Equations (5)–(7) into Equations (2) and (4) and letting the first five terms of Taylor expansion be equal, then we get:

$$\begin{cases} \begin{aligned} &R\_{Tc} + R\_{Rc} = 2R\_c\\ &V\_T \sin \theta\_T + V\_R \sin \theta\_R = 2V\_t \sin \theta\_t\\ &\frac{V\_T^2 \cos \theta\_T^2}{2R\_{Tc}} + \frac{V\_R^2 \cos \theta\_R^2}{2R\_{Rc}} = 2\frac{V\_t^2 \cos \theta\_t^2}{2R\_c}\\ &\frac{V\_T^3 \sin \theta\_T \cos^2 \theta\_T}{2R\_{Tc}^2} + \frac{V\_R^3 \sin \theta\_R \cos^2 \theta\_R}{2R\_{Rc}^2} = 2\left(\frac{V\_t^3 \sin \theta\_t \cos^2 \theta\_c}{2R\_c^2} + E\right) \\ &\frac{V\_T^4 \cos^2 \theta\_T (5 \sin^2 \theta\_T - 1)}{8R\_{Tc}^3} + \frac{V\_R^4 \cos^2 \theta\_R (5 \sin^2 \theta\_T - 1)}{8R\_{Tc}^3} = 2\left[\frac{V\_t^4 \cos^2 \theta\_t (5 \sin^2 \theta\_c - 1)}{8R\_c^3} + F\right]. \end{aligned} \tag{8}$$

Solving the five equations in Equation (8), then we get:

$$\begin{cases} \begin{aligned} R\_{\varepsilon} &= \frac{1}{2} \left( R\_{T\varepsilon} + R\_{R\varepsilon} \right) \\ V\_{\varepsilon} &= \sqrt{A^2 + B} \\ \theta\_{\varepsilon} &= \arcsin \left( A / V\_{\varepsilon} \right) \\ E &= C - \frac{V\_{\varepsilon}^3 \sin \theta\_{\varepsilon} \cos^2 \theta\_{\varepsilon}}{2R\_{\varepsilon}^2} \\ F &= D - \frac{V\_{\varepsilon}^4 \cos^2 \theta\_{\varepsilon} \left( 5 \sin^2 \theta\_{\varepsilon} - 1 \right)}{8R\_{\varepsilon}^3} \end{aligned} \tag{9}$$

where:

$$\begin{cases} \begin{aligned} A &= \left(V\_T \sin \theta\_T + V\_R \sin \theta\_R\right)/2 \\ B &= \left(\frac{V\_T^2 \cos^2 \theta\_T}{R\_{\mathbb{R}c}} + \frac{V\_R^2 \cos^2 \theta\_R}{R\_{\mathbb{R}c}}\right) R\_c/2 \\ C &= \frac{V\_T^3 \sin \theta\_T \cos^2 \theta\_T}{4R\_{\mathbb{R}c}^2} + \frac{V\_R^3 \sin \theta\_R \cos^2 \theta\_R}{4R\_{\mathbb{R}c}^2} \\ D &= \frac{V\_T^4 \cos^2 \theta\_T (5 \sin^2 \theta\_T - 1)}{16R\_{\mathbb{R}c}^2} + \frac{V\_R^4 \cos^2 \theta\_R (5 \sin^2 \theta\_R - 1)}{16R\_{\mathbb{R}c}^2}. \end{aligned} \end{cases} \tag{10}$$

At this point, all defined variables are solved. The range error analysis based on the new equivalent range model is presented next.

#### *2.2. Range Error Analysis*

To evaluate the proposed equivalent range model, an analysis of the range error based on an X-band bistatic SAR system is given. The simulated parameters are listed in Table 1. The results of the equivalent hyperbolic slant range error are shown in Figure 2.


**Table 1.** Simulation parameters.

(a) Approximation error of the traditional range model.

(b) Approximation error of the proposed range model.

**Figure 2.** Approximation error of the bistatic slant range. (**a**) Approximation error of the traditional range model. (**b**) Approximation error of the proposed range model.

Figure 2a is the approximation error of the traditional hyperbolic approximation range model [13], where the high-order terms are ignored. Figure 2b is the approximation error of the proposed hyperbolic range model. The constant term, the linear term, and the quadratic term in Equations (5)–(7) are used to solve the defined variables. Thus, the residual terms lead to the approximation slant range error. To prove that the proposed model can reduce the range error compared with the traditional model, we first give the expression of the traditional model and its corresponding Taylor expansion, which are:

$$R\_t\left(\eta\right) = \sqrt{R\_t^2 + V\_t^2 \left(\eta - \eta\_{\text{pc}}\right)^2 - 2R\_t V\_t \left(\eta - \eta\_{\text{pc}}\right) \sin \theta\_t} \tag{11}$$

$$\begin{split} R\_{l}\left(\eta\right) &= \mathcal{R}\_{l} - V\_{l}\sin\theta\_{l}\left(\eta - \eta\_{\rm pc}\right) + \frac{V\_{t}^{2}\cos\theta\_{t}^{2}}{2R\_{l}}\left(\eta - \eta\_{\rm pc}\right)^{2} + \\ &\frac{V\_{t}^{3}\sin\theta\_{t}\cos^{2}\theta\_{t}}{2R\_{t}^{2}}\left(\eta - \eta\_{\rm pc}\right)^{3} + \frac{V\_{t}^{4}\cos^{2}\theta\_{t}\left(5\sin^{2}\theta\_{t} - 1\right)}{8R\_{l}^{3}}\left(\eta - \eta\_{\rm pc}\right)^{4} \end{split} \tag{12}$$

where *Rt* (*η*), *Rt*, *Vt*, and *θ<sup>t</sup>* are the variables in traditional range model. The error in Figure 2a is the difference between the sum of the cubic terms, the quartic terms, and the residual terms in Equations (5) and (6) and the sum of the cubic term, the quartic term, and the residual term in Equation (12), while the error in Figure 2b is the difference between the sum of the residual terms in Equations (5) and (6) and the residual term in Equation (7). The error caused by the cubic and quartic terms is eliminated. From Figure 2, it can be found that the error in Figure 2a is up to 1.9 m, while the error in Figure 2b is

less than 0.15 m. According to the parameters listed in Table 1, the approximation slant range error of the proposed model is much less than a range solution cell. Therefore, the proposed equivalent slant range model is more accurate than the traditional range model. The following imaging algorithm is derived based on the proposed range model.

#### **3. Imaging Algorithm**

According to the previous analysis, the improved hyperbolic approximation model can equal the true slant range better than traditional hyperbolic approximate models. In this section, a modified omega-K algorithm based on the improved equivalent range model is proposed for the parallel bistatic forward-looking SAR imaging.

#### *3.1. Signal Model*

Assume that a linear frequency-modulated signal is transmitted from the transmitter to the receiver. Then, the base-band echo signal of an arbitrary target *P* is given as:

$$S\_1\left(t\_{\tau},\eta\right) = \exp\left\{j\pi\gamma\left[t\_{\tau} - \frac{2\mathcal{R}\_{\text{c}}\left(\eta\right)}{c}\right]\right\} \exp\left[-j\frac{4\pi\mathcal{R}\_{\text{c}}\left(\eta\right)}{\lambda}\right] \tag{13}$$

where *γ* is the range chirp rate, c is the light speed, *λ* is the wavelength, *tr* is the fast time, and *η* is the slow time. To simplify the expression and further derivation, the envelopes of the range and azimuth are ignored.

Transforming Equation (13) into the range-frequency azimuth-time domain yields:

$$S\_2(f\_r, \eta) = \exp\left(-j\frac{\pi f\_r^2}{\gamma}\right) \exp\left[-j\frac{4\pi \left(f\_r + f\_\ell\right)}{c} R\_\varepsilon \left(\eta\right)\right] \tag{14}$$

where *fr* is the frequency domain variable corresponding to *tr* and *fc* is the carrier frequency. From Equation (14), it can be easily found that the first exponential term is the range frequency modulation term. This term can be compensated by multiplying its complex conjugate in the range frequency domain. Thus, the first frequency modulation compensation function is:

$$H\_{1FM} \left( f\_r, \eta \right) = \exp \left( j \frac{\pi f\_r^2}{\gamma} \right). \tag{15}$$

Multiplying Equation (14) by Equation (15) yields:

$$S\_3(f\_r, \eta) = \exp\left[-j\frac{4\pi\left(f\_r + f\_c\right)}{c} R\_c\left(\eta\right)\right].\tag{16}$$

The exponential term in Equation (16) indicates the severe coupling between range and azimuth. To finish the phase focusing, a modified omega-K algorithm based on the signal model is presented.

#### *3.2. Modified Omega-K Imaging Algorithm*

To analyze the exponential term in Equation (16), Equation (3) is substituted into Equation (16) firstly. Then, we get:

$$\begin{split} S\_4\left(f\_r, \eta\right) = \exp\left\{-j\frac{4\pi\left(f\_r + f\_c\right)}{c}\right[\sqrt{R\_\varepsilon^2 + V\_\varepsilon^2\left(\eta - \eta\_{pc}\right)^2 - 2R\_\varepsilon V\_\varepsilon\left(\eta - \eta\_{pc}\right)\sin\theta\_\varepsilon} \\ &+ E\left(\eta - \eta\_{pc}\right)^3 + F\left(\eta - \eta\_{pc}\right)^4\right]. \end{split} \tag{17}$$

Equation (17) shows that the signal consists of the traditional hyperbolic term and high-order terms. The traditional omega-K can handle the hyperbolic term well, but cannot handle the high-order terms. The first step of the omega-K algorithm is the compensation of the cubic term and the quartic term. Variable substitution is performed on Equation (17), and then, we get:

$$\begin{split} S\_{\mathbb{S}}\left(k\_{\mathrm{r}},\boldsymbol{X}\right) = \exp\left\{-jk\_{\mathrm{r}}\left[\sqrt{R\_{\varepsilon}^{2} + \left(\boldsymbol{X} - \boldsymbol{X}\_{\mathrm{pc}}\right)^{2} - 2\boldsymbol{R}\_{\varepsilon}\left(\boldsymbol{X} - \boldsymbol{X}\_{\mathrm{pc}}\right)\sin\theta\_{\varepsilon}}\right]} \\ + \frac{E}{V\_{\varepsilon}^{3}}\left(\boldsymbol{X} - \boldsymbol{X}\_{\mathrm{pc}}\right)^{3} + \frac{F}{V\_{\varepsilon}^{4}}\left(\boldsymbol{X} - \boldsymbol{X}\_{\mathrm{pc}}\right)^{4}\right] \end{split} \tag{18}$$

where *kr* = <sup>4</sup>*π*(*fr*+*fc* ) *<sup>c</sup>* is the wavenumber, *X* = *Veη*, and *Xpc* = *Veηpc*. Then, we get *Re* (*η*) = *Re* (*X*). Transforming Equation (18) into two-dimensional wavenumber domain yields:

$$\begin{split} S\_6 \left( k\_r, k\_x \right) &= \int S\_5 \left( k\_r, X \right) \exp \left( -jk\_x X \right) dX \\ &= \int \exp \left\{ -jk\_r R\_t \left( X \right) \right\} \exp \left( -jk\_x X \right) dX \\ &= \int \exp \left\{ -j\phi \left( k\_r, k\_{x\_r} X \right) \right\} dX \end{split} \tag{19}$$

where *kx* = <sup>2</sup>*<sup>π</sup> fa Ve* , *fa* is the azimuth frequency, and:

$$\begin{split} \phi\left(k\_{r},k\_{\mathcal{X}},\mathcal{X}\right) = k\_{r}\left[\sqrt{R\_{\varepsilon}^{2}+\left(\mathcal{X}-\mathcal{X}\_{pc}\right)^{2}-2R\_{\varepsilon}\left(\mathcal{X}-\mathcal{X}\_{pc}\right)\sin\theta\_{\varepsilon}}\right. \\ \left.+\frac{E}{V\_{\varepsilon}^{3}}\left(\mathcal{X}-\mathcal{X}\_{pc}\right)^{3}+\frac{F}{V\_{\varepsilon}^{4}}\left(\mathcal{X}-\mathcal{X}\_{pc}\right)^{4}\right]+k\_{\mathcal{X}}\mathcal{X}. \end{split} \tag{20}$$

To solve Equation (19), the stationary phase point of *φ* (*kr*, *kx*, *X*) should be obtained firstly. However, the existence of high-order terms complicates the solution process. For further analysis, the phase is first rewritten as:

$$\Phi\left(k\_{\rm r}, k\_{\rm xr}X\right) = \Phi\_{\rm l}\left(k\_{\rm r}, k\_{\rm xr}X\right) + k\_{\rm r}\left[\frac{E}{V\_{\rm c}^3}\left(X - X\_{\rm pc}\right)^3 + \frac{F}{V\_{\rm c}^4}\left(X - X\_{\rm pc}\right)^4\right] \tag{21}$$

where *φ<sup>t</sup>* (*kr*, *kx*, *X*) is the traditional phase term. It is widely accepted that if the phase error is smaller than *π*/4 [1], the imaging performance will not be affected much by the approximation. The phase error simulation is given in Figure 3.

**Figure 3.** Phase error simulation.

From Figure 3, it can been seen that all absolute phase errors are less than *π*/4. Thus, the stationary phase point of *φ<sup>t</sup>* (*kr*, *kx*, *X*) is regarded as the approximate stationary phase point of *φ* (*kr*, *kx*, *X*). The approximate stationary phase point of *φ* (*kr*, *kx*, *X*) is:

$$X^\* = -\frac{k\_\mathrm{x} R\_\mathrm{c} \sin \theta\_\mathrm{c}}{\sqrt{k\_\mathrm{r}^2 - k\_\mathrm{x}^2}} + R\_\mathrm{c} \sin \theta\_\mathrm{c} + X\_\mathrm{pc} \tag{22}$$

where *X*<sup>∗</sup> is only a designation of the solution and (∗) is not an operator.

Substituting Equation (22) in Equation (19) and applying POSP yield the two-dimensional wavenumber domain signal as:

$$\begin{split} S\_{\mathsf{T}}\left(k\_{r},k\_{\mathsf{x}}\right) &= \exp\left\{-j\sqrt{k\_{r}^{2}-k\_{\mathsf{x}}^{2}}R\_{\mathsf{c}}\cos\theta\_{\mathsf{c}} - jk\_{\mathsf{x}}\left(R\_{\mathsf{c}}\sin\theta\_{\mathsf{c}} + X\_{\mathsf{pc}}\right) \\ &- jk\_{\mathsf{f}}\left[\frac{E}{V\_{\mathsf{c}}^{3}}\left(X^{\*}-X\_{\mathsf{pc}}\right)^{3} + \frac{F}{V\_{\mathsf{c}}^{4}}\left(X^{\*}-X\_{\mathsf{pc}}\right)^{4}\right]\right\}. \end{split} \tag{23}$$

The cubic term and quartic term in Equation (23) can be easily compensated by multiplying its conjugate form. Therefore, the high-order filter is:

$$H\_2\left(k\_I, k\_\Gamma\right) = \exp\left\{jk\_I \left[\frac{E}{V\_\varepsilon^3} \left(X^{\*\*} - X\_{pc}\right)^3 + \frac{F}{V\_\varepsilon^4} \left(X^{\*\*} - X\_{pc}\right)^4\right] \right\} \tag{24}$$

where *X*∗∗ is the value of *X*<sup>∗</sup> at the reference range and (∗∗) is not an operator.

Multiplying Equations (23) and (24), we get the compensated signal for the further omega-K imaging algorithm. The signal is:

$$S\_8\left(k\_{r\prime},k\_x\right) = \exp\left\{-j\sqrt{k\_r^2 - k\_x^2}R\_\varepsilon\cos\theta\_\varepsilon - jk\_x\left(R\_\varepsilon\sin\theta\_\varepsilon + X\_{p\varepsilon}\right)\right\}.\tag{25}$$

A two-step omega-K is performed on Equation (25) to finish the imaging focusing.

The first step is the bulk focusing. A reference function is designed based on the reference range to finish coarse focusing. This filter can compensate the phase of signals of those points at the reference range. The reference function is:

$$H\_{rf}\left(k\_{\mathcal{I}r}k\_{\mathcal{X}}\right) = \exp\left\{j\sqrt{k\_{\mathcal{I}}^2 - k\_{\mathcal{X}}^2}R\_{r\mathcal{t}f}\cos\theta\_{\mathcal{t}} + jk\_{\mathcal{X}}\left(R\_{r\mathcal{t}f}\sin\theta\_{\mathcal{t}} + X\_{\mathcal{P}\mathcal{E}}\right)\right\}.\tag{26}$$

Multiplying Equations (25) and (26) gets:

$$S\_{\theta} \left( k\_{r}, k\_{x} \right) = \exp \left\{ -j \sqrt{k\_{r}^{2} - k\_{x}^{2}} \cos \theta\_{\varepsilon} \left( R\_{\varepsilon} - R\_{ref} \right) - j k\_{x} \sin \theta\_{\varepsilon} \left( R\_{\varepsilon} - R\_{ref} \right) \right\}. \tag{27}$$

After bulk focusing, the residual phase at the reference range is removed. However, the residual phase of points not at the reference range remains. Moreover, the phase contains coupling terms between range and azimuth. For precise focusing of all points, the Stolt interpolation function is given as:

$$k\_y = \sqrt{k\_r^2 - k\_x^2} \cos \theta\_\ell + k\_x \sin \theta\_\ell. \tag{28}$$

After Stolt interpolation, the resampled signal becomes:

$$S\_{10} \left( k\_{\prime}, k\_{\times} \right) = \exp \left[ -jk\_{\,\,\,\ell} \left( R\_{\varepsilon} - R\_{\,\,\,\,\ell} \right) \right]. \tag{29}$$

From Equation (29), it is evident that the coupling between range and azimuth has been removed. The phase is a linear function of *ky*. Then, the inverse fast Fourier transform is implemented on Equation (29) to complete imaging.

According to the analysis mentioned above, the whole imaging process is shown in Figure 4.

**Figure 4.** Flowchart of modified omega-K.

The specific steps are as follows:


#### **4. Simulation Results**

In this section, to demonstrate the effectiveness of the proposed imaging algorithm, experimental simulations of parallel bistatic forward-looking SAR are carried out. The system parameters are listed in Table 1. Four points at different locations were chosen to compare the imaging performance. They were *P*0(0, 0), *P*1(0, 500), *P*2(200, 0), and *P*3(200, 500). The unit of the coordinates is meters. The omega-K imaging algorithm based on the traditional three-parameters hyperbolic range model [13] was selected as the reference.

Figure 5 is the comparison of the overall imaging performance before geometric correction. Figure 5a is the result of the traditional imaging algorithm, and Figure 5b is the result of the proposed imaging algorithm. In Figure 5a, although the four points can be successfully focused, the quality of the right two points has distortion. In contrast, Figure 5b shows that the proposed algorithm achieves a better focus quality on the right two points than the traditional algorithm.

**Figure 5.** Imaging results. (**a**) Imaging results of the traditional hyperbolic omega-K algorithm. (**b**) Imaging results of the proposed hyperbolic omega-K algorithm.

To observe the imaging performance more intuitively, the sub-images of the four points extracted from Figure 5 are given by Figure 6. Figure 6a–c presents the imaging results of *P*0, *P*2, and*P*<sup>3</sup> achieved by the traditional hyperbolic range model given in [13], respectively. Figure 6e,f shows the imaging quality of the three targets obtained by the proposed modified omega-K imaging algorithm. From Figure 6a,d, both algorithms can obtain an excellent focusing quality of the scene center *P*0. For the omega-K algorithm, the scene center is always chosen as the reference point to perform bulk focusing. For the points away from the center (*P*<sup>2</sup> and *P*3), it is evident that the proposed algorithm performs much better than the traditional algorithm. For further analysis, the azimuth impulse response of the farthest point *P*<sup>3</sup> is given in Figure 7. Table 2 gives out the peak sidelobe ratio (PSLR) and the integrated sidelobe ratio (ISLR) of targets *P*3.

**Figure 6.** Imaging results. (**a**) Imaging result of *P*<sup>0</sup> by the traditional algorithm. (**b**) Imaging result of *P*<sup>2</sup> by the traditional algorithm. (**c**) Imaging result of *P*<sup>3</sup> by the traditional algorithm. (**d**) Imaging result of *P*<sup>0</sup> by the proposed algorithm. (**e**) Imaging result of *P*<sup>2</sup> by the proposed algorithm. (**f**) Imaging result of *P*<sup>3</sup> by the proposed algorithm.

**Figure 7.** Azimuth impulse response of *P*3. (**a**) Traditional hyperbolic omega-K algorithm. (**b**) Proposed hyperbolic omega-K algorithm.


**Table 2.** Image quality parameters of *P*3. PSLR, peak sidelobe ratio; ISLR, integrated sidelobe ratio.

Figure 7a is achieved by the traditional hyperbolic omega-K algorithm. Figure 7b is achieved by the proposed hyperbolic omega-K algorithm. Compared with the traditional omega-K algorithm, the proposed omega-K algorithm can improve the performance of the azimuth impulse response. The objective image quality values demonstrated the effectiveness of the proposed omega-K algorithm.

#### **5. Conclusions**

In this article, an improved hyperbolic range model was proposed to deal with the particular form of the echo of bistatic forward-looking SAR. The modified omega-K imaging algorithm based on the hyperbolic range model was used to finish focusing. The high-order terms were taken into account to reduce the range approximation error. Extra phase compensation benefited the focusing of the omega-K algorithm. Compared with the range model without high-order compensation terms, the proposed method showed the effectiveness of imaging quality by simulation results.

**Author Contributions:** Conceptualization, W.S. and H.G.; methodology, C.W.; software, C.W.; validation, all authors; formal analysis, C.W.; investigation, J.Y.; writing–original draft preparation, C.W.; writing—review and editing, W.S. and J.Y.; supervision, H.G.; project administration, W.S.; funding acquisition, W.S., H.G. and J.Y.

**Funding:** This research was funded by the National Natural Science Foundation of China under Grant Numbers 61471198 and 61671246 and the Natural Science Foundation of Jiangsu Province under Grant Numbers BK20160847 and BK20170855.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Microwave Staring Correlated Imaging Based on Unsteady Aerostat Platform**

#### **Zheng Jiang , Yuanyue Guo \*, Jie Deng, Weidong Chen and Dongjin Wang**

Key Laboratory of Electromagnetic Space Information, Chinese Academy of Sciences, University of Science and Technology of China, Hefei 230026, China; jiangz10@mail.ustc.edu.cn (Z.J.); dengjie@mail.ustc.edu.cn (J.D.); wdchen@ustc.edu.cn (W.C.); wangdj@ustc.edu.cn (D.W.)

**\*** Correspondence: yuanyueg@ustc.edu.cn; Tel.: +86-138-6613-5598

Received: 3 May 2019; Accepted: 21 June 2019; Published: 24 June 2019

**Abstract:** Microwave staring correlated imaging (MSCI), with the technical capability of highresolution imaging on relatively stationary targets, is a promising approach for remote sensing. For the purpose of continuous observation of a fixed key area, a tethered floating aerostat is often used as the carrying platform for MSCI radar system; however, its non-cooperative random motion of the platform caused by winds and its unbalance will result in blurred imaging, and even in imaging failure. This paper presents a method that takes into account the instabilities of the platform, combined with an adaptive variable suspension (AVS) and a position and orientation system (POS), which can automatically control the antenna beam orientation to the target area and measure dynamically the position and attitude of the stochastic radiation radar array, respectively. By analyzing the motion feature of aerostat platform, the motion model of the radar array is established, then its real-time position vector and attitude angles of each antenna can be represented; meanwhile the selection matrix of beam coverage is introduced to indicate the dynamic illumination of the radar antenna beam in the overall imaging area. Due to the low-speed discrete POS data, a curve-fitting algorithm can be used to estimate its accurate position vector and attitude of each antenna at each high-speed sampling time during the imaging period. Finally, the MSCI model based on the unsteady aerostat platform is set up. In the simulations, the proposed scheme is validated such that under the influence of different unstable platform movements, a better imaging performance can be achieved compared with the conventional MSCI method.

**Keywords:** microwave staring correlated imaging; unsteady aerostat platform; motion parameter fitting; position error

#### **1. Introduction**

Microwave remote sensing has the ability to work in all day and all weather conditions [1], thus it has been used in many civilian and military fields, such as disaster monitoring and military reconnaissance [2]. The conventional high-resolution microwave remote sensing commonly applies Synthetic Aperture Radar (SAR) which is based on Range-Doppler (RD) principle [3]. However relative motion between radar and target is necessary for SAR and the revisit period is long. In forward-looking or staring imaging geometry, SAR cannot work effectively and encounters great challenges to obtain high-resolution imaging.

Microwave staring correlated imaging is a novel high-resolution staring imaging technique without the relative motion limit of target [4–6]. The essence of MSCI is to construct temporal-spatial stochastic radiation field (TSSRF) in the imaging region, which is typically realized by a multi-transmitters configuration emitting independent stochastic waveforms [7,8]. By correlation process (CP) between the target scattering echo and the TSSRF, targets within the antenna beam can be resolved. Due to its superior imaging performance without target relative motion, MSCI has attracted increasing attention and made progress in many aspects such as random radiation source optimization [9–11], imaging algorithm [12–14] and outfield imaging experiment [15].

At present, research on MSCI depends on the premise of an ideal stable imaging platform, i.e., the system platform of the MSCI radar is assumed to be stationary. However, it is not guaranteed in practical applications. To observe a fixed area, the MSCI radar needs to be raised to a certain height. A tethered aerostat is suitable to serve as the platform of MSCI radar with advantages of long-stay time in the air, wide coverage area and low cost [16,17], but it cannot keep absolutely stationary in the air because of the non-cooperative motion caused by wind and unbalance. The platform instability will result in imaging system errors and the imaging performance will be seriously degraded when the random motion of platform becomes intense.

The imaging system errors in MSCI have been investigated by many studies, since it generally exists in practice. For example, to compensate the gain–phase error in MSCI, Zhou et al. propose a sparse auto-calibration method, which is a cyclic iteration processing combined target reconstruction with gain–phase error estimation [18]. In reference [19], the MSCI with phase error is formulated as a Bayesian hierarchical prior modeling, and self-calibration variational message passing (SC-VMP) algorithm is proposed, which estimates the scattering coefficient and phase error iteratively by VMP and Newton's method to improve the performance of MSCI with phase error. To estimate the gain–phase error and the synchronization error under high SNR, Tian et al. add a reference receiver to the MSCI system to receive the direct wave signal and the gain–phase error and the synchronization error are estimated by the direct wave signals [20]. In reference [21], a method of strip-mode MSCI with self-calibration of gain–phase errors is proposed to solve the problem of MSCI with gain–phase errors in a large scene. Reference [22] considers the off-grid problem in MSCI and an algorithm based on variational sparse Bayesian learning (VSBL) is developed to solve the MSCI with off-grid problem. Reference [23] focuses on sparsity-driven MSCI with array position error (APE) and propose two sparse auto-calibration imaging algorithms in sparse Bayesian learning framework to compensate the APE. Li et al. analyzes the target-motion-induced error and provides an applicable approach for MSCI in the presence of target-motion-induced error [24]. Hitherto, research on MSCI system error generally concentrated on gain–phase error, off-grid error, APE, and target-motion-induced error. There is no study on the imaging system error caused by instability of the platform which is an important issue in practice applications.

Aiming at the above problems, this paper proposes a MSCI method based on unsteady aerostat platform. In the proposed method, the antenna array with multiple transmitters and one receiver is mounted on the aerostat platform combined with an adaptive variable suspension (AVS), and the position and orientation system (POS) located at the center of the array, controlling its antenna beam orientation to the target area and measuring dynamically its position and attitude during imaging process. The effects of antenna motion and dynamic beam coverage caused by instability of the platform are considered in imaging model to reduce the imaging model error. For antenna motion, the real-time position vectors of antenna are used in imaging model in place of static position vector. The calculation of real-time position vector of antenna depends on the translational speed and the rotational angular velocity of the array in each signal pulse, then based on the low-speed discrete POS data, a least square curve-fitting method is employed to estimate the accurate translational speed and rotational angular velocity of the array at every sampling time. For dynamic beam coverage, the selection matrix of beam coverage calculated by the position and the attitude of the array is introduced to indicate the illuminated area at each pulse.

The rest of this paper is organized as follows. Section 2 presents the MSCI method based on unsteady aerostat platform. In Section 3, estimation of translational speed and rotational angular velocity of antenna array is given. In Section 4, serval simulations are demonstrated to show the effectiveness of the proposed method. Section 5 concludes this paper.

#### **2. MSCI Method Based on Unsteady Aerostat Platform**

#### *2.1. Imaging Scene*

MSCI can be realized by using a multi-transmitter configuration to transmit time-independent and group-orthogonal waveforms. To realize observation of targets on ground, MSCI radar can be raised to the air by a tethered aerostat. As shown in Figure 1, the antenna array with *N* transmitters and one receiver at its array center is carried by AVS which is able to control the antenna beam orientation, and POS is placed at the center of the array to dynamically measure its position and the attitude during the imaging process.

**Figure 1.** Imaging geometry of MSCI based on unsteady aerostat platform.

To illustrate the geometry of the imaging scene, as the earth-surface inertial reference frame, the coordinate system *OtXtYtZt* is established, with its origin *Ot* located in the projection point of the array center on the ground on *OtXtYt* plane at the beginning imaging time, its *Xt* axis pointing to the east along local latitude line, its *Yt* axis pointing to the north along local meridian and its *Zt* axis pointing upward along the local geographic vertical line.

The independent signal of random frequency hopping transmitted synchronously by all transmitters and the signal transmitted by the *n*-th transmitter is denoted as

$$s\_n(t) = \sum\_{l=1}^{L} \operatorname{rect}[\frac{t - (l-1)T\_p}{T}] \exp\{j2\pi f\_{nl}[t - (l-1)T\_p] \},\tag{1}$$

where *fnl* is the frequency of the *l*-th pulse emitted by the *n*-th transmitter and randomly selected within the system bandwidth. *rect*(*t*) is rectangular function. *L* is the total number of pulses. *TP* denotes pulse repetition interval and *T* is pulse width.

During the imaging process, POS will dynamically record the position and the attitude of the antenna array. The attitude of the array Euler angles includes yaw angle, pitch angle, and roll angle. To give definition of these angles, the aerostat coordinate system *ObXbYbZb* is established on the array with its origin *Ob* located at its array center, its *Xb* axis pointing to the right along the horizontal axis of the array, its *Yb* axis pointing forward along the longitudinal axis of the array and its *Zb* axis perpendicular to the radar array plane. The yaw angle *θ* is defined as the angle between the projection of *Yb* on the *OtXtYt* plane and the *Yt* axis, with the *Yb* axis right side being positive. The pitch angle *ϕ* is defined as the angle between the *Yb* axis and the *OtXtYt* plane, with *Yb* axis up side being positive. The roll angle *φ* is defined as the angle between the *Zb* axis and the vertical plane containing the *Yb* axis, with *Zb* axis right side being positive. The graphical diagram for the altitude angles is shown in Figure 2. *Y <sup>b</sup>* is the projection of *Yb* on the *OtXtYt* plane and *Z <sup>b</sup>* is the projection of *Zb* on the *OtZtYb* plane.

**Figure 2.** Graphical diagram for the altitude angles.

#### *2.2. Real-Time Position Vector of Antenna*

To eliminate the influence of antenna motion, the real-time position vector **r***<sup>n</sup>* (*tl*) and **r***<sup>s</sup>* (*tl*) are introduced to the MSCI model based on unsteady aerostat platform, where **r***<sup>n</sup>* (*tl*) and **r***<sup>s</sup>* (*tl*) denote the real-time position vector of the *n*-th transmitter and the receiver at *tl* in the *l*-th pulse in *OtXtYtZt* respectively.

The complicated motion of the antenna array is decomposed into three-dimensional translations and three rotational components. The three-dimensional translations are along *Xt*, *Yt* and *Zt* respectively. The rotational components are rotation of yaw angle, pitch angle, and roll angle, respectively. As the pulse repetition interval *TP* is short, the translational speed and the rotational angular velocity will not change drastically during such a short period, so the assumption on the array motion is made that the antenna array motion is uniform translation and uniform rotation during each pulse repetition interval *TP*. Hence the translational speed of the antenna array during the *l*-th pulse is denoted as **v***<sup>l</sup>* = *vl*,*x*, *vl*,*y*, *vl*,*<sup>z</sup>* , where *vl*,*x*, *vl*,*y*, *vl*,*<sup>z</sup>* are the speeds of the three-dimensional translations along *Xt* axis, *Yt* axis and *Zt* axis respectively. The rotational angular velocity of the antenna array during the *l*-th pulse is denoted as *ω<sup>l</sup>* = *ωl*,*θ*, *ωl*,*ϕ*, *ωl*,*<sup>φ</sup>* , where *ωl*,*θ*, *ωl*,*ϕ*, *ωl*,*<sup>φ</sup>* are rotational angular velocity of the yaw angle, the pitch angle, and the roll angle respectively.

If the motion of antenna array during each pulse is known, the real-time position vector of the antenna can be determined. *Tpos* denotes the repetition period of POS recording data. As Figure 3 shows, the pulse repetition interval *TP* is far shorter than *Tpos* of POS, so there are many transmitting pulses between adjacent POS data. Assuming that the recorded time *ti*,*pos* of the *i*-th POS data is in the *l* -th pulse and the recorded time *ti*+1,*pos* of the next POS data is in the *l* -th pulse, the real-time position vector of *n*-th transmitter **r***<sup>n</sup>* (*tl*) at *tl* in the *l*-th (*l* ≤ *l* ≤ *l* ) pulse can be expressed as

$$\mathbf{r}^{\rm n}\left(t\_{l}\right) = \mathbf{r}^{\rm n}\left(t\_{i,\rm pos}\right) + \Delta\mathbf{r}\_{\rm v}\left(t\_{l} - t\_{i,\rm pos}\right) + \Delta\mathbf{r}\_{\omega}^{\rm n}\left(t\_{l} - t\_{i,\rm pos}\right),\tag{2}$$

where Δ**r***<sup>v</sup> tl* <sup>−</sup> *ti*,*pos* and Δ**r***<sup>n</sup> ω tl* <sup>−</sup> *ti*,*pos* are the displacement vectors of the *n*-th transmitter caused by translation and rotation during *tl* <sup>−</sup> *ti*,*pos*, respectively. **<sup>r</sup>***<sup>n</sup> ti*,*pos* is the position vector of the *n*-th transmitter at *ti*,*pos* and can be calculated by the following formula

$$\mathbf{r}^{n}\left(t\_{i,pos}\right) = \mathbf{r}^{s}\left(t\_{i,pos}\right) + \mathbf{C}\left(\theta\_{t\_{i,pos},\prime}\rho\_{t\_{i,pos},\prime}\phi\_{t\_{i,pos}}\right)\mathbf{r}^{n}\_{\mathbf{b},\prime}\tag{3}$$

where **r***<sup>n</sup> <sup>b</sup>* is the position vector of the *<sup>n</sup>*-th transmitter in *ObXbYbZb*. **<sup>r</sup>***<sup>s</sup> ti*,*pos* is the position vector of the receiver measured by the POS at *ti*,*pos* in *OtXtYtZt*. **C** *<sup>θ</sup>ti*,*pos*, *<sup>ϕ</sup>ti*,*pos*, *<sup>φ</sup>ti*,*pos* is the Direction Cosine Matrix (DCM) that transforms the coordinate from *ObXbYbZb* to *OtXtYtZt*. The DCM can be expressed as

$$\begin{aligned} \mathbf{C}\left(\theta\_{l\_{l,pm}},\boldsymbol{\rho}\_{l\_{l,pm}},\boldsymbol{\phi}\_{l\_{l,pm}}\right) &= \begin{bmatrix} \cos\left(\theta\_{l\_{l,pm}}\right) & \sin\left(\theta\_{l\_{l,pm}}\right) & 0\\ -\sin\left(\theta\_{l\_{l,pm}}\right) & \cos\left(\theta\_{l\_{l,pm}}\right) & 0\\ 0 & 0 & 1 \end{bmatrix} \times \\\ & \begin{bmatrix} 1 & 0 & 0\\ 0 & \cos\left(\theta\_{l\_{l,pm}}\right) & -\sin\left(\theta\_{l\_{l,pm}}\right)\\ 0 & \sin\left(\theta\_{l\_{l,pm}}\right) & \cos\left(\theta\_{l\_{l,pm}}\right) \end{bmatrix} \times \begin{bmatrix} \cos\left(\phi\_{l\_{l,pm}}\right) & 0 & \sin\left(\phi\_{l\_{l,pm}}\right)\\ 0 & 1 & 0\\ -\sin\left(\phi\_{l\_{l,pm}}\right) & 0 & \cos\left(\phi\_{l\_{l,pm}}\right) \end{bmatrix} \end{aligned} (4)$$

**Figure 3.** Pulse and POS data timing diagram.

As the receiver is at the center of the antenna array, its position vector at *tl* is only affected by the translation of the antenna array during the period of *tl* − *ti*,*pos* and can expressed as

$$\mathbf{r}^s \left( t\_l \right) = \mathbf{r}^s \left( t\_{i,pos} \right) + \Delta \mathbf{r}\_v \left( t\_l - t\_{i,pos} \right). \tag{5}$$

Δ**r***<sup>v</sup> tl* <sup>−</sup> *ti*,*pos* and Δ**r***<sup>n</sup> ω tl* <sup>−</sup> *ti*,*pos* can be calculated by the translational speed and the rotational angular velocity of the antenna array:

$$
\Delta \mathbf{r}\_v \left( t\_l - t\_{i, \text{pos}} \right) = \left[ \min \left\{ t\_l, l' T\_p \right\} - t\_{i, \text{pos}} \right] \mathbf{v}\_{l'} + \sum\_{k=l'+1}^{l} \left[ \min \left\{ t\_l, k T\_p \right\} - (k-1) \, T\_P \right] \mathbf{v}\_{k'} \tag{6}
$$

$$
\Delta \mathbf{r}\_{\omega}^{\boldsymbol{\eta}} \left( t\_{l} - t\_{i, \text{pos}} \right) = \mathbf{C} \left( \Delta \theta\_{l\_{l}}, \Delta q \rho\_{l\_{l}}, \Delta q \rho\_{l\_{l}} \right) \mathbf{r}\_{\mathbf{b}}^{\boldsymbol{\eta}} - \mathbf{r}\_{\mathbf{b}}^{\boldsymbol{\eta}}.\tag{7}
$$

The function min{*x*, *y*} returns the minimum of *x* and *y*. Δ*θtl* , Δ*ϕtl* and Δ*φtl* are the changes of the altitude angles during *tl* − *ti*,*pos* and can be calculated by the following formula

$$\boldsymbol{w}\_{l\!\!\!\/]} = \left[ \min \left\{ t\_{l\!\!\/ } \boldsymbol{l}^{\prime} \boldsymbol{T}\_{\mathcal{P}} \right\} - t\_{i, \text{pos}} \right] \boldsymbol{\omega}\_{l'\!\!\/]} + \sum\_{k=l'+1}^{l} \left[ \min \left\{ t\_{l\!\!\/ } \boldsymbol{k} \boldsymbol{T}\_{\mathcal{P}} \right\} - \left( k - 1 \right) \boldsymbol{T}\_{\mathcal{P}} \right] \boldsymbol{\omega}\_{k\!\!\/} \tag{8}$$

where *<sup>α</sup>tl* <sup>∈</sup> Δ*θtl* , Δ*ϕtl* , Δ*φtl* .

#### *2.3. Influence of Platform Motion on Beam Coverage*

The aerostat platform instability not only causes the antenna motion, but also changes the beam coverage in the overall imaging region *S*. All echo data contains the information of all beam covered areas, therefore as the union of all beam coverages, the overall imaging region *S* is considered in imaging. The selection matrix of beam coverage is introduced to indicate the dynamically illuminated area of each pulse within the overall imaging region.

The beam coverage of the *<sup>l</sup>*-th pulse is denoted as *Sl*, and the coordinate *xc <sup>l</sup>* , *<sup>y</sup><sup>c</sup> l* is the beam coverage center on the *OtXtYt* plane of the *l*-th pulse:

$$\mathbf{x}\_{l}^{c} = \mathbf{x}\_{l}^{s} - \left(\tan\phi\_{l}\cos\varphi\_{l}/\cos\theta\_{l} - \sin\varphi\_{l}\tan\theta\_{l}\right)\mathbf{z}\_{l'}^{s} \tag{9}$$

$$y\_l^c = y\_l^s + \left(\tan\phi\_l \sin\varphi\_l / \cos\theta\_l + \cos\varphi\_l \tan\theta\_l\right) z\_l^s,\tag{10}$$

where *xs <sup>l</sup>* , *<sup>y</sup><sup>s</sup> l* , *zs l* , (*θl*, *ϕl*, *φl*) are its center position coordinate and its attitude angles of the antenna array at the start time of the *l*-th pulse.

The overall imaging region *S* is the union of all beam covered areas during imaging, i.e., *S* = *S*<sup>1</sup> # *S*<sup>2</sup> # ... # *SL*. The size of the beam covered area of a single pulse is denoted as *wx* × *wy*, where *wx*, *wy* are the side length. The size of the overall imaging region is

$$\mathcal{W}\_{\mathbf{x}} \times \mathcal{W}\_{\mathbf{y}} = \left(\mathbf{x}\_{\text{max}}^{\varepsilon} - \mathbf{x}\_{\text{min}}^{\varepsilon} + w\_{\text{x}}\right) \times \left(y\_{\text{max}}^{\varepsilon} - y\_{\text{min}}^{\varepsilon} + w\_{\text{y}}\right), \tag{11}$$

where *Wx*,*Wy* are the side length of *S*. *x<sup>c</sup>* max and *x<sup>c</sup> min* are the maximum value and the minimum value of *xc <sup>l</sup>* , *<sup>l</sup>* = 1, 2, ··· , *<sup>L</sup>*. *<sup>y</sup><sup>c</sup>* max and *y<sup>c</sup> min* are the maximum value and the minimum value of *<sup>y</sup><sup>c</sup> <sup>l</sup>* , *l* = 1, 2, ··· , *L*.

The overall imaging region will be discretized into *M* = *P* × *Q* discrete grids, where *P* is the row number of azimuth resolution cells, and *Q* is the column number of range resolution cells. In *OtXtYtZt*, the position vectors of the *m*-th grid is denoted as **r***m*, *m* = 1, 2, ··· , *M*.

Selection matrix of beam coverage is as below

$$\mathbf{D} = \begin{bmatrix} D\_1 \begin{pmatrix} 1 \\ \end{pmatrix} & D\_1 \begin{pmatrix} 2 \\ \end{pmatrix} & \cdots & D\_1 \begin{pmatrix} M \\ \end{pmatrix} \\ D\_2 \begin{pmatrix} 1 \\ \end{pmatrix} & D\_2 \begin{pmatrix} 2 \\ \end{pmatrix} & \cdots & D\_2 \begin{pmatrix} M \\ \end{pmatrix} \\ \vdots & \vdots & \vdots \\ D\_L \begin{pmatrix} 1 \\ \end{pmatrix} & D\_L \begin{pmatrix} 2 \\ \end{pmatrix} & \cdots & D\_L \begin{pmatrix} M \\ \end{pmatrix} \\ \end{pmatrix} . \tag{12}$$

The element *Dl* (*m*) indicates whether the *m*-th grid is illuminated by the *l*-th pulse beam:

$$D\_{l}\left(m\right) = \begin{cases} 1 & \text{if } \left(\mathbf{r}\_{m} \in S\_{l}\right) \\ 0 & \text{if } \left(\mathbf{r}\_{m} \notin S\_{l}\right) \end{cases} \tag{13}$$

#### *2.4. Imaging Equation*

Since the whole imaging region *S* has been divided into *M* = *P* × *Q* discrete grids. The scattering coefficient of the *m*-th grid is *σ*(**r***m*). At the beginning of the *l*-th pulse, each transmitter simultaneously transmits independent and stochastic signal. All signals are superimposed in *S* to generate TSSRF. The radiation field at **r***m* can be expressed as

$$E^{inc}(t\_{l\prime}, \mathbf{r}\_m) = \sum\_{n=1}^{N} D\_l \left( m \right) \frac{F\_{\rm tr} \left( \hat{\mathbf{R}}\_{\rm tr} \right) s\_n (t\_l - \left| \mathbf{r}\_m - \mathbf{r}^n \left( t\_{l,0} \right) \right| / \sigma \right)}{4 \pi \left| \mathbf{r}\_m - \mathbf{r}^n \left( t\_{l,0} \right) \right|}, \tag{14}$$

where **Rˆ** *<sup>n</sup>* <sup>=</sup> [**r***<sup>m</sup>* <sup>−</sup> **<sup>r</sup>***<sup>n</sup>* (*tl*,0)] / **r***<sup>m</sup>* − **<sup>r</sup>***<sup>n</sup>* (*tl*,0) . *Fn* **Rˆ** *n* denotes the radiation pattern of the *n*-th transmitter antenna. *tl*,0 = (*l* − 1)*T* denotes the initial time of the *l*-th pulse.

The radiation field interacts with the targets and the received echo can be expressed as

$$E^{\rm sca}(t\_I) = \sum\_{m=1}^{M} \sigma(\mathbf{r}\_m) \frac{E^{inc}(t\_l - \frac{|\mathbf{r}'(t\_l) - \mathbf{r}\_m|}{\varepsilon}, \mathbf{r}\_m)}{4\pi \left| \mathbf{r}^e \left(t\_l\right) - \mathbf{r}\_m \right|} F\_s \left(\mathbf{\hat{R}}\_s\right) + n \left(t\_l\right), \tag{15}$$

where **Rˆ** *<sup>s</sup>* <sup>=</sup> [**r***<sup>m</sup>* <sup>−</sup> **<sup>r</sup>***<sup>s</sup>* (*tl*)]/|**r***<sup>m</sup>* <sup>−</sup> **<sup>r</sup>***<sup>s</sup>* (*tl*)<sup>|</sup> . *Fs* **Rˆ** *s* denotes the radiation pattern of the receiver antenna. *n* (*tl*) denotes the additive noise.

Considering the round-trip propagation of the electromagnetic field in the free space, the modified radiation field is defined as

$$E^{rad}(t\_l, \mathbf{r}\_m) = \sum\_{n=1}^{N} \left\{ \frac{F\_s\left(\hat{\mathbf{R}}\_\varepsilon\right) F\_n\left(\hat{\mathbf{R}}\_\hbar\right) \mathbf{s}\_n \left[t\_l - \left(\left|\mathbf{r}\_m - \mathbf{r}^n\left(t\_{l,0}\right)\right| + \left|\mathbf{r}^s\left(t\_l\right) - \mathbf{r}\_m\right|\right)/\varepsilon\right]}{\left(4\pi\right)^2 \left|\mathbf{r}\_m - \mathbf{r}^n\left(t\_{l,0}\right)\right| \left|\mathbf{r}^s\left(t\_l\right) - \mathbf{r}\_m\right|} D\_l\left(m\right) \right\}.\tag{16}$$

Let *tl*,*k*, *l* = 1, 2, ··· , *L* be the sampling time in the *l*-th pulse, thus the imaging equation in the matrix vector form can be written as

$$\mathbb{E}^{\text{ser}} = \mathbb{E}^{rad} \cdot \sigma + \mathbf{n}\_{\prime} \tag{17}$$

where **<sup>E</sup>***sca* = [*Esca*(*t*1,*k*), *<sup>E</sup>sca*(*t*2,*k*), ··· , *<sup>E</sup>sca*(*tL*,*k*)] *<sup>T</sup>* is the echo vector, *<sup>σ</sup>* <sup>=</sup> [*<sup>σ</sup>* (**r**1), *<sup>σ</sup>* (**r**2), ··· , *<sup>σ</sup>* (**r***M*)]*<sup>T</sup>* is the scattering coefficient vector, **<sup>n</sup>** <sup>=</sup> [*<sup>n</sup>* (*t*1,*k*), *<sup>n</sup>* (*t*2,*k*), ··· *<sup>n</sup>* (*tL*,*k*)]*<sup>T</sup>* is the noise vector, **<sup>E</sup>***rad* is the modified radiation field matrix with **E***rad lm* <sup>=</sup> *<sup>E</sup>rad* (*tl*,*k*,**r***m*).

The scattering coefficient vector *σ* can be reconstructed by the correlated processing between **E***sca* and **E***rad*, which can be described as

$$
\mathfrak{d} = \zeta \left[ \mathbf{E}^{rad}, \mathbf{E}^{sca} \right],
\tag{18}
$$

where *ζ* denote the correlated operator.

Common correlated imaging algorithms include Pseudo-Inverse algorithm, Tikhonov regularization, TV regularization, and sparse reconstruction algorithms, such as Orthogonal Matching Pursuit, sparse Bayesian learning, etc. This paper adopts Tikhonov regularization algorithm because it is robust to noise and does not require a priori of the target. Tikhonov regularization can be formulated as the following optimization problem

$$\mathcal{S} = \underset{\sigma}{\text{arg min}} \left\{ \left\| \mathbf{E}^{\text{sca}} - \mathbf{E}^{\text{rad}} \cdot \boldsymbol{\sigma} \right\| + \lambda \left\| \boldsymbol{\sigma} \right\|\_{2}^{2} \right\},\tag{19}$$

where *λ* is the regularization parameter.

#### **3. Estimation of Translational Speed and Rotational Angular Velocity**

POS system sets inertial navigation technology and satellite navigation technology in one body, and adopts the real-time and post-process information fusion respectively to get high precision positioning and orientation information. For MSCI, the imaging time is very short, so the error accumulation of the INS is negligible, and the INS is more accurate in a short time. Hence the measured position and angular data for estimation of translational speed and rotational angular velocity are very accurate.

The calculation of the real-time position vector of each antenna at high-speed sampling time requires the translational speed and the rotational angular velocity of the antenna array. The low-speed discrete POS data will be used to estimate the translational speed and the rotational angular velocity. Since the data rate of POS is usually less than the pulse repetition frequency as Figure 3 shows, there

are many pulses between two adjacent POS data. To obtain the translational speed and the rotational angular velocity of the antenna array during each pulse, third-order polynomial curve fitting to the position and attitude is employed and the least squares method is used to obtain the coefficients of the fitting polynomial. The fitting polynomial of the position or the attitude can be expressed as

$$\mu\left(t\right) = \sum\_{k=0}^{3} a\_{\mu,k} t^k \,. \tag{20}$$

where *aμ*,*<sup>k</sup>* is the coefficient of the polynomial, and *μ* (*t*) is the fitting curve.

The initial time of the *l*-th pulse is denoted as *tl*,0, and the end time of the *l*-th pulse is denoted as *tl*<sup>+</sup>1,0 = *tl*,0 + *Tp*. By substituting *tl*,0 and *tl*<sup>+</sup>1,0 into the fitting curve *μ* (*t*), we can get the position or the attitude parameters at the beginning and the end of each pulse. Based on the assumption that the antenna array is uniformly translated and rotated during each pulse, the translational speed and rotational angular velocity in each pulse can be solved by

$$
\omega\_{l} = \begin{bmatrix}
\frac{\theta \left(t\_{l+1,0}\right) - \theta \left(t\_{l,0}\right)}{T} \\
\frac{\theta \left(t\_{l+1,0}\right) - \theta \left(t\_{l,0}\right)}{T} \\
\frac{\phi \left(t\_{l+1,0}\right) - \phi \left(t\_{l,0}\right)}{T}
\end{bmatrix},
\tag{21}
$$

$$
\mathbf{v}\_{l} = \frac{\mathbf{r}^{\mathbf{c}} \left(t\_{l+1,0}\right) - \mathbf{r}^{\mathbf{c}} \left(t\_{l,0}\right)}{T}.
\tag{22}
$$

#### **4. Simulation**

In this section, simulations are demonstrated to verify the proposed method based on unsteady aerostat platform. The scenario for the simulation is shown in Figure 1. An X-band MSCI radar system with carrier frequency of 10 GHz is considered. The randomly radiating radar array with 25 transmitters and 1 receiver is raised to 350 m height by a tethered aerostat. The main system simulation parameters are given in Table 1, and the target model is shown in Figure 4. In simulations, the measurement errors of position and altitude angles are assumed to be independent and subject to Gauss distribution with zero mean and 1 mm standard deviation for position and 0.05◦ standard deviation for altitude angles.

**Table 1.** Simulation parameter.


**Figure 4.** Target image.

To illustrate the effectiveness of the proposed method, the trajectories in Figure 5 are used as the three-dimensional translations and the rotational components of the antenna array caused by unsteady platform.

**Figure 5.** The motion trajectory of each component. (**a**) the translational component along the *Xt*; (**b**) the translational component along the *Yt*; (**c**) the translational component along the *Zt*; (**d**) the rotation of yaw; (**e**) the rotation of pitch; (**f**) the rotation of roll.

#### *4.1. Verification of the Proposed Model*

In this subsection, simulations are taken to compare the imaging performance of different imaging models. The proposed imaging model based on unsteady aerostat platform (UPIM) will be compared with the imaging model for stationary platform (SPIM) and imaging model which only uses the discrete POS data (DPDIM). SPIM ignores the motion of antenna array and assumes that the position vector of antenna and the beam coverage do not change during the imaging process. When calculating the radiation field, SPIM uses the first recorded POS data as the position and the attitude angles of the array, i.e., **<sup>r</sup>***<sup>n</sup>* (*tl*) <sup>=</sup>**r***<sup>n</sup> t*1,*pos* (*l* = 1, 2, ··· , *L*). DPDIM does not fit the discrete POS data and uses the closest POS data for each pulse. The normalized mean square error (NMSE) is used to quantity the reconstruction performance, with the definition as: *NMSE* = **xˆ** − **x** <sup>2</sup>/ **x** <sup>2</sup>, where **xˆ** and **x** denote the reconstructed and true value of target.

The imaging results are depicted in Figure 6. As shown in Figure 6a, Comparably, the image reconstructed by UPIM is focused with quite a few spurious scatters, whose better imaging performance benefits from the fact that the UPIM has the minimal motion estimation error. In Figure 6b, apart from strong scatters, the image reconstructed by DPDIM has many spurious scatters. In Figure 6c, the reconstructed image by SPIM is defocused and blurry, and the target is hard to recognize.

**Figure 6.** The imaging results of UPIM, DPDIM, and SPIM. (**a**) the reconstructed image by UPIM, the NMSE is 0.28; (**b**) the reconstructed image by DPDIM, the NMSE is 0.94; (**c**) the reconstructed image by SPIM, the NMSE is 1.21.

The point spread functions (PSF) of UPIM, DPIM, and SPIM are illustrated in Figure 7 and the *X*-axis and *Y*-axis profiles of the PSF are shown in Figure 8. It can be seen from Figure 7, the number and the level of the side lobes is minimum for UPIM while the other two methods both have more side lobes and higher level of side lobes. Figure 8 shows that these methods have almost the same width of the main lobe in *X*-axis profile and the *Y*-axis profile. The above simulation results demonstrate that UPIM indeed reduces the number and the level of side lobes caused by platform instability, but it does not improve the imaging resolution of MSCI.

**Figure 7.** The point spread function of UPIM, DPIM, and SPIM. (**a**) UPIM; (**b**) DPIM; (**c**) SPIM.

**Figure 8.** The profile of the point spread function of UPIM, DPIM, and SPIM. (**a**) the *X*-axis profile of the point spread function; (**b**) the *Y*-axis profile of the point spread function.

For the three imaging models mentioned above, Figure 9 shows the fitting effect on the translational motion trajectory of the aerostat platform along the *Xt*. It can be seen that the proposed UPIM model has the best fitting effect that the estimated translational trajectory is almost the same with the real one with the most minimum motion estimation error, which obviously benefits a better imaging performance in UPIM.

**Figure 9.** The estimated and the real trajectory of the translation along the *Xt*. (**a**) UPIM; (**b**) DPDIM; (**c**) SPIM.

To verify the effectiveness of the proposed method with the translation amplitude increasing, the relationship between the imaging quality and the translation amplitude for three imaging models is presented in Figure 10. The amplitude of three-dimensional translations gradually increased by the step of 0.5 times the original amplitude shown in Figure 5, while the rotation amplitudes keep constant. The coordinate of the horizontal axis in Figure 10 represents the multiple of the original translation amplitude. As seen from Figure 10, the imaging performance of UPIM is still better than the other imaging models when the amplitude of translation increases.

**Figure 10.** NMSE of the imaging results by UPIM, DPDIM, and SPIM at different amplitudes of translation.

The imaging quality under different rotation amplitudes is depicted in Figure 11. The amplitude of all rotational components is gradually increased by step of 0.5 times the original rotational amplitude, while the translation amplitudes keep constant. Figure 11 shows that the proposed method has better performance under all rotation amplitudes.

**Figure 11.** NMSE of the imaging results by UPIM, DPDIM, and SPIM at different amplitudes of rotation.

#### *4.2. Effect of Different Translational Components on Imaging Performance*

This section is to study the effect of independent translational component on imaging performance. In simulations, all independent translational components use the same motion trajectory as shown in Figure 5a. Figure 12 shows the imaging results reconstructed by UPIM when only one translational component exists.

**Figure 12.** The imaging results of UPIM when only one translation component exists. (**a**) Only translational component along the *Xt* exists, the NMSE is 0.21; (**b**) Only translational component along the *Yt* exists, the NMSE is 0.58; (**c**) Only translational component along the *Zt* exists, the NMSE is 0.59.

The imaging quality for each translational component under different translation amplitudes is presented in Figure 13. As shown in Figures 12 and 13, the translation along the *Xt* has the minimal influence on imaging performance, while both translation along the *Yt* and *Zt* has almost the same influence on imaging performance. Therefore, for improving the image performance, the position of the antenna array along the *Yt* and the *Zt* should be estimated more accurate.

**Figure 13.** NMSE of the imaging results when only one translational component exists at different amplitudes.

Next, we investigate the reason for the different imaging results in three-dimensional translations. Although the proposed method compensates in part of the non-cooperative motion, due to the limited number of POS data, the estimated translation errors cannot be totally eliminated. Figure 14 shows the estimation error in three-dimensional translations. The accurate calculation of the radiation field is directly related to the round-trip propagation delay of the electromagnetic wave between antenna and target. The propagation time delay error caused by estimation error of array position will lead to the calculated radiation field error. Because the translational estimation errors in three dimensions have different effect on the propagation time delay, the influence of different translational components on imaging is not the same.

**Figure 14.** The estimation error of antenna position in three-dimensional translations. (**a**) Only along the *Xt*; (**b**) Only along the *Yt*; (**c**) Only along the *Zt*.

Assuming at *t* time, the coordinate of the *i*-th antenna is (*xa*, *ya*, *za*) and the coordinate of any point *m* in the imaging region is (*x*, *y*, *z*). After Δ*t*, the coordinate of the antenna become (*xa* + Δ*xa*, *ya* + Δ*ya*, *za* + Δ*za*).

The distance from the *i*-th antenna to the point *m* in imaging region is

$$S\_{im} = \sqrt{(x - x\_a)^2 + (y - y\_a)^2 + (z - z\_a)^2}. \tag{23}$$

The partial differential of the propagation path along three coordinate dimensions is

$$\frac{\partial S\_{im}}{\partial \mathbf{x}\_a} = \frac{\mathbf{x}\_a - \mathbf{x}}{\sqrt{\left(\mathbf{x} - \mathbf{x}\_a\right)^2 + \left(y - y\_a\right)^2 + \left(z - z\_a\right)^2}},\tag{24}$$

$$\frac{\partial S\_{im}}{\partial y\_a} = \frac{y\_a - y}{\sqrt{\left(x - x\_a\right)^2 + \left(y - y\_a\right)^2 + \left(z - z\_a\right)^2}},\tag{25}$$

$$\frac{\partial S\_{im}}{\partial z\_a} = \frac{z\_a - z}{\sqrt{(x - x\_a)^2 + (y - y\_a)^2 + (z - z\_a)^2}}.\tag{26}$$

In the simulation scenario, the height of the antenna array is 350 m and the antenna is squint observation with the slanting angle 45◦. For any point in the imaging area, its coordinate satisfies 291.5 ≤ *y* ≤ 408.5, −58.5 ≤ *x* ≤ 58.5 and *z* = 0 . Because the size of antenna array is much smaller than the size of imaging region, therefore for most points in the imaging region, it is satisfied that |*x*| |*xa*| and |*y*| |*ya*|, and the value of partial differential function satisfy that |*∂Sim*/*∂ya* | > |*∂Sim*/*∂xa* |. As *za* − *z* ≈ 350, the partial differential of *Sim* satisfies |*∂Sim*/*∂za* | > |*∂Sim*/*∂xa* |. Therefore the same estimation error along the *Xt* axis will cause less propagation delay error than the other two components, which explains the reason that under the same translational trajectory, the reconstructed image with the translation only along the *Xt* has the best imaging result.

#### *4.3. Effect of Different Rotation Components on Imaging Performance*

This section is to study the effect of independent rotational component on imaging performance. In the simulation, three rotational components have the same rotational trajectory as shown in Figure 5d. Figure 15 shows the imaging results reconstructed by UPIM when only one rotational component exists. As three rotational components gradually increased by the step of 0.5 times the original rotational amplitude shown in Figure 5, the imaging quality for each rotational component under different rotation amplitudes is presented in Figure 16.

**Figure 15.** The imaging results of UPIM when only one rotation component exists. (**a**) Only yaw angle, the NMSE is 0.27; (**b**) Only pitch angle, the NMSE is 0.28; (**c**) Only roll angle, the NMSE is 0.28.

**Figure 16.** NMSE of the imaging results when only one rotation component exists at different amplitudes.

From Figures 15 and 16, it can be seen that three rotational components have almost the same effect on imaging performance under different rotation amplitudes.

#### *4.4. Effect of the Position and Angular-Measuring Accuracy on Imaging Performance*

This section is to study the effect of the measuring accuracy of position and attitude parameters on imaging. The imaging performance under different position accuracy and angular accuracy is simulated. In simulations, the measurement error is assumed to be independent and subject to Gauss distribution with zero mean and different variances. The smaller the variance, the higher the measuring accuracy. Figure 17 shows the imaging quality under different position-measuring accuracy and different angular-measuring accuracy, respectively. The results show that the imaging performance is very sensitive to the measuring accuracy of position and attitude parameters which means that the proposed method has a high demand of accurate measurement of position and attitude parameters.

**Figure 17.** The NMSE of imaging result under different measuring accuracy. (**a**) position-measuring accuracy; (**b**) angular-measuring accuracy.

#### **5. Conclusions**

In this paper, a novel MSCI method based on unsteady aerostat platform is proposed, where the MSCI radar array is carried by AVS to keep its antenna beam orientation to the target in the non-cooperative motion of the platform caused by the wind etc., and the POS is used to dynamically measure the position and the attitude of the antenna array. By decomposing of the platform motion to its translation and rotation, the motion model of unsteady aerostat platform in air has been built, and for each antenna, its real-time position vector can be calculated by its translational speed and its rotational angular velocity in each pulse, replacing the static position vector in the traditional MSCI model. For the dynamic beam coverage in the whole observation region, a selection matrix of beam coverage is introduced to indicate the illuminated area at each pulse. By analyzing the modified stochastic radiation field and its scattered echo, the MSCI model based on unsteady aerostat platform is established. Furtherly, based on low-speed POS data, a polynomial curve-fitting algorithm is used to eliminate the position error of the radar array. Simulation experiments demonstrate that under its different random translations and rotations of unsteady aerostat platform, the position and attitude of the antenna array at different time can be estimated well, and better imaging performance can be achieved by the proposed scheme, which provides a feasible technical approach for the floating-observation-platform to realize the microwave staring remote sensing observation in the near space.

**Author Contributions:** All authors contributed extensively to the work presented in this paper. Y.G. proposed the original idea. Z.J. and J.D. designed the study, performed the simulations and wrote the paper; Y.G., W.C. supervised the analysis, edited the manuscript and D.W. provided their valuable suggestions to improve this study.

**Funding:** This work has been supported by the National Natural Science Foundation of China under contact No. 61771446 and No. 61431016.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Strip-Mode Microwave Staring Correlated Imaging with Self-Calibration of Gain–Phase Errors**

#### **Rui Xia , Yuanyue Guo \*, Weidong Chen and Dongjin Wang**

Key Laboratory of Electromagnetic Space Information, Chinese Academy of Sciences, University of Science and Technology of China, Hefei 230026, China; xrke928@mail.ustc.edu.cn (R.X.); wdchen@ustc.edu.cn (W.C.); wangdj@ustc.edu.cn (D.W.)

**\*** Correspondence: yuanyueg@ustc.edu.cn

Received: 8 January 2019; Accepted: 25 February 2019; Published: 3 March 2019

**Abstract:** Microwave staring correlated imaging (MSCI) can realize super resolution imaging without the limit of relative motion with the target. However, gain–phase errors generally exist in the multi-transmitter array, which results in imaging model mismatch and degrades the imaging performance considerably. In order to solve the problem of MSCI with gain–phase error in a large scene, a method of MSCI with strip-mode self-calibration of gain–phase errors is proposed. The method divides the whole imaging scene into multiple imaging strips, then the strip target scattering coefficient and the gain–phase errors are combined into a multi-parameter optimization problem that can be solved by alternate iteration, and the error estimation results of the previous strip can be carried into the next strip as the initial value. All strips are processed in multiple rounds, and the gain–phase error estimation results of the last strip can be taken as the initial value and substituted into the first strip for the correlated processing of the next round. Finally, the whole imaging in a large scene can be achieved by multi-strip image splicing. Numerical simulations validate its potential advantages to shorten the imaging time dramatically and improve the imaging and gain–phase error estimation performance.

**Keywords:** microwave staring correlated imaging (MSCI); gain–phase errors; strip; self-calibration

#### **1. Introduction**

Radar imaging technology [1,2] has enabled radars to have the ability to obtain a panoramic image of an observation scene, which has been widely used in military warning, disaster detection and other fields. In these application scenarios, long-term continuous monitoring of large areas is an important application requirement.

Synthetic aperture radar (SAR) has high azimuth resolution imaging ability by forming large virtual synthetic aperture through relative motion between the target and radar, but its long revisiting period means that it cannot be applied to the staring imaging [3,4].

Traditional real aperture microwave staring imaging has the characteristics of high real-time, but limited by the actual aperture of the antenna; its azimuth resolution is low, so it is difficult to achieve high resolution imaging. As a new staring imaging method, microwave staring correlated imaging (MSCI) [5–7] can realize super resolution imaging without the limit of the target relative motion. The essence of MSCI is to construct a temporal–spatial stochastic radiation field in the imaging region, which is typically realized by a multi-transmitter array transmitting independent stochastic waveforms [5,6] such as the signals with random amplitude and frequency between different pulses. The radiation field interacts with the target so that the target scattering points at different locations scatter

the independent time-varying echoes. Finally, the target information can be obtained by the correlated imaging process between the echoes and the preset radiation field. In [7], two point targets in a small scene are imaged by outfield experiments based on MSCI. Accurate imaging is based on the premise of accurate preset radiation field. However, gain–phase errors generally exist in the multi-transmitter array, so there is a deviation between the actual radiation field and the preset radiation field that is calculated based on transmitted waveform, which results in the imaging model mismatch and degrades the imaging performance considerably. In [8,9], the methods are propose for model mismatch in radar coincidence imaging (RCI), but the gain–phase error model was not analysed.

The studies on calibration of gain–phase error mainly focus on a radar system with multiple transmissions or multiple receptions, including the field of angle estimation of array signals [10,11] and radar imaging [12–17]. In [10], a method based on eigenstructure is proposed for simultaneously estimating the direction of arrival(DOA) and the unknown (or imprecisely known) gain and phase parameters, which applies to arrays with arbitrary sensor geometries. The method is based on the eigendecomposition of the sample covariance matrix of the vector of received signals. In [11], an estimation of signal parameters via a rotational invariance techniques (ESPRIT)-based method is proposed to estimate the gain–phase errors of both transmission and reception arrays and signal angles in bistatic MIMO radars, in which both transmitter and receiver are equipped with uniform linear array, and the first two sensors of transmit array and receive array are well calibrated to obtain a reference channel. In the field of angle estimation of array signal, the method of gain–phase error calibration is generally to ensure the consistency of the gain–phase characteristics of each channel; in contrast, there is no requirement of uniform gain–phase characteristics between the multi-transmitter array channels in an MSCI system.

In the SAR imaging field, a subspace algorithm of calibrating channel gain–phase errors for high-resolution and wide-swath (HRWS) SAR imaging is presented [12]. The proposed method is based on the fact that the signal subspace obtained from the eigendecomposition of covariance matrix equals the space spanned by the practical steering vectors. Channel gain–phase errors can be obtained through eigendecomposition of a special matrix which is the calculation result of the nominal steering vectors and the signal eigenvectors of the covariance matrix.

All the above methods on gain–phase errors calibration make use of the characteristics of eigen-subspace and estimate the gain and phase errors by matrix eigendecomposition. The basic feature of these methods is that the signals of multiple transmitting–receiving channels are separated during processing, but the received echoes are not separated by multiple channels in MSCI, so the channel gain–phase error calibration method based on subspace decomposition cannot be directly adopted in MSCI.

Without subspace decomposition, a method is proposed for joint SAR imaging and phase error correction in [13]. The problem is set up as an optimization problem in a non-quadratic regularization-based framework, and phase error correction is performed during the image formation process. The method involves an iterative algorithm, where each iteration includes consecutive steps of image formation and model error correction. A method for RCI with phase errors is proposed in [14], which adopts the sparse Bayes learning (SBL) framework and jointly estimates target scattering coefficients and phase error during the iterative steps. Soon after, in [15], a method is proposed for sparse auto-calibration for RCI with gain–phase errors(SACRCI), which transforms the imaging into the parameter estimation problem, and then estimates target scattering coefficient and gain–phase errors jointly. In [16], an auto-calibration expansion–compression variance-component (AC-ExCoV)-based auto-focusing method in a sparse Bayesian learning framework is proposed. These methods all take the gain–phase errors as unknown parameters and adopt an iterative procedure to jointly estimates target scattering coefficients and gain–phase errors. The targets in [15,16] are all sparse in small scenes. In other respects, the calibration of the gain–phase and synchronization errors is focused on for MSCI in [17], but a reference receiver is required to receive the direct signals of the transmitters to estimate the errors.

Considering large imaging scenes in MSCI, which means a large number of grid cells, results in very large computational complexity, limit the application for the above methods in large scenes. In [18,19], the problem of MSCI in a large scene is solved by dividing the large scene into strips. In [18], the echoes of the discrete clustered targets are detected to locate the strips with targets and only the regions of interest are discretized to a fine grid.

In order to solve the problem of MSCI with gain–phase error in a large scene, a method of MSCI based on strip-mode self-calibration of gain–phase errors is proposed in this paper. By dividing the target scene into strips, for each strip, the scattering coefficient and the gain–phase errors are combined into a multi-parameter optimization problem, which can be estimated by alternate iteration. Simultaneously, the gain–phase error estimation results of the previous strip can be carried into the next strip as the initial value. All strip imaging results, which can be obtained by correlated processing in turn, are spliced to obtain the image inversion results of the whole scene. To further improve gain–phase error estimation and imaging performance, after all the strips are processed in one round, the gain–phase error estimation results of the last strip can be taken as the initial value and substituted into the first strip for the correlated processing of the next round. In this way, all strips are processed in multiple rounds to obtain the final results.

The rest of the report is organized as follows. In Section 2, the strip-mode MSCI model with gain–phase errors is presented. Section 3 presents strip-mode MSCI algorithm with self-calibration of gain–phase errors. The analysis of the computation of the algorithm is discussed in Section 4. In Section 5, the performance of the proposed method is verified by numerical examples. Finally, Section 6 concludes this paper.

#### **2. Strip-Mode MSCI with Gain–Phase Errors**

As shown in Figure 1, a rectangular coordinate system is established with the center of the transmitting array as the origin; the MSCI system located in stationary platforms is composed of N transmitters and one receiver, whose position vectors are denoted as*rn* and*rs*. The height of the transmitting array is *H* and *θ* is the squint angle. The independent narrow-pulse signals of random frequency hopping (RHF) which are transmitted synchronously by each antenna in multi-transmitter array can be expressed as:

$$f\_{\mathbf{n}}(t) = \sum\_{l=1}^{L} \text{rect}\left[\frac{t - (l-1)\,T\_p}{\tau}\right] a\_{\mathbf{n}} \mathbf{A}\_{\mathbf{n}} e^{j\varphi\_{\mathbf{n}}} \exp\{j2\pi f\_{\mathbf{n}l} \left[t - (l-1)\,T\_p\right] \}\tag{1}$$

where *fnl* is its frequency of the *l*-th, *l* = 1, 2, ··· , *L* pulse emitted by the *n*-th, *n* = 1, 2, ··· , *N* transmitter, and randomly selected within the bandwidth *B*, and *τ* and *Tp* is its narrow pulse width and period. *an*A*<sup>n</sup>* is the gain of the *n*-th transmitter, and *an* denotes the gain error coefficient of the *n*-th transmitter, which equals 1 when there is no gain error, *ϕ<sup>n</sup>* denotes the phase error of the *n*-th transmitter, which equals 0 when there is no phase error. For simplicity, in the case that the bandwidth is narrow compared with the central frequency, we consider that the gain–phase errors are fixed in the imaging process.

According to the feature of radar range-gate, the random narrow pulse signals transmitted simultaneously by the multi-transmitter array can divide two-dimensional imaging area *S* into multiple different strips *Sk*, *k* = 1 ··· *K* in the range direction [19]. The imaging strip *Sk* has been divided into discrete *J* = *P* × *Q* grids, where *P* is the row number of azimuth resolution cells, and *Q* is the column number of range resolution cells, and position vectors of the center of *j*-th grid is denoted as*rk*,*j*, whose scattering

coefficient is *σ rk*,*j* , *j* = 1, 2, ··· , *J*. According to electromagnetic field propagation in free space, stochastic radiated fields at*rk*,*<sup>j</sup>* in the *k*−th strip can be expressed as:

$$E\_k^{\text{in}}(t, \vec{r}\_{k,j}) = \sum\_{n=1}^{N} \frac{f\_n\left(t - \left(|\vec{r}\_{k,j} - \vec{r}\_n|\right) / c\right)}{4\pi t |\vec{r}\_{k,j} - \vec{r}\_n|}\tag{2}$$

**Figure 1.** Geometry of MSCI.

The radiation field interacts with the *k*-th strip targets and the received signal of the *k*-th strip is:

$$\begin{split} E\_{\vec{k}}^{\text{scan}}(t, \vec{r}\_{k,j}) &= \sum\_{j=1}^{I} \sum\_{n=1}^{N} \frac{f\_n \left( t - \left( |\vec{r}\_{k,j} - \vec{r}\_n| + |\vec{r}\_s - \vec{r}\_n| \right) \Big/ c \right)}{(4\pi)^2 |\vec{r}\_{k,j} - \vec{r}\_n||\vec{r}\_s - \vec{r}\_n|} \sigma(\vec{r}\_{k,j}) \\ &= \sum\_{j=1}^{I} E\_k^{\text{rad}}(t, \vec{r}\_{k,j}) \sigma(\vec{r}\_{k,j}) \end{split} \tag{3}$$

Define the modified radiation filed of *Sk* by considering the round-trip time of transmission after target reflection, which can be denoted as:

$$E\_k^{rad} \left( t\_\prime \vec{r}\_{k,j} \right) = \sum\_{i=1}^N \frac{f\_n \left( t - \left( \left| \vec{r}\_{k,j} - \vec{r}\_n \right| + \left| \vec{r}\_s - \vec{r}\_{k,j} \right| \right) / c \right)}{(4\pi)^2 \left| \vec{r}\_{k,j} - \vec{r}\_n \right| \left| \vec{r}\_s - \vec{r}\_{k,j} \right|} \tag{4}$$

The scattered echoes in strip-mode can be written as matrix vector:

$$\mathbf{E}\_k^{\text{var}} = \mathbf{E}\_k^{rad} \cdot \boldsymbol{\sigma}\_k \tag{5}$$

Because of the unknown gain–phase errors, the equation can be rewritten as:

$$\mathbf{E}\_k^{\text{sca}} = \mathbf{E}\_k^{\text{rad}}(\mathfrak{a}, \mathfrak{q}) \cdot \sigma\_k \tag{6}$$

where *<sup>a</sup>* <sup>=</sup> [*a*1, *<sup>a</sup>*2, ··· , *aN*] *<sup>T</sup>* is the vector of the gain errors coefficient of the multi-transmitter array, and *ϕ* = [*ϕ*1, *ϕ*2, ··· , *ϕN*] *<sup>T</sup>* is the vector of the phase errors.

Strip-mode MSCI can obtain the target information *σ*ˆ*<sup>k</sup>* in *Sk* by the correlated processing between **E***sca k* and **E***rad <sup>k</sup>* , which can be described as:

$$\boldsymbol{\sigma}\_{k} = \wp \left[ \mathbf{E}\_{k}^{rad}, \mathbf{E}\_{k}^{sca} \right] \tag{7}$$

where ℘ indicates the first-order correlated operator. Common correlated imaging algorithms include LS algorithm, TSVD regularization, Tikhonov regularization, TV regularization, sparse Bayesian learning, and etc.

Each strip is processed in turn to obtain all the MSCI results, and then all the strip images are spliced to obtain the whole scene imaging results. Since the reconstruction result is a one-dimensional vector deformed by the two-dimensional mesh of the strip, all the reconstruction results *σ*ˆ *<sup>k</sup>* need to be converted into the corresponding two-dimensional form *σ*ˆ *<sup>k</sup>*. The imaging result of the whole scene can be expressed as:

$$
\mathfrak{d}' = \left[ \mathfrak{d}'\_{1}, \mathfrak{d}'\_{2}, \dots, \mathfrak{d}'\_{K} \right] \tag{8}
$$

The whole imaging process mainly includes: transmitting signal, interaction between radiation field and target to form scattering echo, receiving echo, dividing strip, MSCI with self-calibration of gain–phase errors of each strip, and obtaining image results of whole scene by splicing all strips' imaging results. The flow chart of the whole imaging process is as Figure 2.

**Figure 2.** Imaging process flow chart.

#### **3. Strip-Mode MSCI Algorithm with Self-Calibration of Gain–Phase Errors**

According to strip-mode MSCI method, the echo corresponding to each strip can be obtained from the received echo according to the distance gate. Therefore, the correlated imaging with gain–phase errors can be carried out separately for each strip. The modified radiation filed is unknown due to the gain–phase errors. The gain–phase error estimation and target reconstruction can be combined as an optimization problem, the cost function can be expressed as:

$$F(\sigma\_{k\prime}\mathfrak{a}, \mathfrak{q}) = ||\mathfrak{E}\_k^{\mathrm{sca}} - \mathfrak{E}\_k^{\mathrm{rad}}(\mathfrak{a}, \mathfrak{q})||\_2^2 + \lambda ||\sigma\_k||\_1 \tag{9}$$

where *λ* is the regularization parameter.

Then the *k* − *th* strip gain–phase errors calibration and target reconstruction can be converted into the following optimization problem:

$$\mathbb{E}\left[\sigma\_{k'}\mathfrak{a},\mathfrak{q}\right] = \underset{\sigma\_{k},\mathfrak{a},\mathfrak{q}}{\operatorname{arg\,min}} \mathbb{P}\left(\sigma\_{k'}\mathfrak{a},\mathfrak{q}\right) \tag{10}$$

In order to solve the above problems, a strip-mode MSCI algorithm based on self-calibration of gain–phase errors is proposed for the whole target scene. The algorithm is used to divide the whole scene into strips, and then the joint iterative solution of target reconstruction and gain–phase error estimation is carried out for each strip. In the process of one iteration, the target reconstruction results are obtained by minimizing of cost function through the given gain–phase errors. Then the gain–phase errors are estimated according to the target reconstruction results, and the modified radiation filed matrix is updated with the gain–phase error estimation for the next iteration. We terminate the iteration if ' ' '*σi*+<sup>1</sup> *<sup>k</sup>* <sup>−</sup> *<sup>σ</sup><sup>i</sup> k* ' ' ' 2 2 / ' '*σ<sup>i</sup> k* ' ' < *η* or the maximum number of iterations *I*max is reached, where *η* is a predetermined threshold and the superscript *i* refers to the iteration. Key steps of the algorithm include target reconstruction and gain–phase error estimation.The concrete realization course of key steps is as follows.

#### *3.1. Target Reconstruction*

For a single strip, the target is reconstructed when the gain–phase errors is given. The initial gain–phase errors of the first strip *a* = **1**, *ϕ* = **0**. Target reconstruction can be expressed as:

$$\sigma\_{\vec{k}}^{i+1} = \underset{\sigma\_{\vec{k}}}{\text{arg min}} ||\mathbf{E}\_{\vec{k}}^{\text{sc}} - \mathbf{E}\_{\vec{k}}^{\text{rad}}(\boldsymbol{\mu}^{i}, \boldsymbol{\uprho}^{i}) \cdot \sigma\_{\vec{k}}||\_{2}^{2} + \lambda ||\sigma\_{\vec{k}}||\_{1} \tag{11}$$

The above formula is a standard compressed sensing reconstruction model. There are many existing methods for this problem, such as Basis pursuit (BP) algorithm [20], orthogonal matching pursuit (OMP) algorithm [21], Sparse Bayesian Learning (SBL) [22,23] , etc. In this paper, OMP algorithm is adopted because it is simple in structure and easy to implement and analyze.

#### *3.2. Gain–Phase Error Estimation*

The gain and phase errors are estimated in an alternate iteration manner. The gain error is estimated as:

$$\boldsymbol{a}^{i+1} = \underset{\boldsymbol{a}}{\arg\min} ||\mathbf{E}\_k^{\text{sca}} - \mathbf{E}\_k^{\text{rad}}(\boldsymbol{a}, \boldsymbol{\uprho}^i) \cdot \boldsymbol{\sigma}\_k^{i+1}||\_2^2 + \lambda ||\boldsymbol{\sigma}\_k^{i+1}||\_1 \tag{12}$$

Since is ' ' '*σi*+<sup>1</sup> *k* ' ' ' 1 a constant in the iteration, Equation can be rewritten as:

$$\mathfrak{a}^{i+1} = \underset{\mathfrak{a}}{\arg\min} \, \left|| \mathbf{E}\_{\mathbf{k}}^{\mathrm{sca}} - \mathbf{E}\_{\mathbf{k}}^{\mathrm{rad}} \left( \mathfrak{a}, \mathfrak{q}^{j} \right) \cdot \sigma\_{\mathbf{k}}^{i+1} \right||\_{2}^{2} \tag{13}$$

The above formula is a nonlinear least-squares problem, thus we use Newton's method [24] to solve the problem.

Define g*k*(*a*,*ϕ*) = ' ' ' **E***sca <sup>k</sup>* − **<sup>E</sup>***rad k a*,*ϕ<sup>i</sup>* · *<sup>σ</sup>i*+<sup>1</sup> *k* ' ' ' 2 2 , the updated *ai*+<sup>1</sup> estimation denoting by *a<sup>i</sup>* is computed as:

$$\mathbf{a}^{i+1} = \mathbf{a}^i - \left[\nabla\_a^2 \mathbf{g}\_k \left(\mathbf{a}^i, \mathbf{q}^i\right)\right]^{-1} / \left[\nabla\_d \mathbf{g}\_k \left(\mathbf{a}^i, \mathbf{q}^i\right)\right] \tag{14}$$

where ∇*a*g*<sup>k</sup> ai* ,*ϕ<sup>i</sup>* and ∇<sup>2</sup> *<sup>a</sup>*g*<sup>k</sup> ai* ,*ϕ<sup>i</sup>* represent the gradient and Hessian with respect to the gain error respectively. After derivation and simplification, we have:

$$\nabla\_{\mathcal{A}} \mathsf{g}\_k(\mathsf{a}^i, \mathsf{q}^j) = -2 \text{Re}((\mathcal{B}^k(\mathsf{a}^i, \mathsf{q}^j))^H \widehat{\mathsf{w}}) \tag{15}$$

$$\nabla^2\_{\mathfrak{a}} \mathbf{g}\_k(\mathfrak{a}^i, \mathfrak{q}^j) = 2 \text{Re}((\mathcal{B}^k(\mathfrak{a}^i, \mathfrak{q}^j))^H \mathbf{B}^k(\mathfrak{a}^i, \mathfrak{q}^j)) \tag{16}$$

$$
\widehat{\boldsymbol{w}} = \mathbb{E}\_k^{\rm sca} - \mathbb{E}\_k^{\rm rad}(\boldsymbol{a}^i, \boldsymbol{q}^i) \cdot \boldsymbol{\sigma}\_k^{i+1} \tag{17}
$$

$$\mathcal{B}^k(\boldsymbol{a}^i, \boldsymbol{\varrho}^i) = [b\_1^k(\boldsymbol{a}^i, \boldsymbol{\varrho}^i), \dots \cdot b\_N^k(\boldsymbol{a}^i, \boldsymbol{\varrho}^i)] \tag{18}$$

where Re() denotes the real part,

$$\mathbf{b}\_{n}^{k}(\boldsymbol{a}^{i},\boldsymbol{\varrho}^{j}) = \mathbf{e}^{j\boldsymbol{\varrho}\_{n}^{j}} \begin{bmatrix} \text{S}\_{n}(t\_{1},\vec{r}\_{k,1}) & \cdots & \text{S}\_{n}(t\_{1},\vec{r}\_{k,l}) \\ \vdots & \ddots & \vdots \\ \text{S}\_{n}(t\_{L},\vec{r}\_{k,1}) & \cdots & \text{S}\_{n}(t\_{L},\vec{r}\_{k,l}) \end{bmatrix} \cdot \boldsymbol{\sigma}\_{k}^{i+1} \tag{19}$$

$$S\_{\rm nl}(t,\vec{r}\_{k,j}) = \frac{f\_{\rm nl}\left(t - \left(|\vec{r}\_{k,j} - \vec{r}\_{\rm n}| + |\vec{r}\_s - \vec{r}\_{k,j}|\right) \Big/ c\right)}{\left(4\pi\right)^2 |\vec{r}\_{k,j} - \vec{r}\_n| |\vec{r}\_s - \vec{r}\_{k,j}|} \tag{20}$$

$$\hat{f}\_{\rm n}(t) = \sum\_{l=1}^{L} \text{rect}\left[\frac{t - \left(l - 1\right)T\_p}{\tau}\right] A\_{\rm n} \exp\{j2\pi f\_{\rm nl} \left[t - \left(l - 1\right)T\_p\right]\}\tag{21}$$

In the same way, the phase error is estimated as:

$$\boldsymbol{\mathfrak{q}}^{j+1} = \underset{\boldsymbol{\mathfrak{q}}}{\arg\min} \, \left|| \mathbf{E}\_{\mathbf{k}}^{\rm sca} - \mathbf{E}\_{\mathbf{k}}^{\rm rad} (\boldsymbol{\mathfrak{a}}^{j+1}, \boldsymbol{\mathfrak{q}}) \cdot \boldsymbol{\sigma}\_{\mathbf{k}}^{j+1} ||\_{2}^{2} + \lambda ||\boldsymbol{\sigma}\_{\mathbf{k}}^{j+1}||\_{1} \tag{22}$$

The updated *ϕi*+<sup>1</sup> estimation denoting by *ϕ<sup>i</sup>* is computed as:

$$\boldsymbol{\mathfrak{g}}^{i+1} = \boldsymbol{\mathfrak{g}}^i - \left[\nabla\_{\boldsymbol{\mathfrak{q}}}^2 \mathbf{g}\_k(\boldsymbol{a}^{i+1}, \boldsymbol{\mathfrak{g}}^i)\right]^{-1} / \left[\nabla\_{\boldsymbol{\mathfrak{q}}} \mathbf{g}\_k(\boldsymbol{a}^{i+1}, \boldsymbol{\mathfrak{g}}^i)\right] \tag{23}$$

The gradient and Hessian with respect to the phase error can be computed as:

$$\nabla\_{\boldsymbol{\varphi}} \mathsf{g}\_{k}(\boldsymbol{a}^{i+1}, \boldsymbol{\varphi}^{i}) = -2 \mathrm{Im}((\mathsf{D}^{k}(\boldsymbol{a}^{i+1}, \boldsymbol{\varphi}^{i}))^{H} \widehat{\boldsymbol{w}}) \tag{24}$$

$$\nabla^2\_{\boldsymbol{\varrho}} \mathbb{G}\_k(\boldsymbol{a}^{i+1}, \boldsymbol{\varrho}^i) = 2 \text{diag}(\text{Re}((\mathcal{D}^k(\boldsymbol{a}^{i+1}, \boldsymbol{\varrho}^i))^H \widehat{\boldsymbol{w}})) + 2 \text{Re}((\mathcal{D}^k(\boldsymbol{a}^{i+1}, \boldsymbol{\varrho}^i))^H \mathcal{D}^k(\boldsymbol{a}^{i+1}, \boldsymbol{\varrho}^i)) \tag{25}$$

$$D^k(a^i, \varphi^i) = [d\_1^k(a^i, \varphi^i), \dots \cdot d\_N^k(a^i, \varphi^i)] \tag{26}$$

where Im() denotes the imaginary part, diag()is the diagonalization operation.

$$d\_n^k(a^i, \boldsymbol{\varrho}^i) = \mathbf{e}^{i\rho\_n^i} a\_n^{i+1} \begin{bmatrix} \mathbf{S}\_n(t\_1, \vec{r}\_{k,1}) & \cdots & \mathbf{S}\_n(t\_1, \vec{r}\_{k,l}) \\ \vdots & \ddots & \vdots \\ \mathbf{S}\_n(t\_L, \vec{r}\_{k,1}) & \cdots & \mathbf{S}\_n(t\_L, \vec{r}\_{k,l}) \end{bmatrix} \cdot \boldsymbol{\sigma}\_k^{i+1} \tag{27}$$

The above is about the single iteration process of gain–phase error estimation by Newton's method. In the *<sup>i</sup>* <sup>−</sup> *th* iteration, *ai* ,*ϕ<sup>i</sup>* will be updated to *ai*<sup>+</sup>1,*ϕi*+<sup>1</sup> . The initial gain–phase errors of the first strip *a* = **1**,*ϕ* = **0**. The gain–phase error estimation results of the former strip are taken as the initial value and substituted into the latter strip, which makes the estimation results of the latter strip more accurate. After the first round of correlated processing with calibration of gain–phase error of all strips is completed, the gain–phase error estimation results of the last strip are brought into the first strip for the next round, and the whole imaging area divided into strips is processed in multiple rounds to obtain the final results.

The whole process of the algorithm is as follows:


#### **4. Analysis**

The proposed strip-mode MSCI method based on self-calibration of gain–phase errors can greatly reduce the computational cost of the imaging process. The total grid number of the target scene is *M*, divided into *K* strips, and the number of grid in each strip is *J*. The main operations of an iteration during imaging process include updating the modified radiation filed matrix, target reconstruction, gain–phase error estimation by Newton's method. According to the characteristics of MSCI, generally the narrow pulse number *L* should satisfy *L* > *M*. Compared to no strip division, after the target scene is divided

into *K* strips, the number of grids with in each strip is decreased to *M*/*K*, the number of narrow pulse is decreased to *L*/*K*, so the scale of the modified radiation filed matrix is reduced to *ML*/*K*<sup>2</sup> . Therefore, the computation required for updating the modified radiation filed matrix and Newton's method is reduced significantly. For the OMP algorithm in the target reconstruction process, in the case that the sparsity is *d*, the computation is O(*d* · *L* · *M*) [21] when there are no strips, in contrast, when dividing into *K* bands, the computation is *K* · O(*d*/*K* · *L*/*K* · *M*/*K*). The above discussion is about the change of the computation in an iteration. In the actual process, due to the strip division, the target scene and the operation process are simplified, the average number of iterations required in the imaging process is also decreased, and the operation time is further reduced.

#### **5. Simulations**

The effectiveness of proposed method is verified by several simulations in this section. An X-band MSCI radar system with center frequency 10 GHz is considered. The scenario for simulation is shown in Figure 1. The height of the transmitter array is 300 m , which consists of 25 elements to form a uniform array of 3 × 3 m in size. The distance of target scene is 450 m, and the size of target scene is discretized into 40 <sup>×</sup> 40 grids with grid size of 2 <sup>×</sup> 2 m. We initialize *<sup>a</sup>* <sup>=</sup> 1,*<sup>ϕ</sup>* <sup>=</sup> 0, *<sup>I</sup>*max <sup>=</sup> 100, *<sup>η</sup>* <sup>=</sup> <sup>10</sup><sup>−</sup>4. Some system parameters are given in Table 1, and the parameters of gain–phase errors are given in Table 2.


**Table 1.** System parameters.


**Table 2.** Gain–phase error parameters.

#### *5.1. Performances Under Different Number of Strips*

In this subsection, simulations are taken to compare the performances with different strips. The normalized mean square error (NMSE) is used to quantify the reconstruction effect and gain–phase error estimation, with the definition as: *NMSEdB* <sup>=</sup> <sup>20</sup> lg( *<sup>x</sup>*-*<sup>−</sup> <sup>x</sup>* <sup>2</sup> ) *<sup>x</sup>* <sup>2</sup>) ,where *<sup>x</sup>* denotes the target imaging or gain–phase errors, accordingly, *x*ˆ denotes the target reconstruction or gain–phase error estimation results.

It can be seen in Figure 3b that the image is defocused and many spurious scatterers exist with for the OMP algorithm. In Figure 3c–f, it can be seen that the image become clearer and clearer with increase in

the number of strips. The NMSEs of the reconstruction images under different strips are given in Figure 4, and it shows that NMSEs are decreased as the number of strips increases, which means the quality of imaging is getting better. Compared to no strip, proposed method with eight strips improves the imaging performance by about 20 dB from the NMSE perspective.

**Figure 3.** Imaging results (**a**) objective model; (**b**) Imaging results of OMP ; (**c**–**f**) Imaging results under different number of strips (**c**) no strip; (**d**) two strips; (**e**) four strips; (**f**) eight strips.

**Figure 4.** NMSE of target reconstructions under different number of strips.

In Figures 5 and 6, it can be seen that the estimates of gain and phase error are closer to the actual value as the number of strips increases. As shown in Figure 7, the NMSEs of gain–phase error estimation are getting lower as the number of strips increases, which means estimation errors are getting lower,

and it is proved that the proposed method in this paper can improve the accuracy of gain–phase error estimation effectively.

**Figure 5.** Gain error estimation under different number of strips (**a**) no strip; (**b**) two strips; (**c**) four strips; (**d**) eight strips.

**Figure 6.** *Cont.*

**Figure 6.** Phase error estimation under different number of strips (**a**) no strip; (**b**) two strips; (**c**) four strips; (**d**) eight strips.

**Figure 7.** Gain–phase error estimation performance under different strips (**a**) NMSE of gain error estimation; (**b**) NMSE of phase error estimation.

In Figure 8, as the strip increases, the imaging time decreases significantly, which is consistent with the analysis in this paper. It takes less than 1/15 of time by divided into eight strips compared with no strip. It is proved that the strip division can greatly reduce the time required for the correlated imaging process.

**Figure 8.** Imaging time under different number of strips.

#### *5.2. Performance under Different SNRs*

In this subsection, we compare the performance of algorithms under different SNRs, for the proposed method and SACRCI [15]. As shown in Figure 9, the imaging quality is improved significantly as the SNR increases, which means the two method are sensitive to noise. The proposed method improves the imaging performance by more than 10 dB compared with SACRCI from the NMSE perspective. In Figure 10, it can be seen that the gain–phase error estimation is also sensitive to noise.

**Figure 9.** NMSE of target reconstructions under different SNRs.

#### *5.3. Performance under Different Transmitting Array Configurations*

In MSCI, transmitting array configurations can influence imaging effect, and considering this, we perform simulations in this subsection to compare the performance under different transmitting array configurations. In Figure 11a, the transmitting array is a array with its aperture of 3 m. In Figure 11b, the array elements are randomly distributed on the plane. In Figure 11c, the aperture of the uniform planar array is reduced to 1.5 m. From the imaging results, it can be seen that the size of the array aperture influences the target reconstruction significantly, which is consistent with the relationship between the array aperture size and the imaging resolution.

**Figure 10.** gain–phase error estimation performance under different SNRs (**a**) NMSE of gain error estimation; (**b**) NMSE of phase error estimation.

**Figure 11.** Imaging results under different transmitting array configurations (**a**–**c**) Different transmitting array configurations; (**d**–**f**) Imaging results.

#### *5.4. Performance under Different Center Frequencies*

In this subsection, performance under different center frequencies is compared by simulations. I can be seen in Figure 12 that target reconstruction result is not clear when center frequency is 1 GHz, in contrast, when center frequency is 40 GHz, the target reconstruction effect is much better. This is because the resolution of MSCI is related to the center frequency, the higher the center frequency, the better the resolution, and the better the imaging effect under the same grid division.

It can be seen in Figure 12 that the target reconstruction result is not clear when center frequency is 1 GHz; in contrast, when center frequency is 40 GHz, the target reconstruction effect is much better. This is because the resolution of MSCI is related to the center frequency, the higher the center frequency, the better the resolution, and the better the imaging effect under the same grid division.

**Figure 12.** Imaging results under different center frequencies (**a**) 1 GHz; (**b**) 40 GHz.

#### *5.5. Performance under Different Target Scenes*

Since Target reconstruction results is obtained by OMP, the reconstruction performance may be affected by the target, more precisely, the sparsity of target. In this subsection, we design simulations to compare the performance under different target scenes.

As shown in Figure 13a–c are three different target scenes. It can be seen that the images become blurred as the complexity of targets increases, which means the less sparse target would make the target reconstruction more difficult and the gain–phase error estimation performance is also affected. Comparing with the results obtained by SACRCI, the spurious scatterers in the bottom three images which are obtained by the proposed method, are much less, and the three targets are identified clearly. It proves that the proposed method can improve the imaging performance by reducing the complexity of correlated imaging processing.

**Figure 13.** Imaging results for different target scenes (**a**–**c**) Three different target scenes; (**d**–**f**) Imaging results of SACRCI; (**g**–**i**) Imaging results of the proposed method.

#### *5.6. Discussion*

Lots of numerical simulations validate potential advantages of the proposed method to shorten the imaging time dramatically and improve the imaging and gain–phase error estimation performance, and show the performance under different SNRs, different targets, different array configurations and different center frequencies. In the actual system, since the proposed method uses the range-gate characteristic of the narrow pulse to divide the imaging area into strips, the transmitting system must have a high rectangular coefficient, and each transmitting element needs a high-precision time-frequency reference. The system must have a high-precision time-frequency synchronization to ensure the accurate separation of the corresponding parts of each strip from the echo. These are great challenges in actual MSCI system.

#### **6. Conclusions**

This paper proposes a method of MSCI based on strip-mode self-calibration of gain–phase errors. By dividing the target scene into strips, the target reconstruction and the gain–phase error estimation are solved simultaneously by alternate iteration. By simulations it can be seen that the gain–phase errors calibration and imaging effect have been greatly improved and the time required for the entire imaging process has been greatly shortened. Moreover, to improve imaging and gain–phase error estimation performance furtherly, not only are the gain–phase error estimation results of the previous strip carried into the next strip as the initial value, but also the gain–phase error estimation results of the last strip are the initial value in next round. In conclusion, the proposed method can greatly reduce the time required by the imaging process and improve the imaging quality, so it can rapidly achieve gain–phase errors calibration and target imaging in a large scene.

**Author Contributions:** All authors contributed extensively to the work presented in this paper. R.X. proposed the original idea, designed the study, performed the simulations and wrote the paper; Y.G. supervised the analysis, edited the manuscript, W.C. and D.W. and provided their valuable suggestions to improve this study.

**Funding:** This research received no external funding.

**Acknowledgments:** This work has been supported by the National Natural Science Foundation of China under contact Nos. 61771446, 61431016.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**



© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Geometrical Matching of SAR and Optical Images Utilizing ASIFT Features for SAR-based Navigation Aided Systems**

#### **Jakub Markiewicz 1, Karol Abratkiewicz 2,***∗***, Artur Gromek 2, Wojciech Ostrowski 1, Piotr Samczy ´nski <sup>2</sup> and Damian Gromek <sup>2</sup>**


Received: 21 October 2019; Accepted: 6 December 2019; Published: 12 December 2019

**Abstract:** This article presents a new approach to the estimation of shift and rotation between two images from different kinds of imaging sensors. The first of the image is an orthophotomap that is created using optical sensors with georeference information. The second one is created utilizing a Synthetic Aperture Radar (SAR) sensor.The proposed solution can be mounted on a flying platform, and, during the flight, the obtained SAR images are compared with the reference optical images, and thus it is possible to calculate the shift and rotation between these two images and then the direct georeferencing error. Since both images have georeference information, it is possible to calculate the navigation correction in cases when the drift of the calculated trajectory is expected. The method can be used in platforms where there is no satellite navigation signal and the trajectory is calculated on the basis of an inertial navigation system, which is characterized by a significant error. The proposed method of estimating the navigation error utilizing Affine Scale-Invariant Feature Transform (ASIFT) and Structure from Motion (SfM) is described, and techniques for improving the quality of SAR imaging using despeckling filters are presented. The methodology was tested and verified using real-life SAR images. Differences between the results obtained for a few selected despeckling methods were compared and commented on. Deep investigation of the nature of the SAR imaging technique and noise creation character allows new algorithms to be developed, which can be implemented on flying platforms to support existing navigation systems in which trajectory error occurs.

**Keywords:** SAR; Synthetic Aperture Radar; ASIFT; Despeckling Filter; Navigation; Structure from Motion; Iterative Closest Point

#### **1. Introduction**

Over recent years, supporting navigation systems has become particularly important for several reasons. Firstly, the aim is to increase the precision of ammunition and flying objects. Secondly, it is necessary to create systems that can work without the support of GNSS (Global Navigation Satellite System). The lack of a GNSS (GPS, GLONAS, Galileo, or others) signal is a significant limitation whose probability of occurring increases due to potential international conflicts, as well as the possibility of the GNSS signal being jammed or interrupted. For these reasons, it is necessary to create independent systems that allow navigation in the absence of a satellite signal. One of the basic sensors that allows the navigation of objects is the inertial navigation system (INS). However, due to the significant drift and error increasing with time, inertial navigation requires additional systems. Currently, drift is reduced using GNSS systems; however, due to the limitations mentioned above, it is necessary to use additional sensors to support inertial navigation.

In the literature, several solutions have been proposed that allow the navigation of objects in the absence of a GNSS signal. The least effective solutions are purely vision methods, which are ineffective in the case of night missions, cloud cover, fog, smog, or smoke. Other remote sensing methods that do not use sensors working in the visible spectrum are used more often. Such solutions use sensors such as light detection and ranging (LIDAR) and an altimeter, which allows a terrain contour (DEM) to be obtained, and then compares the acquired contour with the data in the database. Such systems, also known as Terrain Contour Matching (TERCOM), are utilized to navigate unmanned aerial vehicles or cruise missiles in cases when a GNSS signal could be unavailable [1–3].

A recently developed technique is the use of Synthetic Aperture Radar (SAR) and Interferometric SAR (InSAR) radar for navigation correction [4–7]. Radio waves used in this technique can be easily used during cloud cover, night, and rain, which makes this method universal and independent of the weather conditions. In this case, a radar sensor and a database with georeferenced images are on board the flying platform. During the flight, the terrain is scanned by the radar and the obtained image is compared to the corresponding one in the database. Thanks to this, the inertial navigation error can be reduced.Such a concept of utilizing SAR and InSAR sensors to support on-board air platform navigational devices was proposed in the SARINA (SAR-based Augmented Integrity Navigation Architecture) project carried out during 2010–2012. The authors of this paper were also involved in this work, and as a result have developed a concept of the system and proved that SAR/InSAR sensors can be successfully used to support navigational devices. The results of this work were published by the authors in [4–6]. Initially, the concept was proved only using simulations at the technical readiness level (TRL) 3–4 in the nine-degree scale, where 9 denotes the system prototype after all the required certifications. In the previous work within the SARINA project, the merging of SAR/InSAR images with an optical image database was based on simple automatic shape recognition of the terrain targets, using target contours extraction. Image processing was then applied using different techniques, such as the Hough transform, to find specific targets and recognize their shape. These techniques were used for SAR image matches. For InSAR, algorithms based on matching 3D SAR interferograms with LIDAR elevation models were developed that were equipped with a simulated on-board database. The very promising results of the previous SARINA project motivated the authors to continue work on the topic, and to start developing a system on a higher TRL level. In the meantime, new algorithms on SAR and optical images have been developed and are widely used for other applications, such as in geodesy. The author intend to test efficiency of these techniques in cases so they could be used in support of air platform navigational devices. One such approach is based on the SIFT technique, and was presented by the authors in [7]. The novelty and usability of this approach is presented in [8]. The SIFT algorithm was used to find and match corresponding keypoints appearing in the SAR and optical images. This approach is extended and investigated in this paper by utilizing the more robust ASIFT (Affine Scale Invariant Feature Transform) method and its limitations. The novelty of the solution proposed and described in this paper is in the utilization of an innovative approach to the shift and rotation estimation between SAR and optical images. By finding characteristic points in both images and applying themodified version of the Structure from Motion (SfM) technique, an error can be estimated, which in turn provides the navigation drift correction in the flying platform. In the authors' opinion, the concept of using the ASIFT technique is interest regarding the checking of the efficiency and precision of mismatched SAR and optical image calculations, which might be further used to correct platform navigation devices according to the algorithms developed by the authors in their previous work [4–6]. To the authors' knowledge, an innovative approach to the shift and rotation estimation between SAR and optical images by also applying SfM techniques has not been used for navigation drift correction on a flying platform, on which the authors of this paper are currently working. In this paper, the authors present an overview of the SIFT and ASIFT techniques and their

modifications, as well as the required pre-processing and its limits, which is taken into account when developing systems based on SAR systems for navigation drift corrections. The paper structures is as follows. In Section 2, there is a description of the proposed method presenting an overall problem characterization and solution. In Section 3, the basics of the SfM approach are presented to depict the main keys in this technique. Section 4 presents different types of SAR filters providing speckle noise reduction in images created using SAR radars. The results are presented in Section 5, and the conclusion closes the article. Additionally, the Appendix present the extensives results in a single part of the article. In the corresponding parts of the paper, there are references to the Appendix and images contained to ensure reader clarity.

#### **2. Methodology**

#### *2.1. Overview of the Approach*

The process of image registration refers to the alignment of two or more images of the same scene which might be obtained with the same sensor, time, and imaging conditions, as well as by different sensors and viewpoints. The process of orientating optical and SAR images is still an open issue, which creates many challenges [9–19]. Many issues can be dealt with in the process of the synergy of optical and SAR data, due to the great differences between passive and active remote sensing techniques. One of the issues is the problem with the speckle noise that influences the detection of robust corresponding/tie points [9,11–13,15,16,18,19]. Another issue is related to the differences in the image geometry acquired from these two devices [9,11–13,15,16,18,19]. Thus, many co-registration approaches have been proposed (e.g., [9–19]), but, in general, these approaches might be divided into two main categories: area- and feature-based methods. Nowadays, the Structure from Motion (SfM) methods, which are mostly based on the feature-based approach, are used for the co-registration of spaceborne SAR and optical images [9,11–13,15,16,18,19]. An extended description of the incremental SfM process is presented in Section 3. The classical SfM approach contains four main steps: (1) feature detection; (2) feature description; (3) descriptor matching; and (4) bundle adjustment. Due to the problems mentioned above, which influence the quality of detected and matched pairs of points, the SfM/co-registration approach has been modified by many authors. The first improvement is in the feature detection—downsampling SAR images or using the despeckling filters in the pre-processing step. Based on this way of pre-processing images, different types of features are detected such as blob, corners, or lines and segments. In the case of the feature description and matching, the modification is in the descriptor, or this step is eliminated and based only on the geometrical relationship. It should be stressed that these presented methods were validated in the spaceborne SAR and optical images.

The proposed methodology of the automation of SAR data registration with orthophotomaps as well as the SAR trajectory improvement, is a multi-stage process. This process is based on the original software and it consists of: (1) SAR data conversion to the raster form with a georeference file; (2) the aligning of orthophotomaps with a SAR raster based on the extended version of the ASIFT algorithm; (3) the relative orientation based on the classical SfM and modifiedIterative Closest Point (ICP) approach; (4) the analysis of the quality of the relative orientation of processed data; and (5) the final bundle adjustment.

In this investigation, the process of optical and SAR images was tested and validated with a high resolution SAR and orthophotomaps from altitude. It should be stressed that the entire investigation was performed on the full resolution of the images, and additionally tested on other pyramid levels. The authors decided to use a well-known SfM approach, but with the following modification: (1) the pre-processing of SAR data with different speckle noise reduction filters in order to increase the possibility of detecting and matching robustness corresponding points; (2) reducing the values of affine angle in the ASIFT algorithm—the values being related to the angle of the antenna; and (3) using a two-stage ICP procedure to eliminate the outliers and compute the correction parameters. Thanks to the use of the ASIFT algorithm, it was possible to detect well-distributed keypoints in the whole area

under investigation. The authors decided to eliminate the description and matching step and replace it with the ICP alignment. This allowed keypoints to be treated as a point cloud, and minimize the distance between these two point clouds in an iterative process with the simultaneous correction of translation and rotation of SAR data.

In Figure 1, a diagram of the performed research work and experiments is shown. To perform a complete analysis of the possibility of applying the modified version of the registration algorithms for the detection and matching of correspondence points, a combination of these should be determined to obtain the best results. Verification of the following parameters are required:


**Figure 1.** Diagram of the performed research: Processing and orientation of SAR images and orthophotomaps.

The idea of the automatic SAR data registration and trajectory improvement presented in this paper (according to the diagram presented in Figure 1) consists of the following steps:

1. *Generation of the SAR raster with georeferences based on the trajectory information.* This step is one of the most important parts in the whole automatic registration process. It influences both the computation time and convergence of the ICP process.

	- RAW data
	- Multilook-2D filter
	- Averaging filter
	- Minimum Mean Square Error filter
	- Enhanced Lee filter
	- Gamma MAP filter
	- SAR Block Matching 3D filter
	- 4.1. Detection of keypoints in RAW SAR images on three pyramid levels (full resolutions, as well as **1/2**, 1/4, and 1/16 of the full resolution of the raster)—ASIFT algorithm.
	- 4.2. Detection of keypoints in SAR images with the speckle noise reduction parameters: 1, 4, 8, and 16 levels (full resolution, and **1/2**, 1/4, and 1/16 of the full resolution of the raster)—ASIFT algorithm.
	- 4.3. Detection of keypoints on orthophotomap levels (full resolution, as well as **1/2**, 1/4, and 1/16 of the full resolution of the raster)—ASIFT algorithm.
	- 4.4. Matching keypoints by the incremental ICP method:
		- Approximate registration with 20 iterations and linear threshold deviation 20 m.
		- Removal of orthophotomap keypoints that are outside the SAR area.
		- Registration with 10 iterations and linear threshold deviation 5 m.
		- Removal of the SAR point outliers based on the RMSE*xy*. Final bundle adjustment.
		-

5. *Analysis of the quality of data registration on the marked check-points.*

The presented SAR data registration processing is an original approach, and for this reason original applications were applied (based on the OpenCV library and MATLAB software).

The presented technique was intensively tested, validated, and compared with the methods existing in the literature. Especially two approaches were investigated [9,17]. However, the method described in [17] is inefficient and fails due to the speckle noise on SAR images. The approach presented in [9], in turn, is accurate only on urban areas, and such assumption cannot be applied in the considered case. The authors main goal was to provide universal approach able to work in both, urban and rural areas.

#### **3. The Principles of the Structure from Motion**

Modern software packages, application, and function libraries dedicated to raster data orientation, and 3D shape reconstruction utilize algorithms based on a combination of methods commonly applied in Computer Vision (CV) and conventional photogrammetric approaches. These types of algorithms and methods allow the geometry and appearance of an object or an entire scene to be captured, and have been used in video games assets [20], virtual tours [21], virtual and augmented reality [22], and cultural heritage [23–27], among others. One of the most important approaches is the Structure from Motion (SfM) method [20,27–29]. The SfM pipeline allows for the reconstruction of three-dimensional structures based on a series of images (rasters) acquired from different positions (observation points) [20].

Figure 2 shows the overview of the incremental SfM workflow, which contains the following steps: (1) feature extraction; (2) feature matching; (3) geometric verification; (4) reconstruction initialization; (5) image registration; (6) triangulation; and (7) bundle adjustment. To generalize, the SfM approach might be divided into two main parts: the correspondence search phase (1–3) and iterative reconstruction phase (4–6). Based on these two phases, the estimation of the camera position for each image as well as a 3D reconstructed tie point, called a spare point cloud [20], can be done. In this article, only the correspondence search with the computation of registration parameters is described.

**Figure 2.** Incremental SfM pipeline [20].

#### *3.1. The Feature Extraction*

The feature extraction process is the first step of the SfM pipeline, which is based on detectors. For each image (raster data) given as an input, a group of characteristic points (called keypoints) are detected (excreted) based on the local characteristic of the image intensity. For feature extraction, different methods and algorithms can be used which affect the robustness of the detected features, as well as the efficiency of the matching method. Nowadays, two types of commonly used algorithms are corner (such as FAST, Harris, etc.) and blob detectors (i.e., SIFT and its modification) [30–38]. Brief summaries of studies into data fusion and finding correspondences between SAR and optical images have been recently made by [39] and [14]. Due to the fact that many feature detectors exist, in this section, only SIFT [40] and ASIFT (a modified version of the SIFT) [38] algorithms are focused on, which were used in this investigation.

The SIFT (Scale Invariant Feature Transform) algorithm, which was originally proposed by Lowe [40] for the registration of optical images, has already been adapted for the matching of SAR images. In their studies, [41] focused on the denoising of SAR data with curvelet transformation and the evaluation of SIFT performance on SAR images for various terrain types. The speckle noise, which is characteristic for SAR images, has a vast influence on the SIFT algorithm's performance, and beside denoising (which is further described in Section 4) deeper modification into SIFT feature extraction has also been proposed in the SAR-SIFT approach [19]. However, the noise is not the only difference between SAR and optical images; geometrical differences also have an influence on image registration [14]. In this research, the authors are searching for corresponding points between aerial SAR images where image points are recorded according to the object-to-antenna distance and optical orthophotomaps where bare ground is in ortho projection, but all other objects that are elevated from the ground (e.g., buildings or vegetation) are distorted (shifted) by central projection from the optical image. To overcome these geometric distortions, the authors propose to use the ASIFT (Affine Scale Invariant Feature Transform) [38], which is a modification of the SIFT algorithm. The main idea of the ASIFT algorithm is to simulate a set of sample views of the initial images, obtainable by varying the two camera axis orientation parameters, namely the latitude and the longitude angles, which are not detected by the classical SIFT method [38]. Then, the SIFT method is applied to all the virtual images generated. Thus, ASIFT covers all six parameters of the affine transform and guarantees full affine invariant independence. In the ASIFT algorithm, each image is transformed by simulating all possible affine distortions caused by the change of the initial camera positions. To perform this, the camera model as well as the affine model are utilized (Equations (2) and (3)):

$$
\mu = S\_1 G\_1 A T u\_{0\prime} \tag{1}
$$

where *u* is a digital image; *u*<sup>0</sup> is an (ideal) infinite resolution frontal view of the flat object; *T* and *A* are, respectively, a plane translation and a planar projective map due to the camera motion; *G*<sup>1</sup> is a Gaussian convolution modeling the optical blur; and *S*<sup>1</sup> is a standard sampling operator on a regular grid with mesh 1.

$$u(x,y) \to u(ax+by+\varepsilon, cx\_dy+f) \tag{2}$$

$$A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} = H\_{\lambda} R\_{1}(\psi) T\_{l} R\_{2}(\phi) = \lambda \begin{bmatrix} \cos \psi & -\sin \psi \\ \sin \psi & \cos \psi \end{bmatrix} \begin{bmatrix} t & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \cos \phi & -\sin \phi \\ \sin \phi & \cos \phi \end{bmatrix} \tag{3}$$

where *λ* > 0 is the determinant of *A*, *Ri* are rotations, *φ* ∈ [0, *π*), and *Tt* is a tilt, namely a diagonal matrix with first eigenvalue *t* > 1 and the second one equal to 1.

It is possible to prepare the decomposition of the camera motion parameters into the viewing point angels (longitude (*φ*) and latitude (*θ* = arccos <sup>1</sup> *<sup>t</sup>* )), spin of the camera (*ψ*) and zoom factor (*λ*). In the ASIFT algorithm, images undergo rotation with the angle *φ*, which is represented by the tilt parameter *t* = <sup>1</sup> cos *<sup>θ</sup>* . In the ASIFT algorithm, the influence of the tilt (latitude rotation) is performed by the t-sampling and the Gaussian convolution with standard deviations*c* <sup>√</sup>*t*<sup>2</sup> <sup>−</sup> <sup>1</sup> (*<sup>c</sup>* <sup>=</sup> 0.8). It is assumed that the latitudes *θ* are sampled as the geometric series 1, *a*, *a*2, ... , *a<sup>n</sup>* with *a* > 1 and choosing *<sup>a</sup>* <sup>=</sup> <sup>√</sup><sup>2</sup> is a good compromise between the accuracy and the number of steps. The authors of the ASIFT algorithm proposed the *n* value equals 5, which results in the tilt being simulated 32 times [38]. In the case of the longitudes *φ*, the arithmetic series 0, *<sup>b</sup> <sup>t</sup>* , ... , *kb <sup>t</sup>* with the *<sup>b</sup>* <sup>∼</sup><sup>=</sup> <sup>2</sup>*<sup>π</sup>* <sup>5</sup> and *<sup>k</sup> <sup>b</sup> <sup>t</sup>* < *π* is used. After the process of virtual image generation (which includes the skew, tilt, and rotation), any detector, such as SIFT or SURF, might be used. In this study, the SIFT detector was used.

#### *3.2. The Feature Description*

After the process of detecting characteristic points, the next step is to describe this by analyzing the nearest points. In the literature, several descriptors are presented such as SIFT, SURF, Daisy, etc. [42], but the authors decided to describe only the SIFT detector, which was used in this study. The main idea of the SIFT descriptor is to compute the local image gradients at the selected scale in the region around the tested keypoint. The full description of the descriptor can be found in Lowe's publication [34], and a further SAR-specific modification of this descriptor can be found in [19]. In the original ASIFT algorithm proposal, the SIFT detector is used. This mathematical computation allows one to determine which detected key-point and its surrounding is highly distinctive yet as invariant as possible to remaining variations, such as changes in illumination or 3D viewpoint. The descriptor calculation is similar to determining the detector as the image gradient magnitude and orientations are sampled around the keypoint localization for each octave and each Gaussian blur.

#### *3.3. The Feature Matching and Images Registration*

The detection and description of features for each characteristic point are important components in the process of the detection of conjugate points in digital images. To determine if the keypoint (obtained through the point detection and description process) might be threaded as a tie point, the feature matching process is used. This allows one to take into consideration two points from different images, but which are characterized by the same description as a homologous point. Different strategies can be used for effectively computing matches between images, but two which are usually used are: approximate nearest-neighbor-based point matching [43] and brute-force matching [43]. In the nearest neighbor approach, the points are stored in the k-dimensional space (k-d tree structure). This allows one to compute the nearest neighbors based approximately on the minimal distances between the descriptor values [43]. The brute-force matcher is much simpler because it takes the descriptor of one feature in the first set and matches it with all other features in the second set, using a set of distance calculations. As a result, the closest feature is returned. The feature matching based only on the descriptors is justified in the case of the sensors with a similar wavelength and images obtained from the same optical system. In the case of matching heterogeneous images (e.g., optical and SAR), where a large amount of outliers is expected during feature (descriptors) matching, also adding additional geometrical constraints could be beneficial. [18] used spatial consistent matching, which assumes that

the geometrical relationship between matching features should not change too much across images. However, this solution is still largely based on SIFT-like descriptors which are constructed using gradients of image values that can differ significantly between SAR and optical images because of radiometric differences [39]. To overcome this problem, the authors of this paper proposed another way of keypoint matching, where the feature matching is based solely on the geometrical relations between keypoints and utilizes the Iterative Closest Point (ICP) algorithm [44]. The ICP algorithm is a well-known algorithm, implemented in many commercial and open-source software, as well as in programming libraries (such as VTK, open3D, and PCL) [45–47] and is used for oriented point clouds—the minimal distance between two point clouds [44]. There are many variants of the ICP algorithm [44,48–52]; however, in this section, only one of them (thebasic one), proposed by Besl and McKey, is described in more detail.

When considering two datasets, it is possible to determine interrelations between them expressed by the phenomenon:

$$y\_i = R\mathbf{x}\_i + y\_0 \tag{4}$$

where *R* is a rotation matrix, *xi* are the point coordinates in the input point cloud reference system, and *y*<sup>0</sup> is the translation vector. In the proposed approach, only 2D coordinates are used because orthophotos do not contain height information, and SAR images during pre-processing are projected onto a plane (with mean height). Real-world coordinates can be easily obtained for orthophotomaps because they are georeferenced, and, for SAR images, coordinates are estimated with direct georeferencing using the plane trajectory from GNSS and/or an inertial navigation. The main objective of the ICP method is to align two point clouds, based on shapes or models, by using the Euclidean distance dependence between the nearest point from the initial set of points and the reference. For this purpose, based on the least square method using the distance square minimization (Equation (5)) function, transformation parameters are calculated for points on the areas for which common coverage occurs.

$$\sigma^2 = \sum\_{i} \left\| R\mathbf{x}\_i + \mathbf{y}\_0 - \mathbf{y}\_i \right\|^2 \Rightarrow \text{min.} \tag{5}$$

In each of the iterations of the ICP algorithm, the transformation can be determined using the four main methods: SVD decomposition [53], Hora quaternions [54], Horn's orthogonal matrix [55], and based on Walker's double quater [56]. These algorithms are characterized by similar effectiveness and stability of operation in the case of noisy point clouds [57]. Based on the calculated translation and rotation parameters, the initial point cloud is transformed and the whole process repeats until the minimal distance threshold is not reached. The presented ICP method is commonly used for a 3D point cloud registration. However, in the case of the keypoint matching, only 2D space computation is performed. When using the ICP method, it is important to have a good first approximation of the relative orientation parameters, because without it the solution of the final registration might fail. In the proposed method of the SAR images and orthophotomaps coregistration, this condition is met thanks to the approximate georeference of both data sources. This way of keypoint matching allows one to reduce the problem of the influence of the descriptor, because it is only based on the geometrical relationship between the keypoints detected in SAR and optical images, and could be easily used as long as the approximate georeference of both images are known. One of the most important parts of the image orientation is the geometric verification of the matched keypoints. This correct keypoint matching determines the final correctness of the alignment and quality of the data registration. Depending on the methodology of keypoint matching—descriptor matching or keypoint ICP matching—different methods are used, but overall it should be stressed that the feature matching phase only verifies pairs of points on matched images. Considering the descriptor matching, it is not guaranteed that the matches found actually correspond to 3D points in the scene, and outliers could be included. It is important to find the correct geometric transformation that correctly maps the corresponding point. Using the descriptor method, it is necessary to choose the correct mathematical model for transformation, i.e., homography or similarity transformation. A list of commonly used methods are presented in [20]. To reduce the outliers from the data, it is necessary to implement robust estimation techniques such as RANSAC (Random Sample Consensus [56]) or MLESAC (Maximum Likelihood Estimation SAmple and Consensus), which is a generalization of the RANSAC algorithm [58,59]. In the case of the ICP matching method, the transformation method is predefined. However, for eliminating the outliers, a transformation method such as a similarity transformation can be applied. In this investigation, the registration parameters from the keypoint ICP method are treated as final and applied to the correction of the SAR georeference.

#### **4. SAR Image Preprocessing**

Because all kinds of algorithms which detect keypoints such as ASIFT/SIFT/SURF work on intensity images, the presence of speckle noise in SAR images is an obstacle. Speckle noise affects the behavior of keypoint detector algorithms, which is why it has to be reduced. *Speckle noise* is a common phenomenon that accompanies all coherent imaging systems, such as SAR sensors [60]. The source of this "noise" is attributed to random interference between the coherent returns issued from the numerous scatterers present on the surface of a scene, in relation to the wavelength of incident radar wave. The resulting *speckle* has a multiplicative nature, thus SAR imagery is characterized by strong intrinsic noise (hereafter, referred to as *speckle* only). Typically, for a single-look SAR image, **ISNR** (Intrinsic Signal to Noise Ratio) = 0 dB, and we have the same amount of signal and noise power/level. It appears in the image as strong fluctuations in its brightness, hardening the image interpretation.

As described in the literature, utilizing SAR radar images for different purposes often requires additional operations, increasing SNR. If the operations are not performed, the results can be strongly disturbed [7,61]. Some despeckling methods should be taken into account in a processing pipeline, even though they might be computationally complex (for example as shown in Figure 3), because by reducing the influence of noise, better quality results can be obtained, which might be critical in many applications. In the presented approach, speckle noise plays an important or even key role in the final outcome. Despite the resolution reduction caused by the filter usage, the local dynamic of the image is significantly improved, which allows further processing to be carried out. Further processing in this case means keypoint localization, which is the finding of characteristic points in the image. If the speckle noise is present, disturbed pixels may be considered as keypoints even if they are only distorted.

For a fully developed *speckle* (see Figure 4a), the brightness fluctuations in SAR images can be modeled using a gamma distribution (Equation 6 ). The gamma distribution is one of the basic types of distributions used in SAR radar imaging (although not the only one [60]):

$$\begin{aligned} p\_z(z, n|\sigma, L) &= \frac{n}{\Gamma(L)} \left(\frac{L}{\sigma}\right)^L z^{nL-1} \exp\left(-\frac{Lz^n}{\sigma}\right), \\ n &= 1 \quad \text{for intensity data, } I = A^2 \\ n &= 2 \quad \text{for amplitude data, } A\_A \end{aligned} \tag{6}$$

with the following statistical moments:

*E z<sup>m</sup>* <sup>=</sup> <sup>Γ</sup>(*<sup>L</sup>* <sup>+</sup> *<sup>m</sup>*/*n*) Γ(*L*) , *σ L* -*m*/*<sup>n</sup>* , (7)

where *z* is the given pixel value; *n* is the amplitude (*A*) or intensity (*A*2) data format; *σ* is the expected (true) pixel value; *L* is the number of (multi)looks/averages; and *m* is the statistical moment order (*m* = 1 is the mean value; *m* = 2 is the mean square value; etc).

From Equations (6) and (7), providing *n* = 1, the basic maximum likelihood estimators (MLE) can be determined:

$$E[z] = \langle z \rangle,\tag{8}$$

$$ENL\left(\equiv L\right) = \frac{E^2[z]}{E[z^2]} = \frac{\langle z \rangle^2}{\langle z - \langle z \rangle \rangle^2},\tag{9}$$

where *E*[•] is the expected value (operator); • is the arithmetic averaging; and *ENL* or *ENIL* is the equivalent number of (independent) looks.

One of the basic ways to deal with the problem of high *speckle* noise level is the pre-processing of SAR imagery with despeckling filters. Commonly, a classical multilooking technique is applied. Nevertheless, there are plenty of filtration/despeckling algorithms.The authors decided to describe types of filters which were used in experimental processing since the results are different for each filter type.

#### *4.1. ML2D: Multilook–2D Filter*

In contrast to the classical multilooking procedure [62], Multilook–2D (which might also be called a non-coherent version of the multilook procedure) works in both image dimensions, i.e. range and cross-range (X and Y) [61]. Thanks to this, a more effective *speckle* reduction can be made with less degradation of the image spatial resolution at the same time.

The idea for the 2D multilooking procedure was taken from optical image processing. The algorithm operates on the entire, already prefocused SAR image, based on Fourier domain processing.

$$\begin{aligned} Z\left\{\omega\_{\mathbf{x}\_{\prime}}\,\omega\_{\mathbf{y}}\right\} &= \mathbb{F}\_{2D}\Big[z(\mathbf{x},\,\,\,\mathbf{y})\Big], \\ Z\left\{\omega\_{\mathbf{x}\_{\prime}}\,\,\omega\_{\mathbf{y}}\right\} &\approx \sum\_{i=1}^{L} Z\_{i}\Big\{\omega\_{\mathbf{x}\_{\prime}}^{\prime}\,\,\omega\_{\mathbf{y}}^{\prime}\Big\}, \\ \tilde{z}(\mathbf{x},\,\,\,\mathbf{y}) &= \sum\_{i=1}^{L} \,\,\big|\,\,\mathbb{F}\_{2D}^{-1}\Big[Z\_{i}\{\omega\_{\mathbf{x}^{\prime}}^{\prime}\,\,\omega\_{\mathbf{y}}^{\prime}\}\cdot\mathcal{W}\{\omega\_{\mathbf{x}^{\prime}}^{\prime}\,\,\omega\_{\mathbf{y}}^{\prime}\}\Big]\Big]\Big. \end{aligned} \tag{10}$$

where *z*(*x*, *y*) is the original noisy SAR image; *z*˜(*x*, *y*) is the *despeckled* (reconstructed) SAR image; *Z* ! *ωx*, *ω<sup>y</sup>* " is the two-dimensional Fourier spatial frequency spectrum ! *ωx*, *ω<sup>y</sup>* " ; *Zi* ! *ω <sup>x</sup>*, *ω y* " is the partial two-dimensional Fourier spatial frequency spectrum ! *ω <sup>x</sup>*, *ω y* " ; *W*! *ω <sup>x</sup>*, *ω y* " is the window (spectrum weighting) function; and *L* is the number of (multi)looks...

The algorithm transforms the entire radar image into a two-dimensional Fourier space. Then, it divides the spectrum into *L* sub-bands (with the possibility of overlapping, typically ≤ 50%) and filters each sub-band by appropriate weighting. Finally, it reconstructs the image throughout, returning to the spatial domain for each sub-band, generating sub-pictures, and incoherently putting all of them together. The resulting (reconstructed) SAR image is characterized by its reduced *L*-times speckle level. Undesirable side effects are <sup>√</sup>*L*-times spatial resolution degradation and visible side lobes resulting from spectrum windowing. The result of filtering a real-life SAR radar image is shown in Figure 4b.

#### *4.2. MEAN: Averaging Filter*

One of the simplest noise filtration techniques is averaging image samples over the area around a pixel, combined with a sliding window technique (see Equation 11 ). For a gamma distribution [60], the averaging operation also corresponds to the maximum likelihood estimator (MLE) [63].

$$z(k) = \frac{1}{N} \sum\_{i=1}^{N} z(k+i),\tag{11}$$

where *z*(*k*) is the original noisy SAR image; *k* is the linear image pel index (*k* = *x* + *Width* ∗ *y*) *Width* is the image width in pixels/pels; *Height* is the image height in pixels/pels; and *x*, *y* is the image pixel(s) indexes (horizontal and vertical); *z*˜(*k*) is the *despeckled* (reconstructed) SAR image; *N* is the filter window size in pixels/pels, e.g. 3 × 3, 5 × 5, etc; and *i* is the filter window linear index ∈ 1, *N*.

As a result of the averaging filtration, the speckle standard deviation is reduced by a factor of <sup>√</sup>*N*-times, where *<sup>N</sup>* is the number of pixels in the filter window. Note that in extreme cases, if the window size is *N* = 1 (only the central pixel without its surroundings will be taken into account), no filtration effect will be noticed. In turn, for a large window size N, the speckle will be reduced at a cost of image detail degradation. An averaging filter that does not include local image statistics results in severe degradation of details (such as: lines, edges and point target blurring). To overcome this issue, only suitably sized windows should be chosen (small windows are usually used e.g. 3 × 3, 5 × 5 points in two dimensions). The result of filtering a real-life SAR radar image with an averaging filter is shown in Figure 4c.

#### *4.3. MMSE: Minimum Mean Square Error Filter*

The above-mentioned despeckling filters fail when the assumption of constant pixel value within the filter window breaks down. The filter should then adapt to take account of excess fluctuations compared to *speckle* within the analysis window. One approach to such an adaptive filter is to provide a model-free minimum mean-square error (MMSE) filter based on measured local statistics. The solution of minimizing the mean square error for a pixel *z*˜(*k*) is to perform first-order expansion about its local mean value *z*¯(*k*) (Equation 11 ) so that:

$$\begin{aligned} \overline{z}(k) &= \overline{z}(k) + a \cdot (z(k) - \overline{z}(k)),\\ \alpha &= \frac{\overline{\nabla}\_{\sigma}}{\overline{\nabla}\_{z}} = \frac{\overline{\nabla}\_{z} - 1/L}{\overline{\nabla}\_{z} \left(1 + 1/L\right)}, \end{aligned} \tag{12}$$

where *z*(*k*) is the original noisy SAR image; *k* is the linear image pel index (*k* = *x* + *Width* ∗ *y*); *z*¯(*k*) is the expected/mean pixel value (see Equation 11 ); *z*˜(*k*) is the *despeckled* (reconstructed) SAR image; *α* is the linear interpolation weight; and *Vz* is the normalized variance.

As can be seen, it is a weighted sum (or linear interpolation) between the mean and given/current pixel value. In cases when there is no fluctuation coming from the image texture, the weighting factor *α* → 0 and the pixel value is assigned the average value of its surroundings. On the other hand, when the fluctuation of the image texture takes on a significance weighting factor *<sup>α</sup>* <sup>→</sup> *<sup>L</sup> <sup>L</sup>*+<sup>1</sup> , the value of the pixel will be scaled by a factor of alpha—which can happen in places where there are lines, edges, or any other texture/spatial features. The result of filtering a real-life SAR radar image with an MMSE filter is shown in Figure 4d. A similar approach was presented by Lee [64,65].

#### *4.4. ELEE: Enhanced Lee Filter*

An approach presented by Lee [64] considers an optimal linear filter that is equivalent to a first-order Taylor expansion of the multiplicative noise model *z*(*k*) = *z*¯(*k*) · *η* about expected *z*¯(*k*) and *speckle* component *η*. Multiplicative noise can be rewritten as an additive one by *z*(*k*) = *<sup>z</sup>*¯(*k*)+(*<sup>η</sup>* <sup>−</sup> <sup>1</sup>) · *<sup>z</sup>*¯(*k*), thus resulting in a similar form to MMSE (Equation 12 ), but the weighting factor *α* is now given a bit differently:

$$\begin{aligned} \overline{z}(k) &= \overline{z}(k) + \mathfrak{a} \cdot \left( z(k) - \overline{z}(k) \right), \\ \mathfrak{a} &= \frac{\overline{\nabla\_z} - 1/L}{\overline{\nabla\_z}}, \end{aligned} \tag{13}$$

where *z*(*k*) is the original noisy SAR image; *k* is the linear image pel index (*k* = *x* + *Width* ∗ *y*); *z*¯(*k*) is the expected/mean pixel value (see Equation 11 ); *z*˜(*k*) is the *despeckled* (reconstructed) SAR image; *α* is the linear interpolation weight; and *Vz* is the normalized variance.

When there is no image texture variation, it would be expected that the estimate *Vz* is close to pure *speckle*, i.e. 1/*L*, so that *α* → 0 and *z*˜(*k*) = *z*¯(*k*) (same as for the MMSE algorithm). However, texture variability causes *Vz* to be different from the *speckle*. If pixel value *z*(*k*) is sufficiently large compared with its surroundings, it yields a large value of *Vz*, so that *α* → 1 and *z*˜(*k*) = *z*(*k*), and the pixel value remains unchanged (no filtration effect). Thus, the response of Lee's filter to strong targets differs from MMSE in that it ignores the *speckle* contribution to the target when making the filtration, corresponding to treating the bright pixel as a point target that would not give rise to *speckle* fluctuations. The result of filtering a real-life SAR radar image with an enhanced Lee filter is shown in Figure 4e.

#### *4.5. GMAP: Gamma MAP Filter*

The Gamma filter is a Maximum A Posteriori (MAP) filter based on a Bayesian analysis of the image statistics. It assumes that both the SAR image texture and the *speckle* follows a Gamma distribution (Equation 6 ). The imposition of these distributions yields a K-distribution [66], which is recognized to match a large variety of different types of radar clutter, such as land and ocean type cover. The formula of the GMAP filtration is given by:

$$\begin{aligned} \overline{z}(k) &= \frac{\overline{z}(k) \cdot (\nu - L - 1) + \sqrt{\overline{z}^2(k) \cdot (\nu - L - 1)^2 + 4 \cdot \nu \cdot L \cdot \overline{z}(k) \cdot \overline{z}(k)}}{2 \cdot \nu} \\ \nu &= \frac{1}{\overline{V}\_\sigma} = \frac{1 + 1/L}{\overline{V}\_z - 1/L} \end{aligned} \tag{14}$$

where *z*(*k*) is the original noisy SAR image; *k* is the linear image pel index (*k* = *x* + *Width* ∗ *y*); *z*¯(*k*) is the expected/mean pixel value (see Equation 11 ); *z*˜(*k*) is the *despeckled* (reconstructed) SAR image; *ν* is the texture model order parameter; and *Vz* is the normalized variance.

In the case of pure *speckle*, it would be expected that *Vz* = 1/*L* so that *ν* → ∞ and *z*˜(*k*) = *z*¯(*k*) (as with the MMSE and Lee filters). However, as mentioned above, texture variability causes the estimate *Vz* to be different from the *speckle*. In this case, excess pixel value *z*(*k*) is understood as image texture contribution, and the parameter takes small values, thus the Gamma MAP estimator weights the current pixel value by the properly calculated weight *z*˜(*k*) = *z*(*k*)/(1 + 1/*L*). The result of filtering a real-life SAR radar image with Gamma MAP filter is shown in Figure 4f.

#### *4.6. SAR-BM3D: SAR Block Matching 3D Filter*

Among different types of despeckling filters [64,67–69], one in particular demonstrates great improvements in speckle filtration performance—SAR-BM3D [70]. The filter has a very complicated structure; however, to outline the processing flow, it can be summarized as shown in Figure 3. The filter is divided into two stages, the first of which makes a coarse estimate of the image content/texture by Non-Local Meanings (NLMs), and the second stage makes a fine estimate of the image texture by Wiener filtration. The algorithm operates in the wavelet transform domain, and the general formula of SAR-BM3D filtration is given by:

$$\begin{aligned} \bar{Z}(k) &= \bar{Z}(k) + \frac{V\_{\Sigma}^2(k)}{V\_{\Sigma}^2(k) + V\_{\mathcal{U}}^2(k)} \cdot \left[ Z(k) - \bar{Z}(k) \right], \\ Z(k) &= \Sigma(k) \cdot H = \Sigma(k) + \{H - 1\} \cdot \Sigma(k) \\ &= \Sigma(k) + \mathcal{U}(k), \end{aligned} \tag{15}$$
 
$$Z(k) = \mathbb{W} \mathbb{T}\_{3D} \left[ z(k) \right], \quad \tilde{z}(k) = \mathbb{W} \mathbb{T}\_{3D}^{-1} \left[ \bar{Z}(k) \right],$$

where *Z*(*k*) is the original noisy SAR image in wavelet transform domain, and capitalized letter means a transformed value (e.g., *X* = *WT*[*x*] ); *Z*¯(*k*) is the expected (true/original) image texture in WT transform domain; *Z*˜(*k*) is the *despeckled* (reconstructed) SAR image in WT transform domain; *u*(*k*) is the zero mean, additive signal dependent *speckle* noise (*u*(*k*)=(*η* − 1) · *z*¯(*k*)); *k* is the linear image pel index (*k* = *x* + *Width* ∗ *y*); and *η* is the fully developed *speckle* noise.

**Figure 3.** SAR-BM3D despeckling filter block diagram.

The first stage comprises three steps:


The second stage also comprises the same three stages, but with the following differences:


• aggregation—as in Stage 1, giving a final estimate of the image.

The presented filtration algorithm comprises several state-of-the-art techniques, thus giving better performance in terms of signal-to-noise ratio and perceived image quality than any of the other aforementioned filters. The result of filtering a real-life SAR radar image with the SAR-BM3D filter is shown in Figure 4g.

#### *4.7. Filtration Performance*

In this subsection, SAR image despeckling filtration results are presented and compared. The image used is shown in Figure 4a; the imaged area is located near Plock city/refinery, Poland. The image was made by WUT's (Warsaw University of Technology) proprietary radar imaging system. The images shown in Figure 4 present results for different filters applied. In Table 1, a summary of the despeckling filtration performance is presented.

It should be emphasized that the implementations of filters were made as so-called MATLAB ® *m*-scripts, and the entire processing chain was made in this computing environment. Thus, the filtration performance in terms of absolute processing time is not meaningful. However, it reflects the computational complexity for each filter.

**Table 1.** SAR†‡ images filtration performance comparison.


† *Original SAR image size* 2540 × 2250 [*px*]*;* ‡ *Original SAR image ENL* = 1*.*

From the results presented in Table 1 and Figure 4, it can be seen that SAR-BM3D is the most effective filter. In this case, the equivalent number of looks (*ENL*) indicator is many times higher than for the others; also the visual assessment is outstanding (see Figure 4g). However, at the same time, it is the slowest one; its execution time is several times longer compared to the other filters, which is a consequence of its sophisticated/complex signal processing pipeline. However, there is a potential to significantly speed up its execution time (e.g., by using CUDA).

(**a**) Original/noisy SAR image **Figure 4.** *Cont.*

(**b**) ML2D despeckling filter

(**d**) MMSE despeckling filter

(**f**) GMAP despeckling filter (**g**) SAR-BM3D despeckling filter **Figure 4.** SAR imagery filtration results: *L* is the number of (multi)looks and *N* is the filter window size (equiv. to *L*). \* Even window size forced by UWT transform implementation.

On the other hand, the ML2D despeckling filter provides good performance for both *speckle* reduction and execution time (see Table 1 and Figure 4b). In comparison with the other filters, it performs the fastest, giving very good image quality (*ENL* ≈ *L* is the desired number of multilooks). Nevertheless, this statement does not disqualify the usability of the other filters.

The *ENL* indicator is not fully unambiguous in the context of radar to optical image comparison. It is important to preserve details of the unique structural features of the image resulting from the texture of the observed scene. Thus, an assessment based solely on *ENL* is not conclusive. Therefore, in the following sections, further analysis is carried out for all the filters described above.

#### **5. Results**

According to the previously mentioned methods for SAR image despeckling, the ASIFT algorithm was examined to verify its usability in the considered problem. The following scene presented in Figure 5 is considered.The SAR image being considered was gathered during one of the measurement campaigns in which the authors took part. During the raw data acquisition, the GPS data were stored together with the IQ signal samples. The optical image is the reference orthophotomap obtained during the geodetic measurements, and is defined by precise latitude and longitude information. As can be seen, a significant shift is visible in the SAR image, which has to be eliminated using the proposed approach. In theimage, corresponding regions were marked in colors, provingdirect georeference errors between the images. Next, the six described filters were employed to obtain speckle noise reduction, and the results were compared to the original SAR image.

**Figure 5.** Initially oriented images. The corresponding characteristic areas were marked in colors.

In the process providing point cloud matching, two images are considered. The first is the orthophotomap, which is the same for all cases. The second image was changed to verify the proposed filters. An optical reference image is presented in Figure 6. The total number of keypoints found on the orthophotomap is 126, 164.

**Figure 6.** Orthophotomap with marked keypoints processed using the ASIFT algorithm.

#### *5.1. Original SAR Image*

In the first step, the original SAR image was considered. As mentioned, speckle noise may have significant influence when SAR data are processed in a typical way, ignoring this phenomenon. However, the problem under consideration requires additional processing, because such noise provides unwanted keypoints related to the nature of SAR image creation, not actual artifacts in the image. Figure 7 presents the results of the ASIFT algorithm processing for the initial SAR image.

**Figure 7.** The results of the ASIFT algorithm processing for the initial SAR image.

As can be seen, speckle noise has a significant impact on the number of keypoints as well as their quality. The ASIFT algorithm found 367, 584 keypoints, which is almost three times more in comparison to the optical image. Since speckle noise is present in all of the dataset, the distinguishing of the characteristic components or objects is impossible. However, the point clouds assigned to the images were processed, and the results before and after the operation are presented in Figure 8.

**Figure 8.** Point clouds before and after correction for the orthophotomap and original SAR image. (**a**) Point clouds of the SAR and optical images before orientatio, (**b**) Point clouds of the SAR and optical images after orientation.

Because of the distorted data, it is impossible to find orientation correction. Because of the speckle noise, keypoints were found in the entire image, whereas the optical image is covered by keypoints only in the regions with the highest dynamic. As such, point clouds corresponding to keypoints in both images are correlated incorrectly.

For the analyzed point clouds, estimated latitude and longitude correction were calculated as a maximum value of histograms according to each coordinate. The histograms were performed by analyzing the shift of each point from the SAR image to match the optical reference orthophotomap. The results obtained are depicted in Figure 9. The estimated longitude correction is <sup>Δ</sup><sup>⇔</sup> <sup>=</sup> 6.625 · <sup>10</sup>−4[ ◦] and the latitude correction is <sup>Δ</sup> <sup>=</sup> 6.45 · <sup>10</sup>−4[ ◦]. The results are compared with those obtained for despeckled images below.

**Figure 9.** Longitude and latitude correction calculated based on the original SAR image.

#### *5.2. SAR Image Filtered Using ML2D*

The ML2D method is presented in Section 4.1. Here, the results of the modified ASIFT algorithm are delivered. First, the keypoints were found in the filtered SAR image. The results obtained are depicted in Figure 10.

**Figure 10.** The results of the ASIFT algorithm processing for the SAR image filtered using ML2D method.

As can be noticed, the number of keypoints is significantly reduced by using the despeckling method. After this operation, keypoints were found in characteristic places such as trees, buildings, or roads. The ASIFT algorithm found 49, 153 keypoints. The point clouds before and after orientation are presented in Figure 11.

**Figure 11.** Point clouds before and after correction for the orthophotomap and SAR image filtered using ML2D method. (**a**) Point clouds of the SAR and optical images before orientation, (**b**) Point clouds of the SAR and optical images after orientation

As expected, the filtered image allowed precise results to be obtained. Contours provided by the keypoints in the SAR scene were matched to the same scene, but illustrated using an optical sensor. Thanks to the ML2D method, the correction was estimated. Each coordinate is presented in Figure 12.

**Figure 12.** Longitude and latitude correction calculated based on the original SAR image filtered using ML2D method.

The estimated longitude and latitudecorrections are, respectively: <sup>Δ</sup><sup>⇔</sup> <sup>=</sup> 9.775 · <sup>10</sup>−4[ ◦], <sup>Δ</sup> = 2.69 · <sup>10</sup>−4[ ◦]. It is worth noting that the character of histograms is different in comparison to the unfiltered data. In the initial case, the error had a Gaussian curve shape. After filtration, the latitude error can be clearly indicated. In the longitude case, the error is ambiguous. This is caused by the character of the SAR and optical image creation. Orthophotomaps are usually made perpendicular (NADIR) to the Earth, whereas SAR radars working in the StripMap mode illuminate scenes from a certain angle (off-NADIR). This dependency provides additional affine transformation, "stretching" an image in the direction of the antenna main lobe. However, by approximating the histogram using a Gaussian curve and interpolating data, the value corresponding to the correction can be estimated in a more precise way than in the original SAR image. Additionally, the correction sets are significantly narrower in comparison to the original data.

The results obtained for the other filters are similar and depicted in the same way. To make the text perspicuous, graphical representations such as keypoint clouds, histograms, and SAR images with marked points are attached in Appendix A.

#### *5.3. SAR Image Filtered Using MEAN Filter*

The next considered filter is presented in detail in Section 4.2. This approach is the simplest and probably the most intuitive, and can be quickly implemented to obtain a SAR image with reduced speckle noise. The filtered image with marked keypoints is depicted in Figure A1.

The number of detected keypoints significantly decreased. Despite the simple nature of the filter, the navigation error was estimated. The ASIFT algorithm detected 65, 489 points whose distribution focused on characteristic places such as trees, buildings, etc. It enabled keypoints clouds obtained for both optical and SAR images to be comparable, as presented in Figure A2.

The MEAN filter allows improvement of the results to be obtained. According to the histograms presented in Figure A3, the corrected latitude and longitude are, respectively, <sup>Δ</sup><sup>⇔</sup> <sup>=</sup> 9.465 · <sup>10</sup>−4[ ◦] and <sup>Δ</sup> <sup>=</sup> 2.595 · <sup>10</sup>−4[ ◦]. The presented histograms show the correction distribution, based on which the navigation drift is obtainable.

Figures provided for this subsection are attached in Appendix A.1.

#### *5.4. SAR Image Filtered Using MMSE Filter*

The MMSE filter, described in Section 4.3, was examined as the next method for speckle noise reduction. As in the previous methods, in the first step, keypoints were detected using the ASIFT method, as presented in Figure A4.

As can be seen, a larger number of points was detected. The algorithm extracted 66, 917 keypoints, which is clearly a higher value than in the previous results achieved for the algorithms of speckle noise reduction. By analyzing the result, it can be seen that points were found in places where they were not detected for other noise reduction methods. Keypoints are visible in fields and uniform areas, resulting in poorer point cloud quality. However, the overall point cloud structure has been preserved and correctly covered, as illustrated in Figure A5.

Although the quality of the keypoints is worse, because more points were detected, it was possible to correctly cover the characteristic areas distinguished from the images. A characteristic triangle formed by tree lines and buildings located in the left part of the picture were covered, determining the error being sought.

The calculated longitude and latitude corrections presented in Figure A6 are as follows: Δ<sup>⇔</sup> = 9.355 · <sup>10</sup>−4[ ◦] and <sup>Δ</sup> <sup>=</sup> 2.57 · <sup>10</sup>−4[ ◦]. This is a similar result to the previously used methods. Accuracy is mainly limited by the histogram resolution, which can be improved by interpolation. Despite the detection of a larger number of points, which resulted from a smaller reduction of the speckle noise, it successfully allowed the detected keypoints clouds to be covered, and thus the navigation correction to be recalculated. The results are consistent with the methods presented above.

Figures provided for this subsection are attached in Appendix A.2.

#### *5.5. SAR Image Filtered Using ELEE Filter*

The ELEE filter described in detail in Section 4.4 was employed as the next method for the improvement of point clouds covering. As in the case of the MMSE filter, more points were found both in places with increased dynamics and on uniform surfaces. This indicates less noise reduction; however, the overall character of the keypoint cloud was set as in the case of the MMSE filter. The results obtained are presented in Figure A7.

For the discussed method, 58, 165 characteristic points were detected, which is also a significant number compared to the analyzed methods. The keypoint clouds before and after the correction are shown in Figure A8.

Again, the correction was calculated correctly, as evidenced by the coverage of characteristic areas in the images under investigation. It turns out that the method of point cloud shift correction is resistant to such small discrepancies and additional detection of characteristic points resulting from limited filtration of speckle noise. Histograms presenting navigation correction are presented in Figure A9. The estimated longitude and latitude corrections are, respectively: <sup>Δ</sup><sup>⇔</sup> <sup>=</sup> 9.4175 · <sup>10</sup>−4[ ◦] and <sup>Δ</sup> <sup>=</sup> 2.19 · <sup>10</sup>−4[ ◦].

Figures provided for this subsection are attached in Appendix A.3.

#### *5.6. SAR Image Filtered Using GMAP Filter*

The next investigated filter is GMAP, whose extended characteristic is presented in Section 4.5. The result showing the point cloud is illustrated in Figure A10.

The total number of points found in the image is 22, 200, which is the smallest value for all of the considered cases. The algorithm detected characteristic points corresponding to different objects in the scene, which indicates a significant reduction in noise in the image.

As shown in Figure A11, the last filter also provides correct results. The point clouds are similar to those previously presented, making the GMAP filter an effective tool in the proposed method. It is worth noting that, for the considered image, there are a few regions where the resolution is low, such as in the lower part where trees are present, or in the upper part where buildings are depicted. Histograms presenting navigation correction are shown in Figure A12. Comparing the result in Figure A10 to the ones previously obtained, new details are available. In this case, the GMAP method shows a distinguishing object initially "hidden" in noise, which can be used in the point cloud analysis. The same effect is visible in Figure A13 for the SAR-BM3D filter. For the last considered case, the estimated longitude correction is <sup>Δ</sup><sup>⇔</sup> <sup>=</sup> 9.635 · <sup>10</sup>−4[ ◦], whereas the estimated latitude correction is <sup>Δ</sup> <sup>=</sup> 2.445 · <sup>10</sup>−4[ ◦], which correspond to the previously obtained values.

Figures provided for this subsection are attached in Appendix A.4.

#### *5.7. SAR Image Filtered Using SAR-BM3D Filter*

The last examined filter is BM3D, described in detail in Section 4.6. The same high resolution SAR image was processed to obtain a keypoint cloud, allowing navigation the correction to be estimated. The input image with marked keypoints is presented in Figure A13.

The utilized despeckling method significantly improved the image quality (by reducing the noise level), which significantly affected the focusing of the detected keypoints in the most dynamic regions. The keypoints distribution corresponds to the results obtained for the different despeckling methods previously presented (including the number of detected keypoints). However, fewer keypoints were found in comparison to the previously presented outcomes (apart from the GMAP filter). The algorithm detected 24, 954 points. In Figure A14, the keypoint clouds before and after orientation are presented.

Thanks to the reduction of speckle noise, negligible improvement was obtained, which affects the unequivocal shift estimation. The correction distribution, presented in Figure A15, is similar to the previous methods. In addition, in this case, the considered set is narrower than it was presented for the original SAR image without despeckling methods. The estimated longitude and latitudecorrections are, respectively: <sup>Δ</sup><sup>⇔</sup> <sup>=</sup> 9.865 · <sup>10</sup>−4[ ◦] <sup>Δ</sup> <sup>=</sup> 2.475 · <sup>10</sup>−4[ ◦]. This result proves the usability of such a filter for both, despeckling purposes as well as for the proposed modified ASIFT algorithm.

Figures provided for this subsection are attached in Appendix A.5.

#### *5.8. Discussion*

Each of the applied filters allowed estimationcorrections to be calculated in accordance with the assumptions. Due to the different nature of the creation of optical images and SAR, some imperfections are visible, but they are negligible in the analyzed problem. The filters processing performance is presented in Section 4.7. Table 2 shows the processing time using the ASIFT algorithm (does not consider filtering time) and the point cloud correlation. Additionally, the results of the correction estimation, as well as the number of points found in each of the images, are summarized.


**Table 2.** Estimated coordinates correction for different filters.

As can be seen, similar results were obtained for all filters tested. The outcomes of the rotation estimation are characterized by the following relationship: the greater is the number of points found, the greater is the rotation correction determined. Interestingly, the calculation time is not strongly dependent on the number of keypoints detected. However, storing points requires more memory, and, even if the calculation time is similar, the memory required is larger for the method detecting a greater number of points. It should be noted that all computation was performed on the CPU without any parallelization. To decrease the computation time and boost the computation speed, a GPU should be used.

The experiments proved that the proposed method for the geometrical matching of SAR and optical images utilizing ASIFT features for SAR-based Navigation Aided Systems is suitable for compute the corrections to the SAR images' direct georeferencing. In the literature, there are many methods and algorithms for this type of data co-registration [9,11–13,15,16,18,19]. All of these algorithms were tested on SAR images and optical images acquired from space. The proposed

methodology of data integration is based on high resolution (in full resolution) SAR images and orthophotomaps obtained from altitude. Thus, it is hard to compare the presented method with the method described in [9,11–13,15,16,18,19], because of the spatial resolution of satellite optical images and the size of overlapping areas. For this reason, it required guaranteeing well-distributed (in the whole investigation area), robust corresponding points. To achieve this, a modification of the SfM approach based on the ASIFT detector and he elimination of the keypoints description, as well as description matching step and using the two-step ICP method, was performed. To compute the transformation parameters, methods based on finding the pairs of points are used. This way of determining corresponding points is useful when the points are well distributed throughout the overlapping areas. Unfortunately, when data from attitude are processed, this relationship may not be met and the matching pairs of points will not be evenly distributed throughout the study area. This might cause wrong or inaccurate determination of correction parameters. Therefore, the use of the ASIFT algorithm allows the detection of more points evenly distributed throughout the entire area of work. In connection with the ICP method, the keypoints will not be aligned in pairs, but will form a rigid body that reduces the influence of the outliers on the final determining transformation elements.

In Figure 13, the oriented SAR and optical images before and after correction are presented. In this case, correction calculated by the ELEE method was utilized, however the results are comparable for all cases when despeckling filters are used. As can be seen, characteristic objects such as trees, roads, and buildings are covered.

On the basis of the results obtained, the calculated correction value per kilometer can also be estimated. Assuming that 1◦ ≈ 111.1 km, the longitude correction is about 105 m, while for latitude the correction was about 25 [m]. This is a significant value, especially in the case of military systems for which the required precision should be as high as possible. For the case where the despeckling filter was not applied, the navigation correction is inappropriate and amounts to 73.6 m for longitude and 71.6 m for latitude, which are the mean values of the differences between the original coordinates of the SAR and optical images taken into consideration. This shows how important it is to combine the presented methods and implement the processing pipeline. The juxtaposition also shows that, from the point of view of the implementation of navigation correction on a flying platform, it is sufficient to use simple despeckling filters, which can significantly reduce computational effort and thus accelerate processing, which is particularly important for fast flying objects. It should be noted that the final resolution of the correction is limited by the radar bandwidth, for which the range resolution is expressed by the relationship *δR* = *<sup>c</sup>* <sup>2</sup>*<sup>B</sup>* , where *c* is the speed of light and *B* is radar signal bandwidth. For the proposed method to make sense, high resolution radars should be used, which, unfortunately, are associated with a greater financial outlay and a requirement for high processing power of the computing unit. In addition, it should be taken into account that the obtained high resolution radar images must be filtered to reduce speckle noise, which limits the resolution of the entire imaging. Moreover, phenomena such as heterogeneous movement of the radar carrier platform or changing its speed during the flight in an unpredictable way also degrade the quality of the image. Nevertheless, taking into account and minimizing such phenomena, it is possible to use the method proposed by the authors, which was proved experimentally.

(**a**) Initially oriented images

(**b**) Images oriented after correction **Figure 13.** Oriented images before and after correction.

#### **6. Conclusions**

This paper presents a novel algorithm for navigation correction estimation dedicated to flying platforms (e.g., drones, airplanes, cruise missiles, etc.) in cases when satellite navigation systems are unavailable. This method utilizes various techniques to:


Combining such methods in a single solution is a novel approach proposed by the authors. Taking advantage of several techniques, an innovative algorithm was proposed, tested and validated to confirm its usability for SAR images obtained during a measurement campaign. This solution may be useful in military and civilian applications, when the lack of a GNSS signal is a critical problem which makes flying missions impossible. Merged techniques such as ASIFT-based keypoint extraction and SfM-based keypoints matching make this method robust and resistant to noise and interference. Thus, the presented methodology can be successfully integrated with existing systems to enhance their precision and dependability. Additionally, the authors provide a comparison of several filters, including their computational complexity and performances. This presents a wide variety of uses for this technique depending on the solution. In the future, the authors intend to implement the method on a real-time platform and test and verify the proposed methodology in real conditions, which should confirm its usability. The target platform is the GPU (Graphics Processing Unit), allowing fast computing to be obtained (both SAR processing and correction estimation). Another crucial issue is the implementation of the functionality that would be able to cope with the situation where the imaged scene has been significantly changed compared to the one that was saved in the database (as an optical image). Such a situation may arise when, for example, the imaged area has been damaged as a result of a disaster or war, and the database on board the flying object does not have current pictures. The estimation may then be ineffective, which is undesirable. This is currently the main problem the authors are working on.

**Author Contributions:** project administration, D.G.; formal analysis, J.M., W.O., A.G., and K.A.; funding acquisition, D.G. and P.S.; investigation, J.M., W.O., A.G., and K.A.; methodology, J.M., W.O., and A.G.; resources, D.G.; software, J.M. and K.A.; supervision, P.S.; validation, D.G.; visualization, A.G. and K.A., writing—original draft, K.A., A.G., J.M., and W.O.; and writing—review and editing, P.S.

**Funding:** This work was done within the frame of project No. DOB-2P/03/05/2018 funded by the Polish National Center for Research and Development (NCBiR).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

In this appendix, the results obtained in Section 5 are presented.

*Appendix A.1. Figures for SAR Image Filtered Using MEAN filter*

**Figure A1.** The results of the ASIFT algorithm processing for the SAR image filtered using MEAN method.

**Figure A2.** Point clouds before and after correction for the orthophotomap and SAR image filtered using MEAN filter (**a**) Point clouds of the SAR and optical images before orientation, (**b**) Point clouds of the SAR and optical images after orientation.

**Figure A3.** Longitude and latitude correction calculated based on the original SAR image filtered using ML2D method.

*Appendix A.2. Figures for SAR Image Filtered Using MMSE Filter*

**Figure A4.** The results of the ASIFT algorithm processing for the SAR image filtered using MMSE filter.

**Figure A5.** Point clouds before and after correction for the orthophotomap and SAR image filtered using MMSE filter. (**a**) Point clouds of the SAR and optical images before orientation, (**b**) Point clouds of the SAR and optical images after orientation.

**Figure A6.** Longitude and latitude correction calculated based on the original SAR image filtered using MMSE filter.

*Appendix A.3. Figures for SAR Image Filtered Using ELEE Filter*

**Figure A7.** The results of the ASIFT algorithm processing for the SAR image filtered using ELEE filter.

**Figure A8.** Point clouds before and after correction for the orthophotomap and SAR image filtered using ELEE filter. (**a**) Point clouds of the SAR and optical images before orientation, (**b**) Point clouds of the SAR and optical images after orientation.

**Figure A9.** Longitude and latitude error calculated based on the original SAR image filtered using ELEE filter.

*Appendix A.4. Figures for SAR Image Filtered Using GMAP Filter*

**Figure A10.** The results of the ASIFT algorithm processing for the SAR image filtered using GMAP filter.

**Figure A11.** Point clouds before and after correction for the orthophotomap and SAR image filtered using GMAP filter. (**a**) Point clouds of the SAR and optical images before orientation, (**b**) Point clouds of the SAR and optical images after orientation.

**Figure A12.** Longitude and latitude error calculated basing on the original SAR image filtered using GMAP filter.

*Appendix A.5. Figures for SAR Image Filtered Using SAR-BM3D Filter*

**Figure A13.** The results of the ASIFT algorithm processing for the SAR image filtered using SAR-BM3D filter.

**Figure A14.** Point clouds before and after correction for the orthophotomap and SAR image filtered using SAR-BM3D filter. (**a**) Point clouds of the SAR and optical images before orientation, (**b**) Point clouds of the SAR and optical images after orientation.

**Figure A15.** Longitude and latitude correction calculated based on the original SAR image filtered using SAR-BM3D filter.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Wavelength-Resolution SAR Ground Scene Prediction Based on Image Stack**

#### **Bruna G. Palm 1,\*, Dimas I. Alves 2,3, Mats I. Pettersson 4, Viet T. Vu 4, Renato Machado 5, Renato J. Cintra 6,7,8, Fábio M. Bayer 9, Patrik Dammert <sup>10</sup> and Hans Hellsten <sup>10</sup>**


Received: 13 February 2020; Accepted: 31 March 2020; Published: 3 April 2020

**Abstract:** This paper presents five different statistical methods for ground scene prediction (GSP) in wavelength-resolution synthetic aperture radar (SAR) images. The GSP image can be used as a reference image in a change detection algorithm yielding a high probability of detection and low false alarm rate. The predictions are based on image stacks, which are composed of images from the same scene acquired at different instants with the same flight geometry. The considered methods for obtaining the ground scene prediction include (i) autoregressive models; (ii) trimmed mean; (iii) median; (iv) intensity mean; and (v) mean. It is expected that the predicted image presents the true ground scene without change and preserves the ground backscattering pattern. The study indicates that the the median method provided the most accurate representation of the true ground. To show the applicability of the GSP, a change detection algorithm was considered using the median ground scene as a reference image. As a result, the median method displayed the probability of detection of 97% and a false alarm rate of 0.11/km<sup>2</sup> , when considering military vehicles concealed in a forest.

**Keywords:** CARABAS II; ground scene prediction; image stack; multi-pass; SAR images

#### **1. Introduction**

Common tasks in synthetic aperture radar (SAR) statistical image processing include the identification and classification of distinct ground type [1–5], modeling [6–9], and change detection [10–13]. In special, wavelength-resolution low-frequency SAR systems are useful for natural disasters monitoring, foliage-penetrating applications, and detection of concealed targets [14].

The wavelength-resolution SAR system is usually associated with ultrawideband (UWB) radar signal and ultrawidebeam antenna [15]. With such, the maximum resolution is achieved and it is in the order of radar signal wavelength. Additionally, available UWB SAR systems only operate at low frequencies. One essential feature of wavelength-resolution SAR systems is that the speckle noise does not influence the acquired images since it is likely that only a single scatter is present in the resolution cell. Additionally, small scatterers present in the ground area of interest do not contribute to the backscattering for low-frequency radar systems. Thus, small structures, such as tree branches and leaves, are not shown in SAR images [16]. Because large scatterers are associated with low-frequency components, they tend to be less influenced by environmental effects and are stable in time. Hence, by using multi-passes with identical heading and incidence angle of the illuminating platform at a given ground area, an image package with similar statistics can be obtained [17]. In [18], clutter statistical models for stacks of very-high-frequency (VHF) wavelength-resolution SAR images are discussed. The SAR image stacks are a frequent topic of study for SAR systems with high resolution [19–21]. However, the literature lacks the use of large image stacks for wavelength-resolution SAR for change detection applications.

Change detection algorithms (CDA) have been widely considered over the years in the detection of distinct targets in SAR images [22–24]. In particular, the wavelength-resolution SAR change detection is an important topic of research and has been studied for more than a decade [17]. Wavelength-resolution systems have also shown unique results with high detectability rate on a low false alarm rate per square km, as presented, for example, in [17,24]. The nature of the wavelength-resolution SAR imagery can be exploited to facilitate the design of CDAs, since (i) the contribution of small scatterers to radar echoes is not significant for the wavelength of several meters; (ii) scatter from large objects are the main contribution; (iii) large scatterers are usually stable in time and less sensitive to environmental effects; and (iv) the wavelength-resolution almost totally cancel the speckle noise [16] in the SAR image given a very stable backscattering between measurements.

A CDA is used to detect changes in a ground scene between distinct measurements in time, such as natural disasters like floods and wildfires or human-made interferences [14]. Generally, in wavelength-resolution systems, a CDA can be simply obtained by the subtraction of two single-look images (reference and surveillance), followed by a thresholding operation. However, an image stack can be considered instead of just two images in a CDA; such a collection of images leads to improved detection performance, as discussed in [17]. This information is used to eliminate clutter and noise in the surveillance image [17], and consequently, enhancing CDA results. Recently, a study using a small stack of multi-pass wavelength-resolution SAR images for change detection was introduced in [17].

In [25], the autoregressive (AR) model was employed as a preliminary study considering a ground scene prediction (GSP) based on a single wavelength-resolution SAR image stack. The resulting predicted image was submitted as input data to a change detection algorithm, based only on subtraction, thresholding, and morphological operations. The CDA in [25] corresponds to the detection analysis step of the CDA used in [26]. Despite its simplicity, the change detection results in [25] were competitive when compared with the ones recently presented in [17,27].

Multi-pass SAR images cannot be exactly equidistantly observed over time since the noise across the image stack is not related to the time order. As a consequence, the use of a time series model, commonly employed in statistical signal processing [28–31], may not be the most suitable approach to obtain a GSP, and, consequently, resulting in lower performance in a CDA. Additionally, the backscattering of the images in the stack is stable in time, i.e., a sequence of pixels for each position follows a similar pattern, and changes in such behavior are understood as outliers. Thus, an image filtering considering robust statistical methods, such as trimmed mean and median [32,33], might be better candidates to obtain a ground scene prediction. These approaches can provide an accurate prediction of the ground scene, avoid the time order problem, and exclude the pixels that do not follow the sequence pattern. Indeed, the median and the trimmed mean filters are traditionally used to remove impulse noise from an image [34–41].

To the best of our knowledge, the study in [25] is the only work related to the ground scene prediction for wavelength-resolution SAR image stacks. Our paper extends the results presented in [25] with four other statistical methods to predict a ground scene for three SAR image stacks, since statistical methods are commonly employed in SAR image processing [1,2,5–11,13]. The selected statistical methods to obtain the prediction image are (i) autoregressive models; (ii) trimmed mean; (iii) median; (iv) intensity mean; and (v) mean. The predicted ground scene methods are sought to preserve the ground backscattering statistical characteristics of the images in the stack and presents predicted pixel values closer to the original images. It is expected that the predicted images represent the true ground scenes, allowing applications, such as monitoring of forested areas and natural disasters. In this paper, our goal is twofold. First, we propose the use of statistical methods to obtain a ground scene prediction image based on a wavelength-resolution SAR image stacks. Second, we consider this new image as a reference image in a change detection algorithm. In particular, we employed the median GSP image obtained based on stack statistics as a reference image in a CDA based on the detection analysis step of the CDA presented in [26], which was evaluated in terms of target detection probability and false alarm rate. The results reported in [12,17,24] were adopted as the reference model for comparison.

The paper is organized as follows. In Section 2, we describe the considered change detection method and a suite of selected statistical methods for ground scene prediction. Section 3 presents experimental results, including a description of the considered data set, the ground scene prediction results, and the change detection results. Then, a change detection method based on the discussed GSP approaches is introduced. Finally, Section 4 concludes the paper.

#### **2. Change Detection Method**

The change detection method used in this paper applied the processing scheme given in Figure 1. An image stack is processed by a desirable GSP method furnishing the GSP image. The changes are simply obtained with the subtraction of the image of interest (surveillance image) from the GSP image (reference image). For change detection, we applied thresholding to the difference image and then used morphological operations for false alarm minimization. The methods employed to obtain the GSP images are described in the next section.

**Figure 1.** Processing scheme for change detection. The ground scene prediction (GSP) image is the reference image and the interest image is the surveillance image. The change detection algorithm (CDA) is performed applying thresholding and morphological operations in the difference image. Note that the difference image is based on the subtraction between single-look image pixels as a consequence of the stability in backscattering using a wavelength-resolution synthetic aperture radar (SAR) system.

The employed CDA consists of two mathematical morphology steps. First, an opening operation [42] aimed at removing small pixel values, which are regarded as noise. The second step is a dilation that prevents the splitting of the interest targets in multiple substructures. The first step uses a 3 × 3 pixel square structuring element, whose size is determined by the system resolution; the second step considers a 7 × 7 pixel structuring element, which is linked to the approximate size of the targets (about 10 × 10 pixels).

#### *2.1. Ground Scene Prediction*

As discussed in [18], an image stack is composed of images with similar heading and incidence angle of the same illuminating platform. As a consequence of this similarity, the SAR images in the stack are very similar and stable in time. Thus, a sequence of each pixel position can be extracted from the stack, as illustrated in Figure 2.

The data set considered in this paper is composed of wavelength-resolution SAR images, i.e., the resolution of the SAR image is in the order of the radar signal wavelength [16]. Therefore, there may only be a single scatter in the resolution cell. As a consequence, the considered images are not affected by speckle noise, which is typically a strong source of noise in SAR images in higher frequency bands. Thus, the backscattering from the image stack is stable in time, allowing an accurate GSP.

**Figure 2.** Stack of images to be considered in GSP. The methods should be applied for each pixel position, as evidenced by the vertical line.

We consider five statistical methods to obtain ground scene predictions. The techniques are applied in a sequence of pixels, as described in the following.

#### *2.2. AR Model*

The AR model was adopted to compute the GSP, which can be defined as [43]

$$y[n] = -\sum\_{k=1}^{p} a[k]y[n-k] + u[n], \quad n = 1, 2, \dots, N,\tag{1}$$

where *y*[*n*] is the value of each pixel in one image, *N* is the number of images in the stack, *a*[*k*] are the autoregressive terms, *u*[*n*] is white noise, and *p* is the order of the model [43]. The autoregressive terms *a*[*k*] in Equation (1) can be estimated by the Yule–Walker method [43,44].

Hence, the estimated autoregressive terms *a*[*k*] are the solutions of the equation system, given by [43]

$$
\begin{bmatrix} r\_{yy}[0] & r\_{yy}[1] & \dots & r\_{yy}[p-1] \\ r\_{yy}[1] & r\_{yy}[0] & \dots & r\_{yy}[p-2] \\ \vdots & \vdots & \ddots & \vdots \\ r\_{yy}[p-1] & r\_{yy}[p-2] & \dots & r\_{yy}[0] \end{bmatrix} \begin{bmatrix} a[1] \\ a[2] \\ \vdots \\ a[p] \end{bmatrix} = -\begin{bmatrix} r\_{yy}[1] \\ r\_{yy}[2] \\ \vdots \\ r\_{yy}[p] \end{bmatrix} \tag{2}
$$

where *ryy*[·] is the sample autocorrelation function. Information about large sample distributions of the Yule–Walker estimator, order selection, and confidence regions for the coefficients can be found in [45]. Considering the estimated autoregressive terms *a*[*k*], it is possible to forecast *h* steps ahead with the AR model as [44]

$$
\hat{y}[N+h] = -\sum\_{k=1}^{p} \hat{a}[k]y[N+h-k].\tag{3}
$$

The ground scene prediction image is obtained by forecasting the one-step ahead (*h* = 1) pixel value for each pixel in the image.

#### *2.3. Trimmed Mean, Median, and Mean*

For SAR images whose backscattering is stable in time, robust methods can be applied to obtain a GSP. We consider the trimmed mean to obtain a GSP, which is given by

$$\mathcal{Y}\_{\rm tm} = \frac{2}{N - 2m} \sum\_{n=m+1}^{N-m} y^\*[n],\tag{4}$$

where *<sup>y</sup>*[*n*] is the ordered sequence of *<sup>y</sup>*[*n*], *<sup>m</sup>* = (*<sup>N</sup>* <sup>−</sup> <sup>1</sup>)*α*, and *<sup>α</sup>* <sup>∈</sup> [0, 1/2) [32,33]. If *<sup>α</sup>* <sup>=</sup> 0 or *<sup>α</sup>* <sup>→</sup> 0.5, then the trimmed mean corresponds to the sample mean and median, respectively [32], which are considered as methods for GSP derivation.

#### *2.4. Intensity Mean*

We also use the intensity mean for obtaining ground scene predictions, given by

$$\mathcal{Y}\_{\rm im} = \sqrt{\frac{1}{N} \sum\_{n=1}^{N} y[n]^2}. \tag{5}$$

Compared to other statistical methods, the intensity mean has the advantage of providing physical interpretation about the image reflection. However, the intensities' values contribute evenly to the prediction results, which can be strongly affected by the changes in the ground scene [32].

#### **3. Experimental Results**

In this section, we present the results obtained from the discussed ground scene prediction methods and describe an approach for change detection based on such methods.

#### *3.1. Data Description*

In this study, we considered a data set obtained from CARABAS II, a Swedish UWB VHF SAR system whose images are available in [46]. The system is a low-frequency wavelength-resolution system which means that the images have almost no speckle noise. The data set was divided into three stacks with eight images each, i.e., two out of six passes have identical flight headings. Two passes have a flight heading of 255◦, two of 135◦, and two of 230◦, and the heading is defined as 0◦ pointing towards the north with clockwise increasing heading. The images in the stacks have the same flight geometry but are associated with four different targets' deployments (missions 1 to 4) in the ground scene. Hence, with four missions and six passes for each mission, there are 24 magnitude single-look SAR images. The images cover a scene of size 2 km × 3 km and are georeferenced to the Swedish reference system RR92, which can easily be transformed to WGS84 [12,26].

The first stack is composed of images corresponding to flight passes 1 and 3; the second stack, with passes 2 and 4; and the last stack is composed of images associated with passes 5 and 6. In all images, the backscattering was stable in time, and only target changes are expected within the image stacks.

Each image is represented as a matrix of 3000 <sup>×</sup> 2000 pixels, corresponding to an area of 6 km2 . As reported in [12], the spatial resolution of CARABAS II is 2.5 m in azimuth and 2.5 m in range. The ground scene is dominated by boreal forest with pine trees. Fences, power lines, and roads were also present in the scene. Military vehicles were deployed in the SAR scene and placed uniformly, in a manner to facilitate their identifications in the tests [26]. Each image has 25 targets with three different sizes and the spacing between the vehicles was about 50 m. For illustration, one image of Stack 1 is shown in Figure 3. In this image, the vehicles were (i) obscured by foliage; (ii) deployed in

the top left of the scene; and (iii) oriented in a southwestern heading. This deployment corresponds to mission 1. In missions 2, 3 and 4, these vehicles were deployed in other locations and were oriented in a northwestern, southwestern, and western heading, respectively [12,26].

**Figure 3.** Sample image from CARABAS II data set—Stack 1: mission 1 and pass 1.

#### *3.2. Ground Scene Prediction Evaluation*

The AR model parameter estimation requires (i) fitting 6,000,000 models (one fit for each pixel) in each stack and (ii) evaluating the best model for each pixel sequence. Such demands lead to a significant computational burden. For simplicity, we considered *p* = 1 in the AR model. Within the image stack, the two images related to the targets have the highest pixel values in the areas where the targets were deployed. Thus, to compute the trimmed mean, we considered *m* = 2 (*α* ≈ 0.3), expecting to remove the pixels related to the targets, since it is desired that the predicted image presents the true ground scene without change.

Figures 4 and 5 show the ground scene prediction for Stack 1, considering the discussed methods and a zoomed image in the region where the targets were deployed. In Figure 4, the deployed targets are visually present. However, the targets are absent in the images predicted with the trimmed mean and median, as shown in Figure 5. The areas highlighted by rectangles and circles in the images in Figure 4 indicate the regions where the targets were deployed during the measurement campaign. The circles show selected military vehicles that can be viewed. With such visual analysis, the trimmed mean and median show better performance, i.e., better prediction of the ground scene. For brevity, we limited our presentation to the GSP images from Stack 1, which is representative of all considered stacks.

Table 1 displays descriptive statistics of the employed images, such as average, standard deviation, skewness, and kurtosis. It is desirable that a GSP presents not only a good visual representation of the true ground, but also preserves the statistical characteristics of the image of interest. In Table 1, we highlighted the two best methods according to each considered measure. In the majority of the scenarios, the AR model and median methods outperformed the remaining methods.

To evaluate the difference between the ground scene prediction methods, we computed some standard quality adjustment measures. The criteria are the mean square error (MSE), mean absolute percentage error (MAPE), and median absolute error (MdAE), which can be defined as follows [47].

**Figure 4.** Ground scene prediction images for Stack 1 based on the autoregressive (AR) model, mean, and intensity mean methods. The areas highlighted by rectangles in the images represent the regions where the targets are deployed. The circles show selected military vehicles that can be viewed.

(**c**) Intensity mean

**Figure 5.** Ground scene prediction images for Stack 1 based on trimmed mean and median methods.

$$\text{MSE} = \frac{1}{Q} \sum\_{q=1}^{Q} (\mathbf{x}[q] - \hat{\mathbf{x}}[q])^2,\tag{6}$$

$$\text{MAPE} = \frac{1}{Q} \sum\_{q=1}^{Q} \frac{|\mathbf{x}[q] - \hat{\mathbf{x}}[q]|}{|\mathbf{x}[q]|},\tag{7}$$

$$\text{MdAE} = \text{Median}\left(|\mathbf{x}[q] - \hat{\mathbf{x}}[q]|\right), \quad q = 1, 2, \dots, Q,\tag{8}$$

where *<sup>x</sup>*[*q*] and *<sup>x</sup>*-[*q*] are the pixel values of the interest and predicted images, respectively, *Q* is the number of pixels, and Median(·) is the median value of <sup>|</sup>*x*[*q*] <sup>−</sup> *<sup>x</sup>*-[*q*]|, for *q* = 1, 2, ... , *Q*. These goodness-of-fit measures are usually considered to compare different methods applied to the same data set [47]. They are expected to be as close to zero as possible.

**Table 1.** Average, standard deviation, skewness, and kurtosis of one interest image and the ground scene prediction. The interest image in Stacks 1, 2 and 3, is the image of mission 1 and passes 1, 2 and 5, respectively. The two values of each measure that yielded the closest values with the interest image are **highlighted**.


For the quality adjustment measures, the target regions in the image were excluded since we expect to obtain an accurate ground scene prediction, and no target deployment should influence the measurements. Table 2 summarizes the results of the quality adjustment measures for the five considered statistical methods, and the best measurements are highlighted. The mean method presents the best performance according to MSE measurements, while the median method excels in terms of MAPE and MdAE measures in all the stacks. However, the MSE values obtained with the mean and median methods are similar. The results provided in Tables 1 and 2 consider the same reference image of each stack. Regardless of the selected image, the median method presented good performance according to MAPE, MdAE, and statistics measures.

Based on visual inspection, statistical characteristics, and quality adjustment measures, the median method yields the most reliable prediction among the considered methods. Therefore, we separate the predicted images from the median method as reference images in the change detection algorithm detailed in the next section.


**Table 2.** Measures of quality of the ground scene prediction image. The interest image in Stacks 1, 2 and 3 is the image of mission 1 and passes 1, 2 and 5, respectively. We **highlighted** the values of each quality adjustment measure that yielded the smallest values.

#### *3.3. Change Detection Results*

As indicated in Figure 1, we use the obtained GSP image and the interest image for change detection based on image subtraction. Two examples of subtraction images are shown in Figure 6. Figure 6a highlights the deployed targets, while Figure 6b focuses on the targets and the back-lobe structures. A comparison between the difference image shown in Figure 6b to the related GSP image suggests that the back-lobe structures are related to issues in the SAR system and the image formation algorithm.

Figure 7 shows the pixels' values of the image given in Figure 6a in a vectorized form. In general, the subtracted image pixels values are randomly distributed in (−0.4, 0.4). As discussed in [16], the distribution of the values of the CARABAS II subtracted image approximately follows the Gaussian distribution and the regions where no change occurs are stable. Thus, the threshold (*λ*) can be simply chosen as

$$\mathcal{C} = \frac{\lambda - \widehat{\mu}}{\widehat{\mathcal{O}}},$$

where *<sup>C</sup>* is a constant, *<sup>μ</sup>* is the estimated mean, and *σ* is the estimated standard deviation of the considered amplitude pixels in the image. For evaluation, we set *C* ∈ {2, 3, 4, 5, 6}, resulting in different false alarm rates (FAR), which range from full detection to almost null false alarm rate.

Table 3 summarizes the change detection results corresponding to a single constant *C* = 5. Among 600 deployed vehicles in the missions, 579 were correctly detected. There are 22 detected objects that can not be related to any vehicle and are considered to be false alarms. Thus, the detection probability is about 97%, while the false alarm rate is 0.15/km<sup>2</sup> (total of 144/km2 ). Ten of the 22 false alarms are related to the back-lobe structures, i.e., they are not actually false alarms and may stem from system and image formation issues. Additionally, in general, the undetected targets are related to missions 2 and 4. These undetected military vehicles are more difficult to detect since they have the smaller sizes and magnitude values, and, consequently, pixel values closer to the forest ones.

**Figure 6.** Subtraction of an interest image from the median ground scene prediction image. The areas highlighted by rectangles in the images represent the region with higher pixel values.

**Figure 7.** Result of the subtraction of the ground scene prediction image from the image obtained from mission 1 and pass 1.


**Table 3.** Change detection results obtained with *C* = 5.

#### *3.4. Evaluation*

The performance of change detection was evaluated by the probability of detection (P*d*) and FAR. The quantity P*<sup>d</sup>* was obtained from the ratio between the number of detected targets and the total numbers of known targets, while FAR is defined by the number of false alarms detected per square kilometer [26]. Figure 8 presents the receiver operating characteristic (ROC) curves [48] of the change detection results, showing the probability of detection versus the false alarm rates for the different evaluated values of *C*. We compared the change detection results obtained from the proposed method with the results described in [12,17,24]. The proposed method excels in terms of probability of detection and false alarm rate in comparison to [12,17,24].

For example, for a detection probability of 98%, our proposed change detection method presents log10(FAR) about −0.5, while [12,17,24] have log10(FAR) about 1.4, −0.3 and 0.14, respectively. For log10(FAR) = −0.9, i.e., a very low FAR, the probability detection given by [12] drops to 60%, while our proposal still maintains the probability of detection more than 90%. The detection probability of our proposed method and [17] reach 100% with log10(FAR) ≈ 1, while [12,24] have full detection for log10(FAR) ≈ 1.5 and log10(FAR) ≈ 2, respectively. Additionally, detection probability improvements of our method compared to [17] are found in the range of (0.93, 0.98). For example, for a probability of detection of 0.97%, our proposed change detection method presents log10(FAR) about −0.8, while [17] has log10(FAR) ≈ −0.2.

**Figure 8.** The receiver operating characteristic (ROC) curves obtained with the CDA with the background predicted scene as the reference image compared with the best ROC curves extracted from [12,17,24].

#### **4. Conclusions**

In this paper, we presented five methods to obtain ground scene prediction of SAR images based on image stack. The experimental results revealed that, among the considered techniques, the median method yielded the most accurate ground prediction. The statistical characteristics of the obtained GSP image were similar to the image of interest. Moreover, the median method excels in terms of quality adjustment measures, and the changes in the image stack were not visually presented in the predicted image. The GSP image based on the method was used as a reference image in a CDA, presenting competitive performance when compared with recently published results.

**Author Contributions:** Writing—original draft, B.G.P. and D.I.A.; Writing—review and editing, M.I.P., V.T.V., R.M., R.J.C., and F.M.B.; Resources and supervision, P.D. and H.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** This study was financed in part by the Conselho Nacional de Desenvolvimento Científico e Tecnológico, CNPq, Brazil, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, CAPES, Brazil, Swedish-Brazilian Research and Innovation Centre (CISB), and Saab AB.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **A Multi-Scale U-Shaped Convolution Auto-Encoder Based on Pyramid Pooling Module for Object Recognition in Synthetic Aperture Radar Images**

#### **Sirui Tian 1,\*, Yiyu Lin 2, Wenyun Gao 3, Hong Zhang <sup>4</sup> and Chao Wang 4,5**


Received: 5 February 2020; Accepted: 7 March 2020; Published: 10 March 2020

**Abstract:** Although unsupervised representation learning (RL) can tackle the performance deterioration caused by limited labeled data in synthetic aperture radar (SAR) object classification, the neglected discriminative detailed information and the ignored distinctive characteristics of SAR images can lead to performance degradation. In this paper, an unsupervised multi-scale convolution auto-encoder (MSCAE) was proposed which can simultaneously obtain the global features and local characteristics of targets with its U-shaped architecture and pyramid pooling modules (PPMs). The compact depth-wise separable convolution and the deconvolution counterpart were devised to decrease the trainable parameters. The PPM and the multi-scale feature learning scheme were designed to learn multi-scale features. Prior knowledge of SAR speckle was also embedded in the model. The reconstruction loss of the MSCAE was measured by the structural similarity index metric (SSIM) of the reconstructed data and the images filtered by the improved Lee sigma filter. A speckle suppression restriction was also added in the objective function to guarantee that the speckle suppression procedure would take place in the feature learning stage. Experimental results with the MSTAR dataset under the standard operating condition and several extended operating conditions demonstrated the effectiveness of the proposed model in SAR object classification tasks.

**Keywords:** multi-scale representation learning (MSRL); pyramid pooling module (PPM); compact depth-wise separable convolution (CSeConv); convolution auto-encoder (CAE); object classification; synthetic aperture radar (SAR)

#### **1. Introduction**

As the vital task of object classification with synthetic aperture radar (SAR) images, feature engineering intends to obtain robust representations of intrinsic properties to distinguish various targets in high-resolution radar images. Although numerous hand-designed features have been proposed to represent both the spatial and electromagnetic characteristics of targets over the past decades, feature learning is still a challenging task for SAR-based automatic target recognition (SAR ATR) applications.

In general, the traditional hand-designed features include two categories: the generalized features [1–3] and the SAR-specialized features [4–7]. The former ones involve features from other domains considering little of the characteristics of SAR imagery, while the latter ones refer to those designed for specific SAR ATR tasks. Despite their high accuracy while dealing with the benchmark or specific SAR dataset, all these handcrafted features have certain obstacles. A major drawback is the requirement of detailed prior knowledge about the potential applications that are sometimes unavailable. Another obstacle is that many features, especially those based on scattering models, hold a series of assumptions for operation conditions (OCs), leading to performance degradation when the assumptions are inconsistent with the OCs. Accordingly, it is necessary to devise new feature learning algorithms which can adaptively learn representations from various data, considering complicated situations.

With the theoretical progress of machine learning, the deep learning (DL) model, which has turned out to be adept at automatically discovering intricate information in high-dimensional raw data [8], has been employed to tackle SAR ATR tasks and achieved the superior performance than hand-designed features. Although the supervised DL models have obtained state-of-the-art results, their requirement of a great many labelled data is the major obstacle in SAR ATR. The labelled benchmarks are too small to train a supervised deep network effectively, and overfitting caused by limited labelled samples is often one of the main causes of performance degradation of the supervised model. To handle this problem, various unsupervised DL models are employed and developed, including the autoencoder (AE) [9,10], the generative adversarial network (GAN) [11,12], and the restricted Boltzmann machine (RBM) [13]. Due to the fact of its simple implementation and attractive computational cost, the AE has widely been used in SAR ATR which minimizes the distortion between the inputs and the reconstructions to guarantee that the mapping process preserves the information of the inputs.

In earlier works, the autoencoder was utilized to derive refined representations from the predefined features or preprocessed images before feeding them into a traditional classifier such as the softmax or the support vector machine (SVM) [14–20]. In Reference [14], Geng et al. presented a deep convolution AE (CAE) for SAR image classification. Two kinds of handcrafted features, a gray-level co-occurrence matrix (GLCM) and the Gabor filter banks, were jointly fed into the CAE. The learned representation was subsequently fed to a softmax classifier for land-cover classification. In Reference [15], the geometric parameters and the local texture features were combined to train a stacked AE (SAE) for vehicle classification in SAR images. Gleich and Planinšic [16] estimated the log commulants of SAR data patches via the dual-tree oriented wavelet transform and input them into an SAE to derive representations for scene patch categorization. Zhang et al. [17] devised a framework to learn the robust representation of polarimetric SAR (PolSAR) data based on the spatial information, which was characterized by the spatial distance to the central pixel. Their framework was subsequently improved in [18] with a multi-scale strategy in which the spatial information was obtained by taking neighborhood windows of different scales before the stacked sparse AE (SSAE) was applied to extract features at different scales for land cover classification in PolSAR images. Hou et al. [19] devised a PolSAR image classification method based on both the multilayer AE (MLAE) and the superpixel trick. The superpixels produced by the Pauli decomposition to integrate contextual information of the neighborhood was refined by an MLAE to generate a robust representation. Chen and Jiao [20] fed the discriminative feature extracted by the multilayer projective dictionary into an SAE to realize the nonlinear relationship between the elements of feature vectors in an adaptive way.

To further promote the performance in representation learning, various models based on the AE framework have also been utilized [21–24]. In Reference [21], the stacked contractive AE (SCAE) was utilized to extract temporal characteristics from superpixels for change detection in SAR images which was restricted by a contractive penalty with the Frobenius norm of the Jacobian. Xu et al. [22] developed an improved variational AE (VAE) based on the residual network to draw latent representations for vehicle classification in SAR images. Song et al. [23] devised an adversarial autoencoder neural network (AAN) to learn intrinsic characteristics and generate new samples at different azimuth angles by adversarial training. Kim and Hirose [24] proposed a quaternion autoencoder and a quaternion self-organizing map (SOM) for PolSAR image classification. The quaternion AE was introduced to extract representations based on the natural distribution of PolSAR features. The extracted features were classified by the quaternion SOM in an unsupervised manner, by which new and more detailed land categories could be discovered. In Reference [25], a deep bimodal AE was proposed for land cover classification by fusing the SAR and the multispectral images. The bimodal AE provided independent encoding modalities in the front part to learn the features of SAR data and fused the feature of each modality with shared representation layers to obtain the representations for classification.

Despite the explosive growth of unlabeled SAR images with the development of high-resolution SAR systems, the training dataset (even the unlabeled benchmarks) available for specific tasks or targets are limited and incomplete. To handle this problem, model transferring is recommended as another solution to improve the representation learning capability with small sample size and limited training resources. Huang et al. [26] devised an assembled CNN model that combines a CAE with a CNN, sharing the encoder part of the CAE. The CAE was pre-trained with a large number of unlabeled SAR images, and its encoder part that connected with a fully connected layer was fine-tuned with the limited target patches. Mohammad et al. [27] proposed a domain adaptation algorithm to transfer knowledge from the earth observation (EO) domain to the SAR domain. They trained two deep encoders coupled through their last layer to map data points from the EO and the SAR domains to the shared embedding space, such that the distance between the distributions of the two domains was minimized in the latent embedding space. In Reference [28], a DL-based workflow was proposed to map forest above-ground biomass by integrating Landsat 8 and Sentinel-1A images with airborne light detection and ranging (LiDAR) data. They demonstrated the advantage of a stacked sparse autoencoder network in comparison to other prediction techniques. De et al. [29] proposed an AE-based technique for urban area classification in PolSAR images which leveraged a synthetic target database for data augmentation (DA). The synthetic dataset obtained by rotation and collation was fed to an SAE to generate a compact representation of the information in the augmented dataset. Although these model transferring methods have alleviated overfitting of DL models caused by small datasets and achieved the state-of-the-art performance, it is quite difficult to design the transferring schemes for specific SAR ATR tasks in various extended operation conditions (EOCs). Besides, the selection of the pre-trained model and the natural image dataset for information transferring will also greatly affect the performance of these approaches. If there is a great difference between the natural images and the objective SAR dataset, the representation learning capability of the transferred models will suffer serious degradation.

Another way to promote the performance of the AE models with a limited training dataset is to incorporate prior knowledge in the model with certain regularization terms and task-specific cost functions. The training process of the AE refers to estimating the trainable parameters of the model and can be achieved by optimizing the objective function consisting of a reconstruction loss and certain regularization terms [9]. In References [30,31], the supervised information was embedded in the cost function by designing label-related regularization terms. Deng et al. [30] devised a Euclidean distance restriction in the cost function which encouraged the intra-class distance of features to be a small value near zero and the inter-class distance to be close to a constant. A similar idea was applied in Reference [31], where the objective function was tuned according to the SAR ATR task. The authors devised a regularization term based on the modified triplet loss that combines the semi-hard triplet loss with the intra-class distance penalty to learn discriminative features with a small intra-class divergence and a large inter-class divergence. In References [32–34], the task-specialized prior knowledge was embedded in the objective function of the AE-based model. Xie et al. [32] proposed a new type of AE and CAE with a modified objective function according to the task of PolSAR image classification, where the distortion of the reconstructed data to the inputs was measured by the Wishart distance instead of the ordinary mean square error (MSE) or cross-entropy. Similarly, Wang et al. [33] devised a hybrid AE for land cover classification, where the Wishart distance and Euclidean distance were jointly applied to evaluate the reconstruction error between the input and the output according to the distribution of PolSAR data matrix. In Reference [34], Li et al. proposed a stacked fisher AE for change detection, where the ratio difference image (RDI) of multi-temporal SAR images was used as the input and the distribution of the RDI was introduced to construct the objective function with sparsity regularization.

Although these AE-based models have developed an effective way to learn the robust representation via an unlabeled SAR dataset and achieved competitive results, the performance of most of these models is still slightly inferior to their supervised counterparts [35–39] and some handcrafted features [4,5,7] that are based on the electromagnetic scattering models. The major reasons include the following:


In this paper, a novel unsupervised multi-scale CAE (MSCAE) is proposed which can extract features at different scales and discard useless information of speckle and background clutter. The proposed model provides a framework to learn multi-scale features at two levels: the modality level feature learning achieved by the U-shaped structure and the branch level feature extracted by the pyramid pooling module (PPM). A modified objective function was devised to tackle the performance degradation caused by speckle. The reconstruction loss of the MSCAE was measured between the output and the input filtered by the improved Lee sigma filter (ILSF) [40] to alleviate the influence of serious speckle in SAR images. The structural similarity index metric (SSIM) was employed as the measurement of the reconstruction deviation, taking full advantage of the targets' characteristics such as the structure and the variation of backscattering intensity. An additional filter regularization term was also incorporated in the objective function that measures the dissimilarity of the encoded features of the raw data and the ILSF filtered inputs, guaranteeing that the speckle suppression procedure occurred during the encoding stage. Moreover, to handle the performance degradation caused by the limited training dataset, a new convolution layer, named compact depth-wise separable convolution (CSeConv) layer, and its deconvolution counterpart (CSeDeConv layer) were also developed to reduce the number of the trainable parameters in the model, alleviating overfitting caused by limited training samples.

The rest of this paper is organized as follows: Section 2 illustrates the key technologies used to build our MSCAE model, including the CSeConv and CSeDeConv, the PPM processing, and the specially designed objective function. Furthermore, the technical details of network topology are also given. Section 3 conducts a series of comparative experiments based on the moving and stationary target acquisition and recognition (MSTAR) dataset [41,42]. The experimental results of the proposed network with SOC and various EOCs are presented. Section 4 concludes our work.

#### **2. The Multi-Scale Convolution Auto-Encoder**

#### *2.1. Overall Structure of the MSCAE*

In this part, we discuss the characteristics and general layout of the proposed MSCAE. As shown in Figure 1, the MSCAE consists of a series of uniform modalities to learn representations and generate the reconstructed feature map at different modality levels. Each modality includes an encoder part and the corresponding decoder part.

In the encoder part of a modality, the CSeConv and the PPM module are applied to learn multi-scale features at branch level. The input feature map will be convolved with a 5 × 5 CSeConv layer with a stride 2 which means that both the width and the height of the output feature map will be half the size of the input. Subsequently, the batch normalization (BN) and the rectified linear unit (ReLU) activation function will be applied to the convolved feature map. The output feature map will be processed in two branches: one for multi-scale representation learning with the PPM module and the other for the processes in next modality after downsampled by a 2 × 2 max-pooling layer. It should be noted that at the coarsest modality level, the PPM module is neglected, and the convolved feature map is optional. If the input feature map is larger than 5 × 5, the convolution layer will be applied. Otherwise, it will also be neglected. The processed feature map will be directly vectorized to form the feature vector of the coarsest level.

**Figure 1.** The overall architecture of the U-shaped multi-scale convolution auto-encoder. The feature vector learnt at each modality level will be converted into vector and concatenated to form the feature vector for SAR ATR. CSeConv stands for the compact depth-wise separable convolution layer; BN stands for batch normalization; ReLU stands for the rectified linear unit activation function; CSeDeConv denotes the compact separable deconvolution layer; PPM stands for the pyramid pooling modul; FAM stands for the feature aggregation module; ILSF refers to the improved Lee sigma filter; SSIM refers to the structural similarity index metric.

In the decoder part of each modality, the feature aggregation module (FAM) is adopted to combine the feature vector learned by the PPM with the feature map reconstructed from the coarser modality level. The combined feature map is convolved by a 5 × 5 CSeDeConv layer with a stride 2 to upsample the feature map as well as reduce the channel number. At the first modality level, the reconstructed feature map is convolved with a 3 × 3 CSeConv layer followed by a sigmoid activation function, and the reconstructed image is generated.

The SSIM loss is measured between the reconstructed image and the image filtered by the 9 × 9 ILSF to diminish the influence of the speckle. Besides, the speckle suppression restriction is also computed to force the speckle suppression taking place at the feature learning procedure. Therefore, an additional auxiliary data flow is fed to the encoder part of the proposed model, where the image filtered by the 9 × 9 ILSF is encoded with similar modules and parameters at each modality level. The similarity between the feature vectors learned from the unfiltered image and those learned from the filtered one will be compared and summarized to construct the speckle suppression restriction. The weighted sum of the SSIM loss and the speckle suppression restriction forms the objective function of the proposed model. Once the model is trained with the dataset, the encoder part will be utilized to learn representations and the feature vector learned at each modality level will be concatenated to generate the final feature vector for SAR ATR.

#### *2.2. Compact Depth-Wise Separable Convolution and the Corresponding Deconvolution*

The convolution layer, which is the basic structure in CNNs and CAEs, has the capability of capturing local patterns of input data and generating new representations of jointly encoding space and channel information. As presented in Figure 2, the standard convolution layer [43] creates *Cout* trainable convolution kernels that are convolved with the *Win* × *Hin* × *Cin* input *Fin* to produce a *Wout* × *Hout* × *Cout* feature map *Fout*. Here, *Win* × *Hin* and *Wout* × *Hout* are the spatial size of the input and the output feature maps, respectively; *Cin* and *Cout* are the channels of *Fin* and *Fout*, respectively. The size of each trainable convolution kernel is *Nk* × *Nk* × *Cin* with *Nk* being the size of the sliding filter.

**Figure 2.** The convolution procedure of a standard convolution layer. Given a *Win* × *Hin* × *Cin* feature map *Fin* with *Win* × *Hin* being the spatial size and *Cin* being the number of channels of the input feature map, respectively, the standard convolution layer utilized *Cout* trainable convolution kernels whose size is *Nk* × *Nk* × *Cin* to produce the *Wout* × *Hout* × *Cout* feature map *Fout* with *Wout* × *Hout* being the spatial size and *Cout* being the number of channels of *Fout*, respectively.

In comparison with the fully connected layer, the number of trainable parameters in a convolution layer is much fewer due to the shared convolution kernels, substantially decreasing the computational cost and improving the performance with a small dataset. However, in large-scale deep networks, where the size of the convolution kernels is quite large and the channel number rises rapidly as the depth of the network increases, the high computational cost and overfitting caused by massive trainable parameters are still major causes of performance deterioration. To diminish these problems, various factorized convolution operators are utilized.

The depth-wise (DW) separable convolution (SeConv) [44], presented in Figure 3, is a typical factorized convolution operator in channel level which factorizes the standard convolution into two steps via the DW convolution and the pointwise (PW) convolution. In the DW convolution step, *Cin* filters with the size of *Nk* × *Nk* × 1 are applied to every input channel of the *Win* × *Hin* × *Cin* feature map *Fin* and produce the intermediate feature map that has the same number of channels as that of the inputs. Subsequently, the PW convolution utilizes *Cout* filters with the size of 1 × 1 × *Cin* to combine the output of the depth-wise layer and produce the final output feature map *Fout*.

Another factorized convolution layer is the kernel decomposition convolution (DeCConv) layer proposed by Simonyan and Zisserman [45] which decomposes the large convolution kernel into a series of 3 × 3 small kernels as depicted in Figure 4. Specifically, a large convolution kernel with the size *Nk* × *Nk* is approximated by *M* cascaded 3 × 3 filters, where the number of the 3 × 3 filters is determined by:

$$M = (N\_k - 1) / 2 \tag{1}$$

**Figure 3.** The convolution procedure of the depth-wise (DW) separable convolution (SeConv) layer that decomposes the standard convolution into two steps: the DW convolution and the point-wise (PW) convolution. During the DW convolution process, each channel of the *Win* × *Hin* × *Cin* feature map, *Fin*, is convolved with a filter the size of *Nk* × *Nk* × 1 to generate an intermediate feature map that has *Cin* channels. Subsequently, *Cout* PW filters with the size 1 × 1 × *Cin* are adopted to combine the output of the depth-wise layer and produce the final output feature map, *Fout*, with the size of *Wout* × *Hout* × *Cout*

**Figure 4.** The procedure of the kernel decomposition convolution layer that decomposes the large convolution kernels into *M* stacked 3 × 3 filters. Given the *Win* × *Hin* × *Cin* feature map, *Fin*, *Cout* small convolution kernels with the size of 3 × 3 × *Cin* are applied to generate the first intermediate feature map with the size of *W*<sup>1</sup> × *H*<sup>1</sup> × *Cout*. Subsequently, the first intermediate feature map is convolved successively with *Cout* small convolution kernels with the size of 3 × 3 × *Cout* for *M* − 1 times, and the output feature map *Fout* with the size of *Wout* × *Hout* × *Cout* is produced.

During the convolution procedure with stacked 3 × 3 filters, activation functions can be employed after each convolution operator. Given the *Win* × *Hin* × *Cin* feature map, *Fin*, *Cout* small convolution kernels with the size of 3 × 3 × *Cin* are applied to generate the first intermediate feature map with the size of *W*<sup>1</sup> × *H*<sup>1</sup> × *Cout*. Subsequently, the first intermediate feature map is convolved successively with *Cout* small convolution kernels with the size of 3 × 3 × *Cout* for *M* − 1 times and the output feature map *Fout* with the size of *Wout* × *Hout* × *Cout* is produced. It is reported that this scheme could not only significantly decrease the trainable parameters and computational cost but also improve the representation learning capability of the convolution layer due to the increasing nonlinearity induced by the activation function of the cascaded 3 × 3 convolution layers.

Although these schemes significantly decrease the trainable parameters and computational cost, the small benchmark in SAR ATR still limit the application of deeper and complicated models. In this paper, a more compact convolution layer and its deconvolution counterpart are proposed. The proposed layers combine the kernel decomposition scheme and the DW SeConv scheme, thereby requiring a smaller number of trainable parameters as well as introducing more nonlinearity for better representation learning. The proposed compact DW separable convolution (CSeConv) process is depicted in Figure 5a. Similar to the DW SeConv layer, the standard convolution is split into two steps: the DW convolution for separable convolution at the channel level and the PW convolution to combine the filtered features of all channels. Besides, the kernel decomposition scheme is also adopted in the

DW convolution step, as each DW convolution can be considered as an input with single-channel convolving with only one kernel. Each large DW kernel is decomposed into a bunch of 3 × 3 filters, each of which is followed by a nonlinear activation function to provide additional nonlinearity [45]. Accordingly, the trainable parameters can be further decreased by the combined scheme.

**Figure 5.** The proposed compact depth-wise separable convolution (CSeConv) and the corresponding compact separable deconvolution (CSeDeConv)layer which combines the depth-wise (DW) separable convolution/deconvolution scheme and the kernel decomposition scheme to reduce the trainable parameters. (**a**) The procedure of the CSeConv layer; (**b**) the details of the CSeDeConv. The size of the input feature map and output feature map in (**a**) and (**b**) are *Win* × *Hin* × *Cin* and *Wout* × *Hout* × *Cout*, respectively.

Its deconvolution counterpart, i.e., the compact separable deconvolution (CSeDeConv) layer presented in Figure 5b, is devised in the same manner, composed of two steps: the DW separable deconvolution and the channel-level combination DeConv. In the first step, the deconvolution operator was applied to each channel of the *Win* × *Hin* × *Cin* input feature map, *Fin*. The kernel decomposition scheme is also employed to split the *Nk* × *Nk* deconvolution kernel into *M* − 1 concatenated 3 × 3 filters at the channel level, where *M* is determined according to (1). Subsequently, *Cout* deconvolution filters with the size of 3 × 3 × *Cin* are utilized to combine the output of the DW separable deconvolution step and generate the output feature map *Fout* with the size of *Wout* × *Hout* × *Cout*. It should be noted that if the stride of either the CSeConv or the CSeDeConv is larger than 1, the convolution/deconvolution operator with the given stride will be implemented in the last channel level combination step.

To demonstrate the validation of the CSeConv and the CSeDeConv, the mixed national Institute of standards and technology database (MNIST) of handwritten digits was utilized for evaluation. A three-layer CAE model with one standard convolution layer and one deconvolution layer was employed as the baseline model for comparison. Both the convolution layer and the deconvolution layer had four 5 × 5 filters, and the strides of both the convolution layer and the deconvolution layer were 4. The trainable parameters were initialized with the He initialization [46], while the

activation functions of both the convolution and deconvolution layer were ReLU. In the experiment, the convolution and deconvolution layers were replaced by the proposed CSeConv and the CSeDeConv, respectively. Accordingly, four CAEs could be generated for comparison: the baseline CAE, the CAE with the CSeConv layer (CCAE), the CAE with the CSeDeConv layer (CDCAE), and the compact CAE with the CSeConv layer and CSeDeConv layer (CompactCAE). The original images and the reconstructed results of the four CAE models are shown in Figure 6a to illustrate the validation of the proposed layer. Moreover, the training losses of the four models are compared in Figure 6b. As shown in Figure 6, both the reconstruction results and the training losses of the four CAEs were approximately the same which demonstrate the validation of the proposed CSeConv layer and CSeDeConv layer.

**Figure 6. Validation Experiments with the** mixed national Institute of standards and technology database (MNIST) **dataset**. (**a**) The reconstruction results of the baseline convolution auto-encoder (CAE), the CAE with the compact depth-wise separable convolution (CSeConv) layer (CCAE), the CAE with the compact separable deconvolution (CSeDeConv) layer (CDCAE), and the compact CAE with the CSeConv layer and CSeDeConv layer (CompactCAE) from the top row to the bottom row. (**b**) The training losses of the four models.

A brief analysis of the trainable parameters and computational consumption of various convolution layers are made and compared in Table 1. In our comparison, the size of the input feature map is supposed to be *Win* × *Hin* with *Cin* channels, and the size of the convolution kernel is *Nk* × *Nk* × *Cin*. In order to simplify the analysis, the stride of the convolution is assumed to be 1, and the padding mode of the convolution is set to unify the input and output feature maps in size. Consequently, the size of the output feature map is *Win* × *Hin* with *Cout* channels. Besides, the addition of feature aggregation is also ignored as in Reference [36] when the computational consumption with different convolution layers is compared. The number of trainable parameters *Kparam*, the computation consumption *Lcomp*, and the ratio of calculation consumption between the improved convolution layer and the standard convolution *Ropt* = *LOther comp* /*LStandard comp* are all listed in Table 1. It can easily be found that the number of trainable parameters and the calculation consumption have been effectively reduced compared with the standard convolution and other mainstream convolution layers. Moreover, the reduction of the trainable parameters and the ratio of calculation consumption is only related to the number and size of the convolution kernel.


**Table 1.** Analysis of the trainable parameters and computational consumption of the proposed convolution layer and some mainstream convolution layers.

#### *2.3. Multi-Scale Representation Learning with Pyramid Pooling Module and Feature Aggregation Module*

#### 2.3.1. Pyramid Pooling Module for Multi-Scale Feature Extraction

In most convolution-based deep networks, the spatial pooling operator is utilized as a crucial element to fuse characteristics of nearby feature bins into a compact representation. The objective of the spatial pooling process is to transform the joint feature representation into a new compressed, more effective one that preserves discriminative information while discarding irrelevant detail, the crux of which is to determine what can benefit the classification performance. Various pooling operators have been devised based on the sum, the average, the maximum, or some other combination rules and achieved significant success in computer vision and SAR ATR tasks [47]. However, most of the spatial pooling operators usually obtain the compact representation at a fixed-size receptive field which is possibly improper to the structure of the intrinsic characteristics and will lead to either information loss with too large of a size or feature dilution with too small of a size. Besides, for targets with a complicated characteristic structure, the fixed-size pooling operators that can only learn features at a fixed scale is also the major cause for incomplete representation learning and the consequent performance degradation.

To tackle the problem caused by the fixed-size pooling operators, the pyramid pooling module (PPM) was devised which was first adopted to generate fixed-length representations from inputs with varying sizes for deep visual recognition [48]. The PPM provides an effective way to obtain intrinsic characteristics of complicated targets from the view of multiple scales. In this paper, a modified version of PPM was devised and adopted in the proposed model to obtain a multi-scale representation at each modality level. As depicted in Figure 7, a typical PPM in the proposed model consists of four sub-branches with varying local reception field for pooling to capture the context information of the input feature maps. The first and the last sub-branches are the global max-pooling layer in the channel level and feature map level, respectively. For the two middle sub-branches, each of them consists of an adaptive max-pooling layer and a 3 × 3 CSeConv layer followed by a BN operator and a ReLU activation function. To be more specific, let us suppose the size of the input feature map is *Win* × *Hin* × *Cin*. Accordingly, the size of the output feature maps of the first and the last global max-pooling layers are 1 × 1 × *Cin* and *Win* × *Hin* × 1, respectively. The size of the output feature maps of the adaptive max-pooling layers in the two middle branches will be *Win* <sup>2</sup> <sup>×</sup> *Hin* <sup>2</sup> <sup>×</sup> *Cin* and *Win* <sup>4</sup> <sup>×</sup> *Hin* <sup>4</sup> × *Cin*. The following CSeConv layers in each branch is employed to compress the pooled multi-channel feature map into a single-channel feature map, i.e., the sizes of the output feature maps of the CSeConv layers in the two middle sub-branches are *Win* <sup>2</sup> <sup>×</sup> *Hin* <sup>2</sup> <sup>×</sup> 1 and *Win* <sup>4</sup> <sup>×</sup> *Hin* <sup>4</sup> × 1. Finally, the output feature maps of the four sub-branches are converted to a single column vector and concatenated to construct the representation at the current modality level. It should be noted that if the size of the input feature

map is too small to obtain the feature maps in the overall four sub-branches, part of the branches can be removed from the typical PPM and the corresponding feature maps can be neglected according to the image size.

**Figure 7.** The architecture of the proposed pyramid pooling module (PPM) for multi-scale representation learning at each modality level. Given an input feature map with the size of *Win* × *Hin* × *Cin*, the output feature maps of the four sub-branches of the PPM are with the size of 1 <sup>×</sup> <sup>1</sup> <sup>×</sup> *Cin*, *Win* <sup>4</sup> <sup>×</sup> *Hin* <sup>4</sup> <sup>×</sup> *Cin*, *Win* <sup>2</sup> <sup>×</sup> *Hin* <sup>2</sup> × *Cin*, and *Win* × *Hin* × 1.

#### 2.3.2. Feature Aggregation Module (FAM) for Feature Map Reconstruction

The utilization of our PPMs in the encoder stage allows the model to learn multi-scale representations from the input SAR image at different modality levels. However, a new problem that deserves to be solved is how to seamlessly merge the feature maps from PPMs at different modality levels and obtain the reconstructed image that is the essential objective of an AE model. To this end, a series of FAMs are developed each of which contains two parts as illustrated in Figure 8.

In the first part, the feature maps at different scales of the PPM are combined to produce a new feature map at the current modality level. The multi-scale feature vector at one modality level obtained by a PPM is first decomposed and reshaped into feature maps of different scales according to the PPM at the same modality level. Subsequently, the feature maps of different scales are processed in separate sub-branches. For two middle sub-branches, an upsampling operator and a smooth operator consisting of a 3 × 3 CSeDeConv layer, a BN operator and a ReLU activation function are executed. For the first and the last sub-branches, only the upsampling operator and the smooth operator are applied, respectively. Finally, the processed feature maps of different scales are weighted and summed to generate the feature map of the current modality level. In the second part, the feature map from the coarser level is merged with the combined multi-scale feature map at the current level. The upsampling process followed by a 3 × 3 CSeDeConv layer, a BN operator and a ReLU activation function is applied to obtain the feature map of a coarser level that is the same size as the feature map at the current level. Subsequently, feature maps from different levels are concatenated together to generate the merged feature map.

**Figure 8.** The architecture of the proposed feature aggregation module (FAM) for merging multi-scale representation of different modality levels which consists of two parts. The first part of the proposed module combines multi-scale feature maps and generates a new feature map at the current modality level. The input feature vector with the size of <sup>21</sup> <sup>16</sup>*Wcur* × *Hcur* + *Ccur* is decomposed and reshaped to produces the feature maps in the four sub-branches with the size of 1 <sup>×</sup> <sup>1</sup> <sup>×</sup> *Ccur*, *Wcur* <sup>2</sup> <sup>×</sup> *Hcur* <sup>2</sup> <sup>×</sup> 1, *Wcur* <sup>4</sup> <sup>×</sup> *Hcur* <sup>4</sup> × 1 and *Wcur* × *Hcur* × 1 Subsequently, the upsampling operator and the following smoothing operator are applied to each sub-branch. The output feature maps of the four sub-branches are weighted and added to generate the feature map of the first part with the size of *Wcur* × *Hcur* × *Ccur*. The second part of the FAM merges the feature map with the size of *Wpre* × *Hpre* × *Cpre* from the coarser modality level with the combined feature map with the size *Wcur* × *Hcur* × *Ccur* at the current modality level. The input feature map at the coarser level is upsampled and smoothed to generate the upsampled feature map of the coarser level with the size of *Wcur* × *Hcur* × *Cpre* The upsampled feature map is concatenated with the feature map at the current modality level, generating the new feature map with the size of *Wcur* × *Hcur* × *Cpre* <sup>+</sup> *Ccur* .

To be more specific, in the first step, suppose the input feature vector at the current modality level has the size of <sup>21</sup> <sup>16</sup>*Wcur* × *Hcur* + *Ccur*. The input multi-scale feature vector will be decomposed and reshaped the feature maps with the size of 1 <sup>×</sup> <sup>1</sup> <sup>×</sup> *Ccur*, *Wcur* <sup>2</sup> <sup>×</sup> *Hcur* <sup>2</sup> <sup>×</sup> 1, *Wcur* <sup>4</sup> <sup>×</sup> *Hcur* <sup>4</sup> × 1, and *Wcur* × *Hcur* × 1, respectively. Subsequently, the upsampling operator and the following smoothing operator are applied. Accordingly, the output feature maps of the four sub-branches will have the same size of *Wcur* × *Hcur* × *Ccur*. Finally, the four feature maps are weighted and added to generate the feature map of the current modality level with the size of *Wcur* × *Hcur* × *Ccur*. In the second step, let

the input feature map at the coarser level have the size of *Wpre* × *Hpre* × *Cpre* with *Wpre* × *Hpre* being the spatial size of the input feature map and *Cpre* being the number of channels at the coarser level. The feature map from the coarser level is upsampled and smoothed by a 3 × 3 CSeDeConv layer, a BN operator, and a ReLU activation function. Finally, the upsampled feature map of the coarser level with the size of *Wcur* × *Hcur* × *Cpre* is concatenated with the feature map at the current modality level, generating the new feature map with the size of *Wcur* × *Hcur* × *Cpre* <sup>+</sup> *Ccur* .

#### *2.4. Loss Function Based on the Modified Reconstruction Loss and Speckle Filtering Restriction*

Typically, an AE-based model provides a symmetrical frame on learning latent representation of candidate targets by mapping the inputs into a low dimensional feature space at the encoder stage and approximately reconstruct the inputs from the learned features at the decoder stage. The objective of the AE-based model is to minimize the loss function that measures the distortion between the inputs and the outputs to guarantee that the mapping process preserves the information of the inputs. The commonly used loss functions, such as the MSE, cross-entropy, and the Minkowski distance in the field of deep learning, concern the total bias of pixel values or distributions and neglect the structural information of the candidate targets, leading to performance degradation in SAR ATR. Therefore, the SSIM loss function [49] is employed in the proposed model which simultaneously compares the similarity of two images over the structure, the luminance, and the contrast, gaining significant success in the computer vision domain. Suppose the input image of an AE-based model is *x* and the output of the model is *x*ˆ, the SSIM loss function can be:

$$L\_{SSIM}(\mathbf{x}, \hat{\mathbf{x}}) = E \left\{ \frac{(2\mu\_{\mathbf{x}}\mu\_{\hat{\mathbf{x}}} + c\_1)(2\sigma\_{\mathbf{x}\hat{\mathbf{x}}} + c\_2)}{(\mu\_{\mathbf{x}}^2 + \mu\_{\hat{\mathbf{x}}}^2 + c\_1)(\sigma\_{\mathbf{x}}^2 + \sigma\_{\hat{\mathbf{x}}}^2 + c\_2)} \right\}. \tag{2}$$

where <sup>μ</sup>*<sup>x</sup>* and <sup>μ</sup>*x*<sup>ˆ</sup> are the local average in a 11 <sup>×</sup> 11 sliding window of *<sup>x</sup>* and *<sup>x</sup>*ˆ, respectively; <sup>σ</sup><sup>2</sup> *<sup>x</sup>* and σ2 *<sup>x</sup>*<sup>ˆ</sup> are the local variance in a 11 × 11 sliding window of *x* and *x*ˆ, respectively; σ*xx*<sup>ˆ</sup> is the correlation coefficient in a 11 × 11 sliding window; *c*<sup>1</sup> = (*K*1*L*) <sup>2</sup> and *c*<sup>2</sup> = (*K*2*L*) <sup>2</sup> are two constants with *<sup>K</sup>*<sup>1</sup> = 0.01 and *K*<sup>2</sup> = 0.03; *L* is the dynamic range of the pixel values (1.0 for normalized SAR images); *E*(·) is the expectation operator. While calculating the SSIM loss of two images, the sliding window will be moved pixel by pixel over the entire image. At each step, the local statistics and the local SSIM loss are computed in the window. Finally, the SSIM of the entire image is computed by averaging the local SSIM of each step.

Another problem is that in most conditions there is serious speckle in the target patches which not only have little information about the target but affect the ATR capability of the learned features. To alleviate their influence, the reconstruction loss is modified by measuring the distortion between the outputs and the speckle filtered images instead of the original input data. Consequently, the model will be forced to learn the characteristics of the targets rather than the background clutter, and the prior knowledge of speckle suppression can be embedded in the MSCAE during the model training procedure. In the proposed model, the ILSF is employed to generate the speckle suppressed image due to the fact of its excellent capability in maintaining detailed structures, strongly reflecting and scattering targets, and smoothing undesired background clutter [50]. Moreover, an additional restriction is devised to guarantee that the speckle suppression process is taken place in the encoder stage and little information on the speckle will be learned. The restriction is implemented by comparing the difference between the features learned from the original inputs and those learned from the speckle suppressed images. Accordingly, the loss function of the proposed model is

$$L\_{\rm MSCAE} = \frac{1}{N} \sum\_{i=1}^{N} L\_{\rm SSIM} \{ \pounds\_i, \xleftarrow{ILSF} \} + \alpha \sum\_{j=1}^{\mathbb{C}} \|h\_{ij} - h\_{ij}^{ILSF}\|\_2 \tag{3}$$

where *Dtrain* = {*xi*} *N <sup>i</sup>*=<sup>1</sup> is the training dataset with *xi* being the *ith* target patch and *N* being the number of samples; *x*ˆ*<sup>i</sup>* and *xILSF <sup>i</sup>* are the output image of the proposed model and the speckle suppressed version of *xi*, respectively; ||·||<sup>2</sup> is the *<sup>l</sup>* <sup>−</sup> 2 norm; *hij* and *<sup>h</sup>ILSF ij* are the encoded feature vectors of *xi* and *xILSF <sup>i</sup>* at scale *j*, respectively; *C* is the number of modality levels; α is the coefficient of the speckle suppression restriction, which can be 0.01/*C* in most SAR ATR tasks.

#### **3. Experiments and Discussion**

#### *3.1. Experimental Data Sets*

In this study, the representation learning capability of the proposed model was evaluated by the MSTAR dataset [42] which is jointly sponsored by the US Defense Advanced Research Projects Agency (DARPA) and Air Force Research Laboratory (AFRL). There were a total of ten distinctive types of vehicles in the dataset as shown in Figure 9, including the armored personnel carrier BMP-2, BRDM-2, BTR-60, and BTR-70; the tank T-62 and T-72; the rocket launcher 2S1; the air defense unit ZSU-234; the truck ZIL-131; and the bulldozer D7. The images were collected by an X-band SAR in spotlight mode with the resolution of 0.3 m × 0.3 m and split into tens of thousands of small patches centered on the candidate targets and surrounded by varying background clutter. These small patches provide full-aspect coverage from 0◦ to 360◦ and different views at various depression angles for each type of the ten vehicles. Detailed information including the type, the serial number (Serial No.), the depression angle, and the number of samples are all listed in Table 2.

**Figure 9.** Photographs (the first row) and SAR imagery examples (the second row) of the moving and stationary target acquisition and recognition (MSTAR) dataset for model evaluation. From left to right, the types of vehicles are 2S1, BMP-2, BRDM-2, BTR-60, BTR-70, D7, T-62, T-72, ZIL-131, and ZSU-234.


**Table 2.** Detailed information of the MSTAR dataset.

In order to ensure comprehensive access to the performance, the proposed MSCAE was tested under standard operating condition (SOC) and various extended operating conditions (EOCs) including substantial variations in the signal-to-noise ratio (SNR), resolution, and version. In our experiments, the proposed model was first validated on three similar targets, namely, BMP-2, BTR-70, and T-72, to validate its performance under SOC and version variants. Subsequently, validation of the SSIM loss, the PPM, the CseConv, and CSeDeConv and the speckle suppression scheme are all discussed based on the three-target dataset. Experiments on 10 class MSTAR data were also conducted to evaluate the performance under the extension of the target type. Finally, the robustness of the proposed model under various conditions, including noise corruption and resolution variance, was also evaluated with the ten-target dataset. For both the three-target dataset and the ten-target dataset, the patches acquired at the 17◦ depression angle were utilized as training samples, while those obtained at the 15◦ depression angle constructed the test set. Similar to the experimental setting in References [30,31], only the data from BMP2-9563 and T72-132 were used, as the samples of the BMP-2 and T-72 were used to construct the training dataset. But in the test dataset, images of all serial numbers (i.e., version variants) were used to test the performance of the proposed method.

#### *3.2. Experiment Configuration*

#### 3.2.1. Data Preprocessing

In most deep networks, including the proposed model, the size of the input images is required to be the same. Meanwhile, the size of target patches in the MSTAR dataset can vary from 128 × 128 to 158 × 158. Consequently, the input target patches should be resized to the same shape, which is 128 × 128 in this study, before being used for model training and performance validation. In this study, the image crop processing based on the centroid of the target region is adopted. The ILSF is firstly applied to suppress the speckle and background clutter in the small patches. Subsequently, a two-parameter constant false alarm rate (CFAR) detector is executed to obtain the target region of each patch. The centroid is calculated by averaging the coordinates of the target pixels weighted by their pixel values. Finally, only the 128 × 128 region surrounding the centroid will remain, while other regions will be removed.

Another preprocessing step is the normalization process. It can be found that in many target patches, the intensity of targets seriously varies which possibly conceals the differences among targets and, thus, affect the performance of the learned features. Accordingly, intensity normalization was adopted to alleviate the amplitude variation in target patches, mapping the pixel intensities onto the range [0, 1]. Except for image resizing and normalization, no other preprocessing, such as data augmentation (DA) or target segmentation, were applied.

#### 3.2.2. Model Configuration and Experiment Design

In our experiments, an MSCAE with four modality levels was utilized to obtain multi-scale representations of the MSTAR data. The model parameters and the fan-ins and fan-outs of each level are listed in Table 3. As depicted in the table, there were some changes in the third and fourth levels. In the third level, the size of the feature map was downsampled from 16 × 16 × 16 to 4 × 4 × 32 after the 2 × 2 max-pooling and the 5 × 5 CseConv with a stride of 2. While feeding the feature map to the PPM block, the output feature maps of the four sub-branches of the typical PPM should have the size of 4 × 4 × 1, 2 × 2 × 32, 1 × 1 × 32 and 1 × 1 × 32, respectively. The outputs of the third and the fourth sub-branch were the same, bringing in redundant information and had little contribution to the target discrimination. Accordingly, the fourth sub-branch, which provides a feature map with global max-pooling in channel level, was removed, and the corresponding feature map was neglected. The FAM in the same modality level was also changed by removing the corresponding sub-branches while combining the feature maps obtained by the PPM. In the fourth modality level, since the size of the input feature map was smaller than 5 × 5, the optional CseConv layer was removed, and only the 2 × 2 max-pooling layer was applied before drawing the feature vector of the coarsest level.


**Table 3.** The main structure of the MSCAE with four modality levels utilized in the experiments.

All the convolution kernels of the MSCAE were also initialized with the He initialization [46]. After initialization, the model was trained with the preprocessed training dataset, and the Adam optimizer [51] was utilized to optimize the model with an initial learning rate of 0.001. The exponential decay rates β<sup>1</sup> and β<sup>2</sup> for the moment estimates were 0.9 and 0.999, respectively. The batch of the training samples was 32. The maximum number of iterations was 500, and the early-stopping scheme was enabled to terminate the training if the improvement of the training loss was less than the threshold. In our experiments, the nonlinear SVM (NSVM) was employed for classification after extracting features from the target patches. Moreover, to avoid fluctuations in the results caused by random steps in the model initialization and optimization, each experiment was repeated ten times, and the average of the results were utilized for performance evaluation.

Experiments were carried out in a 64 bit Windows 10 system. The proposed model was mainly built on the Google Tensorflow v1.5.0 deep learning library in the Python development environment PyCharm. The hardware platform was a specially adapted DELL T5810 workstation with an Intel Xeon E5-1607 v3 @ 3.10 GHz CPU, 32 GB DDR4 RAM and an NVIDIA K40c (12G memory) GPU with CUDA8.0 accelerating calculation.

#### *3.3. Evaluation on Three-target Classification*

The average results of the ten experiments with the three-target dataset are depicted in Table 4. The performance was measured by the probability of correct classification (*Pcc*) which is calculated through the number of targets recognized correctly divided by the number of all the targets. The results with and without version variants are listed in the sixth and fifth columns of the table, respectively. A comparison experiment was also conducted to demonstrate the performance improvement induced by the multi-scale feature learning architecture. Classification rates with various feature combinations are evaluated and compared.

As shown in Table 4, for the experiment without version variants (i.e., under SOC), the proposed model had the highest accuracy (i.e., 99.73%) when features from all modality levels were utilized. Meanwhile, in the case with variants only, the accuracy of the proposed model with various feature combinations suffered a slight degradation due to the differences in local structure and small equipment

of varied serial numbers. However, the accuracy obtained by the model with features from all levels was still higher than 98%, indicating a good generalization performance. The average accuracies of these methods with all the test data are listed in the seventh column of Table 4. It can be found that when features from multiple scales are combined, the average *Pcc* of 99.14% is competitive to the state-of-the-art results provided by the supervised neural networks. The major reason is that the proposed two-level multi-scale feature extraction structure guarantees that the MSCAE can learn the high-level abstracted properties while preserving the detailed information that is neglected by most DL networks. Besides, the specifically designed objective function with speckle suppression and SSIM can diminish the influence of serious speckle in SAR images and take full advantage of the target structure caused by the backscattering. In addition, the proposed compact convolution and deconvolution processes greatly decrease the number of trainable parameters and introduce more nonlinearity that slightly benefits the model capability.


**Table 4.** Classification results on the three-target dataset.

Other features extraction methods were also compared with the proposed method for further evaluation including the baseline handcrafted methods and the DL networks which were obtained from the state-of-the-art results. The baseline handcrafted methods include the PCA-kernel SVM (PCA-KSVM) [52], the joint sparse representation based method (JSRC) [53], the particle swarm optimization with Hausdorff distance (PSO-HD) [54], the non-negative matrix factorization (NMF) method [55], the attributed scattering center matching method (ASCM) [7], and the 3D scattering center model reconstruction method (3D-SCM) [5]. Among these methods, the PCA-KSVM method employs the nonlinear PCA to extract discriminative feature and, subsequently, feeds the features into the SVM classifier. The JSRC method exploits the inter-correlations among the multiple views using joint sparse representation over a training dictionary. The PSO-HD is a pattern matching method that minimizes the Hausdorff distance over rigid transformations. The NMF method utilized the NMF with an L1/<sup>2</sup> norm constraint to extract features in SAR images. The ASCM and the 3D-SCM were devised based on the backscattering model of the SAR image and achieved state-of-the-art results in SAR ATR. The ASCM proposed a SAR ATR method, where the ASCs were utilized for target reconstruction and similarity measurement. In the 3D-SCM method, the 3D scattering center model, established offline from the CAD model of the target, was employed to predict the 2D scattering centers for template matching. The DL networks for comparison were composed of the restricted RBM (RRBM) [56], the CNN with DA (DA-CNN) [57] and additional data generated by image processing methods, the CNN with SVM (CNN+SVM) [37], the A-Convnet [57] that replaced the fully connected layers with a convolution layer in a CNN, the sparse AE pre-trained CNN (AE-CNN) [58] where the convolution kernel was trained on randomly sampled image patches using unsupervised sparse auto-encoder, the ED-AE [30], and the

Triplet-DAE [31]. Among these methods, the CNN with SVM and the A-Convnet were implemented in our codes with Python. In our implementation, preprocessing included image cropping, speckle filtering with ILSF, and normalization was applied to these methods. Besides, the additional DA scheme was also executed to generate sufficient training samples for the CNN with SVM model and the A-Convnet according to References [37,38]. The configurations of the CNN with SVM and the A-Convnet were determined according to References [37,38]. The accuracies of all the methods are shown in Figure 10. The features learned by the proposed method had a better classification capability than most handcrafted methods, even comparing them with ASCM and 3D-SCM which achieved state-of-the-art results for handcrafted features, because of the multi-scale feature learning scheme and the specifically designed objective function. Comparison with deep networks, including the DA-CNN, the RRBM, the AE-CNN, and the ED-AE, also indicates that the proposed model outperformed most of the DL models which have specialized restrictions for finding discriminative features. Even compared with the CNN+SVM and the A-Convnet that achieved state-of-the-art results, the proposed method obtained a comparable result.

**Figure 10.** Performance comparison with baseline handcrafted feature extraction methods and deep representation models via the three-target dataset.

#### *3.4. Validation of the Model Component*

In order to investigate the contribution of each proposed component in the MSCAE, including the PPM, the CSeConv, the SSIM measurement, the ILSF, and the speckle suppression restriction, validation experiments were conducted. Each component was removed from the MSCAE to reveal the performance improvement induced by it. Accordingly, we obtained five models for performance validation, marked as MSCAE no. 1 to no. 5 in Table 5. In model no. 1, the PPMs and the corresponding FAMs at each modality were removed from the MSCAE, and the multi-scale features were only generated by the U-shaped architecture. In model no. 2, the CSeConv and the corresponding CSeDeConv layers were all replaced by the standard convolution layers. In model no. 3, the SSIM measurement was replaced by the MSE loss. In model no. 4, all the data flows and restrictions that related to the ILSF were removed from the MSCAE model, while in model no. 5 only the restriction term in the objective function was removed. The three-target MSTAR dataset was utilized to evaluate their performance and each experiment was conducted ten times. The average accuracy of the five models and the proposed MSCAE models are listed in Table 5. As shown in the table, the PPM and the SSIM measurement contributed the most to improving the accuracy, 2.85% and 1.83% respectively, while CSeConv and the CSeDeConv only improved the accuracy approximately 0.41%. However, the proposed CSeConv and CSeDeConv can remarkably reduce the number of trainable parameters in the proposed model that can greatly benefit its performance with a small training dataset. To further illustrate their contribution, an experiment which evaluated the model performance with a limited training sample was conducted by randomly removing a part of the sample in the training set. In this experiment, only 1/*n* images were randomly selected from the dataset as training samples with *n* varying from one to ten. The average accuracies and their standard deviations with the proposed MSCAE and model no. 2 are presented in Figure 11. As shown in the figure, the proposed model achieved an accuracy higher than 90% when only 20% of the sample was utilized to train the model, while the *Pcc* of the MSCAE no. 2 that had much more trainable parameters than the proposed model fell below 85%. Moreover, when the size of the training dataset was only 1/10 of the original one, the *Pcc* of the MSCAE no. 2 fell below 60% and the proposed MSCAE still had an accuracy higher than 70%.


**Table 5.** Validation of the components in the proposed MSCAE model.

**Figure 11.** Validation of the proposed model with a small dataset when 1/*n* samples were randomly selected to train the model.

#### *3.5. Evaluation on Ten-Target Classification*

The average results of the ten experiments with the ten-target MSTAR dataset are depicted in Table 6. It can be found that by combining the features of all the four levels, the classification accuracy obtained an improvement that increased from 98.5% to 98.9%, in comparison with the highest *Pcc* achieved by the feature vector that combined the learned representations of the first three modality levels. Many other feature extraction methods and representation learning models were also compared with the MSCAE for further evaluation, including the baseline handcrafted features, the unsupervised DL models, and the supervised models. The baseline handcrafted features for evaluation includes the NMF method [55], the sparse representation of monogenic signal via Riemannian manifolds (SRRMs) [59], the weighted multi-task kernel sparse representation (WMTKSR) [60], and the ASCM [7]. Among these methods, the SRRM utilizes the covariance descriptor of the monogenic signal as the features, and classifies the targets with the Riemannian manifold embedded in an implicit reproduction of the kernel Hilbert space (RKHS). The WMTKSR maps the multi-scale monogenic features into a

high-dimensional kernel feature space using the nonlinear mapping associated with a kernel function, and the classification process is formulated as a joint covariate selection problem across a group of related tasks. The unsupervised DL models comprise the multi-discriminator generative adversarial network (MGAN-CNN) that generates unlabeled images with GAN and sets them as the input of CNN together with original labeled images [61], the feature fusion SAE (FFAE) [15] that extracts 23 baseline features and three-patch local binary pattern (TPLBP) features and, subsequently, feeds them into an SAE for feature fusion and the variational AE based on residual network (ResVAE) [22]. The supervised models for performance evaluation are the ED-AE [30], the Triplet-DAE [31], the CNN with SVM [37], the A-Convnet [38], the ESENet that based on a new enhanced squeeze and excitation (enhanced-SE) module [35], and the hierarchical fusion of CNN and ASC (ASC-CNN) that provide a complicated scheme to fuse the decision of the ASC model and the CNN [39]. Among these methods, the CNN with SVM and the A-Convnet are implemented in our codes with Python. In our implementation, preprocessing included image cropping, speckle filtering with ILSF, and normalization was applied to the two methods. Besides, an additional DA scheme was also executed to generate sufficient training samples for the CNN with an SVM model and the A-Convnet according to References [37,38]. Therefore, both the results of the two models with and without DA processes were compared in our experiment to make a comprehensive and equal analysis. The configurations of the CNN with SVM and the A-Convnet were determined according to References [37,38]. In the experiments, each of the CNN with SVM and the A-Convnet was executed and tested ten times, and the average classification accuracy was utilized for performance evaluation.

**Table 6.** Classification results of the ten-target MSTAR dataset.


The classification results of all the methods are depicted in Figure 12. It can easily be found that the classification accuracy of the proposed method was much higher than most of the traditional handcrafted features. Although the accuracy of the proposed model was a bit lower than the ASCM feature that achieved state-of-the-art results with the scattering center model, the result obtained by the proposed model was still comparable and can adaptively extract features without manual intervention. Compared with most of the deep representation learning methods (e.g., the ED-AE, the Triple-DAE, the MGAN-CNN, and the ESENet), the proposed model also yielded much better performance. In comparison with the CNN with SVM, the A-Convnet and the ASC-CNN that achieved state-of-the-art results, the proposed MSCAE was also competitive. The results achieved by the CNN with SVM and the A-Convnet without DA preprocess demonstrated that their high classification rates mainly relied on the DA operations. Although their DA processes did improve the performance, they induced certain problems including bringing in man-made uncertainty and unstable performance, amplifying the sampling biases in the original dataset, and high computational complexity. The results obtained by the ASC-CNN devised a complex decision fusion strategy to improve the accuracy obtained by the ASC and the CNN separately. Although the performance of the proposed model was better

than the proposed model, it requires a complicated process to extract the ASC features and higher computational complexity.

**Figure 12.** Performance comparison on the ten-target dataset with different handcrafted feature extraction methods and deep representation models.

#### *3.6. Classification Experiment with Noise Corruption*

An important characteristic of SAR data is that serious noise can often be observed in the images, which is a major factor causing performance deterioration in SAR ATR. Accordingly, to demonstrate the robustness of the proposed model, the SAR images corrupted by different levels of SNRs were simulated to evaluate the model's robustness to noise. The original MSTAR images that had an SNR over 30 dB were considered as noise-free sources. To obtain the noise-contaminated images, the original MSTAR patches were first transformed into the frequency-aspect domain with the 2D inverse discrete Fourier transform (IDFT), and different levels of additive complex Gaussian noises were added to the transformed images with the SNR defined in Equation (4) in accordance with Reference [5].

$$NR(dB) = 10\log\_{10}\frac{\sum\_{u=0}^{lI-1} \sum\_{v=0}^{V-1} \left| f(u,v) \right|^2}{HW\sigma^2} \tag{4}$$

where *f*(*h*, *w*) denotes the complex RCS computed by the EM code; σ<sup>2</sup> is the variance of the complex noise. By transforming the noisy RCS into the image domain using the same imaging process, the noise-contaminated images can be generated for experimental evaluation. Figure 13 presents some contaminated images with different SNRs.

Some input images and the corresponding reconstruction results at 10 dB and −10 dB SNR are presented in Figure 14 for comparison. Although many inputs at −10 dB SNR were seriously contaminated such that the targets in the patches can merely be observed, the output images of the trained model successfully reconstructed the major parts of the targets, demonstrating the excellent noise suppression capability of the proposed model. The average classification results and the corresponding standard deviations with noise-contaminated data under different SNRs are shown in Figure 15. The average experimental results with other methods are also presented in the figure including the Triplet-CAE, the CNN with SVM, and the A-Convnet. With the decreasing SNR of the input images, the classification accuracy of all the models suffers different degrees of deterioration and the highest was obtained by the proposed model at nearly every SNR level. When the SNR was higher than 0 dB such that the geometric and scattering characteristics were not seriously interrupted by the

noise, each model reported a classification rate higher than 85%, and the proposed model achieved the highest accuracy. Even when the noise level was −10 dB such that most of the targets were concealed in the noise, as presented in Figures 13 and 14, the classification rate of the MSCAE still yielded a better performance than the other reference models in Figure 15, demonstrating the robustness of the proposed model under serious noise interruption.

**Figure 13.** The noise interrupted images with different signal-to-noise ratios (SNRs). (**a**) The original image, (**b**) the SNR at 10 dB, (**c**) the SNR at 5 dB, (**d**) the SNR at 0 dB, (**e**) the SNR at −5 dB, and (**f**) the SNR at −10 dB.

**Figure 14.** The input images and reconstruction results of the trained model at the noise levels of 10 dB SNR and −10 dB SNR. (**a**) Input images at a noise level of 10 dB SNR, (**b**) the reconstructed results of (**a**), (**c**) the input images at a noise level of -10dB SNR, (**d**) the reconstructed results of (**c**).

**Figure 15.** Classification results at different SNR levels.

#### *3.7. Classification Experiment with Resolution Variance*

The proposed model was subsequently evaluated concerning resolution variance. Theoretically, the range resolution and azimuth resolution of the SAR imagery was determined by the bandwidth of the transmitted wave and the synthetic aperture angle. However, due to the instability of the radars, the actual resolution of the measured SAR images would fluctuate around the theoretical values. Meanwhile, it was infeasible to train and maintain models at every possible resolution. Consequently, the robustness of resolution variation is also an important factor for model performance evaluation. Because the resolution of all target patches in the MSTAR dataset was 0.3 m × 0.3 m, the target patches with varied resolution should be simulated from the original images in the dataset. The spatial SAR images were converted into the frequency-aspect domain by the 2D-IDFT, and the sub-band was extracted. The sub-band data were subsequently resampled by zero-padding in the frequency domain and turned back to the spatial domain.

In the evaluation experiment, the resolution of the simulated data varied from 0.3 m × 0.3 m to 0.7 m × 0.7 m, and some images at different resolutions are presented in Figure 16. Similar to the configuration of the noise interruption experiment, the classification results of the proposed model are compared with the three reference models including the Triplet-DAE, the CNN with SVM, and the A-Convnet. At each resolution level, the experiment of each model was executed ten times to alleviate the influence of randomness caused by the model initialization and optimization. The average experimental results of each model are plotted in Figure 17. As shown in the figure, limited resolution deterioration did not seriously affect the performance of all the models. Even when the resolution was 0.6 m × 0.6 m, their average accuracy was still higher than 90%. However, the proposed model still gained the highest classification rate in comparison with the reference models at almost all the resolutions, illustrating its robustness under the extended operation condition of resolution variance.

**Figure 16.** MSTAR data at different resolution. (**a**) 0.3 m × 0.3 m, (**b**) 0.4 m × 0.4 m, (**c**) 0.5 m × 0.5 m, (**d**) 0.6 m × 0.6 m, (**e**) 0.7 m × 0.7 m.

**Figure 17.** Classification results at different resolutions.

#### **4. Conclusions and Future Work**

In this paper, an unsupervised representation learning model was proposed, providing an effective way to learn the multi-scale representation of targets in SAR images via its U-shaped architecture, the CSeConv and the PPM blocks, and the modified loss function based on the SSIM and the restriction of speckle suppression. The major contributions of our work include:


speckle suppression capability, while the restriction guarantees that the speckle filtering procedure was implemented in the feature learning step;

(3) The CSeConv and the CSeDeConv decreased the trainable parameters and calculation consumption, avoiding overfitting caused by insufficient samples. Moreover, they introduced more nonlinearity and slightly improved the performance of the MSCAE.

The MSTAR dataset was utilized to evaluate the performance of the proposed model. The proposed method was tested under both standard operating conditions and several extended operating conditions with both the three-target dataset and the ten-target dataset including the version variants, the noise corruption, and the resolution variance. Evaluation experiments demonstrated that the proposed method outperformed most of the conventional and deep learning algorithms and achieved comparable accuracy to the state-of-the-art results without any supervised information.

**Author Contributions:** S.T., W.G. and Y.L. conceived the methodology and conducted the entire experiments. S.T. and Y.L. wrote the manuscript. C.W. supervised the experiments and helped discuss the proposed method. H.Z. contributed to the organization of the paper and also the experimental analysis. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded in part by the Key Program of National Natural Science Foundations of China (Grant No. 41930110), the National Natural Science Foundations of China (Grant No. 41501356) and the Natural Science Foundation of Jiangsu Province under Grants No. BK20150774.

**Acknowledgments:** The authors thank the US Air Force Research Lab for providing the public MSTAR data. In addition, we are grateful to anonymous referees for their instructive comments.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **A MIMO-SAR Tomography Algorithm Based on Fully-Polarimetric Data**

#### **Lingyu Kong and Xiaojian Xu \***

School of Electronics and Information Engineering, Beihang University, Beijing 100191, China; konglingyu@buaa.edu.cn

**\*** Correspondence: xiaojianxu@buaa.edu.cn; Tel.: +86-010-8231-6065

Received: 4 September 2019; Accepted: 4 November 2019; Published: 6 November 2019

**Abstract:** A fully-polarimetric unitary multiple signal classification (UMUSIC) tomography algorithm is proposed, which can be used for acquiring high-resolution three-dimensional (3D) imagery, in a polarimetric multiple-input multiple-output synthetic aperture radar (MIMO-SAR) with a small number of baselines. In terms of the elevation resolution, UMUSIC provides an improvement over standard MUSIC by utilizing the conjugate of the complex sample data and converting the complex covariance matrix into a real matrix. The combination of UMUSIC and fully-polarimetric data permits a further reduction of the noise of the sample covariance matrix, which is obtained through pixel averaging of multiple two-dimensional (2D) images. Considering the consistency of four polarizations, this algorithm not only makes scattering centers have the same estimated height in four polarizations, but it also improves the estimation accuracy. Simulation results show that this algorithm outperforms the popular distributed compressed sensing (DCS). Image processing of measured data of an aircraft model using a multiple-input multiple-output synthetic aperture radar (MIMO-SAR) with six baselines is presented to validate the proposed algorithm.

**Keywords:** polarimetric; SAR tomography; MIMO radar

#### **1. Introduction**

Multiple-input multiple-output synthetic aperture radar (MIMO-SAR) is an enabling technique capable of imaging a target [1–7], which is different from a rail synthetic aperture radar (SAR) and a turntable inverse synthetic aperture radar (ISAR). Two-dimensional (2D) virtual apertures can be synthesized through different combinations of transceiver antenna elements. A large number of virtual apertures in the cross-range and elevation directions are beneficial to obtain high-resolution three-dimensional (3D) radar images [4–6], but they also result in a high cost and large size of radar systems due to the increase of the number of antenna elements. When the measured target has several scattering centers in an elevation direction, such as airplanes, an affordable array strategy can be adopted: the priority is given to ensuring adequate cross-range virtual apertures for high-resolution two-dimensional (2D) radar images [7], and then a small number of elevation virtual apertures ensure high-resolution 3D radar images by SAR tomography [8–18].

The reconstruction quality of SAR tomography depends on the product of the number of baselines and the signal-to-noise ratio (SNR) [8]. When the number of baselines is limited, the SNR of radar images can be equivalently improved by filtering [8,9], auxiliary information [10–12], and polarization [13–17]. The SNR can be improved by integrating nonlocal filtering into the compressed sensing (CS) algorithm and a reasonable reconstruction of buildings from only seven baselines is feasible [8]. In addition, [9] investigates the possibility related to the use of a multi-looking approach for fine resolution analysis of ground structures that combines SAR tomography. Filtering consisting of averaging pixels is bound to reduce the range-azimuth resolution and therefore is not suitable for high-resolution radar

images of artificial targets. The auxiliary information added to the standard Capon and multiple signal classification (MUSIC) algorithms can be exploited to reduce the ambiguity and resolve the superimposition of the scatterers in the case of a limited number of radar images [10]. Atmospheric phases for SAR tomography in mountainous regions are regressed against the spatial coordinates in map geometry at persistent scatterers locations [11]. The high-resolution 3D positions of a large amount of natural scatterers are obtained by a geodetic SAR tomography framework that fuses SAR tomography and SAR image geodesy compensating SAR measurement error [12]. The auxiliary information can effectively improve the image quality, but it needs to be obtained by other technologies which increases the complexity of the algorithm and the cost of the imaging. A distributed compressed sensing (DCS) algorithm based on fully-polarimetric data is proposed in [13–16] to improve the accuracy of the estimation. However, the CS algorithm suffers from a high computational expense and is hard to extend to fast practice [18]. In [17], a comparison among tomograms obtained in different polarizations is made to analyze how polarimetry can enhance target signatures.

To address these problems, the combination of spectral analysis and full polarization is an attractive way to improve resolution and processing speed for a small number of baselines. This paper explores a fully-polarimetric unitary multiple signal classification (UMUSIC) technique for polarimetric MIMO-SAR tomography [7]. The remainder of the paper is organized as follows. Section 2 describes a signal model based on fully-polarimetric data. A fast and high-resolution UMUSIC algorithm is developed in Section 3. In Section 4, two algorithms are compared through simulation of different point scatterers. Finally, Section 5 contains measured tomography results of an aircraft model to validate the proposed algorithm.

#### **2. Polarization Signal Model**

SAR tomography allows us to obtain 3D imagery to describe the electromagnetic property of illuminated objects. The geometry of MIMO-SAR tomography is shown in Figure 1, where *x,y,z* denote coordinates originating from the center of the imagery scene, *Rn* represents the distance from the target to the *n*th baseline, and *R*<sup>0</sup> is the projection distance from the radar to the center of the imagery scene on the *y*-axis. The orange triangles and blue circles denote receivers and transmitters, respectively. Each baseline represents a linear array, where transmitters are at both ends of the array and receivers are in the middle of the array. The 2D image for the *n*th baseline is represented by the following form [19].

$$g\_n(x, y, z\_n) = \int\_{-h/2}^{h/2} \sqrt{\widehat{\sigma}(x, y, z)} e^{-j\frac{4\pi}{\lambda}R\_n} dz \tag{1}$$

where σ(*x*, *y*, *z*) denotes the target scattering function that needs to be solved, *h* is the height of the imagery scene, λ represents the wavelength and *zn* refers to the *z*-coordinates of the *n*th baseline. Under the Born weak-scattering approximation, *Rn* representing the distance from the point target at (*x, y, z*) to the *n*th baseline, is approximated as:

$$R\_{\rm ll} = \sqrt{\left(y + R\_0\right)^2 + \left(z - z\_{\rm tr}\right)^2} \approx y + R\_0 + \frac{z\_n^2 + z^2 - 2z\_{\rm tr}z}{2R\_0} \tag{2}$$

The first three terms in (2) are irrelevant to *z*, the fourth term is the residual phase term that can be merged into σ(*x*, *y*, *z*), and the fifth term is the phase term for imaging in the elevation direction. We can choose one of *N* baselines as a reference baseline as:

$$g\_{\rm ref}(x, y, z\_{\rm ref}) = \mathbf{e}^{-j\frac{4\pi}{\lambda} \left( y + R\_0 + \frac{z\_{\rm ref}^2}{2R\_0} \right)}\tag{3}$$

After phase compensation, the *N* 2D images *g <sup>n</sup>*(*x*, *<sup>y</sup>*, *zn*) and σ (*x*, *y*, *z*) are in a Fourier transform relation.

$$g\_{\mathfrak{n}}'(\mathbf{x}, y, z\_{\mathfrak{n}}) = g\_{\mathfrak{n}}(\mathbf{x}, y, z\_{\mathfrak{n}}) g\_{\mathfrak{n}\mathfrak{f}}^{\*}(\mathbf{x}, y, z\_{\mathfrak{n}\mathfrak{f}}) = \int\_{-\mathbf{h}/2}^{\mathbf{h}/2} \widehat{\sigma}'(\mathbf{x}, y, z) e^{-j2\pi wz} dz \tag{4}$$

with

$$w\_n = \frac{2z\_n}{\lambda R\_0} \tag{5}$$

$$
\widehat{\sigma'}(x, y, z) = \sqrt{\widehat{\sigma}(x, y, z)} e^{-j\frac{2\pi\hbar^2}{M\_0}} \tag{6}
$$

where σ (*x*, *<sup>y</sup>*, *<sup>z</sup>*) integrated with the residual phase term still has the same amplitude as σ(*x*, *y*, *z*), which has no effect on the 3D imagery. *g <sup>n</sup>*(*x*, *y*, *zn*) can be discretized as the multiplication of two matrices.

$$\mathbf{g}'\_{n}(\mathbf{x}, y, z\_{n}) = \sum\_{l=1}^{L} \widehat{\boldsymbol{\sigma}}'(\mathbf{x}, y, z\_{l}) \mathbf{e}^{-j2\pi w\_{n} z\_{l}} = \mathbf{a}\_{n} \mathbf{s} \tag{7}$$

with

$$\mathbf{a}\_{\rm ll} = \left[ \mathbf{e}^{-j2\pi w\_n z\_1}, \mathbf{e}^{-j2\pi w\_n z\_2}, \dots, \mathbf{e}^{-j2\pi w\_n z\_L} \right] \tag{8}$$

$$\mathbf{s} = \left[\widehat{\sigma}'(\mathbf{x}, \mathbf{y}, z\_1), \widehat{\sigma}'(\mathbf{x}, \mathbf{y}, z\_2), \dots, \widehat{\sigma}'(\mathbf{x}, \mathbf{y}, z\_L)\right]^T \tag{9}$$

where *L* represents the number of scattering centers in the elevation direction. Before imaging, the imagery scene in the elevation direction needs to be divided into many discrete points to represent the range of *L*. As a consequence, the matrix **a***<sup>n</sup>* is very large, which is the reason why the imaging algorithm takes a long time to locate scattering centers and determine *L* in the simulation and measurement. In combination with *N* 2D images, the polarimetric tomography model is given by:

$$\mathbf{g} = \begin{bmatrix} g\_1'(\mathbf{x}, y, z\_1) \\ g\_2'(\mathbf{x}, y, z\_2) \\ \vdots \\ g\_N'(\mathbf{x}, y, z\_N) \end{bmatrix} = \begin{bmatrix} \mathbf{e}^{-j2\pi w\_1 z\_1} & \mathbf{e}^{-j2\pi w\_1 z\_2} & \cdots & \mathbf{e}^{-j2\pi w\_1 z\_L} \\ \mathbf{e}^{-j2\pi w\_2 z\_1} & \mathbf{e}^{-j2\pi w\_2 z\_2} & \cdots & \mathbf{e}^{-j2\pi w\_2 z\_L} \\ \vdots & \vdots & \ddots & \vdots \\ \mathbf{e}^{-j2\pi w\_N z\_1} & \mathbf{e}^{-j2\pi w\_N z\_2} & \cdots & \mathbf{e}^{-j2\pi w\_N z\_L} \end{bmatrix} \mathbf{s} = \mathbf{A} \mathbf{s} \tag{10}$$

(10) can be further developed by merging with the fully-polarimetric data.

$$\mathbf{G} = \begin{bmatrix} \mathbf{g}\_{HH} & \mathbf{g}\_{HV} & \mathbf{g}\_{VH} & \mathbf{g}\_{VV} \end{bmatrix} = \begin{bmatrix} \mathbf{a}\_1, \mathbf{a}\_2, \dots, \mathbf{a}\_N \end{bmatrix}^T \begin{bmatrix} \mathbf{s}\_{HH} & \mathbf{s}\_{HV} & \mathbf{s}\_{VH} & \mathbf{s}\_{VV} \end{bmatrix} = \mathbf{A}\mathbf{S} \tag{11}$$

where **G** denotes the *N*×*4* 2D imagery matrix, **S** refers to the *L*×*4* 3D imagery matrix, and **A** is the *N* × *L* transformation matrix.

**Figure 1.** Geometry of multiple-input multiple-output synthetic aperture radar (MIMO-SAR) tomography.

#### **3. Tomography Algorithm**

Multiple signal classification (MUSIC) is a spectral analysis algorithm based on the eigen decomposition of the sample covariance matrix. It is necessary to reduce the matrix noise by some techniques, including snapshot in direction-of-arrival and multi-looking in SAR tomography that also leads to a decrease in range-azimuth resolution [13]. In order to dispel the influence, we employ fully-polarimetric data and their conjugation to obtain the matrix.

The polarization tomography model (11) including Gaussian white noise matrix **W** is rewritten as:

$$\mathbf{G} = \mathbf{A}\mathbf{S} + \mathbf{W} \tag{12}$$

The covariance matrix **R** is given by:

$$\begin{aligned} \mathbf{R} &= E\left\{ \mathbf{G} \mathbf{G}^{H} \right\} = E\left\{ \mathbf{[AS+W]} \left[ \mathbf{S}^{H} \mathbf{A}^{H} + \mathbf{W}^{H} \right] \right\} \\ &= \mathbf{A}E\left\{ \mathbf{SS}^{H} \right\} \mathbf{A}^{H} + E\left\{ \mathbf{W} \mathbf{W}^{H} \right\} = \mathbf{A} \mathbf{P} \mathbf{A}^{H} + \sigma^{2} \mathbf{I} \end{aligned} \tag{13}$$

where <sup>σ</sup>2**<sup>I</sup>** = *diag*\$ |σ1| <sup>2</sup> ··· <sup>|</sup>σ*N*<sup>|</sup> 2 % . The Hermite matrix **APA***<sup>H</sup>* composed of the positive definite diagonal matrix **P** and the column full rank matrix **A** can be eigen decomposed to obtain the noise subspace.

Multi-sample data is critical for the MUSIC algorithm to ensure a high-quality covariance matrix. In SAR tomography, the average of pixels with the same scattering characteristics is called multi-looking. However, the number of these pixels is limited, and the average processing also reduces the range-azimuth resolution. The combination of observed data and their conjugation can be equivalent to doubling the number of these pixels, which not only improves the estimation accuracy but also solves the problem of coherent signal estimation. To force the Hermite property of **APA***H*, the average of the covariance matrices is computed from forward and backward data samples [20].

$$\mathbf{R}\_M = \frac{1}{2} (\mathbf{R} + \mathbf{J}\_N \mathbf{R}^\* \mathbf{J}\_N) = \mathbf{A} \mathbf{\tilde{P}} \mathbf{A}^H + \sigma^2 \mathbf{I} \tag{14}$$

with

$$\overline{\mathbf{P}} = \frac{1}{2} (\mathbf{P} \mathbf{+DP}^{\mathbf{\*}} \mathbf{D}^{H}) \tag{15}$$

$$\mathbf{D} = \operatorname{diag} \left\{ \mathbf{e}^{j2\pi(w\mathbb{N} + w\_1)z\_1} \dots \mathbf{e}^{j2\pi(w\mathbb{N} + w\_1)z\_L} \right\} \tag{16}$$

where superscript \* represents the conjugation, **<sup>J</sup>***<sup>N</sup>* denotes the *N*×*N* exchange matrix with ones on its antidiagonal and zeros elsewhere. In order to reduce the computational complexity, **R***<sup>M</sup>* can be transformed into a real covariance matrix by unitary transformation.

$$\mathbf{R}\_{II} = \mathbf{Q}\_N^H \mathbf{R}\_M \mathbf{Q}\_N = \mathbf{Q}\_N^H \mathbf{A} \tilde{\mathbf{P}} \mathbf{A}^H \mathbf{Q}\_N + \sigma^2 \mathbf{I} \tag{17}$$

where **Q***<sup>N</sup>* is any *N*×*N* unitary matrix to satisfy column conjugate symmetry. A simple form can be chosen as [21]

$$\mathbf{Q}\_{\text{Nxeven}} = \frac{1}{\sqrt{2}} \begin{bmatrix} \mathbf{I}\_{N/2} & \mathbf{j} \mathbf{I}\_{N/2} \\ \mathbf{J}\_{N/2} & -\mathbf{j} \mathbf{J}\_{N/2} \end{bmatrix} \tag{18}$$

$$\mathbf{Q}\_{\text{Nxadd}} = \frac{1}{\sqrt{2}} \begin{bmatrix} \mathbf{I}\_{(N-1)/2} & 0 & \mathbf{j}\mathbf{I}\_{(N-1)/2} \\ 0 & \sqrt{2} & 0 \\ \mathbf{J}\_{(N-1)/2} & 0 & -\mathbf{j}\mathbf{J}\_{(N-1)/2} \end{bmatrix} \tag{19}$$

As mentioned in (14) and (17), the real covariance matrix **R***<sup>U</sup>* can be rewritten as:

$$\begin{split} \mathbf{R}\_{U} &= \frac{1}{2} (\mathbf{Q}\_{N}^{H} \mathbf{R} \mathbf{Q}\_{N} + \mathbf{Q}\_{N}^{H} \mathbf{J}\_{N} \mathbf{R}^{\*} \mathbf{J}\_{N} \mathbf{Q}\_{N}) \\ &= \frac{1}{2} (\mathbf{Q}\_{N}^{H} \mathbf{R} \mathbf{Q}\_{N} + (\mathbf{Q}\_{N}^{H})^{\*} \mathbf{R}^{\*} \mathbf{J}\_{N} \mathbf{Q}\_{N}^{\*}) = \text{Re} \left\{ \mathbf{Q}\_{N}^{H} \mathbf{R} \mathbf{Q}\_{N} \right\} \end{split} \tag{20}$$

The real matrix is eigen decomposed as:

$$\operatorname{Re} [\mathbf{Q}\_N^H \mathbf{R} \mathbf{Q}\_N] = \sum\_{i=1}^N \lambda\_i u\_i u\_i^H + \sigma^2 \sum\_{i=1}^N u\_i u\_i^H = \sum\_{i=1}^L \lambda\_i u\_i u\_i^H + \sigma^2 \sum\_{i=1}^N u\_i u\_i^H \tag{21}$$

where λ1, ... ,λ*<sup>N</sup>* are eigenvalues and *u*1, ... ,*uN* represent corresponding orthogonal normalized eigenvectors. Among *N* eigenvectors of the **R***U*, *L* eigenvalues are related to the signal, and *N-L* eigenvalues are related to the noise. By using the noise subspace **E***<sup>N</sup>* = *span*{*uN*−*L*, ... , *uN*}, the fully-polarimetric pseudo-spectrum is expressed as:

$$\mathbf{P}\_{MLS \gets C}^{FP}(w) = \frac{1}{\mathbf{A}^H(w)\mathbf{E}\_N \mathbf{E}\_N^H \mathbf{A}(w)}\tag{22}$$

We can find out the peaks of the spectrums to locate different scattering centers in the elevation dimension and estimate the scattering intensity by the least square method (LSM) in four polarizations.

$$\mathbf{S} = \left(\mathbf{A}^H \mathbf{A}\right)^{-1} \mathbf{A}^H \mathbf{G} \tag{23}$$

where **A** is updated according to positions of the scattering centers.

#### **4. Simulation**

We adopt the fully-polarimetric DCS as the comparison item. Considering crosstalk, noise, and dispersion, point scatterers with typical polarimetric scattering matrices (PSMs) are simulated to generate return signals. The simulation scene is shown in Figure 2, where scatterers with different heights are located in the coordinate origin. The working frequency is 8 GHz–12 GHz. The down-range, cross-range and elevation Rayleigh limits of the MIMO radar are 0.037 m, 0.047 m, and 0.188 m, respectively. As shown in Figure 3, we simulate three cases to compare the two algorithms.

**Figure 2.** Simulation scene of the MIMO-SAR tomography.

**Figure 3.** Positions of point scatterers in three cases.

#### *4.1. Case 1: Two Point Scatterers with a Spacing of 0.18 m*

A cylinder and a 90◦ rotated dihedral reflector are located at −0.09 m and 0.09 m in the elevation dimension, respectively. When the two scatterers spacing is 0.18 m (close to the Rayleigh limit), the simulation results of the two algorithms are seen in Figure 4, where lines represent pseudo-spectrums and points denote estimated results including height and scattering intensity of scatterers.

**Figure 4.** Estimated results of two point scatterers with a spacing of 0.18 m by (**a**) fully-polarimetric distributed compressed sensing (DCS) and (**b**) fully-polarimetric unitary multiple signal classification (UMUSIC).

Two fully-polarimetric algorithms make scatterers have the same estimated height in four polarizations. The specific estimated results are listed in Table 1. The PSMs of the two scatterers are estimated, where the scattering intensity of the cylinder return signal at −0.09 m is inconsistent with the truth because of polarimetric distortion. In Figure 4a, it is noteworthy that the pseudo-spectrum of CS is leaked to form false scattering points. There are two main reasons for signal leakage [13]: on the one hand, if the regularization parameter is too small in the optimization model, it can lead to over-fitting of data; on the other hand, the observed data does not satisfy the sparsity in the unit orthogonal basis. Therefore, it is necessary to use a sliding window to suppress signal leakage.


**Table 1.** Estimated results of two point scatterers with a spacing of 0.18 m.

#### *4.2. Case 2: Two Point Scatterers with a Spacing of 0.06 m*

When the spacing is reduced to 0.06 m (one-third of elevation Rayleigh limit), the estimated results of the two algorithms are displayed in Figure 5 and Table 2. Two fully-polarimetric algorithms still have high-resolution. Furthermore, polarimetric distortion of the cylinder return signals becomes more severe as the spacing decreases. Consequently, polarimetric calibration is necessary for a fully-polarimetric radar system.

**Figure 5.** Estimated results of two point scatterers with a spacing of 0.06 m by (**a**) fully-polarimetric DCS and (**b**) fully-polarimetric UMUSIC.


**Table 2.** Estimated results of two point scatterers with a spacing of 0.06 m.

#### *4.3. Case 3: Four Point Scatterers with a Spacing of 0.09 m*

The four point scatterers are a cylinder, a 67.5◦ rotated dihedral reflector, a 90◦ rotated dihedral reflector, and a plate and their PSMs are listed in Table 3. According to the MIMO configuration, the bistatic angles of all transceiver channels are less than 10◦. To simplify the simulation, we assume that the PSMs listed in Table 3 are applicable to all transceiver channels. It can be seen from Figure 6 that the pseudo-spectrums of two fully-polarimetric algorithms are not affected when the number of scatterers increase. We summarize the estimation results in Table 4, which demonstrates the estimation accuracy of fully-polarimetric UMUSIC is higher than that of the fully-polarimetric DCS. The CS, which is essentially an optimization problem, needs to be solved iteratively, therefore, its processing speed is bound to be limited by the number of iterations. The simulation results show that for a pixel, the processing speed of the fully-polarimetric UMUSIC is more than five times faster than that of the fully-polarimetric DCS in the same computing condition.

**Table 3.** Polarimetric scattering matrix (PSM) of four point scatterers


**Figure 6.** Estimated heights of four point scatterers with a spacing of 0.09 m by (**a**) fully-polarimetric DCS and (**b**) fully-polarimetric UMUSIC.


**Table 4.** Estimated results of two point scatterers with a spacing of 0.09 m.

#### **5. Experiment**

An experimental polarimetric MIMO array has been upgraded based on the radar system in [7], and baselines with different heights are controlled by an elevator. It can be seen from Figure 7 that the polarimetric MIMO array consists of 20 receive elements and 6 transmit elements, where the combinations among them synthesize 80 transceiver channels. The measured target is an aircraft model with an elevation angle of 16 degrees on a foam support, as shown in Figure 8. M1, M2, and M3 represent three missile models mounted on the wing, respectively. To avoid complex scattering properties of cavity structures, the inlet of the aircraft model is sealed with copper foils. The measurement parameters are the same as the simulation parameters in Section 4.

**Figure 7.** An experimental polarimetric MIMO array.

**Figure 8.** Aircraft model.

It can be seen from Figure 9 that the scattering mechanisms of the aircraft model are different in four polarizations. In the HH image, there are three strong scattering centers including two parts that are not distinguished (see Figure 9a). Compared with the HH image, more components can be distinguished from the VV image. The scattering intensity of the two cross-polarization images is low. Figures 10–13 illustrate the 3D point cloud maps obtained from 24 2D images. The top views are similar to the 2D image, which proves that the 3D scattering intensity can be estimated by LSM. It can be seen from the bottom and side views that scattering centers with different heights are basically consistent with the aircraft model. In addition, the scattering intensity in front of the fuselage is higher than that of the fuselage tail due to the shielding of the supporting foam. By comparing the 3D point cloud maps in different polarizations, we can analyze its scattering mechanism.

**Figure 10.** Tomography results in HH polarization. Three views of the airplane model are shown: (**a**) top view; (**b**) bottom view; (**c**) side view.

**Figure 11.** Tomography result in HV polarization. Three views of the airplane model are shown: (**a**) top view; (**b**) bottom view; (**c**) side view.

**Figure 12.** Tomography result in VH polarization. Three views of the airplane model are shown: (**a**) top view; (**b**) bottom view; (**c**) side view.

**Figure 13.** Tomography result in VV polarization. Three views of the airplane model are shown: (**a**) top view; (**b**) bottom view; (**c**) side view.

Figure 14 illustrates tomographic image slices along the down range for HH (Figure 10), HV (Figure 11), VH (Figure 12), and VV (Figure 13). It can be seen from the figures that scattering of the model shows obvious variety with heights. We summarize components of the model in Table 5, where M1 tail and M2 head cannot be distinguished because they have the same height, so do M2 tail and rear wheel.

**Figure 14.** Tomographic image slices along the downrange: (**a**) HH, (**b**) HV, (**c**) VH, and (**d**) VV.


**Table 5.** Components of the aircraft in tomographic image slices.

#### **6. Conclusions**

This paper proposes a fully-polarimetric UMUSIC tomography algorithm to acquire highresolution 3D radar imagery for a MIMO-SAR with a small number of baselines. In order to mitigate the effect of multi-looking on the range-azimuth resolution, we employ fully-polarimetric data and their conjugation to obtain the sample covariance matrix. Two algorithms including the fully-polarimetric DCS and the fully-polarimetric UMUSIC, are compared through numeric simulation of different point scatterers. Simulation results demonstrate that the fully-polarimetric UMUSIC outperforms the popular fully-polarimetric DCS in processing speed and estimation accuracy. Measurements for an aircraft model are conducted using an X-band experimental polarimetric MIMO-SAR which was upgraded from a previous system [7]. The resulting 3D images using six baselines demonstrate the usefulness of the algorithm for 3D imagery of complex radar targets.

**Author Contributions:** Methodology, L.K.; validation, L.K. and X.X.; writing—original draft preparation, L.K.; writing—review and editing, L.K. and X.X.

**Funding:** This research was funded by National Natural Science Foundation of China: 61371005.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Target Localization Using Double-Sided Bistatic Range Measurements in Distributed MIMO Radar Systems**

#### **Hyuksoo Shin and Wonzoo Chung \***

Division of Computer and Communications Engineering, Korea University, Seoul 02841, Korea; shs727@korea.ac.kr **\*** Correspondence: wchung@korea.ac.kr

Received: 22 April 2019; Accepted: 30 May 2019; Published: 2 June 2019

**Abstract:** We develop a novel approach improving existing target localization algorithms for distributed multiple-input multiple-output (MIMO) radars based on bistatic range measurements (BRMs). In the proposed algorithms, we estimate the target position with auxiliary parameters consisting of both the target–transmitter distances and the target–receiver distances (hence, "double-sided") in contrast to the existing BRM methods. Furthermore, we apply the double-sided approach to multistage BRM methods. Performance improvements were demonstrated via simulations and a limited theoretical analysis was attempted for the ideal two-dimensional case.

**Keywords:** distributed MIMO radar; target localization; double-sided bistatic range (BR)

#### **1. Introduction**

In distributed multiple-input multiple-output (MIMO) radar systems, target localization based on the time delays between transmitters and receivers is an attractive research topic due to its high accuracy and simplicity [1–3]. As target-mediated time delays are nonlinear, estimation of target location via direct analysis of these delays is difficult. Hence, several approaches seeking to linearize the relationship between the target and the time delays have been proposed [4–15]. Of these, algorithms based on bistatic range measurements (BRMs), which are the sum of target–transmitter and target–receiver distances, are introduced in [6–15].

A single stage algorithm based on BRM, introduced first in [6,7], estimates the target position with the help of auxiliary parameters (distances between the target and transmitters or distances between the target and receivers). Multistage algorithms, such as those in [8–15], further refine the target position by re-using the estimates of the first-stage BRM method and exploiting their relationships, and asymptotically attain the Cramer–Rao lower bound (CRLB) [12] assuming accurate estimates of the first stage. A recent study [15] shows that the choice of auxiliary parameters (target–transmitter side or target–receiver side) in BRM methods affects the target estimation accuracy. Therefore, a systematic approach that utilizes all available auxiliary parameters optimally is desirable.

In this paper, we propose a novel approach that utilizes both target–transmitter distances and target–receiver distances as the auxiliary parameters, to improve the mean square error (MSE) performance. Furthermore, the proposed approach can be applied to the second-stage of the multistage BRM algorithms, such as in those of [8–15]. The existing multistage algorithms can be divided into two types depending on the way of linearizing the nonlinear relations between target position and auxiliary parameters estimated in the first stage: the algorithms in [8–12] linearize nonlinear relationships by squaring them and the algorithms in [13–15] use first-order Taylor expansion to this end. We present two types of double-sided two-stage BRM algorithms by applying our approach to the most recent multistage BRM algorithms, i.e., two-stage methods using squared Taylor approximated relationships. The improved MSE performances of the proposed algorithms were demonstrated by simulations and limited theoretical analysis was attempted for an ideal two-dimensional case.

The remainder of this paper is organized as follows. We briefly review the BRM method with a distributed MIMO radar system model in Section 2. In Section 3, we develop double-sided, single- and two-stage BRM algorithms. A theoretical analysis for ideal two-dimensional target/antenna positions presented in Section 4 shows the improved MSE performance afforded by the double-sided BRM algorithm. The simulations of practical three-dimensional target/antenna positions presented in Section 5 confirm that our algorithms improve MSE performance. Our conclusions are presented in Section 6.

Table 1 lists the notations used in this paper.



#### **2. System Model for BRM Based Target Localization and Problem Formulation**

We consider a three-dimensional, widely separated MIMO radar system consisting of a single target located at an unknown position **x***<sup>o</sup>* = [*xo*, *yo*, *zo*] *<sup>T</sup>* with *M* transmitting antennae (Tx) and *N* receiving antennae (Rx) located at known positions **<sup>x</sup>***t*(*m*)=[*xt*(*m*), *yt*(*m*), *zt*(*m*)]*T*,*<sup>m</sup>* = 1, ··· , *<sup>M</sup>* and **<sup>x</sup>***r*(*n*) = [*xr*(*n*), *yr*(*n*), *zr*(*n*)]*T*, *<sup>n</sup>* = 1, ··· , *<sup>N</sup>*, respectively, and, we denote the positions of antennae as **<sup>X</sup>***<sup>t</sup>* = [**x***t*(1), ··· , **x***t*(*M*)] and **X***<sup>r</sup>* = [**x***r*(1), ··· , **x***r*(*N*)], together.

The bistatic range (BR) between the *m*th Tx and the *n*th Rx, denoted by *rmn*, is defined as the sum of the distance from the *m*th Tx to the target, denoted by *dt*(*m*) = **x***<sup>o</sup>* − **x***t*(*m*) , and the distance from the target to the *n*th Rx, denoted by *dr*(*n*) = **x***<sup>o</sup>* − **x***r*(*n*) ([16]):

$$r\_{mn} = d\_l(m) + d\_l(n). \tag{1}$$

Each BR is measured by converting the estimated time delay between a Tx and an Rx to a distance. Any BR measurement (BRM) between the *m*th Tx and the *n*th Rx, denoted by *r*ˆ*mn*, is often corrupted by measurement error, denoted by *ωmn* and modeled as an i.i.d., zero-mean white Gaussian noise with variance *σ*<sup>2</sup> *<sup>ω</sup>* ([4]):

$$
\theta\_{mn} = r\_{mn} + \omega\_{mn}.\tag{2}
$$

The goal of BRM based target localization is to estimate the target location **x***<sup>o</sup>* from the BRMs {*r*ˆ*mn*}*m*=1,··· ,*M*, *<sup>n</sup>*=1,··· ,*N*.

The BRM method in [6,7] jointly estimates the target location, **x***o*, and the distances from Txs to the target, denoted by **<sup>d</sup>***<sup>t</sup>* = [*dt*(1), ··· , *dt*(*M*)]*T*, from the BRMs, using the following linear model in the presence of noise:

$$\mathbf{b}\_{t} = [\mathbf{1}\_{M \times 1} \otimes \mathbf{X}\_{r}^{T} - \mathbf{X}\_{t}^{T} \otimes \mathbf{1}\_{N \times 1 \times } -\mathbf{R}\_{t}][\mathbf{x}\_{o}^{T}, \mathbf{d}\_{t}^{T}]^{T} + \mathbf{e}\_{t}, \tag{3}$$

where

$$\mathbf{b}\_{l} = \frac{1}{2} \begin{bmatrix} \|\mathbf{x}\_{l}(1)\|^{2} - \mathfrak{h}\_{11}^{2} - \|\mathbf{x}\_{l}(1)\|^{2} \\ \vdots \\ \|\mathbf{x}\_{l}(N)\|^{2} - \mathfrak{h}\_{MN}^{2} - \|\mathbf{x}\_{l}(M)\|^{2} \end{bmatrix} \tag{4}$$

$$\mathbf{R}\_t = \text{blkdiag}(\mathbf{r}\_1, \dots, \mathbf{r}\_M)\_\prime \tag{5}$$

where **r***<sup>m</sup>* = [*r*ˆ*m*1, ··· ,*r*ˆ*mN*] *<sup>T</sup>* and *ε<sup>t</sup>* is a vector reflecting BR measurement error ([7]).

Alternatively, the BRM equation can be constructed using the distances from the target to the Rxs, denoted by **<sup>d</sup>***<sup>r</sup>* = [*dr*(1), ··· , *dr*(*N*)]*T*, instead of the **<sup>d</sup>***<sup>t</sup>* values:

$$\mathbf{b}\_{r} = [\mathbf{X}\_{t}^{T} \otimes \mathbf{1}\_{N \times 1} - \mathbf{1}\_{M \times 1} \otimes \mathbf{X}\_{r}^{T}, -\mathbf{R}\_{r}][\mathbf{x}\_{o}^{T}, \mathbf{d}\_{r}^{T}]^{T} + \mathbf{c}\_{r} \tag{6}$$

where

$$\mathbf{b}\_{r} = \frac{1}{2} \begin{bmatrix} \|\mathbf{x}\_{t}(1)\|^{2} - \mathfrak{h}\_{11}^{2} - \|\mathbf{x}\_{r}(1)\|^{2} \\ \vdots \\ \|\mathbf{x}\_{t}(M)\|^{2} - \mathfrak{h}\_{MN}^{2} - \|\mathbf{x}\_{r}(N)\|^{2} \end{bmatrix} \tag{7}$$

$$\mathbf{R}\_{\mathbf{r}} = [\text{diag}(\mathbf{r}\_1), \dots, \text{diag}(\mathbf{r}\_M)]^T,\tag{8}$$

where *ε<sup>r</sup>* is a vector reflecting BR measurement error [7]. Note that the estimated auxiliary parameters **d**ˆ*<sup>t</sup>* or **d**ˆ *<sup>r</sup>* contain the target information **x***o*. Multistage algorithms further refine the target position by exploiting this information.

The two-stage BRM method using the squared relationships ([12]) estimates the squared target position, **<sup>x</sup>***<sup>o</sup>* **<sup>x</sup>***o*, using **<sup>x</sup>**<sup>ˆ</sup> *<sup>o</sup>* and **<sup>d</sup>**ˆ*<sup>t</sup>* yielded by the first-stage BRM method based on the following linear model (which reflects the relationship between [**x***<sup>T</sup> <sup>o</sup>* , **d***<sup>T</sup> t* ] *<sup>T</sup>* and **<sup>x</sup>***<sup>o</sup>* **<sup>x</sup>***o*):

$$\begin{bmatrix} \hat{\mathbf{x}}\_{\boldsymbol{\theta}} \odot \hat{\mathbf{x}}\_{\boldsymbol{\theta}}\\ \hat{\mathbf{d}}\_{t} \odot \hat{\mathbf{d}}\_{t} + 2\mathbf{X}\_{t}^{T}\hat{\mathbf{x}}\_{\boldsymbol{\theta}} - (\mathbf{X}\_{t}^{T} \odot \mathbf{X}\_{t}^{T})\mathbf{1}\_{3 \times 1} \end{bmatrix} = \begin{bmatrix} \mathbf{I}\_{3} \\ \mathbf{1}\_{M \times 3} \end{bmatrix} (\mathbf{x}\_{\boldsymbol{\theta}} \odot \mathbf{x}\_{\boldsymbol{\theta}}) + \mathbf{z}\_{\boldsymbol{S},t} \tag{9}$$

where *εS*,*<sup>t</sup>* is the error vector due to the first-stage estimation error ([12]).

Alternatively, we obtain the following linear model reflecting the relationship between [**x***<sup>T</sup> <sup>o</sup>* , **d***<sup>T</sup> r* ] *<sup>T</sup>* and **x***<sup>o</sup>* **x***o*:

$$
\begin{bmatrix}
\hat{\mathbf{x}}\_{\boldsymbol{\theta}} \odot \hat{\mathbf{x}}\_{\boldsymbol{\theta}} \\
\hat{\mathbf{d}}\_{\boldsymbol{r}} \odot \hat{\mathbf{d}}\_{\boldsymbol{r}} + 2\mathbf{X}\_{\boldsymbol{r}}^{T}\hat{\mathbf{x}}\_{\boldsymbol{\theta}} - (\mathbf{X}\_{\boldsymbol{r}}^{T} \odot \mathbf{X}\_{\boldsymbol{r}}^{T})\mathbf{1}\_{3 \times 1}
\end{bmatrix} = \begin{bmatrix}
\mathbf{I}\_{3} \\
\mathbf{1}\_{N \times 3}
\end{bmatrix} \left(\mathbf{x}\_{\boldsymbol{\theta}} \odot \mathbf{x}\_{\boldsymbol{\theta}}\right) + \mathbf{z}\_{S,r} \tag{10}
$$

where *εS*,*<sup>r</sup>* is the error vector due to the first-stage estimation error [12].

Let **x***o*1 **x***<sup>o</sup>* denote the **x***<sup>o</sup>* **x***<sup>o</sup>* estimated by the linear model of (9) (or (10)); then, the refined target location, denoted by **x**ˆ *<sup>o</sup>*,*S*, is:

$$
\hat{\mathbf{x}}\_{\bullet,\mathcal{S}} = \text{sgn}(\hat{\mathbf{x}}\_{\bullet}) \odot \sqrt{\mathbf{x}\_{\bullet}^{-} \hat{\mathbf{C}}^{-} \mathbf{x}\_{\bullet}}.\tag{11}
$$

The two-stage BRM method using Taylor approximated relationships [15] considers the first-order Taylor expansion of *dt*(*m*) at **x**ˆ *<sup>o</sup>* to be

$$\begin{split} d\_{l}(m) &= \hat{d}\_{l}(m) - \triangle d\_{l}(m) = \|\hat{\mathbf{x}}\_{0} - \triangle \mathbf{x}\_{0} - \mathbf{x}\_{l}(m)\| \\ &\simeq \|\hat{\mathbf{x}}\_{0} - \mathbf{x}\_{l}(m)\| - \frac{\hat{\mathbf{x}}\_{0}^{T} - \mathbf{x}\_{l}^{T}(m)}{\|\hat{\mathbf{x}}\_{0} - \mathbf{x}\_{l}(m)\|} \triangle \mathbf{x}\_{0} \end{split} \quad \text{for} \quad m = 1, \cdots, M, \tag{12}$$

where **x***o*, *dt*(1), ··· , *dt*(*M*) are the estimation errors at the **x**ˆ *<sup>o</sup>*. The linear model reflecting the relationships of (12) is

$$
\begin{bmatrix}
\mathbf{0}\_{3\times 1} \\
\dot{d}\_{l}(1) - \|\dot{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(1)\| \\
\vdots \\
\dot{d}\_{l}(M) - \|\dot{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(M)\|
\end{bmatrix} = \begin{bmatrix}
\mathbf{-I}\_{3} \\
(\dot{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{l}^{T}(1))/\|\dot{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(1)\| \\
\vdots \\
(\dot{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{l}^{T}(M))/\|\dot{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(M)\|
\end{bmatrix} \triangle \mathbf{x}\_{o} + \begin{bmatrix}
\triangle \mathbf{x}\_{o} \\
\triangle d\_{l}(1) \\
\vdots \\
\triangle d\_{l}(M)
\end{bmatrix}.
\tag{13}
$$

Alternatively, we obtain the following linear model using **d***r* instead of **d***t*:

$$
\begin{bmatrix}
\mathbf{0}\_{3\times 1} \\
\dot{d}\_{\mathbf{r}}(1) - \|\mathbf{\dot{x}}\_{o} - \mathbf{x}\_{\mathbf{r}}(1)\| \\
\vdots \\
\dot{d}\_{\mathbf{r}}(N) - \|\mathbf{\dot{x}}\_{o} - \mathbf{x}\_{\mathbf{r}}(N)\|
\end{bmatrix} = \begin{bmatrix}
(\mathbf{\dot{x}}\_{o}^{T} - \mathbf{x}\_{r}^{T}(1))/\|\mathbf{\dot{x}}\_{o} - \mathbf{x}\_{\mathbf{r}}(1)\| \\
\vdots \\
(\mathbf{\dot{x}}\_{o}^{T} - \mathbf{x}\_{r}^{T}(N))/\|\mathbf{\dot{x}}\_{o} - \mathbf{x}\_{r}(N)\|
\end{bmatrix} \triangle \mathbf{x}\_{o} + \begin{bmatrix}
\triangle \mathbf{x}\_{o} \\
\triangle d\_{\mathbf{r}}(1) \\
\vdots \\
\triangle d\_{\mathbf{r}}(N)
\end{bmatrix}.
\tag{14}
$$

We make an intermediate estimation of the error **x***<sup>o</sup>* of the first stage to refine the target position. Let 2**x***<sup>o</sup>* denote the **x***<sup>o</sup>* estimated by the linear model of (13) (or (14)); then, the refined target position, denoted by **x**ˆ *<sup>o</sup>*,*A*, is:

$$
\mathfrak{X}\_{\boldsymbol{\vartheta},A} = \mathfrak{X}\_{\boldsymbol{\vartheta}} - \widehat{\triangle}\mathfrak{X}\_{\boldsymbol{\vartheta}}.\tag{15}
$$

In the existing single-sided BRM methods, **x***<sup>o</sup>* and **d***t*, or **x***<sup>o</sup>* and **d***r*, are used exclusively. As the BRMs are the sum of the **d***t* and **d***r* values, target estimation accuracy can be improved by simultaneously estimating **x***o*, **d***<sup>t</sup>* and **d***<sup>r</sup>* in the first stage, and by fully utilizing these values in the second stage. Thus, the goal of our paper is to develop target estimation schemes that use both the Tx- and Rx-sided linear models simultaneously.

#### **3. The Double-Sided BRM Approach**

#### *3.1. The Double-Sided Single-Stage BRM Algorithm*

The target estimation performance of the BRM algorithm depends on the choice of auxiliary parameters (the transmitter-side parameters **d***<sup>t</sup>* or the receiver-side parameters **d***r*), as shown in [15]. Such dependency implies that the linear models in (3) and (6) cannot fully exploit the target information in BRM observations. Thus, by merging the two linear models in (3) and (6) into a single linear model and,

consequently, simultaneously estimating the target, **d***<sup>t</sup>* and **d***<sup>r</sup>* values, we fully utilize all BR information for the target estimation.

To simultaneously estimate **x***o*, **d***t* and **d***r*, we rewrite the two linear models of (3) and (6) as equivalent linear models with respect to [**x***<sup>T</sup> <sup>o</sup>* , **d***<sup>T</sup> <sup>t</sup>* , **d***<sup>T</sup> r* ] *<sup>T</sup>*, by inserting **<sup>0</sup>***MN*×*N***d***<sup>r</sup>* and **<sup>0</sup>***MN*×*M***d***t*:

$$\mathbf{b}\_{t} = [\mathbf{1}\_{M \times 1} \otimes \mathbf{X}\_{r}^{T} - \mathbf{X}\_{t}^{T} \otimes \mathbf{1}\_{N \times 1} - \mathbf{R}\_{t}, \mathbf{0}\_{MN \times N}][\mathbf{x}\_{o}^{T}, \mathbf{d}\_{t}^{T}, \mathbf{d}\_{r}^{T}]^{T} + \mathbf{e}\_{t},\tag{16}$$

$$\mathbf{b}\_{l} = \begin{bmatrix} \mathbf{X}\_{l}^{T} \odot \mathbf{1}\_{N \times 1} - \mathbf{1}\_{M \times 1} \odot \mathbf{X}\_{r}^{T}, \mathbf{0}\_{M \dot{N} \times M\_{r}} - \mathbf{R}\_{l} \end{bmatrix} \begin{bmatrix} \mathbf{x}\_{0}^{T}, \mathbf{d}\_{l}^{T}, \mathbf{d}\_{r}^{T} \end{bmatrix}^{T} + \mathbf{c}\_{l}. \tag{17}$$

Using the above linear equations, we construct a single linear model with respect to [**x***<sup>T</sup> <sup>o</sup>* , **d***<sup>T</sup> <sup>t</sup>* , **d***<sup>T</sup> r* ] *T* as follows:

$$\mathbf{b} = \mathbf{H}[\mathbf{x}\_0^T, \mathbf{d}\_l^T, \mathbf{d}\_r^T]^T + \varepsilon,\tag{18}$$

where **b** = [**b***<sup>T</sup> <sup>t</sup>* , **b***<sup>T</sup> r* ] *<sup>T</sup>*, *ε* = [*ε<sup>T</sup> <sup>t</sup>* , *ε<sup>T</sup> r* ] *<sup>T</sup>*, and

$$\mathbf{H} = \begin{bmatrix} \mathbf{1}\_{M \times 1} \otimes \mathbf{X}\_r^T - \mathbf{X}\_t^T \otimes \mathbf{1}\_{N \times 1} & -\mathbf{R}\_t & \mathbf{0}\_{MN \times N} \\ \mathbf{X}\_t^T \otimes \mathbf{1}\_{N \times 1} - \mathbf{1}\_{M \times 1} \otimes \mathbf{X}\_r^T & \mathbf{0}\_{MN \times M} & -\mathbf{R}\_r \end{bmatrix}. \tag{19}$$

The weighted least squares (WLS) solution of (18), denoted by [**x**ˆ*<sup>T</sup> <sup>o</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> <sup>t</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> r* ] *<sup>T</sup>*, is:

$$[\hat{\mathbf{x}}\_{\boldsymbol{\theta}}^{\boldsymbol{T}}, \hat{\mathbf{d}}\_{\boldsymbol{l}}^{\boldsymbol{T}}, \hat{\mathbf{d}}\_{\boldsymbol{r}}^{\boldsymbol{T}}]^{\boldsymbol{T}} = (\mathbf{H}^{\boldsymbol{T}} \mathbf{W} \mathbf{H})^{-1} \mathbf{H}^{\boldsymbol{T}} \mathbf{W} \mathbf{b},\tag{20}$$

where the diagonal weighting matrix **W** is:

$$\mathbf{W} = \text{diag}\left(\sigma\_{\omega}^{2} \begin{bmatrix} (\mathbf{d}\_{r} \odot \mathbf{d}\_{r}) \otimes \mathbf{1}\_{M \times 1} \\ \mathbf{1}\_{N \times 1} \odot (\mathbf{d}\_{t} \odot \mathbf{d}\_{t}) \end{bmatrix} \right)^{-1}. \tag{21}$$

In practice, we apply the approximated **W** using estimated **d***t* and **d***r* via a least square (LS) approach (substituting an identity matrix for *W* in (20)) as in previous methods [6–15]. Note that, instead of error covariance matrix, Cov[*ε*], we use the diagonal terms of Cov[*ε*] for *W*, since Cov[*ε*] is not invertible here.

The analysis of Section 4 shows that our double-sided BRM method enhances the MSE of target location estimated by the existing BRM method by a factor of two, given ideal two-dimensional target/antenna positions. The numerical simulations presented in Section 5 show that our method affords a better MSE performance than the existing BRM method when dealing with practical target/antenna positions.

#### *3.2. The Double-Sided Two-Stage BRM Algorithms*

In this subsection, we develop two double-sided two-stage BRM algorithms by modifying the above single-sided two-stage BRM algorithms using the squared relationships [12] and the Taylor approximation [15] to fully utilize the parameters (**x**ˆ *<sup>o</sup>*, **d**ˆ*t*, and **d**ˆ *<sup>r</sup>*) estimated by the first stage double-sided BRM algorithm.

#### 3.2.1. Proposed Double-Sided Two-Stage BRM Algorithm Using the Squared Relationships

As for the single-stage algorithm, we construct an extended linear model reflecting the relationships between **d***t*, **d***<sup>r</sup>* and **x***<sup>o</sup>* **x***<sup>o</sup>* by merging the two single-sided linear models of (9) and (10) as the following:

$$
\begin{bmatrix}
\mathbf{\hat{x}}\_{\boldsymbol{\theta}} \odot \mathbf{\hat{x}}\_{\boldsymbol{\theta}} \\
\mathbf{\hat{d}}\_{\boldsymbol{t}} \odot \mathbf{\hat{d}}\_{\boldsymbol{t}} + 2\mathbf{X}\_{\boldsymbol{t}}^{T}\mathbf{\hat{x}}\_{\boldsymbol{\theta}} - (\mathbf{X}\_{\boldsymbol{t}}^{T} \odot \mathbf{X}\_{\boldsymbol{t}}^{T})\mathbf{1}\_{3\times 1} \\
\mathbf{\hat{d}}\_{\boldsymbol{t}} \odot \mathbf{\hat{d}}\_{\boldsymbol{r}} + 2\mathbf{X}\_{\boldsymbol{r}}^{T}\mathbf{\hat{x}}\_{\boldsymbol{\theta}} - (\mathbf{X}\_{\boldsymbol{r}}^{T} \odot \mathbf{X}\_{\boldsymbol{r}}^{T})\mathbf{1}\_{3\times 1}
\end{bmatrix} = \begin{bmatrix}
\mathbf{I}\_{3} \\
\mathbf{1}\_{M\times 3} \\
\mathbf{1}\_{N\times 3}
\end{bmatrix} (\mathbf{x}\_{\boldsymbol{0}} \odot \mathbf{x}\_{\boldsymbol{0}}) + \boldsymbol{\varepsilon}\_{\mathcal{P}\boldsymbol{\prime}} \tag{22}
$$

where *ε<sup>p</sup>* is error vector due to the estimation error. The method of (22) provides an estimate of the squared target location, **<sup>x</sup>***<sup>o</sup>* **<sup>x</sup>***o*, using all [**x**ˆ*<sup>T</sup> <sup>o</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> <sup>t</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> r* ] *<sup>T</sup>* given by the first-stage double-sided BRM algorithm. Denote [**I**3, **1***<sup>T</sup> <sup>M</sup>*×3, **<sup>1</sup>***<sup>T</sup> <sup>N</sup>*×3] *<sup>T</sup>* as **<sup>H</sup>***p*; then, the WLS solution of (22), denoted by **<sup>x</sup>***o*1 **<sup>x</sup>***o*, is:

$$\mathbf{x}\_{o}\overleftarrow{\odot}\mathbf{x}\_{o} = (\mathbf{H}\_{p}^{T}\mathbf{W}\_{p}\mathbf{H}\_{p})^{-1}\mathbf{H}\_{p}^{T}\mathbf{W}\_{p}\left[\begin{matrix}\mathbf{\hat{x}}\_{o}\odot\mathbf{\hat{x}}\_{o}\\ \mathbf{\hat{d}}\_{l}\odot\mathbf{\hat{d}}\_{l} + 2\mathbf{X}\_{l}^{T}\mathbf{\hat{x}}\_{o} - (\mathbf{X}\_{l}^{T}\odot\mathbf{X}\_{l}^{T})\mathbf{1}\_{3\times 1}\\ \mathbf{\hat{d}}\_{l}\odot\mathbf{\hat{d}}\_{l} + 2\mathbf{X}\_{l}^{T}\mathbf{\hat{x}}\_{o} - (\mathbf{X}\_{l}^{T}\odot\mathbf{X}\_{l}^{T})\mathbf{1}\_{3\times 1}\end{matrix}\right].\tag{23}$$

The weighting matrix *Wp* is:

$$\mathbf{W}\_p = (\mathbf{T}(\mathbf{H}^T \mathbf{W} \mathbf{H})^{-1} \mathbf{T}^T)^{-1},\tag{24}$$

where

$$\mathbf{T} = 2 \begin{bmatrix} \text{diag}(\mathbf{x}\_{\circ}) & \mathbf{0}\_{3 \times (M+N)} \\ \mathbf{A}^{T} & \text{diag}([\mathbf{d}\_{r}^{T}, \mathbf{d}\_{r}^{T}]^{T}) \end{bmatrix} \tag{25}$$

$$\mathbf{A} = [\mathbf{X}\_{l}, \mathbf{X}\_{r}].\tag{26}$$

The final target position estimate, denoted by **x**ˆ *<sup>o</sup>*,*DS*, is:

$$\mathfrak{k}\_{\mathfrak{o},DS} = \text{sgn}(\mathfrak{k}\_{\mathfrak{o}}) \odot \sqrt{\mathbf{x}\_{\mathfrak{o}} \widehat{\odot}^{\circ} \mathbf{x}\_{\mathfrak{o}}}.\tag{27}$$

3.2.2. Proposed Double-Sided Two-Stage BRM Algorithm Using the Taylor Approximated Relationships

To utilize all [**x**ˆ*<sup>T</sup> <sup>o</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> <sup>t</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> r* ] *<sup>T</sup>* values given by the first-stage double-sided BRM algorithm, we construct the following extended linear model which reflects the Taylor approximated relationships between **d**ˆ*t*, **d**ˆ *<sup>r</sup>* and **x**ˆ *<sup>o</sup>* by merging the linear models in (13) and (14):

$$\begin{bmatrix} \hat{d}\_{l}(1) - \lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(1) \rVert \\ \vdots \\ \hat{d}\_{l}(M) - \lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(M) \rVert \\ \hat{d}\_{l}(1) - \lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(1) \rVert \\ \vdots \\ \hat{d}\_{l}(1) - \lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(1) \rVert \\ \vdots \\ \hat{d}\_{l}(N) - \lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(N) \rVert \end{bmatrix} = \begin{bmatrix} -\mathbf{I}\_{3} \\ (\mathbf{\hat{x}}\_{o}^{T} - \mathbf{x}\_{t}^{T}(1))/\lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(1) \rVert \\ \vdots \\ (\mathbf{\hat{x}}\_{o}^{T} - \mathbf{x}\_{t}^{T}(M))/\lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(M) \rVert \\ (\hat{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{t}^{T}(1))/\lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(1) \rVert \\ \vdots \\ (\mathbf{\hat{x}}\_{o}^{T} - \mathbf{x}\_{t}^{T}(N))/\lVert \hat{\mathbf{x}}\_{o} - \mathbf{x}\_{t}(N) \rVert \end{bmatrix} \triangle \mathbf{x}\_{o} + \begin{bmatrix} \triangle d\_{l}(1) \\ \triangle d\_{l}(1) \\ \vdots \\ \triangle d\_{l}(M) \\ \triangle d\_{l}(1) \\ \triangle d\_{l}(N) \end{bmatrix},\tag{28}$$

where **x***o*, *dt*(1), ··· , *dt*(*M*), *dr*(1), ··· , *dr*(*N*) are the estimation errors at **x**ˆ *<sup>o</sup>*. The method of (28) provides an estimate of **x***o*. Let us denote

$$\mathbf{H}\_{\mathcal{V}} = \begin{bmatrix} \mathbf{I}\_{3} \\ (\hat{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{t}^{T}(1)) / \|\hat{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(1)\| \\ \vdots \\ (\hat{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{t}^{T}(M)) / \|\hat{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(M)\| \\ (\hat{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{r}^{T}(1)) / \|\hat{\mathbf{x}}\_{o} - \mathbf{x}\_{r}(1)\| \\ \vdots \\ (\hat{\mathbf{x}}\_{o}^{T} - \mathbf{x}\_{r}^{T}(N)) / \|\hat{\mathbf{x}}\_{o} - \mathbf{x}\_{l}(N)\| \end{bmatrix}; \tag{29}$$

then, the WLS solution of (28), denoted by 2**x***o*, is:

$$
\overline{\triangle}\overline{\mathbf{X}}\_{\boldsymbol{\theta}} = (\mathbf{H}\_p^T \mathbf{W}\_p \mathbf{H}\_p)^{-1} \mathbf{H}\_p^T \mathbf{W}\_p \begin{bmatrix} \mathbf{0}\_{3 \times 1} \\ d\_t(1) - \|\hat{\mathbf{x}}\_{\boldsymbol{\theta}} - \mathbf{x}\_t(1)\| \\ \vdots \\ d\_t(M) - \|\hat{\mathbf{x}}\_{\boldsymbol{\theta}} - \mathbf{x}\_t(M)\| \\ \hat{d}\_r(1) - \|\hat{\mathbf{x}}\_{\boldsymbol{\theta}} - \mathbf{x}\_r(1)\| \\ \vdots \\ \hat{d}\_r(N) - \|\hat{\mathbf{x}}\_{\boldsymbol{\theta}} - \mathbf{x}\_t(N)\| \end{bmatrix} \tag{30}
$$

where the weighting matrix **W***p* is

$$\mathbf{W}\_{\mathcal{V}} = \left(\mathbf{H}^T \mathbf{W} \mathbf{H}\right)^{-1}.\tag{31}$$

The final target position estimate, denoted by **x**ˆ *<sup>o</sup>*,*DA*, is:

$$
\hat{\mathbf{x}}\_{\vartheta,DA} = \hat{\mathbf{x}}\_{\vartheta} - \widehat{\triangle}\mathbf{x}\_{\vartheta}.\tag{32}
$$

Unfortunately, theoretical performance analysis of (27) and (32) are virtually impossible given their complexity. However, the simulation results presented in Section 5 support the suggestion that our double-sided BRM method improves existing algorithms.

Table 2 compares the overall complexity of the double-sided algorithms to that of single-sided algorithms in terms of the number of multiplications.


**Table 2.** Complexity table of the target localization algorithms.

The extra complexity of the double-sided algorithms is attributable principally to the larger matrix used for WLS computation. The increased computation cost scales polynomially, but is acceptable given the performance gain demonstrated by the simulations presented in Section 5.

#### **4. Performance Analysis of Double-Sided BRM Method for Ideal Target/Antennae Positions**

Here, we derive target estimation MSEs of our double-sided BRM method and the BRM method of Noroozi [7] when the two-dimensional target/antenna positions are ideal. Derivation of general, theoretical MSEs of target estimations is extremely complicated; the existing study in [7] assumes that the target/antenna distributions in the *x*-*y* plane are ideal. Accepting this, let the target be at (without loss of generality) **x***<sup>o</sup>* = [0, 0] *<sup>T</sup>*, and let the antennae be located uniformly around the target:

$$\begin{aligned} \mathbf{x}\_{l}(m) &= d \left[ \cos \left( \theta\_{0} + \frac{2 \pi m}{M} \right), \sin \left( \theta\_{0} + \frac{2 \pi m}{M} \right) \right]^{T}, \\ \mathbf{x}\_{l}(n) &= d \left[ \cos \left( \phi\_{0} + \frac{2 \pi n}{N} \right), \sin \left( \phi\_{0} + \frac{2 \pi n}{N} \right) \right]^{T}, \end{aligned} \tag{33}$$

where *d* is the common distance between the target and the various antennae, and *θ*<sup>0</sup> and *φ*<sup>0</sup> are distinct angles.

Assuming small BR errors, the error covariance matrix of the WLS estimator can be derived from [17,18]:

$$\text{Cov}[[\mathbf{\hat{x}}\_o^T, \mathbf{\hat{d}}\_l^T, \mathbf{\hat{d}}\_r^T]^T - [\mathbf{x}\_o^T, \mathbf{d}\_l^T, \mathbf{d}\_r^T]^T] = \left(\mathbf{H}\_o^T \mathbf{W} \mathbf{H}\_o\right)^{-1} \mathbf{H}\_o^T \mathbf{W} \text{Cov}[\varepsilon] \mathbf{W} \mathbf{H}\_o \left(\mathbf{H}\_o^T \mathbf{W} \mathbf{H}\_o\right)^{-1},\tag{34}$$

where **H***<sup>o</sup>* is the noise-free version of **H** (derived by substituting *rmn* for *r*ˆ*mn* in (19)). Accepting the above assumption, **d***<sup>t</sup>* and **d***<sup>r</sup>* simplify to *d***1***M*×<sup>1</sup> *d***1***N*×1, respectively, hence, the weighting matrix **W** of (21) and the covariance matrix of *ε* = [*ε<sup>T</sup> <sup>t</sup>* , *ε<sup>T</sup> r* ] *<sup>T</sup>*, *Cov*[*ε*], simplify to:

$$\mathbf{W} = 1/(d^2 \sigma\_{\omega}^2) \mathbf{I}\_{2MN} \tag{35}$$

$$\text{Cov}[\boldsymbol{\varepsilon}] = d^2 \sigma\_{\omega}^2 \begin{bmatrix} \mathbf{I}\_{MN} & \mathbf{I}\_{MN} \\ \mathbf{I}\_{MN} & \mathbf{I}\_{MN} \end{bmatrix} . \tag{36}$$

As the antennas are uniformly located on a circle of radius *d*, the assumption further yields the following properties (the results for Rxs are the same):

$$\sum\_{m=1}^{M} \mathbf{x}\_{l}(m) = \sum\_{m=1}^{M} y\_{l}(m) = 0\tag{37}$$

$$\sum\_{m=1}^{M} \mathbf{x}\_{l}^{2}(m) = \sum\_{m=1}^{M} y\_{l}^{2}(m) = Md^{2}/2. \tag{38}$$

Using (37) and (38), each term of (34), (**H***<sup>T</sup> <sup>o</sup>* **WH***o*)−<sup>1</sup> and **H***<sup>T</sup> <sup>o</sup>* **W***Cov*[*ε*]**WH***o*, can be simplified as follows:

$$(\mathbf{H}\_o^T \mathbf{W} \mathbf{H}\_o)^{-1} = \frac{\sigma\_\omega^2}{MN} \begin{bmatrix} \mathbf{I}\_2 & \frac{1}{2d} \mathbf{A} \\ \frac{1}{2d} \mathbf{A}^T & \mathbf{B} \end{bmatrix} \tag{39}$$

$$\mathbf{H}\_{\vartheta}^{T}\mathbf{W}\mathbf{C}\boldsymbol{\sigma}[\boldsymbol{\varepsilon}]\mathbf{W}\mathbf{H}\_{\vartheta} = \frac{1}{\sigma\_{\omega}^{2}} \begin{bmatrix} \mathbf{0}\_{2 \times 2} & \mathbf{0}\_{2 \times (M+N)} \\ \mathbf{0}\_{(M+N)\times 2} & \mathbf{D} \end{bmatrix} \tag{40}$$

where **A** is that of (26), and

$$\mathbf{D} = \begin{bmatrix} 4\mathbf{N}\mathbf{I}\_M & 4\mathbf{1}\_{M \times N} \\ 4\mathbf{1}\_{N \times M} & 4\mathbf{M}\mathbf{1}\_N \end{bmatrix} \tag{41}$$

hence,

$$\begin{split} \text{Cov}[[\mathbf{\dot{s}}\_o^T, \dot{\mathbf{d}}\_t^T, \dot{\mathbf{d}}\_r^T]^T - [\mathbf{x}\_o^T, \mathbf{d}\_t^T, \mathbf{d}\_r^T]^T] \\ = \sigma\_\omega^2 \begin{bmatrix} \frac{1}{4\mathcal{E}^2} \mathbf{A} \mathbf{D} \mathbf{A}^T & \frac{1}{2\mathcal{E}} \mathbf{A} \mathbf{D} \mathbf{B} \\\ \frac{1}{2\mathcal{E}} \mathbf{B} \mathbf{D} \mathbf{A} & \mathbf{B} \mathbf{D} \mathbf{B} \end{bmatrix}. \end{split} \tag{42}$$

As the MSEs of the *x* and *y* components are the (1, 1) and (2, 2) elements of *Cov*[[**x**ˆ*<sup>T</sup> <sup>o</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> <sup>t</sup>* , **<sup>d</sup>**<sup>ˆ</sup> *<sup>T</sup> r* ] *<sup>T</sup>* − [**x***<sup>T</sup> <sup>o</sup>* , **d***<sup>T</sup> <sup>t</sup>* , **d***<sup>T</sup> r* ] *<sup>T</sup>*], we are interested only in (1/4*d*2)**ADA***T*. Using (38) once more, (1/4*d*2)**ADA***<sup>T</sup>* is:

$$\frac{1}{4d^2} \mathbf{A} \mathbf{D} \mathbf{A}^T = \frac{1}{MN} \mathbf{I}\_2. \tag{43}$$

Thus, we finally obtain

$$E\left\{ (\pounds\_{\vartheta} - \mathbf{x}\_{\vartheta})^2 \right\} = E\left\{ (\mathfrak{J}\_{\vartheta} - \mathbf{y}\_{\vartheta})^2 \right\} = \frac{\sigma\_{\omega}^2}{MN}.\tag{44}$$

Meanwhile, under the same assumption, the MSEs of the existing BRM method in [7] are:

$$E\left\{\left(\mathfrak{X}\_{BRM} - \mathfrak{x}\_{o}\right)^{2}\right\} = E\left\{\left(\mathfrak{Y}\_{BRM} - y\_{o}\right)^{2}\right\} = \frac{2\sigma\_{\omega}^{2}}{MN}.\tag{45}$$

A comparison of (44) and (45) shows that our method improves the MSE performance of the BRM method by a factor of two, given the assumed two-dimensional target/antenna positioning. As presented in the following section, simulations highlighted the improvements afforded by our algorithms when practical target/antenna settings were evaluated.

#### **5. Numerical Simulation for Practical Target/Antennae Positions**

Figure 1 presents the MSE performances of the proposed algorithms for the antenna positions specified in Table 3 and a target located at **x***<sup>o</sup>* = [0*m*, 0*m*, 0*m*] *<sup>T</sup>*. The results in Figure 1a show that our double-sided BRM method consistently affords better MSE performance than the single-sided BRM method of Noroozi [7], and the results in Figure 1b,c show that the double-sided two-stage BRM algorithms afford better MSE performance than the single-sided two-stage BRM methods of Amiri [12] and Wang [15].

Figure 2 presents the MSEs of target estimations when the target moves along the *x*-axis with the *y* and *z* target positions fixed at *yo* = 400 m and *zo* = 100 m, and antennas positioned as specified in Table 4. Here, the noise variance, *σω*, was considered to be 5 m2. The simulations shown in Figure 2 revealed that our algorithms afforded better MSE performance than existing algorithms for all target positions tested.


**Table 3.** Transmitters and receiver Positions (*m*).

(**c**)

**Figure 1.** Target estimation MSE of the double-sided and single-sided algorithms with respect to noise variance: (**a**) single-stage; (**b**) two-stage using squared relations; and (**c**) two-stage using approximated relations.


**Table 4.** Transmitters and Receiver Positions (*m*).

(**c**) **Figure 2.** Target estimation MSE of the double-sided and single-sided algorithms with respect to the target position: (**a**) single-stage; (**b**) two-stage using squared relations; and (**c**) two-stage using approximated relations.

#### **6. Conclusions**

Here, we develop a novel target localization approach improving the target estimation accuracy of existing BRM based algorithms for distributed MIMO radars. The proposed double-sided BRM method estimates target, target–transmitter, and target–receiver distances simultaneously. We also took a double-sided approach to two-stage BRM methods. The improvements afforded by the proposed algorithms were confirmed theoretically for an ideal scenario, and via numerical simulations for practical scenarios.

**Author Contributions:** All authors contributed equally to this work. The final manuscript has been read and approved by all authors for submission.

**Funding:** This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (grant No. 2018R1A2B6001456).

**Conflicts of Interest:** The authors declare that they have no competing interests.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **Research of a Radar Imaging Algorithm Based on High Pulse Repetition Random Frequency Hopping Synthetic Wideband Waveform**

#### **Songhua He \* and Xiaotian Wu \***

School of Information Science and Engineering, Hunan University, Changsha 410082, China **\*** Correspondence: hesonghua@hnu.edu.cn (S.H.); xiaotian\_w@hnu.edu.cn (X.W.)

Received: 17 November 2019; Accepted: 6 December 2019; Published: 9 December 2019

**Abstract:** Aiming at the imaging algorithm of high-pulse-repetition random-frequency-hopping synthetic wideband radar on a supersonic/hypersonic aircraft platform, this study established an echo simulation model of target and clutter, analyzed the special range-Doppler coupling effect and its influence on imaging, and proposes a method of imaging with pipeline-parallel processing based on generalized 2D matched-filtering and Doppler pre-processing. In the method, Doppler-beam-sharpening was advanced to be performed with the pulse compression process in each frame, and the special range-Doppler coupling effect caused by high dynamic motion of platform and random frequency hopping in bandwidth synthesis was well suppressed; several modes of random frequency hopping were designed and the pipeline-parallel image processing algorithm was optimized for each mode. Theoretical analysis and simulation results show that the proposed imaging method can effectively avoid the divergence of 2D range-Doppler images in the range direction, and can meet the requirements of real-time imaging.

**Keywords:** high pulse repetition frequency (HPRF); random frequency hopping (RFH); radar imaging; hypersonic aircraft

#### **1. Introduction**

In order to improve the performances of target-detection, low-probability-of-intercept, and anti-jamming, the high pulse repetition frequency (HPRF) synthetic wideband waveform has been used for radar imaging in supersonic/hypersonic aircraft guidance [1–5]. HPRF can decrease the velocity ambiguity, reduce the folding effect of clutter in Doppler direction, and then improve the signal-to-clutter ratio (SCR) and target detection ability. In addition, combined with random frequency hopping (RFH), HPRF can increase the number of accumulated pulses per unit time and improve the signal accumulation gain ratio of detector (coherent detection) to interceptor (non-coherent detection), which can decrease the peak power of the transmitting signal and improve the low-interception ability of radar. Furthermore, wideband HPRF RFH can improve the anti-jamming ability of guidance radar. RFH makes the reconnaissance jammer unable to predict the frequency-hopping pattern adopted in each frame; and makes it difficult to implement the answering-deception-jamming and the narrowband-blocking-jamming at each frequency point. HPRF and RFH enable the radar receiver to select echoes of a target from a relatively narrow range region according to the number of periods of return delay, and can suppress the deceptive-repeater-jamming, which is not in the same range region as the target, especially from the side-lobe direction.

In the case of RFH, the frequency domain sampling of radar signal in each frame is random and non-uniform. Because the basic frequency changes randomly between frames, the equivalent time-domain sampling of the inter-frame Doppler processing is also non-uniform and random. It is difficult to adopt the fast imaging algorithm based on Inverse Discrete Fourier Transform (IDFT) whether

in pulse-compression processing at range direction or beam-sharpening processing in the Doppler direction. In addition, in the case of supersonic/hypersonic application, due to the large broadening in clutter Doppler spectrum, the special range-Doppler coupling effect and the large motion-compensation residue in the echoes, it is difficult to use the conventional method of fractal-dimension processing in the two directions of range and Doppler. Therefore, it is important to develop new imaging methods and fast algorithms for RFH radar. The existing non-uniform DFT (NU-DFT) fast algorithms, such as Vandermonde determinant method [6,7], regular Fourier matrix method [8,9], and min-max interpolation method [10,11], have specific constraints on the structural characteristics of non-uniform sampling signals, and their versatility is relatively poor. Compressed sensing technology [12] has also been widely used in the field of RFH synthetic wideband imaging [13–19]. According the theory of compressed sensing, it requires that the target and clutter background meet the basic conditions of sparsity. In the cases of wideband high-range-resolution (HRR) and high signal-to-noise ratio (SNR) or high SCR, compared with the number of range resolution units, the number of the observed targets or the number of scattering centers of the targets is limited. Therefore, it can provide a good guarantee for the sparsity in the high-resolution range profiles. The advantage of the compressed sensing method is that it can reconstruct the range profile through a small number of observations, which means in RFH radar, that it can effectively reduce the number of transmitting pulses without decreasing the resolution of the range profile. However, the compressed sensing algorithm needs a large number of matrix inversion operations, which makes it difficult to meet the real-time requirement, especially in hypersonic-platform-borne (HPB) radar imaging application of limited computing resources and extremely short platform-target intersection time. In addition, in the case of strong clutter and low SCR, the sparsity required by compressed sensing is also difficult to meet.

In view of the above problems, we established the scattering echo model of target and clutter for an HPRF RFH radar system and supersonic/hypersonic aircraft platform, and analyzed the special range-Doppler coupling effect and its influence on imaging. We also proposed the 2D range-Doppler imaging method and the 1D HRR imaging method based on Doppler pre-processing and 2D generalized matched-filtering (GMF) processing. Additionally, we designed several RFH modes, and proposed the corresponding pipeline-parallel processing, fast, real-time imaging algorithm for a different RFH mode. Theoretical analysis and simulation experiments showed that the proposed imaging method could effectively suppress the special range Doppler-coupling effect, achieve good imaging performances, and easily meet the real-time imaging requirements of supersonic/hypersonic aircraft-borne application.

#### **2. Echo Modeling of RFH Synthetic Wideband Radar**

Set the imaging processing time period of RFH radar as *M*\**N*\**T*. *M* is the number of sub-frames, which is the number of accumulated frames required for Doppler processing, the size of which defines the speed resolution; *N* is the number of frequency hopping points per frame, the synthetic bandwidth of which defines the range resolution; *T* is the pulse repetition period. In the case of HPRF, the echo delay τ*<sup>H</sup>* of targets and sea/land clutters from the main-lobe direction is much greater than *T*, *nsT* < τ*<sup>H</sup>* < (*ns* + 1)*T*, where *ns* is integer. Here we assume that each receiving complex frame lags behind the transmitting complex frame for *ns* periods and the period in the receiving complex frame is numbered (*n*|*m*), where *m* is the number of the frame (*m* = 0, 1, ... , *M* − 1) and *n* is the number of pulse period in each frame (*n* = 0, 1, ... , *N* − 1). In each pulse period, the receiving signal is sampled and the sampling interval is equal to the pulse-width τ; then, the total number of sampling points in each period is *K* = *INT*[*T*/τ] (*INT*[.] represents rounding down). Taking the starting time of each period as the reference, the corresponding sampling time is τ, 2τ, ... , *K*τ, which is numbered *k* = 0, 1, ... , *K* − 1 respectively, and *k* is the number of sampling unit. By using the *ns*-period-delayed frequency hopping pattern to construct the local reference signal, the echo signal is coherently received, and the echo signal with a range of *cnsT*-*c*(*ns* + 1)*T* (*c* is the speed of light) is selected by the IF filter of receiver. For any scattering point in the selected range, according to the radar principle, the received/sampled signal can be expressed as follows:

$$\begin{aligned} \mathbf{x}(n|m,k) &= A\exp\{j2\pi[(2\mathcal{R}/c)\Delta f\_d i\_{mn} - (2f\_m VNT/c)m - (2f\_m VT/c)n - (2f\_m VT/c)n - (2f\_m VT/c)n\} \\ (2\Delta f\_d VT/c)n i\_{mn} &- (2\Delta f\_d VNT/c)m i\_{mn} - (2\Delta f\_d V\tau/c)k i\_{mn} - (2\Delta f\_d V/c)\tau i\_{mn} - \\ &(2f\_m V\tau/c)k + 2f\_m (\mathcal{R} - V\tau)/c \rceil. \end{aligned} \tag{1}$$

Here, the pure RFH mode (fast hopping intra frame/wide hopping inter frame; other modes are special cases of this mode) is investigated, where *fm* is the basic frequency of the transmitting signal at the *m*-th frame; Δ*fd* is the minimum frequency jump interval determined by the minimum quantization level of direct digital synthesizer (DDS); *imn* is an integer which is randomly selected according to a certain frequency hopping pattern in the range of integer set [0, 1, 2, ... , *I* − 1], where *I* = Δ*F*/Δ*fd* (the same value cannot be repeated in the same frame); Δ*F* is the synthetic bandwidth; *fm* + Δ*fdimn* is the carrier frequency of the transmitting signal in the *n*-th period of the *m*-th frame; *R* , *R*, and *V* are, respectively, the actual range, ambiguous range, and radial velocity of the point target at the starting time of the first pulse period of each receiving complex frame, *R* = *R* + *cnsT*/2 and 0 ≤ *R* < *cT*/2. The velocity is defined as positive for movement facing the radar and is assumed to remain unchanged within *MNT* (the time of a complex frame, usually several milliseconds). Here we ignore the influence of the slight change of *V* within a short time of milliseconds.

As shown in Figure 1, in the *n*-th pulse period of the *m*-th frame in each complex frame, the time delay of the receiving echo is 2(*R* − *VmNT* − *VnT* − *VnsT*)/*c* relative to the starting time *mNT* + *nT* of this period. Because the *k*-th sampling unit in each period can only acquire data of echo which has delay in the range of *k*τ − (*k* + 1)τ, *k*τ < 2(*R* − *VmNT* − *VnT* − *VnsT*)/*c* < (*k* + 1)τ, and the echo is sampled by the *k*-th sampling unit.

**Figure 1.** Sequence chart of each frame.

It is apparent that if the scatter has a facing range-walk of δ*R* in a complex frame and 0 < *R* − *ck*τ < δ*R*, then the echo is sampled first by the *k*-th unit in some frames, and then by the (*k* − 1)-th unit in the remaining frames, which is called cross-sampling-unit movement. For supersonic/hypersonic applications, considering the large-scale range-walk of cross-sampling unit in a complex frame, the following requirement must be met for each scattering point:

$$kc\tau/2 < R - VmNT - VnT - Vn\_sT < (k+1)c\tau/2.$$

For (*n*|*m*, *k*) combinations that do not satisfy the above equation, *x*(*n*|*m*, *k*) = 0.

The advantage of the simulation model shown in Equation (1) is that it can fully describe the actual cross-sampling-unit movement in supersonic/hypersonic applications. In addition, it is suitable for panoramic simulation of echoes from the area that is illuminated by the main-lobe of radar beam. The panoramic clutter area can be divided into many grids, and echo from each grid can be simulated as point scattering by using Equation (1). The target can be simulated as multiple scattering centers, and echo from each scattering center can be simulated by using Equation (1). The target may be close to the junction of two sampling unit, and echoes from the target may appear successively at two adjacent sampling units (which is the so-called cross-sampling-unit range-walk). The clutter echoes appear at most sampling units (in the case of HPRF, *cT*/2 is larger than but very close to the radial length of the illuminated area of main-lobe). According to Equation (1), the return data containing both clutter-background and target can be simulated as follows:

$$
\pi\_{\mathbb{S}}(n|m,k) = \mathfrak{x}\tau(n|m,k) + \mathfrak{x}\_{\mathbb{C}}(n|m,k)\_{\ast},
$$

where *xT*(*n*|*m*, *k*) is the target echo, which can be expressed as the sum of the point scattering echoes of multiple scattering centers; *xC*(*n*|*m*, *k*) is the clutter echo, which can be expressed as the sum of the point scattering echoes of each clutter grid in the main-lobe illuminated area. The amplitude of each clutter scattering point is randomly selected according to the Rayleigh distribution, and the parameter σ<sup>2</sup> of the Rayleigh distribution is controlled according to the required SCR.

According to the moving speed of the platform, angle between the moving direction of the platform, and the illumination direction of the beam, the estimated value *VC* of the radial speed of the center of the main-lobe clutter can be obtained. Clutter-center velocity compensation is applied to data acquired in each complex frame as follows:

$$\begin{aligned} y\_S(n|m,k) &= \mathbf{x}\_S(n|m,k) \times \exp\left[j2\pi[(2f\_{\rm ff}V\_{\rm C}\text{NT}/\text{c})m + (2f\_{\rm mf}V\_{\rm C}\text{T}/\text{c})n + (2f\_{\rm mf}V\_{\rm C}\text{T}/\text{c})n + (2f\_{\rm mf}V\_{\rm C}\text{T}/\text{c})n\right] \\ (2\Delta f\_{\rm df}V\_{\rm C}\text{T}/\text{c})\dot{m}\_{nm} &+ (2\Delta f\_{\rm df}V\_{\rm C}\text{NT}/\text{c})\dot{m}\_{nm} + (2\Delta f\_{\rm df}V\_{\rm C}\text{T}/\text{c})\dot{m}\_{nm} + \\ &(2f\_{\rm mf}V\_{\rm C}\text{T}/\text{c})k + 2f\_{\rm mf}V\_{\rm C}\text{T}/\text{c})]. \end{aligned} \tag{2}$$

Considering the Doppler broadening effect and the velocity estimation error of the moving platform, let *v* = *V* − *VC* be the velocity surplus of the scattered relative to the clutter center. After clutter-center velocity compensation, the sampled signal of each scatter can be expressed as follows:

$$\begin{aligned} y(n|m,k) &= A \exp\{j2\pi[(2R/c)\Delta f\_d i\_{mn} - (2f\_m vNT/c)m - (2f\_m vT/c)n - (2f\_m vT/c)n - (2\Delta f\_d vT/c)m\} \\ &\quad (2\Delta f\_d vT/c)m i\_{mn} - (2\Delta f\_d vNT/c)m i\_{mn} + 2f\_m R/c \} \end{aligned} \tag{3}$$

where ϕ(*f*0,*R*, *v*) is a constant term independent of (*n*|*m*, *k*).

Because the velocity surplus of clutter or target is far less than the platform velocity, some phase terms in the compensated signal can be ignored, which can simplify the subsequent imaging process. The ignored phase terms are *j*2π[−(2Δ*fdv*τ/*c*)*kimn*], *j*2π[−(2Δ*fdv*/*c*)τ*imn*], *j*2π(−2 *fmv*τ/*c*), and *j*2π(−2 *fmv*τ/*c*)*k*, the variation range of which is not more than π/4 in a complex frame.

#### **3. High Quality Real-Time Imaging of HPB HPRF RFH Radar**

#### *3.1. Special Range-Doppler Coupling E*ff*ect and Its Suppression*

In the case of conventional stepped-frequency (SF) synthetic wideband radar system, *imn*Δ*fd* = *n*Δ*f* and *fm* = *f*0; in Equation (3), Δ*f* is the frequency interval between adjacent pulses and *N*Δ*f* is the synthetic bandwidth. Each frame has the same basic frequency *f*<sup>0</sup> and the same stepped-frequency hopping. According to Equation (3), in any frame-*m*, the change of signal phase between pulses mainly depends on the phase term 2π(2*R*/*c*)Δ*f n*, which is only related to range-*R*. Therefore, FFT processing or pulse compression processing in each frame can be used to obtain the distribution of scatters in the range direction; i.e., target range profile [20]. The range resolution determined by DFT is *c*/(2*N*Δ*f*). The second-order phase term 2π(2Δ*f vT*/*c*)*n*<sup>2</sup> in Equation (3) may cause energy diffusion of

the scattering center in range profile, but the diffusion can be ignored because the synthetic bandwidth *N*Δ*f* is far smaller than the carrier frequency *f* <sup>0</sup> and the phase variation of the second-order phase term is very small. The other velocity-related phase terms 2π(2 *f*0*vT*/*c*)*n* and 2π(2Δ*f vNT*/*c*)*mn* are linear with *n*, and their influence on pulse compression is that the position of the scatter on the FFT spectrum is shifted by an offset of *f*0*v*(1 + *mN*)*T*/Δ*f*, which is called range-Doppler coupling effect. The range-Doppler coupling effect in an SF radar system can cause error in range measurement, but cannot cause significant diffusion of energy or degradation of imaging quality. After pulse compression and envelope alignment of a range profile in each frame, the inter-frame phase change of each range unit mainly depends on the phase term 2π(2 *f*0*vNT*/*c*)*m*. Therefore, FFT processing or Doppler processing in each range unit can be used to obtain the distribution of scatters in velocity or Doppler direction. The distribution of the scatters on the 2D range-Doppler plane can be obtained by synthesizing the distributions of all the range units. As described above, in the conventional SF system, the imaging processing method of pulse compression in each frame at first, and then Doppler processing in each range resolution unit, are generally adopted.

It can be seen from Equation (3) that the phase of the signal is complexly related to the range and speed of the scatter due to HPRF and RFH. In each frame-*m*, the range-related phase term 2π(2*R*/*c*)Δ*fdimn*, changes randomly and nonlinearly between pulses because *imn* changes randomly. In order to use traditional FFT for pulse compression in each frame, it is necessary to rearrange the data in order of frequency from small to large, and then interpolate the non-uniform frequency-sampled data into uniform frequency-sampled data. The randomly-changed phase term 2π(2*R*/*c*)Δ*fdimn* is transformed to linearly-changed phase term 2π(2*R*/*c*)Δ*f n* after rearrangement and interpolation. However, data rearrangement randomizes the original linear range-Doppler coupling phase term 2π(2 *fmvT*/*c*)*n*. For supersonic/hypersonic applications, even if the clutter-center velocity compensation is made by using Equation (2), the phase change of the coupling phase term caused by the velocity residual *v* is still large for the scatters that are not at the direction of beam-center, and it can be close to or even more than 2π in one frame. The random change of phase is equivalent to adding multiplicative noise to the signal, and it seriously reduces the coherence of the rearranged data, which leads to serious energy-divergence of scatters and degradation of imaging quality. For inter-frame Doppler processing, because *imn* changes randomly, the velocity-related phase term (2 *fmvNT*/*c*)*m* changes randomly and nonlinearly between frames, which makes the Doppler processing complicated. The random and nonlinear range-Doppler coupling effect not only exists within the frame but occurs between frames, so it is difficult to carry out fractal-dimension processing in range and Doppler directions respectively.

In this paper, the above phenomenon is called the special range-Doppler coupling effect of RFH synthetic wideband radar in a highly dynamic application. Because of the above special effect, the conventional imaging processing method of pulse compression in each frame at first and then Doppler processing in each range resolution unit cannot be adopted in HPRF RFH radar. In order to suppress the special range-Doppler coupling effect, Doppler processing must be advanced to each frame and be synchronous with the pulse compression processing, which is called Doppler pre-processing in this paper.

The imaging algorithm of Doppler pre-processing is based on the 2D GMF algorithm, which can be executed by means of pipeline-parallel processing, and the computation can be dispersed to each frame in combination with the data acquisition process. The algorithm can be optimized in real-time ability according to different RFH modes.

As an example, the basic principle of suppressing the above special coupling effect through Doppler pre-processing is illustrated by the following intra-frame pseudo RFH mode where *imn*Δ*fd* = *in*Δ*f* and *fm* = *f*<sup>0</sup> (the basic frequency of pulse signal remains unchanged between frames; for different *n*, *in* randomly takes different values in [0, 1, 2, ... , *N* − 1] without repetition); then,

$$\begin{split} y(n|m,k) &= A \exp\{j2\pi[(2\mathcal{R}/c)\Delta f i\_{\rm n} - (2f\_0 v N T/c)m - (2f\_0 v T/c)n - (2f\_0 v T/c)m - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n - (2f\_0 v T/c)n$$

Obviously, for any fixed period number *n*, the phase terms 2π(2 *f*0*vNT*/*c*)*m* and (2Δ*f vNT*/*c*)*min* vary non-randomly and linearly with the frame number *m*. Therefore, the *M* sampled data ! *yS*(*n*|*m*, *k*) *<sup>m</sup>* <sup>=</sup> 0, 1, ... , *<sup>M</sup>* <sup>−</sup> <sup>1</sup> " of the same period number *n* and the same sampling unit number *k* can be processed first by using FFT (Doppler pre-processing). The velocity resolution converted from the DFT spectral resolution is Δ*v* = *c*/(2 *f*0*NMT*), and the distribution of the scatters in the velocity direction or the Doppler direction can be obtained. In the Doppler pre-processed data ! *YS*(*n*|*lv*, *k*) *lv* <sup>=</sup> 0, 1, ... , *<sup>M</sup>* <sup>−</sup> <sup>1</sup> " , the phase terms of the signal on the *lv*-th speed channel changes with *n* are mainly 2π(2*R*/*c*)Δ*fin* and 2π(2 *f*0*vT*/*c*)*n*. Due to the accumulation or filtering effect of DFT, the change range of *v* in this channel is [*lv*Δ*v* − Δ*v*/2, *lv*Δ*v* + Δ*v*/2]. If the signal on the *lv*-th speed channel is phase-compensated by the phase factor *exp*! *j*2π(2 *f*0*lv*Δ*vT*/*c*)*n* " during or after Doppler processing, the phase term 2π(2 *f*0*vT*/*c*)*n* becomes 2π(2 *f*0*v T*/*c*)*n*, where −Δ*v*/2 < *v* < Δ*v*/2. As long as the accumulation time *MNT* is long enough or the resolution of velocity is high enough, Δ*v* is small enough, and the change range of the term 2π(2 *f*0*v T*/*c*)*n* can be far less than π/4. Rearrange the data ! *YS*(*n*|*lv*, *k*) *<sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> " on each speed channel *lv* in the order of *in* from small to large; then, the phase term 2π(2*R*/*c*)Δ*fin* becomes 2π(2*R*/*c*)Δ*f n*, while 2π(2 *f*0*v T*/*c*)*n* becomes a random phase term of small value, which can be ignored.

By FFT processing of the rearranged data on each speed channel, the distribution of the scatters along range direction of each speed channel can be obtained. By synthesizing all the speed channels, the distribution of the scatters on the 2D range-Doppler plane can be obtained. It has almost the same imaging effect as the conventional imaging algorithm used in SF Radar.

For other RFH modes, Doppler pre-processing can also suppress the above special range-Doppler coupling effect, and different fast 2D range-Doppler imaging algorithms can be obtained.

#### *3.2. Image Processing Based on 2D GMF*

According to the theory of matched filtering, for any transmitting signal waveform, as long as it has a certain bandwidth and a certain time width, the 2D range-Doppler image of the detected area can be obtained from the return signal through 2D matched filtering processing at the receiving end. The range resolution of the image depends on the effective bandwidth of the transmitting signal, and the speed resolution depends on the effective time-width of the signal. If the random frequencies are uniformly-distributed, the effective bandwidth is proportional to the synthetic bandwidth Δ*F*.

Supposing the required non-ambiguous range depth of imaging at each sampling unit is *Rp*, the parameters Δ*F* and *N* are designed to satisfy *Rp* = *cN*/(2Δ*F*). If range depth of *Rp* is divided into *N* range cells, the corresponding range width of each cell is *c*/(2Δ*F*), which is exactly the nominal range resolution corresponding to the synthetic bandwidth Δ*F* of RFH signal. The non-ambiguous velocity measurement range [0, *c*/(2 *f*0*MT*)] is divided into *M* velocity cells, and the velocity width corresponding to each cell is *c*/(2 *f*0*MNT*), which is exactly the velocity resolution corresponding to the accumulation time *MNT* of a complex frame. According to Equation (3) and the principle of 2D GMF, the processing of 2D range-Doppler segment imaging at each sampling unit *k* can be described as follows

$$\begin{aligned} P(l\_{\nu}, l\_{\nu}, k) &= \sum\_{m=0}^{M-1} \sum\_{\nu=0}^{N-1} \mathcal{W}\_{\Omega}(m, n) y\_{\mathcal{S}}(n|m, k) \exp\{-j2\pi \left[ (f\_{m} + \Delta f\_{d} i\_{\text{min}}) k\pi \right] \} \times \\ \exp\left\{-j2\pi \left[ \frac{f\_{m} + \Delta f i\_{\text{min}}}{\Delta F} l\_{\text{ii}} \right] \right\} & \times \exp\left\{ j2\pi \left[ \frac{f\_{m} + \Delta f i\_{\text{min}}}{f\_{0}} \left( l\_{\nu} - \frac{M}{2} \right) \frac{m}{M} \right] \right\} \times \exp\left\{ j2\pi \left[ \frac{f\_{m} + \Delta f i\_{\text{min}}}{f\_{0}} \left( l\_{\nu} - \frac{M}{2} \right) \frac{n}{M \text{IV}} \right] \right\}, \end{aligned} \tag{4}$$

where ! *P*(*lu*, *lv*, *k*) *lu* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> 1; *lv* <sup>=</sup> 0, 1, ... , *<sup>M</sup>* <sup>−</sup> <sup>1</sup> " is called the 2D segment image of the target area obtained by the sampling point *k*. *P*(*lu*, *lv*, *k*) is the value (complex number) at the pixel point (*lu*, *lv*), where *lu* is the number of pixel points in the range direction; *lv* is the number of pixel points in the speed direction. The total number of pixels in the segment image is *MN*.

The function of phase term *exp*! −*j*2π[(*fm* + Δ*fdimn*)*k*τ] " is to calibrate the segment image, so that the starting position *lu* = 0 and the ending position *lu* = *N* − 1 of the segment image in the range direction correspond to the starting position *ck*τ/2 and the ending position *ck*τ/2 + (*N* − 1)*Rp*/*N*. The purpose of calibration is to ensure that the segment image acquired at different sampling units is not ambiguous. *exp*! −*j*2π[(*fm* + Δ*fdimn*)*lu*/Δ*F*] " is the frequency-domain, non-uniform-sampling Fourier transform factor in the direction of range. The range-domain sampling after transformation is uniform, and the sampling interval is *c*/(2Δ*f*), but the frequency-domain sampling interval defined by *imn*Δ*fd* before transformation is non-uniform and random. *exp*! *j*2π(*fm* + Δ*fdimn*)(*lv* − *M*/2)*m*/(*M f*0) " is the time-domain, non-uniform sampling Fourier transform factor in the velocity direction. The equivalent time-domain sampling before transformation is non-uniform because of variation of between frame, and the Doppler- frequency-domain sampling or velocity sampling after the transformation is uniform, with an interval of *c*/(2 *f*0*MNT*). *exp*! *j*2π[(*fm* + Δ*fdimn*)(*lv* − *M*/2)*n*/*MN f*0] " is the range-Doppler coupling compensation factor, which can realize the phase compensation of the range movement within the sampling unit and across the sampling unit. Obviously, the motion compensation and 2D Fourier transform are carried out synchronously, and different phase compensation is used in different velocity channels to improve the compensation accuracy.

Obviously, when *imn*Δ*fd* = *n*Δ*f*, *fm* = *f*0, the RFH synthesis wideband system degenerates to the conventional SF synthesis wideband system, and the 2D GMF of Equation (4) degenerates to the conventional 2D windowed DFT operation, which can be implemented by 2D fast Fourier transform.

However, the computation complexity of 2D GMF is much higher than that of 2D FFT, so it is necessary to combine different RFH modes and use fast algorithms to realize 2D GMF to meet the real-time needs of high-speed platform-borne application.

Since the range width of echoes in each sampling unit is *c*τ/2, if the non-ambiguous range depth of the segment image is *Rp*, it is required that *Rp* ≥ *c*τ/2 + *RI*, so that the echoes of scatters which move across sampling unit can be accumulated in-phase at the same point of panoramic image, and this is of importance in supersonic/hypersonic applications. *RI* is the maximum moving range of the scatters in an imaging period of *MNT*. The overlapping width of the segment image of adjacent sampling units in the range direction is *Rp* − *c*τ/2, and the number of non-overlapping range resolution cells is *Kd* = *Nc*τ/ 2*Rp* . Then, the panoramic image in the beam irradiation area can be obtained from the segment image of all the sampling units as follows:

$$Z(i,j) = \sum\_{k=0}^{K-1} P(i - kK\_d, j, k) L(i - kK\_d) i = 0, 1, \dots, K\_d + (K - 1)(N - K\_d) - 1; j = 0, 1, \dots, M - 1, \dots, \frac{N - 1}{N}$$

where *U*(*i*) is a rectangular function with length *N*, defined as:

$$\mathcal{U}(i) = \begin{cases} \ 1, \ 0 \le i \le N - 1 \\ \ 0, \ otherwise \end{cases}$$

In summary, the procedure of imaging process is shown in Figure 2.

**Figure 2.** The procedure of imaging processing.

#### *3.3. The Tradeo*ff *between Randomness and Real-Time Performance*

For the conventional SF radar, there is fast imaging algorithm of 2D FFT because of uniform sampling in both frequency-domain and slow-time-domain. Theoretically speaking, for the RFH radar system, the fast imaging algorithm depends on the structural characteristics of the RFH pattern. Because there are too many RFH patterns, it is impossible to design optimal imaging algorithm that has the least amount of computation for every RFH pattern. However, it is possible to design pipeline-parallel processing real-time imaging for different RFH modes combined with the data acquisition process.

In this paper, several RFH modes are defined as follows.

#### 3.3.1. Intra-Complex-Frame Pure-RFH

In the *m*-th frame and the *n*-th pulse period of a complex frame, the frequency *fmn* of the transmitting signal is randomly selected according to certain algorithm in the frequency band (*fm*, *fm* + Δ*F*). The basic frequency *fm* can hop randomly in a wide frequency range between frames. In this mode of RFH, *fmn* = *fm* + Δ*fdimn*, where *imn* is the sequence number corresponding to the carrier frequency of the *m*-th frame and the *n*-th period. *imn* is randomly and un-repeatedly selected according to a certain probability density distribution in the integer set [0, 1, 2, ... , *I* − 1], where *I* = Δ*F*/Δ*fd*. Different frame-*m* adopts different baseband frequency point set {Δ*fdimn <sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> " . This mode has the best performance of randomness, low-interception, and anti-interference.

#### 3.3.2. INTRA-Frame Pure-RFH

In each frame of a complex frame, the same frequency point set and the same hopping-order are adopted. For any frame *m*<sup>1</sup> and *m*2, *fm*1*<sup>n</sup>* = *fm*2*<sup>n</sup>* = *f <sup>n</sup>*, where *f <sup>n</sup>* is randomly selected according to certain algorithm in the frequency band (*f*0, *f*<sup>0</sup> + Δ*F*). Because the frequency is the same between frames, the initial phase ϕ*mn* of the transmitting signal must be randomly selected according to certain algorithm between 0 and π, so as to reduce the cyclic autocorrelation of the transmitting signal and maintain the low interception performance. For different complex frames, the basic frequency *f*<sup>0</sup> randomly changes in a large range as much as possible. The same frequency between frames can simplify the Doppler processing and improve the real-time performance. However, compared with the intra-complex-frame pure-RFH mode, this mode loses performance of randomness, low-interception, and anti-interference because the same frequency point set and the same hopping-order are adopted in each frame.

In this mode of RFH, *fm* = *f*0, *imn* = *in* and *fmn* = *f*<sup>0</sup> + Δ*fdin*, where *in* is randomly and un-repeatedly selected according to a certain probability density distribution in the range of integer set [0, 1, 2, ... , *I* − 1], where *I* = Δ*F*/Δ*fd*.

#### 3.3.3. Intra-Complex-Frames Pseudo-RFH

The baseband frequency points are obtained by uniform sampling in (0, Δ*F*), *fmn* = *fm* + *i mn*Δ*F*/*N*, where *n* = 0, 1, 2, ... , *N* − 1. The basic frequency *fm* can hop randomly in a wide frequency range between frames, and *i mn* can be randomly selected according to certain algorithm in the range of {0, 1, 2, ... , *N* − 1}. In this mode, different frames can adopt different basic frequencies and different hopping-orders but the same uniformly sampled baseband frequency point set, and the pulse compression processing in the range direction can be done by fast Fourier transform after higher order motion compensation and data rearrangement, which improves the real-time performance. Compared with the above two modes, the shortcoming of this mode is that the non-ambiguous range depth of segment image at each sampling unit decreases because of the frequency-domain uniform sampling. It is necessary to increase *N* and reduce the frequency hopping interval Δ*f* = Δ*F*/*N* to meet the design requirements of non-ambiguous range depth. In addition, this mode loses more performance of randomness, low-interception, and anti-interference.

In this mode, Δ*fdimn* = (Δ*F*/*N*)*i mn* and *fmn* = *fm* + (Δ*F*/*N*)*i mn*, where *i mn* is randomly and un-repeatedly selected according to a certain probability density distribution in the range of integer set [0, 1, 2, ... , *N* − 1].

#### 3.3.4. Intra-Frame Pseudo-RFH

The baseband frequency points are obtained by uniform sampling in (0, Δ*F*), *fmn* = *f*<sup>0</sup> + *i <sup>n</sup>*Δ*F*/*N*, where *n* = 0, 1, 2, ... , *N* − 1. The basic frequency does not hop between frames. *i <sup>n</sup>* is randomly and un-repeatedly selected according to certain algorithm in the range of {0, 1, 2, ... , *N* − 1}. The initial phase ϕ*mn* is randomly selected according to certain algorithm between 0 and π. For different complex frames, the basic frequency *f*<sup>0</sup> randomly changes in a large range as much as possible. In this mode, both the pulse compression processing in the range direction and the Doppler processing in the speed direction can be done by fast Fourier transform, which further improves the real-time performance. Decreasing in non-ambiguous range depth, and more loss in performance of randomness, low-interception, and anti-interference, are also the shortcomings of this mode.

See Appendix A for the specific generation of 2D RFH patterns for those four RFH modes

#### *3.4. Online, Fast 2D Imaging Algorithms for Di*ff*erent RFH Modes*

As mentioned before, in order to avoid image defocusing caused by the special range-Doppler coupling in the case of RFH, Doppler pre-processing must be carried out synchronized with the pulse compression process. However, Doppler processing is a kind of inter-frame processing. It will cause a serious delay in signal processing if Doppler processing is not done until the data of the last frame is collected. Considering that the data of the RFH synthetic wideband radar is obtained in the order of frames and periods, pipeline-parallel processing can be used to divide the Doppler pre-processing and pulse compression into each frame and each period, which can reduce the delay in signal processing.

#### 3.4.1. Intra-Complex-Frame Pseudo-RFH Mode

In the 2D matched filtering (range-Doppler imaging) equation of Equation (4), in each frame of each sampling unit, the data are rearranged in the way of frequency point from small to large. Set *n* as the frequency point number after rearrangement and the corresponding number before rearrangement is *nm*. Set the rearranged data as *y S*(*n* |*m*, *k*). According to the definition of this RFH mode Δ*fdimn* = *n* Δ*F*/*N*, so the 2D matched filtering of Equation (4) can be re-written as follows:

$$\begin{split} P(l\_{\mathbb{H}}, l\_{\mathbb{U}}, k) &= \sum\_{m=0}^{M-1} \sum\_{n=0}^{N-1} \mathcal{W}\_{\Omega}(m, n\_{\mathbb{m}}) y\_{\mathbb{S}}'(n'|m, k) \times \exp\{-j2\pi \left[ (f\_{m} + n' \Delta \mathcal{F}/N) k \tau \right] \} \\ &\times \exp\{-j2\pi n' l\_{\mathbb{u}}/N \} \times \exp\left\{ j2\pi \left[ \frac{f\_{m} + n' \Delta \mathcal{F}/N}{f\_{0}} \left( l\_{\mathbb{U}} - \frac{M}{2} \right) \frac{m}{M} \right] \right\} \\ &\times \exp\left\{ -j2\pi \frac{f\_{m}}{\Delta \mathcal{F}} l\_{\mathbb{u}} \right\} \times \exp\left\{ j2\pi \left[ \frac{f\_{m} + n' \Delta \mathcal{F}/N}{f\_{0}} \left( l\_{\mathbb{u}} - \frac{M}{2} \right) \frac{n\_{\mathbb{m}}}{M N} \right] \right\} \end{split} \tag{5}$$

Imaging Algorithm 1: FFT-based pulse-compression on multiple velocity channels

	- (A) Obtain the data of the *k*-th sampling unit of the *m*-th frame: ! *yS*(*n*|*m*, *k*) *<sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> " .
	- (B) Data rearrangement: ! *yS*(*n*|*m*, *k*) *<sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> " → \$ *y S*(*n* |*m*, *k*) *<sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> % .
	- (C) Windowing, range calibration, multi-velocity-channel motion compensation, and Doppler pre-processing:

$$y\_S^{\prime\prime}(n^{\prime}|m,k,l\_{\upsilon}) = \mathcal{W}\_{\Omega}(m,n\_{\upsilon})y\_S^{\prime}(n^{\prime}|m,k)\psi(m,n^{\prime},k,l\_{\upsilon}),\tag{6}$$

where

$$\begin{split} \psi(m, n', k, l\_{\mathbb{D}}) &= \exp\left\{-j2\pi[(f\_{\mathbb{H}} + n'\Delta F/N)k\tau] + j2\pi(f\_{\mathbb{H}} + n'\Delta F/N)\left(l\_{\mathbb{D}} - \frac{M}{2}\right)m/(f\_{\mathbb{D}}M) + (f\_{\mathbb{D}} + n')(f\_{\mathbb{D}}M)\right\}. \end{split} \tag{7}$$

(D) Multi-velocity-channel fast pulse-compression processing.

It can be obtained according to Equations (5) and (6) that

$$P(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}, k) = \sum\_{m=0}^{M-1} \exp\{-j2\pi \frac{f\_m}{\Delta F} l\_{\boldsymbol{\nu}}\} \sum\_{n'=0}^{N-1} y\_{\boldsymbol{S}}^{\prime\prime}(n'|m, k, l\_{\boldsymbol{\nu}}) \times \exp\{-j2\pi n' l\_{\boldsymbol{\nu}}/N\}.$$

The operation *N*−<sup>1</sup> *<sup>n</sup>*=<sup>0</sup> *y <sup>S</sup>* (*n* <sup>|</sup>*m*, *<sup>k</sup>*, *lv*) <sup>×</sup> *exp*! −*j*2π*n lu*/*N*" is a uniform sampling DFT, which can be realized by *FFT*:

$$\left\{ Y\_{\S}^{\prime\prime}(l\_{\mathsf{H}}, l\_{\mathsf{U}} | m, k) : l\_{\mathsf{U}} = 0, 1, \ldots, N - 1 \right\} = \text{FFT} \left\{ y\_{\S}^{\prime\prime}(n^{\prime} | m, k, l\_{\mathsf{U}}) : n^{\prime} = 0, 1, \ldots, N - 1 \right\}. \tag{8}$$

#### (E) Current-frame Doppler-accumulation processing:

$$\begin{split} P^{(\mathfrak{m})}(l\_{\mathfrak{u}}, l\_{\mathfrak{v}}, k) &= P^{(\mathfrak{m}-1)}(l\_{\mathfrak{u}}, l\_{\mathfrak{v}}, k) + \exp\{-j2\pi \frac{f\_{\mathfrak{m}}}{\Delta \mathbf{F}} l\_{\mathfrak{u}}\} \times Y\_{\mathcal{S}}^{\prime\prime}(l\_{\mathfrak{u}}, l\_{\mathfrak{v}}|\mathfrak{m}, k) \\ (l\_{\mathfrak{u}} &= 0, 1, \dots, \mathcal{N} - 1; l\_{\mathfrak{v}} = 0, 1, \dots, \mathcal{M} - 1). \end{split} \tag{9}$$

Obviously, *P*(*M*−<sup>1</sup>)(*lu*, *lv*, *k*) = *P*(*lu*, *lv*, *k*).

In this algorithm, gradually, a clear image is obtained through iteration. Every additional frame of data increases the sharpness of the image. Because the rearranged frequency points are uniformly sampled, the pulse compression processing on each velocity-channel can be realized by FFT, which improves the real-time performance of the imaging algorithm.

#### 3.4.2. Intra-Frame Pseudo-RFH Mode

This RFH mode is equivalent to making all *fm* = *f*<sup>0</sup> in the intra-complex-frame pseudo-RFH mode. After data rearrangement, the *nm* corresponding to *n* is the same, which is labeled as *n* and independent of *m*. In this mode, there is neither coupling phase term of *lu* and *m*, nor coupling phase term *lv* and *n*. Compared to imaging algorithm 1, the computational complexity can be further reduced by using fractal-dimension processing.

Imaging Algorithm 2. Fractal-dimension processing with Doppler pre-processing


$$y\_S''(n'|m,k) = \mathcal{W}\_\Omega(m,n\_m)y\_S'(n'|m,k) \times \exp\left\{-j2\pi\left[\frac{f\_0 + n'\Delta F/N}{f\_0}\frac{m}{2}\right]\right\}(m=0,1,\ldots,M-1). \tag{10}$$

• Step 3. For the data with the same sampling unit number *k*, pulse period number *n* , and different frame number *m*, carry out the non-integer sampling (*lv* is an integer, but (*f*<sup>0</sup> + *n* Δ*F*/*N*)*lv*/ *f*<sup>0</sup> is not an integer) IDFT processing:

$$Y\_S''(l\_v|n',k) = \sum\_{m=0}^{M-1} y\_S''(n'|m,k) \times \exp\left\{j2\pi \left[\frac{f\_0 + n'\Delta F/N}{f\_0}l\_v \frac{m}{M}\right] \right\}(l\_v = 0, 1, \dots, M-1). \tag{11}$$

• Step 4. Range calibration and multi-velocity-channel motion compensation.

$$\begin{aligned} Y\_S'(l\_\mathbb{v}|n',k) &= Y\_S''(l\_\mathbb{v}|n',k) \\ \times \exp\left\{-j2\pi[(f\_0+n'\Delta F/N)k\tau] + j2\pi\left[\frac{f\_0+n'\Delta F/N}{f\_0}\left(l\_\mathbb{v}-\frac{M}{2}\right)\frac{n}{M\Lambda}\right]\right\}. \end{aligned} \tag{12}$$

• Step 5. Range-dimension pulse-compression processing on each velocity channel:

$$P(l\_{\rm u}, l\_{\rm v}, k) = \sum\_{n'=0}^{N-1} Y'\_{\mathcal{S}}(l\_{\rm v}|n', k) \times \exp\{-j2\pi n'l\_{\rm u}/N\}(l\_{\rm u} = 0, 1, \dots, N-1). \tag{13}$$

Obviously, the DFT processing of Equation (13) can be realized by FFT.

#### 3.4.3. Intra-Frame Pure-RFH Mode

In this mode, *fm* = *f*0, *imn* = *in*, and the common phase factor *exp*! −*j*2π*fmlu*/Δ*F* " can be ignored, so Equation (4) can be written as follows:

$$\begin{array}{c} P(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}, k) = \sum\_{m=0}^{M-1} \sum\_{n=0}^{N-1} \mathcal{W}\_{\Omega}(m, n) \, y\_{\boldsymbol{S}}(n|m, k) \times \exp\{-j2\pi[(f\_0 + \Delta f\_d i\_n)k\tau] \} \times \\ \exp\{-j2\pi \frac{\Delta f\_d i\_n}{\Delta \Gamma} l\_{\boldsymbol{\nu}}\} \times \exp\{j2\pi \left[\frac{f\_0 + \Delta f\_d i\_n}{f\_0} (l\_{\boldsymbol{\nu}} - \frac{M}{2}) \frac{m}{M}\right] \} \times \exp\left\{j2\pi \left[\frac{f\_0 + \Delta f\_d i\_n}{f\_0} (l\_{\boldsymbol{\nu}} - \frac{M}{2}) \frac{n}{MN}\right] \right\}. \end{array} \tag{14}$$

Imaging Algorithm 3. Pipeline-parallel processing 2D matching filtering algorithm

	- (A) Obtain the data of the *k*-th sampling unit of the *m*-th frame: ! *yS*(*n*|*m*, *k*) *<sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> " ;
	- (B) Windowing, range calibration, multi-velocity-channel motion compensation, and Doppler pre-processing:

$$
\underline{y}\_S''(n|m,k,l\_\mathcal{v}) = \mathcal{W}\_\Omega(m,n)\underline{y}\_S(n|m,k)\psi(m,n,k,l\_\mathcal{v}),\tag{15}
$$

where

$$\begin{split} \psi(m, n, k, l\_v) &= \exp\left[ -j2\pi \left[ (f\_0 + \Delta f\_d i\_n) k\pi \right] + j2\pi (f\_0 + \Delta f\_d i\_n) \left( l\_v - \frac{M}{2} \right) m / (f\_0 M) + \\ &\quad \cdot j2\pi (f\_0 + \Delta f\_d i\_n) \left( l\_v - \frac{M}{2} \right) n / (f\_0 M N) \right]. \end{split} \tag{16}$$

(C) Multi-velocity-channel pulse-compression processing

It can be obtained according to Equations (14) and (15) that

$$P(l\_{\mu}, l\_{\upsilon}, k) = \sum\_{m=0}^{M-1} \sum\_{n=0}^{N-1} y\_S^{\prime\prime}(n|m, k, l\_{\upsilon}) \times \exp\{-j2\pi \frac{\Delta f\_d i\_n}{\Delta F} l\_{\mu}\}.$$

The operation *Y <sup>S</sup>* (*lu*, *lv*|*m*, *<sup>k</sup>*) <sup>=</sup> *N*−<sup>1</sup> *<sup>n</sup>*=<sup>0</sup> *y <sup>S</sup>* (*n*|*m*, *<sup>k</sup>*, *lv*) <sup>×</sup> *exp*! −*j*2πΔ*fdinlu*/Δ*F* " is non-uniform sampling DFT. Even if the data are rearranged in the order of frequency points from small to large, the rearranged data are still non-uniform samples, which are difficult to be realized by FFT before inserting into uniform samples.

$$\left\{ Y\_S''(l\_\mathbf{u}, l\_\mathbf{v}|m, k) : l\_\mathbf{u} = 0, 1, \dots, N - 1 \right\} = N \text{tIDFT} \left\{ y\_S''(n|m, k, l\_\mathbf{v}) : n = 0, 1, \dots, N - 1 \right\}. \tag{17}$$

#### (D) Current-frame Doppler accumulation processing

$$\begin{aligned} P^{(m)}(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}, k) &= P^{(m-1)}(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}, k) + \mathcal{Y}\_{\boldsymbol{S}}^{\prime\prime}(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}} | \boldsymbol{m}, k) \\ (l\_{\boldsymbol{\nu}} &= 0, 1, \ldots, N - 1; l\_{\boldsymbol{\nu}} = 0, 1, \ldots, M - 1). \end{aligned} \tag{18}$$

Imaging Algorithm 4. Multi-velocity-channel pulse-compression based on data rearrangement/interpolation/range dimension FFT

In order to ensure the accuracy of interpolation, the interpolation processing must be done in each velocity channel. Due to the strong randomness of phase (2 *f*0*vT*/*c*)*n* after data rearrangement, the rearranged data needs to be processed by Doppler pre-processing, so that the change range of signal phase (2 *f*0Δ*vT*/*c*)*n* on each speed channel is far less than one, where Δ*v* is the width of velocity resolution unit. On each velocity channel, the effect of rearranged random phase term (2 *f*0Δ*vT*/*c*)*n* is negligible. The data accumulated by Doppler pre-processing are rearranged and interpolated on each velocity channel, which makes it is easy to ensure the interpolation accuracy.

• Step 1. Windowing and velocity calibration.

$$\begin{split} \mathcal{Y}\_{\mathcal{S}}''(n|m,k) &= \mathcal{W}\_{\Omega}(m,n\_m) y\_{\mathcal{S}}(n|m,k) \times \exp\left\{-j2\pi \left[\frac{f\_0 + \Delta f\_d i\_n}{f\_0} \frac{m}{2}\right]\right\} \\ &m = 0, 1, \ldots, M-1. \end{split} \tag{19}$$

• Step 2. For the data with the same sampling unit number *k*, pulse period number *n*, and different frame number *m*, carry out the non-integer sampling (*lv* is an integer, but (*f*<sup>0</sup> + Δ*fdin*)*lv*/ *f*<sup>0</sup> is not an integer) IDFT processing:

$$Y\_S''(l\_\mathcal{v}|n,k) = \sum\_{m=0}^{M-1} y\_S''(n|m,k) \times \exp\left\{j2\pi \left[\frac{f\wp + \Delta f d i\_n}{f\_0} l\_\mathcal{v} \frac{m}{M}\right]\right\}\tag{20}$$
 
$$(l\_\mathcal{v} = 0, 1, \dots, M-1). \tag{21}$$

• Step 3. Range calibration, multi-velocity-channel motion compensation:

$$\mathcal{Y}\_{\mathcal{S}}^{\prime}(l\_{\mathbb{P}}|n,k) = \mathcal{Y}\_{\mathcal{S}}^{\prime\prime}(l\_{\mathbb{P}}|n,k) \times \exp\left\{-j2\pi[(f\_0 + \Delta f\_d i\_n)k\tau] + j2\pi[\frac{f\_0 + \Delta f\_d i\_n}{f\_0}(l\_{\mathbb{P}} - \frac{M}{2})\frac{n}{M\mathcal{N}}]\right\}.\tag{21}$$


$$P(l\_{\nu}, l\_{\upsilon}, k) = \sum\_{\substack{\mathbf{n}'=0\\ \mathbf{n}'=0}}^{N-1} Y\_{\mathbb{S}}(l\_{\upsilon}|\mathbf{n}', k) \times \exp\{-j2\pi\mathbf{n}'l\_{\mathbf{n}'}/N\}\tag{22}$$
 
$$(l\_{\mu} = 0, 1, \dots, N-1).$$

Obviously, the DFT processing of Equation (22) can be realized by *FFT*.

#### 3.4.4. Intra-Complex Frame Pure-RFH Mode

Imaging Algorithm 5. Pipeline-parallel processing 2D matched filtering algorithm

	- (A) Obtain the data of the *k*-th sampling unit of the *m*-th frame: ! *yS*(*n*|*m*, *k*) *<sup>n</sup>* <sup>=</sup> 0, 1, ... , *<sup>N</sup>* <sup>−</sup> <sup>1</sup> " ;
	- (B) Windowing, range calibration, multi-speed channel motion compensation, and pre-processing of Doppler:

$$
\underline{y}\_S^{\prime\prime}(n|m,k,l\_\upsilon) = \mathcal{W}\_\Omega(m,n)\underline{y}\_S(n|m,k)\psi(m,n,k,l\_\upsilon),\tag{23}
$$

.

where

$$\begin{split} \psi(m,\mathbf{u},k,l\_{\upsilon}) &= \exp\{-j2\pi[(f\_{0}+\Delta f\_{d}i\_{\mathrm{mm}})k\tau] + j2\pi(f\_{0}+\Delta f\_{d}i\_{\mathrm{mm}})(l\_{\upsilon}-\frac{M}{2})m/(f\_{0}M) + \\ &\qquad\qquad\qquad\qquad\qquad j2\pi(f\_{0}+\Delta f\_{d}i\_{\mathrm{mm}})(l\_{\upsilon}-\frac{M}{2})n/(f\_{0}MN)\}. \end{split} \tag{24}$$

#### (C) Multi-velocity-channel pulse-compression processing.

It can be obtained according to Equations (4) and (23) that

$$P(l\_{\rm tr}, l\_{\rm tr}, k) = \sum\_{m=0}^{M-1} \exp\{-j2\pi \frac{f\_m}{\Delta F} l\_{\rm tr}\} \sum\_{n=0}^{N-1} y\_S^{\prime\prime}(n|m, k, l\_{\rm tr}) \times \exp\left\{-j2\pi \frac{\Delta f\_d i\_{mn}}{\Delta F} l\_{\rm tr}\right\}.$$

The operation *N*−<sup>1</sup> *<sup>n</sup>*=<sup>0</sup> *y <sup>S</sup>* (*n*|*m*, *<sup>k</sup>*, *lv*) <sup>×</sup> *exp*! −*j*2πΔ*fdimnlu*/Δ*F* " is non-uniform sampling DFT:

$$\left\{ Y\_{\mathcal{S}}''(l\_{\mathfrak{u}}, l\_{\mathfrak{v}} | m, k) : l\_{\mathfrak{u}} = 0, 1, \dots, N - 1 \right\} = N \text{LDFT} \left\{ y\_{\mathcal{S}}''(n | m, k, l\_{\mathfrak{v}}) : n = 0, 1, \dots, N - 1 \right\}. \tag{25}$$

#### (D) Current-frame Doppler accumulation processing

$$\begin{split} P^{(m)}(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}, k) &= P^{(m-1)}(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}, k) + \exp\{-j2\pi \frac{f\_m}{\Delta F} l\_{\boldsymbol{\nu}}\} \times Y\_S^{\prime\prime}(l\_{\boldsymbol{\nu}}, l\_{\boldsymbol{\nu}}|m, k) \\ &(l\_{\boldsymbol{\nu}} = 0, 1, \dots, N - 1; l\_{\boldsymbol{\nu}} = 0, 1, \dots, M - 1). \end{split} \tag{26}$$

Obviously, *P*(*M*−<sup>1</sup>)(*lu*, *lv*, *k*) = *P*(*lu*, *lv*, *k*).

#### *3.5. Real-Time 1D Hig-Range-Resolution (HRR) Imaging Algorithm of HPRF RFH Radar*

The 1D HRR imaging algorithm is mainly used in the stage of target-tracking. As mentioned before, due to the special range-Doppler coupling effect of RFH synthetic wideband imaging radar in supersonic/hypersonic applications, the HRR imaging processing algorithm is essentially different from that of conventional stepped frequency synthetic wideband imaging radar, which can obtain HRR range profile by using the data of one frame. In RFH radar, if we want to obtain the range profile of the current frame, Doppler pre-processing should be executed before the current frame in order to suppress the special-range Doppler coupling effect.

The 1D HRR imaging algorithm is the same as the above online, fast, 2D imaging algorithms. The difference is that, in the target tracking stage, the target has been detected by the range-Doppler 2D image acquired in the searching stage, and the sampling unit and velocity channel where the target is located have been measured; then, the range-Doppler imaging processing algorithm only needs to be carried out on the velocity channel of the target and its adjacent channel. The multi-velocity-channel range-Doppler 2D imaging and splicing processing only need to be carried out in the corresponding sampling unit and the adjacent sampling units. That can significantly reduce the complexity of

computation. In order to meet the requirements of high data rate in the tracking phase, the multi-velocity-channel range profile must be updated by frame, not by complex-frame as in 2D imaging. It is easy to design the iterative algorithm that can obtain the range profile of the next frame from that of the current frame by adding a few computations.

#### **4. Experimental Results and Evaluation of Fast Imaging Algorithm**

The diving angle of the aircraft was θ*<sup>M</sup>* = −30◦ (the angle between the moving direction of the radar platform and the horizontal plane), the flight speed was *VM* = 1750 m/s, and the height was *H* = 30 km. The pulse repetition period was *T* = 14 us, the carrier basic frequency was *f* <sup>0</sup> = 35 GHz, the frequency hopping interval was Δ*f* = 6.25 MHz, and the pulse width was τ = 0.08 us. The number of periods in one frame was *N* = 16, the corresponding synthetic wideband was *N*Δ*f* = 100 MHz, the range resolution was Δ*R* = c/(2*N*Δ*f*) = 1.5 m, the number of frames in one complex-frame was *M* = 32, the corresponding period of complex-frame was *MNT* = 7.168 ms, and the velocity resolution was Δ*v* = *c*/(2*f* <sup>0</sup>*MNT*) = 0.6 m/s.

We assumed that the target was a stationary ship on the sea, and was equivalent to seven strong scattering centers. The parameters of each scattering center are shown in Table 1.


**Table 1.** Parameters of scattering centers of target.

According to the definition of resolution-cell in the above 2D GMF method, the theoretical coordinates of those seven scattering centers in spliced 2D range image are shown in Table 2.


**Table 2.** Theoretical positions of each scattering center of the target in image.

Sea clutters were simulated according to average SCR of 8 dB. The intra-complex-frame pseudo-RFH mode of *imn*Δ*fd* = *i mn*(Δ*F*/*N*) and *fm* = *f*<sup>0</sup> was investigated first, and the 2D range-Doppler normalized image obtained by 2D GMF is shown as Figure 3.

The other RFH modes were also used in the simulation experiments, and the imaging results were almost the same except the side-lobe level. The side-lobe level of pure-RFH mode was higher than that of pseudo-RFH mode. The simulation results show that the 2D GMF method can obtain high quality images in all modes of RFH. However, the computation-complexity of this method is much higher than that of 2D FFT. The 2D GMF was implemented by different pipeline-parallel algorithm according to different RFH mode in order to improve the real-time performance.

For Imaging Algorithm 1, because the rearranged data are uniformly sampled in frequency domain, the pulse-compression processing on each velocity channel can be realized by FFT, which improves the real-time performance of the imaging algorithm. The 2D GMF needs 2 × (*MN*) <sup>2</sup> complex multiplication operations, but the algorithm of FFT-based pulse-compression on multiple velocity channels needs only *M*2[3*N* + *N* log(*N*)] complex multiplication operations. The total operation is reduced to [log(*N*) + 3]/*N* times of the original.

Imaging Algorithm 2 needs *MN*[*M* + log(*N*) + 3] complex multiplication operations. Compared with Algorithm 1, the computation is much less, but it is not convenient for pipeline processing, and the delay time of signal processing is not necessarily short.

Imaging Algorithm 3 needs *M*2*N*<sup>2</sup> + 2*M*2*N* complex multiplication operations, which is more than for Imaging Algorithm 1 and Imaging Algorithm 2. However, the operations can be decomposed to each frame for execution, and the delay time of signal processing is short. It can meet the requirements of real-time imaging by multiprocessor parallel processing, and each processor is responsible for windowing, motion compensation, pulse compression, and Doppler accumulation of several velocity channels.

Regardless of the operations of low-order spline interpolation, Imaging Algorithm 4 needs *MN*[*M* + log(*N*) + 3] complex multiplication operations, which is equivalent to that of Algorithm 2. It is also inconvenient for pipeline processing, and the delay time of signal processing is not necessarily shorter than that of Algorithm 3.

Imaging Algorithm 5 needs *M*2*N*<sup>2</sup> + 3*M*2*N* complex multiplication operations. Similar to Imaging Algorithm 3, the operations can be decomposed to each frame for execution, and the delay time of signal processing is short. It can meet the requirements of real-time imaging processing through multiprocessor parallel processing.

The imaging results of the five imaging algorithms are almost the same as that of the 2D GMF. Figure 4 shows the 2D normalized image obtained by Imaging Algorithm 1.

**Figure 4.** Image obtained by Imaging Algorithm 1.

For the intra-frame pseudo-RFH mode, as a contrast, Figure 5 shows the imaging results obtained by the traditional fractal-dimension imaging algorithm without Doppler pre-processing. In this method, the data *yS*(*n*|*m*, *k*) are re-arranged in the order of frequency points from small to large in each frame, and a new data sequence *y <sup>S</sup>*(*n*|*m*, *k*) is obtained. Then, the range calibration is performed by multiplying the phase factor *exp*! −*j*2π[(*f*<sup>0</sup> + *n*Δ*F*/*N*)*k*τ] " , and the range profile *y <sup>S</sup>*(*lu*|*m*, *k*) of each frame is obtained by processing the calibrated data with *N*-point DFT (pulse compression). Finally, DFT (Doppler beam sharpening) processing of *M*-point is carried out for data of *M* frames in each range resolution cell *lu*, and the 2D normalized image *P*(*lu*, *lv*, *k*) is obtained.

**Figure 5.** Image obtained by traditional algorithm without Doppler pre-processing.

Obviously, due to the special range-Doppler coupling effect caused by RFH and the lack of Doppler pre-processing in the pulse-compression process, the image is seriously divergent.

For the conventional SF mode, as a contrast, Figure 6 shows the normalized imaging results obtained by using the traditional fractal-dimension imaging algorithm.

**Figure 6.** Image obtained by conventional stepped frequency (SF) mode.

Compared with Figures 3 and 4, it is shown that the RFH synthesis wideband system can achieve almost the same imaging effect as that of the conventional SF system, but the side-lobe level is higher in image of RFH system.

#### **5. Discussion**

In the case of RFH mode and a supersonic/hypersonic application, the conventional fractal-dimension 2D range-Doppler imaging algorithm makes it difficult to obtain high quality images because of the special range-Doppler coupling. Theoretical analysis and simulation results show that the proposed pipeline-parallel processing fast imaging algorithms based on Doppler pre-processing and 2D GMF can well suppress the above special range-Doppler coupling effect, avoid the divergence of the image in the range direction, and meet the requirements of real-time imaging. However, the side-lobe level of pure-RFH mode is higher than that of pseudo-RFH mode. Further, it is necessary to suppress the side-lobe level by optimizing the RFH pattern and the 2D window function *W*Ω(*m*, *n*).

**Author Contributions:** Conceptualization, S.H. and X.W.; data curation, X.W.; formal analysis, S.H.; investigation, X.W.; methodology, S.H.; project administration, S.H.; software, X.W.; supervision, S.H.; validation, X.W.; visualization, X.W.; writing—original draft, X.W.; writing—review and editing, S.H.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

• Generation of 2D RFH patterns for different RFH modes

A chaos sequence is used to control the generation of the RFH pattern, which has the advantage of infinite periodicity, so it is difficult for the interceptor to decipher.

#### 1. Intra-frame pseudo-RFH

Suppose we need to generate *N* stepped frequency points at [0, Δ*F*], where *N* = Δ*F*/Δ*f* and Δ*I* = Δ*f*/Δ*fd*. Then we can use the following Bernoulli chaotic sequence to generate the RFH pattern.

$$\begin{aligned} \mathbf{x}\_{n} &= 2.01 \mathbf{x}\_{n-1} \bmod 1\\ \mathbf{i}\_{n}^{\prime} &= \text{INT}[\mathbf{N} \times \mathbf{x}\_{n}] \\ \Omega &= \{i\_{\text{mm}} = i\_{n}^{\prime} \times \Delta \mathbf{l} \Big| n = 0, 1, \dots, N - 1; m = 0, 1, \dots, M - 1\}. \end{aligned} \tag{A1}$$

In the equation, the initial value *x*−<sup>1</sup> can be randomly selected in the range of (0, 1).

For different complex frames, a new RFH pattern can be adopted through changing the initial value *x*−1.

The Bernoulli chaotic pattern can also be used for the hopping of basic frequency *f*<sup>0</sup> between complex frames. If the bandwidth of the antenna is Δ*Ft*, and *IMAX* = Δ*Ft*/Δ*fd*, the basic frequency *fi* of the *i*-th complex frames can be obtained as follows

$$\begin{aligned} z\_i &= 2.01 z\_{i-1} \bmod 1\\ j\_i' &= INT[I\_{MAX} \times z\_i] \\ f\_i &= j\_i' \Delta f\_d(i = 0, 1, \ldots). \end{aligned} \tag{A2}$$

The initial value *z*−<sup>1</sup> can be randomly selected in the range of (0,1).

Of course, constraints can be inserted in Equations (A1) and (A2). If *i <sup>n</sup>* − *i n*−1 or *j <sup>i</sup>* − *j i*−1 is less than a certain value, the iteration value will be discarded and the next iteration value will be selected.

#### 2. Intra-complex-frame pseudo-RFH

In each complex frame, the frequency of the *n*-th pulse repetition period of the *m*-th frame is *fm* + *imn*Δ*fd* and the basic frequency *fm* of each frame randomly changes within the allowable bandwidth of the radar antenna, and the *N* frequency points of each frame are randomly selected from the *N* uniformly stepped frequency points in [0, Δ*F*]. Different frames use different frequency point orders. *fm* and *imn* are generated as follows.

$$\begin{aligned} z\_m &= 2.01 z\_{m-1} \bmod 1\\ j'\_m &= \text{INT}[I\_{MAX} \times z\_m] \\ f\_m &= j'\_m \Delta f\_d(i = 0, 1, \dots, M-1) \end{aligned} \tag{A3}$$

$$\begin{array}{c} \mathbf{x}\_{\textit{m,n}} = 2.01 \mathbf{x}\_{\textit{m,n-1}} \bmod 1\\ \mathbf{i}'\_{\textit{m}\textit{m}} = \textit{INT}[\mathbf{N} \times \mathbf{x}\_{\textit{m,n}}] \\ \Omega = \{f\_{\textit{m}}, i\_{\textit{m}\textit{n}} \Delta f\_{\textit{d}} = \mathbf{i}'\_{\textit{m}\textit{n}} \times \Delta \textit{l} \Delta f\_{\textit{d}} \vert \textit{n} = 0, 1, \ldots, N - 1; m = 0, 1, \ldots, M - 1\} \end{array} \tag{A4}$$

where *xm*,−<sup>1</sup> = *xm*−1,*M*−1; the initial values *z*−<sup>1</sup> and *x*0,−<sup>1</sup> can be randomly selected in the range of (0, 1). By changing the initial value, multiple groups of RFH patterns can be generated.

#### 3. Intra-frame pure-RFH

In each complex frame, the frequency of the *n*-th pulse repetition period of the *m*-th frame is *f*<sup>0</sup> + *in*Δ*fd*. The basic frequency *f*<sup>0</sup> of each frame is constant, but *f*<sup>0</sup> can randomly jump within the allowable bandwidth of the radar antenna according to Equation (A2) between complex frames. Although the *N* frequency points of each frame are randomly selected in [0, Δ*F*], different frames adopt the same frequency points and order. If *I MAX* = Δ*F*/Δ*fd*, the RFH pattern is generated as follows.

$$\begin{aligned} \mathbf{x}\_{n} &= 2.01 \mathbf{x}\_{n-1} \bmod 1\\ \mathbf{i}'\_{n} &= \text{INT} \left[ \mathbf{l}'\_{MAX} \times \mathbf{x}\_{n} \right] \\ \Omega = \{i\_{mn} \Delta f\_{d} = \mathbf{i}'\_{n} \Delta f\_{d} \Big| n &= 0, 1, \dots, N - 1; m = 0, 1, \dots, M - 1\}. \end{aligned} \tag{A5}$$

By changing the initial value of *x*−1, multiple groups of RFH patterns can be generated.

#### 4. Intra–complex-frames pure-RFH

The basic frequency *fm* of each frame jumps within the frequency band allowed by the radar antenna. The *N* frequency points of each frame are randomly selected in the bandwidth of [0, Δ*F*], and different frames use different frequency points.

Algorithm: Generating two independent Bernoulli chaotic sequences ! *fm* " and {*imn*}

$$\begin{aligned} z\_m &= 2.01 z\_{m-1} \bmod 1\\ j'\_m &= \text{INT}[I\_{MAX} \times z\_m] \\ f\_m &= j'\_m \Delta f\_d(i = 0, 1, \dots, M-1) \end{aligned} \tag{A6}$$

$$\begin{aligned} \mathbf{x}\_{m,n} &= 2.01 \mathbf{x}\_{m,n-1} \bmod 1\\ \mathbf{i}'\_{m\mathbf{m}} &= \text{INT}[I\_{MAX} \times \mathbf{x}\_{\mathbf{m},\mathbf{n}}] \\ \Omega &= \{f\_{m\prime}i\_{m\mathbf{m}}\Delta f\_d = \mathbf{i}'\_{m\mathbf{m}}\Delta f\_d\big| \mathbf{n} = 0, 1, \dots, N-1; m = 0, 1, \dots, M-1\}, \end{aligned} \tag{A7}$$

where, the initial value *z*0, *x*0,−<sup>1</sup> can be randomly selected in the range of (0, 1) and *xm*,−<sup>1</sup> = *xm*−1,*M*−1.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Review* **Compressed Sensing Radar Imaging: Fundamentals, Challenges, and Advances**

#### **Jungang Yang, Tian Jin, Chao Xiao \* and Xiaotao Huang**

College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China

**\*** Correspondence: xiaochao12@nudt.edu.cn

Received: 3 June 2019; Accepted: 11 July 2019; Published: 13 July 2019

**Abstract:** In recent years, sparsity-driven regularization and compressed sensing (CS)-based radar imaging methods have attracted significant attention. This paper provides an introduction to the fundamental concepts of this area. In addition, we will describe both sparsity-driven regularization and CS-based radar imaging methods, along with other approaches in a unified mathematical framework. This will provide readers with a systematic overview of radar imaging theories and methods from a clear mathematical viewpoint. The methods presented in this paper include the minimum variance unbiased estimation, least squares (LS) estimation, Bayesian maximum a posteriori (MAP) estimation, matched filtering, regularization, and CS reconstruction. The characteristics of these methods and their connections are also analyzed. Sparsity-driven regularization and CS based radar imaging methods represent an active research area; there are still many unsolved or open problems, such as the sampling scheme, computational complexity, sparse representation, influence of clutter, and model error compensation. We will summarize the challenges as well as recent advances related to these issues.

**Keywords:** radar imaging; synthetic aperture radar; compressed sensing; sparse reconstruction; regularization

#### **1. Introduction**

Radar imaging technique goes back to at least the 1950s. In the past 60 years, it has been stimulated by hardware performance, imaging theories, and signal processing technologies. Figure 1 shows the developmental history of radar imaging methods.

**Figure 1.** Developmental history of radar imaging methods.

Since the development of radar imaging techniques, the main theory that has been used has always been matched filtering [1–3]. Matched filtering is a linear process; it has the advantages of simplicity and stability. However, the drawbacks of the matched filtering method are also obvious. Since it does not exploit any prior information concerning the expected targets, its performance is limited by the signal bandwidth. It also requires a dense sampling to record the signals, according to the Shannon–Nyquist sampling theorem. Thus, the matched filtering method places significant requirements on the measured data, but only produces results with limited performance. As higher and higher imaging performance is demanded, the matched filtering method will struggle to meet the requirements.

Apart from the matched filtering framework, from a more generic mathematical viewpoint, radar imaging can be viewed as an inverse problem [4–7], whereby a spatial map of the scene is recovered using the measurements of the scattered electric field. The radar observation process is a Fredholm integral (F-I) equation of the first kind [8]. Due to observation limitations, such as limited bandwidth and limited observation angles, this inverse problem is usually ill-posed [9,10]. The classic least squares (LS) estimation method cannot solve such ill-posed inverse problems efficiently. The matched filtering method can be viewed as using an approximation to eliminate the irreversible or unstable term in the LS solution. This approximation leads to limited resolution and side-lobes in the results. Thus, matched filtering methods typically provide an image that blurs the details of the scene. Using proper models for the targets, super-resolution methods can improve the resolution of the imaging result [11,12].

Besides using approximation, the ill-posed inverse problem can be solved by another approach, i.e., adding an extra constraint to the LS formula and yielding a stable solution. This approach is called regularization [8]. In order to make the solution after regularization closer to the true value, the additional constraint should represent appropriately some prior knowledge. The regularization approach can also be explained by the Bayesian maximum a posteriori (MAP) estimation theory [6,13,14], which uses prior knowledge in a probabilistic way.

In the radar imaging scenario, imposing sparsity is one possible form of prior knowledge [15]. The advantages of the sparsity-driven regularization methods include increased image quality and robustness to limitations in data quantity. Compressed sensing (CS) refers to the use of under-sampled measurements to obtain the coefficients of a sparse expansion [16–20].

This paper summarizes the fundamentals, challenges and recent advances of sparse regularization and CS-based radar imaging methods. Using a unified mathematical model, we derive the best estimator (i.e., the minimum variance unbiased estimator), the LS estimator, the Bayesian MAP estimator, matched filtering, regularization, and CS reconstructions of the scene. The characteristics of these methods and their connections are also analyzed. Finally, we present some key challenges and recent advances in this area. These include the sampling scheme, the computational complexity, the sparse representation, the influence of clutter, and the model error compensation.

#### **2. Mathematical Fundamentals of Radar Imaging**

#### *2.1. Radar Observation Model*

In the continuous signal domain, under the Born approximation, the radar observation process can be denoted as [4]

$$s(\mathbf{r}) = \int A(\mathbf{r}, \mathbf{r'}) g(\mathbf{r'}) d\mathbf{r'} + n \tag{1}$$

where *s*(**r**) denotes the observed data at the observation position of **r**, *g*(**r** ) denotes the reflectivity coefficient at **r** in the scene, *A*(**r**,**r** ) denotes the system response from **r** to **r**, and *n* denotes noise.

Assuming the system is shift invariant, Equation (1) can be rewritten as

$$s(\mathbf{r}) = \int A(\mathbf{r} - \mathbf{r'}) g(\mathbf{r'}) d\mathbf{r'} + n \tag{2}$$

It can be seen that the radar observation model is a convolution process. Equation (1) is a Fredholm integral (F-I) equation of the first kind [8]. From a mathematical viewpoint, radar imaging can be viewed as the solution of the F-I equation—i.e., we want to recover *g*(**r**) from the observed data *s*(**r**) using the observation equation. Unfortunately, according to the theory of integral equations, solving the F-I equation is usually an ill-posed problem [8].

In practice, since digitization is commonly used, the observed data are discrete. Based on Equation (1), the discrete observation model can be written as

$$\mathbf{s} = \mathbf{A}\mathbf{g} + \mathbf{n} \tag{3}$$

where **s** is stacked from the samples of *s*(**r**), **g** is stacked from the samples of *g*(**r** ), **A** is formed from samples of *A*(**r**,**r** ), and **n** is the observation noise vector.

#### *2.2. Best Linear Unbiased Estimate and Least Squares Estimate of the Scene*

From the observation model shown in (3), radar imaging can be viewed as an estimation problem, in which the scene **g** is estimated based on the observed data **s** in a noisy environment. According to estimation theories, the minimum variance unbiased estimate is the "best" estimate in terms of estimation square error. From Equation (3), it can be seen that when the radar observation model is linear, the minimum variance unbiased estimate is the best linear unbiased estimate [13]—i.e., the expression of the best estimate of the scene is

$$\stackrel{\star}{\mathbf{g}} = \left(\mathbf{A}^{H}\mathbf{C}^{-1}\mathbf{A}\right)^{-1}\mathbf{A}^{H}\mathbf{C}^{-1}\mathbf{s} \tag{4}$$

where **C** is the covariance matrix of the noise term (**C** = *E* & **nn***<sup>H</sup>*' ).

In practice, a more tractable approach is LS estimation, which can be denoted as

$$\hat{\mathbf{g}} = \operatorname\*{argmin}\_{\mathbf{g}} \|\mathbf{s} \cdot \mathbf{A} \mathbf{g}\|\_2^2 \tag{5}$$

Therefore, the LS estimate of the scene is

$$\hat{\mathbf{g}} = \left(\mathbf{A}^{H}\mathbf{A}\right)^{-1}\mathbf{A}^{H}\mathbf{s} \tag{6}$$

If **n** is white Gaussian noise, we have **C** = σ2**I**, where **I** is the identity matrix. Under such condition, Equations (4) and (6) are the same. Therefore, the LS estimate will equal to the best estimate in white Gaussian noise [13].

If we want to use Equation (6) to calculate the best estimate of the scene, a prerequisite is that (**A***H***A**) is invertible. However, in practice, this prerequisite is usually not satisfied, as discussed below. We assume that the size of **A** is *M* × *N*, where *M* denotes the number of measurements and *N* denotes the number of unknown grid points. Then, the size of (**A***H***A**) is *<sup>N</sup>* <sup>×</sup> *<sup>N</sup>*.

One case is that *M* < *N*, i.e., the number of measurements is less than the unknown variables. CS is a typical example of this case. In such a case, rank(**A***H***A**) = rank(**A**) <sup>≤</sup> *<sup>M</sup>* <sup>&</sup>lt; *<sup>N</sup>*, i.e., (**A***H***A**) is irreversible.

In the above case, it can be seen that due to limited number of measurements, (**A***H***A**) is irreversible. Is it possible to make (**A***H***A**) invertible by increasing the number of measurements (i.e., make *M* > *N*. As mentioned previously, due to physical limitations, such as limited bandwidth and limited observation angles, if we take more measurements, the interval between the adjacent measurements will be smaller. Thus, the coherence between the adjacent columns in **A** will increase. Consequently, (**A***H***A**) <sup>−</sup><sup>1</sup> will probably be ill-conditioned.

In summary, the LS solution usually contains irreversible or ill-posed terms. This problem is inherent, and is derived from the property of the F-I equation of the first kind [8].

#### *2.3. Matched Filtering Method*

Examining Equation (6), it can be seen that the irreversible or ill-posed term is (**A***H***A**) −1 . We can multiply (**A***H***A**) in the left side of Equation (6) to eliminate (**A***H***A**) −1 . In this way, we can avoid explicitly calculating the nonexistent or unstable term (**A***H***A**) −1 . This leads to the matched filtering method, which can be denoted as

$$
\stackrel{\star}{\mathbf{g}}\_{\text{MF}} = (\mathbf{A}^H \mathbf{A}) \stackrel{\star}{\mathbf{g}} = \mathbf{A}^H \mathbf{s} \tag{7}
$$

Equation (7) can be viewed as multiplying the best estimate of the scene with (**A***H***A**). The matrix (**A***H***A**) is the autocorrelation of the system response, which usually has a sinc pulse shape [1,21]. The matched filtering result can be viewed as the convolution of the best estimate of the scene and the sinc function. A point target will be spread, and side-lobes will also appear in the matched filtering result [21]. This implies that the matched filtering method can only provide an image that blurs the details of the scene. The matched filtering method has a limited resolution, which depends on the autocorrelation of the system response [1].

Figure 2 shows an example of the matched filtering method. Six point targets are set in the scene. It can be seen that the matched filtering result is the convolution of the targets and the autocorrelation of the system response. As a result, an idea point target is spread into a sinc waveform. Consequently, targets will interfere with each other, and two closely spaced targets may not be resolved in the matched filtering result.

Equation (7) is the original form of the matched filtering equation. In practice, in order to reduce the computational cost and make it more convenient for implementation, some transformations and approximations are usually adopted for Equation (7). Equation (7) can represent many widely used imaging algorithms, such as backprojection algorithms, range Doppler algorithms, chirp scaling algorithms, and ωK algorithms [1].

**Figure 2.** Matched filtering example. Two closely spaced targets cannot be resolved.

#### *2.4. Regularization Method*

Examining the LS formula (Equation (5)), it can be seen that it only relies on the observed data. In order to make the ill-posed inverse problem become well-posed, we can add an extra constraint to the LS formula [8–10]. This leads to the regularization method, which can be denoted as

$$\hat{\mathbf{g}} = \operatorname\*{argmin}\_{\mathbf{g}} \left\{ \|\mathbf{s} \cdot \mathbf{A} \mathbf{g}\|\_{2}^{2} + \lambda L(\mathbf{g}) \right\} \tag{8}$$

where λ is the regularization parameter and *L*(**g**) is the added penalty function. In order to make the solution of Equation (8) closer to the true value, *L*(**g**) should represent appropriate prior knowledge for the problem.

A typical choice of *L*(**g**) is

$$L(\mathbf{g}) = \|\mathbf{g}\|\_p^p \tag{9}$$

where ·*<sup>p</sup>* denotes the *<sup>p</sup>*-norm, i.e.,

$$\|\mathbf{g}\|\_{p} = \begin{cases} \left(\sum\_{i=1}^{N} \left|\mathbf{g}\_{i}\right|^{p}\right)^{1/p} & p > 0\\ \text{Number of nonzero elements in } \mathbf{g} & p = 0 \end{cases} \tag{10}$$

Then, Equation (8) can be rewritten as

$$\hat{\mathbf{g}} = \operatorname\*{argmin}\_{\mathbf{g}} \left\{ \|\mathbf{s} \cdot \mathbf{A} \mathbf{g}\|\_{2}^{2} + \lambda \|\mathbf{g}\|\_{\mathcal{V}}^{\mathcal{V}} \right\} \tag{11}$$

The choice of *p* can control the result of the regularization method. If we want to enforce sparsity in the result, we should choose *p* in the range 0 ≤ *p* ≤ 1 [16,17]. For *p* = 1, Equation (11) can be compared to the Lasso solution of the CS type methods [16]. Equation (11) can be solved by gradient search algorithms, such as the Newton iteration [22].

#### *2.5. Bayesian Maximum a Posteriori Estimation*

It should be noted that in Equation (11), the added constraint term λ**g** *p <sup>p</sup>* represents prior knowledge [17,23]. Another prior knowledge-based estimation method is Bayes theory. The main idea behind the Bayesian estimation framework is to account explicitly for the errors, and also for incomplete prior knowledge. Assuming that the noise **n** in Equation (3) is white and Gaussian, we have

$$p(\mathbf{n}) \propto \exp\left\{-\frac{1}{2\sigma^2} \|\mathbf{n}\|\_2^2\right\} \tag{12}$$

where σ<sup>2</sup> is the noise variance. Then we obtain the expression of likelihood

$$p(\mathbf{s}|\mathbf{g}) \propto \exp\left\{-\frac{1}{2\sigma^2} \|\mathbf{s} - \mathbf{g}\|\_2^2\right\} \tag{13}$$

We assume that the scene has a prior probability density function, as

$$p(\mathbf{g}) \propto \exp\{-\alpha \|\mathbf{g}\|\_{p}^{p}\} \tag{14}$$

If 0 ≤ *p* ≤ 1, the magnitude of the scene is more likely to concentrate around zero, which implies that the scene is sparse. For a review on sparsity enforcing priors for the Bayesian estimation approach, the reader can refer to [6].

Using the prior probability density of **g** shown in (14), and according to the Bayes rule, we obtain

$$p(\mathbf{g}|\mathbf{s}) = \frac{p(\mathbf{s}|\mathbf{g})p(\mathbf{g})}{p(\mathbf{s})} \propto \frac{1}{p(\mathbf{s})} \exp\left\{-\frac{1}{2\sigma^2} \left\|\mathbf{s} - \mathbf{g}\right\|\_2^2 - \alpha \left\|\mathbf{g}\right\|\_p^p\right\} \tag{15}$$

Then the MAP estimate can be obtained easily as

$$\hat{\mathbf{g}} = \arg\max\_{\mathbf{g}} p(\mathbf{g}|\mathbf{s}) = \arg\min\_{\mathbf{g}} \|\mathbf{s} - \mathbf{g}\|\_2^2 + 2\sigma^2 \alpha \|\mathbf{g}\|\_p^p \tag{16}$$

Comparing Equations (11) and (16), it can be seen that when λ = 2σ2α, these two equations are equivalent, i.e., the regularization method is equivalent to Bayesian MAP estimation.

#### *2.6. Compressed Sensing Method*

For the observation model shown in Equation (3), if the scene (i.e., **g**) is sparse, according to CS theory, it can be stably reconstructed using reduced data samples. The reconstruction method can be written as [16,17]

$$\hat{\mathbf{g}} = \operatorname\*{argmin}\_{\mathbf{g}} \|\mathbf{g}\|\_{0} \text{ s.t. } \|\mathbf{s} \cdot \mathbf{A} \mathbf{g}\|\_{2}^{2} < \varepsilon \tag{17}$$

where s.t. means subject to and ε denotes the allowed data error in the reconstruction process.

Equation (17) is NP-hard and computationally difficult to solve [17]. Matching pursuit is an approximate method for obtaining an <sup>0</sup> sparse solution. In CS theory, a more tractable approach is taking the 1-norm instead of the 0-norm, which is called the <sup>1</sup> relaxation:

$$\hat{\mathbf{g}} = \operatorname\*{argmin}\_{\mathbf{g}} \|\mathbf{g}\|\_1 \text{ s.t. } \|\mathbf{s} \cdot \mathbf{A} \mathbf{g}\|\_2^2 < \varepsilon \tag{18}$$

If **g** is sparse and **A** satisfies some specific conditions, Equations (18) and (17) will have the same solution, and this solution is the exact or approximate recovery of **g** [16,17]. Equation (18) can be solved using convex programming, which is more tractable than the original 0-norm minimum problem. Unlike the matched filtering method, CS method does not have an exact or pre-defined resolution, since it is a non-linear method. Generally, the resolution capability of the CS method is much better than the matched filtering method if the targets are sparse.

Figure 3 shows an example of compressed sensing. The simulated scene is the same as the matched filtering example shown in Figure 2. Only 1/20 signal samples are used for the CS reconstruction. It can be seen that the two closely spaced targets are well resolved. This implies that the CS method can obtain better results using less data than the matched filtering method. The reason is that prior information concerning signal sparsity is utilized in the CS model.

Equation (18) is a constrained optimization problem. According to the Lagrange theory, it can be transformed into an unconstrained optimization problem, which will have the same form as Equation (11). For appropriate choices of λ and *p* = 1, Equations (11) and (18) will be equivalent [16,17]. This implies that CS is a special case of the regularization method.

**Figure 3.** Compressed sensing example; closely spaced targets are well resolved.

#### *2.7. Summary of Radar Imaging Methods*

The above subsections introduced the LS estimator, matched filtering, regularization methods, Bayesian MAP estimation, and the CS method. In this subsection, we will summarize these methods and analyze their connections.

Table 1 lists the main characteristics and describes some connections between these imaging methods. The LS estimation only relies on the observed data, and cannot solve the ill-posed radar imaging problem efficiently. The matched filtering method can be viewed as using an approximation to avoid the ill-posed term in the LS solution. The regularization method, Bayesian MAP estimation, and the CS method exploit prior knowledge concerning the targets in addition to the observed data, and they are equivalent in some cases.

Table 1 also shows the equivalent geometric illustration for each method in R2. The observation equation can only confine the solution to a hyperplane (which becomes a line in R2), but cannot reliably produce a certain solution [17,23]. The other methods aim at obtaining a stable solution close to the true value, using some modifications that represent prior knowledge concerning the targets.

Figure 4 shows the block diagram and the relationship of the radar imaging methods. All of the radar imaging methods can be divided into two branches. The first branch does not use the prior information of the targets or scene, and it leads to the linear imaging methods; the most typical and widely used one in this branch is matched filtering. Another branch uses the prior information of the targets or scene. This leads to the non-linear methods. The most recently developed methods, including regularization methods, Bayesian methods, and CS methods belong to this branch.


**Table 1.** Characteristics and connections of radar imaging methods.

**Figure 4.** Block diagram and relationship of the radar imaging methods.

#### **3. Challenges and Advances in Compressed Sensing-Based Radar Imaging**

The use of regularization methods in radar imaging goes back at least to the year 2000 [21,24]. Since the CS theory was proposed in 2006, it has been explored for a wide range of radar [25–33] and radar imaging applications [4,34–38], including synthetic aperture radar (SAR) [39–42], inverse SAR (ISAR) [43–45], tomographic SAR [46–51], three-dimensional (3D) SAR [52–54], SAR ground moving target indication (SAR/GMTI) [55–61], ground penetrating radar (GPR) [62–64], and through-the-wall radar (TWR) [65–67]. In this paper, we will focus on two-dimensional (2D) imaging radar systems, i.e., SAR, GPR, and TWR.

After several years of development, although many interesting ideas have been presented in this area, there still exist a number of challenges, both in theory and practice [68]. The state of the art in this area has not yet reached the stage of practical application. We will present some challenges as well as recent advances in this part of the paper.

#### *3.1. Sampling Scheme*

CS usually involves random under-sampling [16,17]. A widely used waveform in traditional radar imaging is the linear frequency modulated (LFM) waveform. If we adopt the LFM waveform in CS-based radar imaging, a random sampling analog to digital (A/D) converter is needed, which is not easily realized in practice. This will require extra hardware components, which means that LFM waveforms are not ideally suited for CS.

Recently, many researchers have found that the stepped frequency waveform is much more suitable for CS than the LFM waveform [35,62,63,66,69]. Sparse and discrete frequencies are more convenient for hardware implementation. For a CS-based radar imaging system, a stepped frequency waveform may be the preferred choice. In practical application, a set of adjustable pseudorandom numbers can be generated to select the frequency points in the stepped frequencies. In this way, randomly generated frequencies, i.e., random and sparse measurement, can be realized, and the CS-based imaging model can be implemented.

Figures 5 and 6 show an example for CS-based stepped frequency radar imaging. The main equipment in the experimental system is a vector network analyzer (VNA). The experiment is carried out in a non-reflective microwave chamber. Five targets in the scene are shown in Figure 5. Figure 6a shows the backprojection result, using the fully sampled data (81 azimuth measurements × 2001 frequencies). Figure 6b shows the CS reconstruction result using under-sampled data (27 azimuth measurements × 128 frequencies). Considering the aspects of resolution and sidelobe levels, the CS reconstruction result is even better than the backprojection result, although it uses less sampled data. The reason is that prior information concerning signal sparsity is used in the CS model, while the backprojection method uses no prior information.

**Figure 5.** Experimental scene for CS-based stepped frequency radar imaging. (**a**) Five reflectors in the microwave chamber. (**b**) Transmitter and receiver antennas.

**Figure 6.** (**a**) Backprojection result of full data (81 azimuth measurements × 2001 frequencies). (**b**) CS result of under-sampled data (27 azimuth measurements × 128 frequencies).

#### *3.2. Computational Complexity*

In the regularization or CS model for a 2D radar imaging system, the 2D observed data and the 2D scene grid are both stacked into column vectors. This will lead to a huge size measurement matrix. For example, the original fully sampled data are 2048 × 2540 points (azimuth × range); if a 512 × 512 pixel image is reconstructed from a reduced sampling data consist of 256 × 256 points. Then the size of the matrix **A** is 65,536 × 262,144. Since regularization or CS reconstruction is a non-linear process, such a large measurement matrix will result in a huge computational burden for image reconstruction. In addition, the total memory to access the measurement matrix is 128 gigabytes (assuming float point and complex numbers are used). This is a too much memory space for normal desktop computers. Considering that data size is usually larger than the above example in practice, it is difficult for conventional methods to reconstruct a moderate-size scene by using normal computers.

A common idea for reducing computational complexity and memory occupancy is to split big data into sets of small data [70]. Based on this thought, a segmented reconstruction method for CS based SAR imaging has been proposed [71]. In this method, the whole scene is split into a set of small subscenes. Since the computational complexity is non-linear to the data size, the reconstruction time can be reduced significantly. The sensing matrices for the method proposed in [71] are much smaller than those for the conventional method. Therefore, the method also needs much less memory. Due to the short reconstruction time and lower memory requirement of the method proposed in [71],

reconstructing a moderate-size scene in a short time is no longer a difficult task. The processing steps of the segmented reconstruction method are shown in Figure 7.

**Figure 7.** Processing steps of the segmented reconstruction method for CS-based synthetic aperture radar (SAR) imaging (taken from [71]).

Figures 8 and 9 show an example of the segmented reconstruction method [71]. Figure 8 shows the experimental scene of an airborne SAR system, which contains six trihedral reflectors. Figure 9a shows the conventional CS reconstruction result, where the reconstruction time is 44,032 s (12 h 14 min). The whole scene is split into five segments, and Figure 9b shows the segmented reconstruction result, where the reconstruction time is now reduced to 1498 s (25 min). It can be seen that, using the segmented reconstruction method, the reconstruction time is significantly reduced, while the reconstruction precision is nearly the same.

(**a**) Sight A of the scene. (**b**) Sight B of the scene.

**Figure 8.** Trihedral reflectors in the scene. Trihedral reflectors 1–4 are large, and trihedral reflectors 5 and 6 are small (taken from [71]).

**Figure 9.** (**a**) Conventional CS reconstruction result (reconstruction time = 44,032 s). (**b**) Segmented reconstruction result (reconstruction time = 1498 s) (taken from [71]).

#### *3.3. Sparsity and Sparse Representation*

Sparsity of the scene is an essential requirement for sparse regularization or CS methods. For an SAR scene, an extended scene is usually not sparse in itself (not sparse in the canonical basis), except for the case of a few dominant scatterers in a low reflective background [35]. Therefore, a sparse representation is needed to use a sparsity-driven method.

CS-based optical imaging has successfully used sparse representations [72]. However, radar imaging involves complex-valued quantities; the raw data and the imaging result are both complex-valued. Since the phase of the scene are potentially random, it is very difficult to find a transform basis to sparsify a complex-valued and extended scene [73,74].

Structured dictionaries and dictionary learning ideas are proposed in [75] and [76], respectively. An alternative approach is to handle the magnitude and phase separately [41]. Although the phase of the scene is potentially random, the magnitude of the scene usually has better sparse characteristics. However, this approach has a much higher computational complexity than standard CS reconstruction. Another method investigates physical scattering behavior [4,77]. For example, a car can be represented as the superposition of responses from plate and dihedral shapes.

Figure 10 shows a simulation example for an extended and complex-valued scene. There are two extended objects in the scene, one of which has a round shape while the other has a rectangular shape. Both the two objects have random phases associated with them. It can be seen that the DCT (Discrete Cosine Transform) results of the magnitude are sparse.

Figure 11a shows the result of matched filtering. Since the random phase leads to speckle, it can be seen that although the scene has a smooth shape, the matched filtering result has obvious fluctuation. Figure 11b shows the result of conventional CS reconstruction without sparse representation. The reconstruction algorithm is SPGL1 [78]. Since the scene is not sparse in the canonical basis, the reconstruction is not accurate. Figure 11c shows the result of the method using a magnitude sparse representation [41]; it can be seen that the reconstruction result is much better than Figure 11a,b. Figure 11d shows the result of the method using the improved magnitude sparse representation method proposed in [79]. In the proposed method, besides the sparsity, the real-valued information of the magnitude and the coefficient distribution of the sparse representation are also utilized. It can be seen that both the shape and speckle are further improved.

**Figure 10.** (**a**) Magnitude of the scene, (**b**) phase of the scene, (**c**) DCT result of the magnitude (taken from [79]).

**Figure 11.** Simulation results: (**a**) matched filtering result, (**b**) conventional CS reconstruction result without sparse representation, (**c**) result of the method with magnitude sparse representation, and (**d**) result of the method with improved magnitude sparse representation (taken from [79]).

Figure 12 shows the real data results. The raw data is acquired by an airborne SAR system. Figure 12 contains a scene of farmland with trellises. The reflectivity from the trellises is very strong. From the real data result, it can be seen that CS with the improved magnitude sparse representation method can produce an image with less speckle and clearer edges of different regions than the previous methods.

**Figure 12.** Real data reconstruction results (scene of farmland with trellises): (**a**) matched filtering result (full data), (**b**) conventional CS reconstruction result without sparse representation, (**c**) result of CS with magnitude sparse representation, and (**d**) result of CS with improved magnitude sparse representation (taken from [79]).

#### *3.4. Influence of Clutter*

Another practical case is when the targets of interest are sparse, but there also exists clutter in the scene. Clutter arises from reflections within the scene, so the image may no longer be sparse if significant clutter returns are present. Typical examples include GPR and TWR imaging. The interesting targets, such as landmines and humans, are usually sparse, but they are often buried in the ground surface clutter and wall clutter.

Some methods have been proposed to remove the ground surface clutter and wall clutter for downward-looking GPR and TWR [64,65]. These methods are effective in cases when the clutter is concentrated in a fixed range cell or limited to several range cells.

Another scenario is TWR/SAR imaging of moving targets. A sparsity-driven change detection method is proposed in [67]. The stationary targets and clutter are removed via change detection, and then CS reconstruction is applied to the resulting sparse scene. In [55], a SAR/GMTI method using distributed CS is proposed, which can cope with the non-sparse stationary clutter.

A more difficult case is when both the targets and clutter are stationary, and the clutter is distributed over the whole scene. Forward-looking GPR may fall into this category. Figure 13 shows a real data example for this case. In such a scenario, shrubs and rocks above the ground surface may cause strong azimuth clutter. Short range clutter is usually also strong, due to the large grazing angle and short range. Besides the strong clutter far away from the target (landmine), there is also ground surface clutter around the target. In [68], an idea is proposed to build a model in which the clutter is also taken into account as a norm in the objective function. In [80], the forward-looking clutter is suppressed in two steps. In the first step, the strong clutter outside of the reconstruction region is suppressed first. In the second step, the clutter in the reconstruction region is suppressed by selecting a proper β, which represents the ratio of the non-zeros area in the reconstructed scene. The reconstruction results are shown in Figure 14.

**Figure 13.** Real data example for the clutter problem in forward-looking GPR (backprojection result using full-sampled data). Taken from [80].

**Figure 14.** Reconstruction results in clutter environment with different parameters (taken from [80]).

#### *3.5. Model Error Compensation*

In the regularization or CS methods, we usually assume that the model is exact. However, in practice, the model may also contain errors. For example, imperfect knowledge of the observation position will lead to errors in the measurement matrix. This effect resembles motion errors that arise in traditional airborne SAR imaging. Figure 15 shows the geometry of the observation position errors or motion errors in SAR.

Several methods have been proposed to deal with model errors in CS-based or sparsity-driven radar imaging. A phase error correction method for sparsity-driven SAR imaging is proposed in [81]. An autofocus method for compressively sampled SAR is proposed in [82]. This method can correct phase errors in the reconstruction process. Both the methods proposed in [81,82] deal with phase errors in the observed data, or approximately treat the observation position-induced model errors as phase errors in the observed data. In [83], the platform position errors are investigated and compensated. That method considers the azimuth offset errors and also uses some approximations.

**Figure 15.** Geometry of the observation position errors in SAR. (Taken from [84]).

In [84], a model error compensation method is proposed. An iterative algorithm cycles through steps of target reconstruction, and observation position error estimation and compensation are used. This method can estimate the observation position error exactly, while only relying on the observed data.

Figure 16 shows a real data result using the method proposed in [84]. The data set used in this figure is the same as that used for Figure 9. In the data acquisition process, the airplane is expected to fly along a straight line. However, due to the air current's influence, the trajectory of the airplane may slightly deviate from the expected one. As a result, the observation position data inevitably contain some errors.

**Figure 16.** Observation position error compensation for airborne SAR data. (**a**) Result without observation position error compensation. (**b**) Result with observation position error compensation (taken from [84]).

Figure 16a shows the original CS reconstruction result. Since the observation position errors are not compensated, it can be seen that the targets are somewhat defocused. Figure 16b shows the corresponding CS reconstruction result with compensation for observation position error. It can be seen that the focusing quality is improved using the method proposed in [84]. The peak of the targets has an increase of about 20%, and the sidelobes are also significantly reduced.

#### **4. Conclusions**

In radar imaging area, there are many relevant techniques and methods, such as matched filtering, the range Doppler algorithm, the chirp scaling algorithm, the ωK algorithm, regularized methods, and CS methods. These techniques and methods are quite different in their forms. This paper tries to understand these techniques and methods in a unified mathematical framework.

Based on theoretical analysis, it can be seen that sparsity-driven regularization or CS-based radar imaging methods have potentially significant advantages. However, although many interesting ideas have been presented, very few of them have been verified with real data. There are still many unsolved or open problems in this area. In the issues discussed in this paper, the sampling scheme, fast reconstruction strategy, and model error problems are basically solved. However, issues concerning the sparsity or sparse representation of a complex and extended scene are still not completely solved. Strong clutter may break the sparsity of a scene, while sparse representation methods for an extended scene are currently not perfect. The state of the art in these areas has not yet reached the stage of practical application, and further investigations are needed in the future.

**Funding:** This work was supported in part by the National Natural Science Foundation of China under Grant 61401474, and in part by the Hunan Provincial Natural Science Foundation under Grant 2016JJ3025.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Compressive Sensing-Based Bandwidth Stitching for Multichannel Microwave Radars**

#### **Paul Berry 1,\*, Ngoc Hung Nguyen <sup>2</sup> and Hai-Tan Tran <sup>1</sup>**


Received: 18 November 2019; Accepted: 19 January 2020; Published: 24 January 2020

**Abstract:** The problem of obtaining high range resolution (HRR) profiles for non-cooperative target recognition by coherently combining data from narrowband radars was investigated using sparse reconstruction techniques. If the radars concerned operate within different frequency bands, then this process increases the overall effective bandwidth and consequently enhances resolution. The case of unknown range offsets occurring between the radars' range profiles due to incorrect temporal and spatial synchronisation between the radars was considered, and the use of both pruned orthogonal matching pursuit and refined *l*1-norm regularisation solvers was explored to estimate the offsets between the radars' channels so as to attain the necessary coherence for combining their data. The proposed techniques were demonstrated and compared using simulated radar data.

**Keywords:** radar signal processing techniques; radar imaging; multiband processing; compressive sensing; sparse reconstruction; bandwidth stitching

#### **1. Introduction**

The construction of high range resolution profiles (HRRP) of targets is a precursor to feature extraction for automatic target recognition (ATR), and normally requires the employment of a high-bandwidth waveform following detection by a lower resolution radar mode. Examples of recent papers in the non-cooperative target recognition (NCTR) literature focusing on feature extraction for ATR following HRRP construction are [1–3]. This paper considers the problem of HRRP construction, but using low resolution radars operating in different frequency bands for the purpose of combining their signals to achieve a higher resolution, and examines the problem of their data not being mutually coherent.

The ability to acquire high resolution range profiles of targets has improved over time as hardware capability has developed, with higher resolution being achieved by increasing the time-bandwidth product. In the early approaches, for narrowband radars with very limited instantaneous bandwidth, stepped-frequency waveforms were used with a single I,Q (that is, baseband quadrature) signal sample received after each frequency step. The set of samples is effectively used for the Fourier transform of the slant range profile, enabling the range response to be obtained simply by implementing an inverse Fourier transform (see, e.g., [4]). The greater the frequency range, the higher the range resolution, but the downside is that the burst of pulses can be so long that a scatterer may migrate between range cells, causing smearing of the range profile, and therefore requiring range compensation. An example of a recent paper involving the use of stepped-frequency waveforms is [5], and recent papers which have investigated the effects of target motion and aspect sensitivity are [6,7].

An advance on the stepped-frequency approach is to increase the time-bandwidth product using stretch processing, whereby a wideband LFM waveform is transmitted, and pulse-compression is achieved in hardware by mixing the received signal with an extended replica of the transmitted waveform. A point target at a particular range will manifest itself as a single frequency which is proportional to its range. The result, which is digitally sampled in time in order to facilitate the identification of the frequency components, is therefore a superposition of discrete frequencies, each corresponding to a point target at a different range. An application of the inverse Fourier transform again recovers the range profile (see, e.g., [4,8,9]).

A spectral analysis technique to improve range resolution was proposed in [10,11] based on autoregressive linear prediction. The main idea is to combine the mutually coherent signals received from multiple waveforms transmitted sequentially or concurrently, which have widely separate carrier frequencies or may even occupy entirely different frequency bands. Viewed in the spectral domain, the received signals from individual waveforms may be seen to occupy discrete wavebands which are separate or contiguous. If contiguous, then they can potentially be coherently combined to synthesise the signals that would have been received from a single wider bandwidth waveform in the manner presented in [12]. If separate, then presumably this coherent combination of signals would still be feasible, as would be the interpolation of the frequency response in the gaps between the bands under the a priori assumption that the signals are returned from discrete scatterers using spectral estimation techniques (see, e.g., [13]). Alternatively, the signals from different frequency bands can be jointly processed without explicitly filling the gaps between the bands. Since no new synthetic frequency band is actually constructed, this approach can be referred to as bandwidth stitching to distinguish it from bandwidth interpolation and extrapolation. The main challenge of this approach lies in the presence of phase errors in different frequency bands resulting from post-processed motion compensation which is often carried out separately for each frequency band. In this paper, we focus on the problem of bandwidth stitching for radar high-resolution range profiling and explore the use of sparse reconstruction to deal with the phase error problem.

It is convenient to formulate these ideas in the spectral domain, within which point scatterers appear as discrete sinusoids and which are amenable to analysis by spectral estimation techniques such as autoregression, as presented in [10,11]. Compressive sensing and sparse reconstruction, however, provide for the possibility of alternative signal representations, potentially allowing for greater flexibility and discriminating between signals of physical origin and receiver noise [13–20]. Instances of non-sinusoidal signals are waveforms in fast-time and signals returned from rotating objects when the angle of rotation is large. Compressive sensing and sparse reconstruction can also handle the situation corresponding to data being non-uniformly sampled in time or space, such as non-uniform PRF (pulse repetition frequency) waveforms and random sparse arrays.

Compressive sensing and sparse reconstruction were exploited in [17,19,21,22] to address the problem of gaps in the data both in slow-time and in frequency for inverse synthetic aperture radar (ISAR) imaging. However, these works assumed that the data were coherent across different sub-bands and that there were no model uncertainties. The work [23] took account of the possible lack of mutual coherence between the radars operating on the different sub-bands arising from incorrect timing synchronisation, or, equivalently, errors in antenna phase's centre-relative locations. This is achieved by fitting an ultra-wideband all-pole signal model to the mutually-coherent sub-bands, which is then used for bandwidth interpolation and extrapolation prior to recovering the range profile by means of an inverse Fourier transform. This paper, however, proposes the use of compressive sensing and sparse reconstruction to deal with the non-coherence problem between different sub-bands.

To address the sub-band non-coherence problem, two different approaches were explored: (i) greedy pursuit and (ii) *l*1-norm regularisation. In the first approach, pruned orthogonal matching pursuit (POMP) [24], which was originally developed for micro-Doppler parameter estimation, is adopted to deal with the dictionary mismatch which is due to the phase errors in each sub-band resulting from the motion-compensation post processing. The main idea is to parameterise the dictionary as a function of the phase errors and to construct multiple realisations of the dictionary. A selective learning process is then used to discard the dictionaries which correspond to incorrect

values of phase errors. Since a straight application of the POMP algorithm to the problem under consideration would have been computationally expensive, we first applied the POMP algorithm pairwise to sub-bands in order to estimate the phase errors, and then utilised the conventional OMP algorithm to determine the range profile based on the estimated phase error values. In the second approach, an *l*1-norm regularisation problem can be solved jointly both for the range profile vector and the phase errors [25]. The work [25] offers two solutions for joint synthetic aperture radar (SAR) imaging and phase error correction. The first solution is not applicable to the problem under consideration because a constant phase error in each sub-band is assumed. On the other hand, the second solution considers general arbitrary phase errors, and thus can be applied to our problem. Since the phase errors within each sub-band are a linear function of the range error coming from post-processing motion compensation, we also present refined variants of the second solution of [25] to take into account this underlying structure of the phase error.

The paper is organised as follows. Section 2 formulates the problem of bandwidth stitching for HRRP in the presence of phase errors. The POMP algorithm is applied in Section 3 to the bandwidth stitching problem under consideration. Section 4 presents *l*1-norm regularisation solvers. Numerical performance comparisons are provided in Section 5 and conclusions are drawn in Section 6.

#### **2. Problem Formulation**

Consider a multistatic radar system consisting of *M* radar channels on different and distinct frequency sub-bands, approximately co-located, and illuminating a common target, such that their radar lines of sight (LoS) coincide but their range profiles are out of alignment. Each channel can individually produce a one dimensional range profile of the target, but with relatively coarse resolution. The bandwidth stitching problem can be briefly stated as follows: for the *M* generally non-coherent channels, the aim is to coherently combine, or "stitch," the channels together so that they can effectively produce a single range profile with resolution corresponding to the combined overall signal bandwidth.

Let *fm*,*<sup>n</sup>* (*n* = 1, ... , *Nm*) denote the *n*th frequency bin of the *m*th channel (*m* = 1, ... , *M*). Here, *Nm* is the number of frequency bins in the *m*th channel. We base the formulation on the point-scatterer model and assume that the target can be defined as consisting of *K* scattering centres at local line-of-sight coordinates *xk* (or local "down ranges") and having complex-valued reflectivity coefficients *αk*, which are also assumed frequency-independent. The down-converted, pulse-compressed, motion-compensated signals received in each channel, in the frequency domain, can be written as

$$\mathbf{S}\_{\rm m} = [\dots, \mathbf{S}\_{m, \nu}, \dots]\_{n=1, \dots, N\_{\rm m}}^{T} \tag{1}$$

where superscript *T* denotes the transpose operation, and

$$S\_{m,n} = |A(f\_{m,n})|^2 \exp\left\{-\frac{4\pi jf\_{m,n}}{c} \Delta R\_m \right\} \sum\_{k=1}^{K} a\_k \exp\left\{-\frac{4\pi jf\_{m,n}}{c} x\_k \right\}.\tag{2}$$

Here, *A*(*fn*,*m*) represents the transmit radar waveform, the squared amplitude resulting from pulse compression processing; constant *c* denotes the speed of light; and Δ*Rm* accounts for the range errors in the motion-compensation processing. Bandwidth stitching in this context amounts to estimating these phase errors as accurately as possible.

The signals *S<sup>m</sup>* in (1) can be rewritten in a more compact form as

$$\mathbf{S}\_{\mathfrak{m}} = \Lambda\_{\mathfrak{m}} \mathbf{F}\_{\mathfrak{m}}^{\dagger} \mathfrak{a}^{\dagger},\tag{3}$$

where

$$\Lambda\_{\mathcal{W}} = \begin{array}{c} \text{diag}\left\{ \dots, \exp\left\{ -\frac{4\pi j f\_{m,n}}{c} \Delta R\_m \right\}, \dots \right\}\_{n=1,\dots,N\_m} \end{array} \tag{4}$$

$$\begin{array}{rcl} F\_m^\dagger &=& [\dots, F\_{m,k}, \dots]\_{k=1,\dots,K} \end{array} \tag{5}$$

$$F\_{m,k}^{\dagger} = \left[ \dots, \exp\left\{ -\frac{4\pi j f\_{m,n}}{c} x\_k \right\}, \dots \right]\_{N=1,\dots,N\_m}^{T} \tag{6}$$

$$\mathfrak{a}^{\dagger} \quad = \begin{array}{c} \mathfrak{l} \\ [\ldots, \mathfrak{a}\_{k'} \ldots \mathfrak{l}]\_{k=1,\ldots,K}^{T} \end{array} \tag{7}$$

Here, "diag" denotes a diagonal matrix; **Λ***<sup>m</sup>* is referred to as the phase error matrix, of dimension *Nm* <sup>×</sup> *Nm*; *<sup>F</sup>*† *<sup>m</sup>* and *<sup>α</sup>*† are respectively, dimensions *Nm* <sup>×</sup> *<sup>K</sup>* and *<sup>K</sup>* <sup>×</sup>1; *<sup>S</sup><sup>m</sup>* is a column vector of dimension *Nm* × 1; and the dagger symbol † refers to the *K* actual scatterers on the target.

To apply the sparse representation techniques of compressive sensing, we discretise the target's local range coordinate *x* using a regularly-spaced range grid {*xl*} for *l* = 1, ... , *Lx*, with *Lx K*, and construct the *Nm* × *Lx* dictionary matrices

$$F\_{\rm ll} = [\dots, F\_{\rm m,l}, \dots]\_{l=1,\dots,l\_{\rm x}} \tag{8}$$

where

$$F\_{\mathbf{m},l} = \left[ \dots, \exp\left\{ -\frac{4\pi j f\_{\mathbf{m},n}}{c} \mathbf{x}\_l \right\}, \dots \right]\_{n=1,\dots,N\_m}^T \tag{9}$$

Are the "atoms" of dictionary *F<sup>m</sup>* in the frequency domain. The corresponding range profile vector

$$\mathfrak{a} = [\ldots, \mathfrak{a}\_l, \ldots]\_{l=1, \ldots, L\_x}^T \tag{10}$$

Spans over the range grid {*xl*}. The received signal *<sup>S</sup><sup>m</sup>* can also be written as

$$S\_m = \Lambda\_m F\_m \mathfrak{a}.\tag{11}$$

Since the target usually contains only a small number of dominant scattering centres relative to the total number of range resolution cells, the range profile *α* can be considered sparse (i.e., containing a small number of non-zero elements).

In the presence of unknown noise, (11) becomes

$$
\mathcal{S}\_m = \Lambda\_m F\_m \mathfrak{a} + \mathfrak{n}\_m. \tag{12}
$$

where *n<sup>m</sup>* is the additive noise for channel *m*. Stacking up the individual channel signals *S*˜ *<sup>m</sup>*, *m* = 1, . . . , *M*, gives

$$
\vec{S} = \Lambda F \mathfrak{a} + \mathfrak{n},
\tag{13}
$$

where

$$\bar{\mathbf{S}}^{\top} = [\dots, \bar{\mathbf{S}}\_{m}^{T}, \dots]\_{m=1, \dots, M}^{T} \tag{14}$$

$$\mathbf{A}\_{\;\;\;\;\prime} = \text{diag}\{\ldots, \mathbf{A}\_{\text{W}\_{\prime}}, \ldots\}\_{\:\!\;\;\text{m} = 1, \ldots, \text{M}}\tag{15}$$

$$F\_{\ldots} = [\ldots, F\_{m'}^{T}, \ldots]\_{m=1, \ldots, M}^{T} \tag{16}$$

$$\mathfrak{n}\_{\cdot} = [\ldots, \mathfrak{n}\_{m'}^T, \ldots]\_{m=1, \ldots, M}^T. \tag{17}$$

Note that *<sup>S</sup>*˜ and *<sup>n</sup>* are column vectors of size (∑*<sup>m</sup> Nm*) <sup>×</sup> 1; diagonal phase error matrix **<sup>Λ</sup>** is of size (∑*<sup>m</sup> Nm*) <sup>×</sup> (∑*<sup>m</sup> Nm*); dictionary matrix *<sup>F</sup>* is (∑*<sup>m</sup> Nm*) <sup>×</sup> *Lx*; and the range profile *<sup>α</sup>* is again a column vector of size *Lx* × 1. Stacking the received signals amounts to a vertical stacking of the dictionary matrices from all channels and a diagonal concatenation of the corresponding phase error matrices. The stacking of multiple channels in this manner can improve the estimation accuracy for *α*, as will be demonstrated later in the paper.

A *problem statement* can thus be expressed as follows: given *S*˜ as the measured signal,

$$\text{find } \mathfrak{a} \text{ and } \Lambda, \text{ subject to } \left\{ \begin{array}{l} \mathcal{S} \approx \Lambda F \mathfrak{a}, \\ \mathfrak{a} \text{ is sparse} \end{array} \right. \tag{18}$$

The estimation of *α* over {*xl*} is the process of range profiling, giving the main desired output, whereas the estimation of the phase error matrix **Λ** is really only a necessary intermediate result; it is a function of Δ*R*1, Δ*R*2, ... , and Δ*RM* (recall that Δ*Rm* is the range estimation error resulting from the motion-compensation process for channel *m*). Furthermore, since these errors arise from a lack of precise knowledge of the relative locations of the radar channels' phase centres and are small relative to a range resolution cell, we may assume, without loss of generality, that Δ*R*<sup>1</sup> = 0.

#### **3. Greedy Pursuit Solutions**

In this section, we adopt the pruned OMP (POMP) technique, which was originally proposed for micro-Doppler parameter estimation, [24], for the problem of bandwidth stitching for range profiling. We start with the simplest case of two channels and then generalise it to the multiple channel case.

#### *3.1. The Two-Channel Case*

For this case, the signal model in (13) can be expressed as

$$\mathcal{S} = \Lambda(\Delta \mathcal{R}\_2) F \mathfrak{a} + \mathfrak{n}, \tag{19}$$

where **Λ**(Δ*R*2) is a function of the single unknown relative range error Δ*R*2,

$$\mathbf{A}(\Delta \mathbb{R}\_2) = \text{diag}\{I\_{\mathbb{N}\_1}, \mathbf{A}\_2(\Delta \mathbb{R}\_2)\},\tag{20}$$

with

$$\Lambda\_2(\Delta R\_2) = \text{diag}\left\{ \dots, \exp\left\{ -\frac{4\pi j f\_{2,n}}{c} \Delta R\_2 \right\}, \dots \right\}\_{n=1,\dots,N\_2} \tag{21}$$

In addition to the sparse range profile vector *α*, Δ*R*<sup>2</sup> is the only additional unknown parameter to be estimated. Let us rewrite (19) as

$$
\vec{S} = \Phi(\Delta R\_2)\,\mathfrak{a} + \mathfrak{n} \tag{22}
$$

where

$$
\Phi(\Delta R\_2) = \mathbf{A}(\Delta R\_2) \, F.
$$

In this form, the problem can be viewed as a joint sparse reconstruction and parameter estimation problem with the parametric dictionary **Φ**(Δ*R*2) itself a function of the parameter Δ*R*2. This can be considered as a special dictionary learning problem where the objective is to solve simultaneously for both the sparse solution of *α* and the range error Δ*R*2.

To solve this problem, we adopt the POMP technique [24], which embeds a pruning operation into the iterative process of OMP. The main idea of POMP is to construct multiple realisations of the dictionary **Φ** based on a number *L*<sup>Δ</sup> of candidate values of Δ*R*2; the OMP algorithm is applied to each dictionary realisation to find the atom which correlates most strongly with the current residual for that dictionary, and to recompute the residual with that atom's contribution to the residual removed. To overcome possible excessive computations arising from outlier candidate values of Δ*R*2, a pruning operation is performed to exclude the half of the dictionaries which yield the largest residual errors, until a single dictionary realisation remains. The OMP iterations for the remaining dictionary are continued until a termination criterion is satisfied. The candidate value of Δ*R*<sup>2</sup> corresponding to this

dictionary gives the final estimate of Δ*R*2. The basic POMP algorithm is summarised in Table 1 and its computational cost is shown in Appendix C to be of the order of *L*Δ*NmLx*.

**Table 1.** The pruned orthogonal matching pursuit (POMP) algorithm (*M* = 2).

#### INPUT:


#### PROCEDURE:

	- set the initial indexes of active dictionaries to Θ<sup>1</sup> = {1, . . . , *L*Δ};
	- set the corresponding residual vectors to *<sup>r</sup>*<sup>1</sup> <sup>=</sup> ··· <sup>=</sup> *<sup>r</sup>L*<sup>Δ</sup> <sup>=</sup> *<sup>S</sup>*˜;
	- set the initial support Λ to ∅, the null set;
	- **for** every *l* ∈ Θ*i*, perform OMP as follows

$$\begin{array}{c} \text{- } \mathsf{Identity} \\ \begin{array}{c} c\_{l} = \boldsymbol{\Phi}\_{l}^{H} r\_{l} \\ \boldsymbol{j}\_{l} = \text{arg}\,\mathsf{max}\_{j} \,|\,\mathsf{c}\_{j}| \\ \mathsf{- } \mathsf{Merge} \,\mathsf{supp}\,\mathsf{rots} \\ \begin{array}{c} \Lambda\_{l} = \Lambda\_{l} \cup \boldsymbol{j}\_{l} \\ \mathsf{Update}^{\*} \end{array} \\ \mathsf{- } \mathsf{Update}^{H} \colon \\ \begin{array}{c} \mathsf{\hat{\mathsf{\hat{\boldsymbol{I}}}}\_{l} \Lambda\_{l} = \left(\boldsymbol{\Phi}\_{l,\Lambda\_{l}}^{H} \boldsymbol{\Phi}\_{l,\Lambda\_{l}}\right)^{-1} \boldsymbol{\Phi}\_{\boldsymbol{I},\Lambda\_{l}}^{H} \boldsymbol{\mathcal{S}} \\ \mathsf{r}\_{l} = \boldsymbol{\mathcal{S}} - \boldsymbol{\Phi}\_{l,\Lambda\_{l}}^{\boldsymbol{\Lambda}} \boldsymbol{\hat{\mathsf{\hat{\boldsymbol{I}}}}\_{l} \boldsymbol{\Lambda}\_{l} \end{array} \end{array}$$

**if** |Θ*i*| > 1

Remove indices of |Θ*i*|/2 candidate dictionaries that correspond to |Θ*i*|/2 largest residual errors from |Θ*i*|

**end if**

**end for**

OUTPUT:


<sup>∗</sup> **Φ***l*,Λ*<sup>l</sup>* consists of the columns of **Φ***<sup>l</sup>* with indices belonging to Λ*<sup>l</sup>* and *α*ˆ <sup>Λ</sup>*<sup>l</sup>* consists of the elements of *α*ˆ *<sup>l</sup>* with indices belonging to Λ*l*.

*3.2. The Multi-Channel Case*

For this case, the noisy signal model (19) becomes

$$\mathcal{S} = \Phi(\Delta R\_{2\prime}, \dots, \Delta R\_M)\mathfrak{a} + \mathfrak{n} \tag{23}$$

where

$$\Phi(\Delta R\_2, \dots, \Delta R\_M) = \text{diag}\left\{ I\_{\mathbb{N}\_1}, \Lambda\_2(\Delta R\_2), \dots, \Lambda\_M(\Delta R\_M) \right\} \text{ F.} \tag{24}$$

Here, the dictionary matrix is a function of the (*M* − 1) unknowns Δ*R*2, Δ*R*3, . . . and Δ*RM*.

The POMP algorithm could be extended to multiple channels by computing candidate dictionaries based on a multi-dimensional grid of candidate values for Δ*R*2, Δ*R*3, ... , and Δ*RM*. The grid would consist of a total of (*M* − 1) dimensions, where the *m*th dimension corresponds to the unknown range error Δ*Rm*+<sup>1</sup> of the (*m* + 1) channel. Note that only a one-dimensional grid for Δ*R*<sup>2</sup> is required for the case of two channels. However, the cardinality of the dictionary set is exponentially dependent on the number of available channels; i.e., *V*(*M*−1), where *V* denotes the number of grid points in each parameter dimension. As a result, although this extension would be simple and straightforward, it is computationally expensive.

To alleviate this computational burden, we instead apply the POMP algorithm pairwise to channels in order to estimate the range errors Δ*R*2, ... , Δ*RM* relative to the first channel, and then utilise the conventional OMP algorithm to determine the range profile vector *α* based on these estimated values of Δ*R*2,..., Δ*RM*. The procedure is summarised as below:

**STEP 1:** Estimation of range errors.

	- Calculate input signal:

$$\vec{S}\_{1m} = [\vec{S}\_1^T, \vec{S}\_m^T]^T \tag{25a}$$

$$F\_{1m} = [F\_1^T, F\_m^T]^T. \tag{25b}$$


$$\Phi\_{1m,l} = \text{diag}\{I\_{N\_1}, \Lambda\_m(\Delta R\_m^{(l)})\} F\_{1m}. \tag{25c}$$


End for.

**STEP 2:** Estimation of range profile vector.

• Compute signal and dictionary.

$$\mathcal{S} = [\dots, \mathcal{S}\_{m'}^T \dots \mathcal{I}]\_{m-1, \dots, M}^T \tag{26}$$

$$F = [\dots, \mathbf{F}\_m^T, \dots]\_{m=1, \dots, M}^T \tag{27}$$

$$\mathbf{A} = \text{diag}\{\dots, \mathbf{A}\_m(\Delta R\_m), \dots\}\_{m=1,\dots,M} \tag{28}$$

$$
\Phi = \Lambda F.
\tag{29}
$$

• Estimate *α* using OMP given *S*˜ and **Φ**.

The computational cost of the general POMP algorithm is shown in Appendix C to be of the order of (*M* − 1)*L*Δ*NmLx*.

#### **4.** *L***1-Norm Regularisation Approach**

The sparse reconstruction problem (18) can be solved via the following *l*<sup>1</sup> regularised optimisation:

$$\min\_{\mathbf{a},\mathbf{A}} \left\{ \|\mathbf{S} - \mathbf{A}\mathbf{F}\mathbf{a}\|\|\_{2}^{2} + \mu \|\mathbf{a}\|\|\_{1} \right\},\tag{30}$$

where *μ* is a regularisation parameter. It should be emphasized that this is not a conventional *l*<sup>1</sup> regularisation formulation because of the unknown phase error matrix **Λ** resulting from the estimation error of the motion-compensation phase. Therefore, **Λ** must be jointly estimated with *α*:

$$\{\hat{\mathfrak{a}}, \hat{\Lambda}\} = \operatorname\*{arg\,min}\_{\mathfrak{a}, \Lambda} \left\{ \|\mathbf{S} - \Lambda \mathbf{F} \mathfrak{a}\|\|\_{2}^{2} + \mu \|\mathfrak{a}\|\|\_{1} \right\}.\tag{31}$$

Two solutions for this joint estimation problem were presented in [25]. The first solution assumes that the phase error matrix for the *m*th sub-band is modelled as

$$\Lambda\_{\mathfrak{M}} = \exp\{j\phi\_{\mathfrak{M}}\} \ I\_{\mathcal{N}\_{\mathfrak{M}} \times \mathcal{N}\_{\mathfrak{M}}}.\tag{32}$$

In other words, the phase errors for different frequency bins of a particular sub-band are identical. However, this assumption is invalid in the problem under consideration because the phase error is a function of frequency and thus has different values for different frequency bins. Therefore, that solution is not applicable in this case. The second solution considers a general phase error matrix

$$\Lambda\_{\mathfrak{M}} = \text{diag}\{\dots, \exp\{j\phi\_{\mathfrak{m},\mathfrak{n}}\}, \dots\}\_{\mathfrak{n} = 1, \dots, N\_{\mathfrak{m}}} \tag{33}$$

where the phase errors *φm*,*<sup>n</sup>* can be arbitrary. Although this solution can be used, it does not exploit the underlying structure of the phase errors; i.e., *<sup>φ</sup>m*,*<sup>n</sup>* <sup>=</sup> <sup>−</sup>4*<sup>π</sup> fm*,*<sup>n</sup> <sup>c</sup>* Δ*Rm*. In what follows, we will also present other refined versions, building on the second solution of [25], while exploiting prior knowledge of the structure of the phase error.

The *l*<sup>1</sup> norm can be approximated as [26–29]:

$$||\mathfrak{a}||\_1 \approx \sum\_{l=1}^{L} \left( |\mathfrak{a}\_l|^2 + \delta \right)^{1/2} \tag{34}$$

In order to overcome the nondifferentiably of the *l*<sup>1</sup> norm at the origin. Here, *δ* is a small non-negative parameter. Using this approximation, the minimisation problem in (31) becomes

$$\{\hat{\mathbf{a}},\hat{\mathbf{A}}\} = \operatorname\*{arg\,min}\_{\mathbf{a},\mathbf{A}} \left\{ \left\| \mathbf{S} - \mathbf{A}\mathbf{F}\mathbf{a} \right\|\_{2}^{2} + \lambda \sum\_{l=1}^{L} \left( \left| \alpha\_{l} \right|^{2} + \delta \right)^{1/2} \right\}.\tag{35}$$

The solution of (35) tends to the solution of (31) as *δ* approaches zero. Therefore, a small value of *δ* should be used to ensure the validity of this approximation. The quasi-Newton approach can be adopted to solve the modified *l*<sup>1</sup> regularised optimisation (35), as below.

The gradient of the objective function of (35) is given by

$$\nabla(\mathfrak{a}) = H(\mathfrak{a})\mathfrak{a} - 2\,\,\mathcal{F}^H \boldsymbol{\Lambda}^H \mathbf{S},\tag{36}$$

where the superscript *<sup>H</sup>* denotes the Hermitian transpose operation. Here, *H* is the Hessian matrix given by

$$H(\mathfrak{a}) = 2\,\mathrm{F}^{H}\Lambda^{H}\Lambda\mathrm{F} + \lambda\mathcal{W}(\mathfrak{a}) = 2\,\mathrm{F}^{H}\mathrm{F} + \lambda\mathcal{W}(\mathfrak{a}),\tag{37}$$

where

$$\mathcal{W}(\mathfrak{a}) = \text{diag}\left\{ \dots, \left( |a\_l|^2 + \delta \right)^{-1/2}, \dots \right\}. \tag{38}$$

Since the Hessian matrix is a function of the unknown *α*, the minimisation (35) is solved iteratively. Given the estimates *α*ˆ(*i*) and **Λ**ˆ (*i*) from a previous iteration *i*, the new solutions at iteration *i* + 1 are obtained in the following two steps.

1. Calculate ˆ*α*(*<sup>i</sup>* <sup>+</sup> <sup>1</sup>) by setting <sup>∇</sup>(*α*) = **<sup>0</sup>** given *<sup>H</sup>*(*α*ˆ(*i*)) and **<sup>Λ</sup>**<sup>ˆ</sup> (*i*):

$$\begin{split} \hat{\mathfrak{a}}(i+1) &= 2\left(H(\hat{\mathfrak{a}}(i))\right)^{-1} \mathbf{F}^{H}(\hat{\mathbf{A}}(i))^{H} \mathbf{S} \\ &= \left(\mathbf{F}^{H}\mathbf{F} + \frac{1}{2}\lambda\mathbf{W}(\hat{\mathfrak{a}}(i))\right)^{-1} \mathbf{F}^{H}(\hat{\mathbf{A}}(i))^{H} \mathbf{S}. \end{split} \tag{39}$$

2. Calculate **Λ**ˆ (*i* + 1) given ˆ*α*(*i* + 1). The phase error matrix **Λ**ˆ (*i* + 1) is obtained by solving:

$$\hat{\mathbf{A}}(i+1) = \underset{\mathbf{A}}{\arg\min} \|\mathbf{S} - \mathbf{A}\mathbf{F}\mathbf{\hat{a}}(i+1)\|\_{2}^{2} \tag{40}$$

or equivalently

$$\hat{\mathbf{A}}\_{m}(i+1) = \underset{\mathbf{A}\_{m}}{\text{arg min}} \|\mathbf{S}\_{m} - \mathbf{A}\_{m}\mathbf{F}\_{m}\mathbf{a}(i+1)\|\_{2}^{2} \tag{41}$$

for *<sup>m</sup>* <sup>=</sup> 2, . . . , *<sup>M</sup>*. Note that **<sup>Λ</sup>**<sup>1</sup> <sup>=</sup> *<sup>I</sup>N*<sup>1</sup> (since <sup>Δ</sup>*R*<sup>1</sup> <sup>=</sup> 0); thus, no estimation is required for **<sup>Λ</sup>**1.

The algorithm may be halted when the objective function falls below a threshold, or when a maximum number of iterations is reached, or when the relative change in the objective function falls below a threshold.

Various methods for calculating the phase error matrix **Λ**ˆ (*i* + 1) in Step 2 are given in the following sections.

#### *4.1. Unstructured Approach*

Ignoring the underlying structure of the phase errors, i.e., *<sup>φ</sup>m*,*<sup>n</sup>* <sup>=</sup> <sup>−</sup>4*<sup>π</sup> fm*,*<sup>n</sup> <sup>c</sup>* <sup>Δ</sup>*Rm*, **<sup>Λ</sup>**<sup>ˆ</sup> *<sup>m</sup>* can be considered as a diagonal matrix with arbitrary elements *φm*,*n*:

$$\mathbf{A}\_{\mathcal{W}} = \text{diag}\{\dots, \exp\{j\phi\_{\mathcal{W},\mathcal{W}}\}, \dots\}\_{n=1,\dots,N\_{\mathcal{W}}}.\tag{42}$$

Therefore, *φm*,*<sup>n</sup>* can be estimated as [25]

$$\phi\_{m,n}(i+1) = \tan^{-1} \frac{\circledast \{ S\_{m,n} \hat{Y}\_{m,n}^\*(i+1) \}}{\Re \{ S\_{m,n} \hat{Y}\_{m,n}^\*(i+1) \}} \tag{43}$$

where {·} and {·} denote operations to extract the imaginary and real parts of a complex number, and tan−<sup>1</sup> stands for a four-quadrant arctangent operation. Here, *Y*ˆ *<sup>m</sup>*,*n*(*i* + 1) is the *n*th element of *Y*ˆ *<sup>m</sup>*(*i* + 1) which is defined as *Y*ˆ *<sup>m</sup>*(*i* + 1) = *Fmα***ˆ**(*i* + 1). As a result, we obtain:

$$\dot{\mathbf{A}}\_{m}(i+1) = \text{diag}\{\dots, \exp\{j\hat{\phi}\_{m,n}(i+1)\}, \dots\}\_{n=1,\dots,N\_{m}}.\tag{44}$$

#### *4.2. Gauss–Newton Approach*

Taking into account the underlying structure of the phase errors, **Λ***m*(Δ*Rm*) is a function of Δ*Rm*, and the minimisation (41) can be re-expressed as

$$
\overline{\Delta}\overline{R}\_{\mathfrak{M}}(i+1) = \underset{\Delta R\_{\mathfrak{m}}}{\text{arg min}} \left\| \mathbf{S}\_{\mathfrak{m}} - \mathbf{A}\_{\mathfrak{m}}(\Delta R\_{\mathfrak{m}}) F\_{\mathfrak{m}} \mathfrak{A}(i+1) \right\|\_{2}^{2}.\tag{45}
$$

By letting *<sup>e</sup><sup>m</sup>* <sup>=</sup> *<sup>S</sup><sup>m</sup>* <sup>−</sup> **<sup>Λ</sup>***m*(Δ*Rm*)*Fmα***ˆ**(*<sup>i</sup>* <sup>+</sup> <sup>1</sup>), we have

$$\mathbf{e}\_{m} = [\dots, \mathbf{e}\_{m,\nu\nu}, \dots]\_{m=1,\dots,N\_m}^T \tag{46}$$

where

$$\varepsilon\_{\rm m,n} = S\_{\rm m,n} - \mathcal{U}\_{\rm m,n} \exp\left\{ -\frac{4\pi j f\_{\rm m,n}}{c} \Delta R\_m \right\} \sum\_{l=1}^{L} \hbar\_l (i+1) \exp\left\{ -\frac{4\pi j f\_{\rm m,n}}{c} \mathbf{x}\_l \right\} \tag{47}$$

and

$$\mathcal{U}\_{m,n} = \exp\{j\phi\_{m,n}\}.\tag{48}$$

Here *α*ˆ*l*(*i* + 1) is the *l* th element of *α*ˆ(*i* + 1). As we are estimating real quantities, it is more convenient to reformulate the problem as the minimisation of a real function in order to apply the Gauss–Newton. The details of the Gauss–Newton algorithm for updating Δ 2*Rm*(*i* + 1) are given in Appendix A.

Using Δ 2*Rm*(*i* + 1), we obtain the phase error matrix as

$$\hat{\Lambda}\_{m}(i+1) = \text{diag}\left\{ \dots, \exp\left\{ -\frac{4\pi j \hat{\lambda} \hat{R}\_{m}(i+1)}{c} f\_{m,n} \right\}, \dots \right\}\_{n=1,\dots,N\_{m}}.\tag{49}$$

#### *4.3. Linear Regression-Based Approach*

By noting that *<sup>φ</sup>m*,*<sup>n</sup>* <sup>=</sup> <sup>−</sup>4*π*Δ*Rm <sup>c</sup> fm*,*n*, the gradient <sup>−</sup>4*π*Δ*Rm <sup>c</sup>* can be calculated via a linear least squares estimator using *φ*ˆ*m*,*<sup>n</sup>* obtained from (43) [30]. Specifically, we have

$$\left[ -\frac{4\pi\widehat{\Delta R}\_m(i+1)}{c}, \widehat{\Phi}\_m^\dagger(i+1) \right]^T = (A\_m^T A\_m)^{-1} A\_m^T b\_m(i+1) \tag{50}$$

where

$$\mathbf{A}\_{\mathfrak{m}} = [\ldots, \mathbf{A}\_{m, \mathfrak{n}} \ldots \mathbf{I}]\_{n=1, \ldots, \aleph\_{m} \prime}^{T} \text{ with } \mathbf{A}\_{m, \mathfrak{n}} = [f\_{m, \mathfrak{n}}, \mathbf{1}] \tag{51a}$$

$$\mathbf{b}\_{\rm m}(i+1) = [\dots, \hat{\phi}\_{\rm m,n}^{\rm unwarppped}(i+1), \dots]\_{\rm n-1, \dots, N\_{\rm m}}^{T}. \tag{51b}$$

Note that *φ*ˆunwrapped *<sup>m</sup>*,*<sup>n</sup>* (*i* + 1) is the unwrapped version of *φ*ˆ*m*,*n*(*i* + 1) and *φ*ˆ† *<sup>m</sup>*(*i* + 1) is an estimate for the initial phase *φ*† *<sup>m</sup>*(*i* + 1) which results from the unwrapping process. From (50), Δ 2*Rm*(*i* + 1) is obtained and can then be used for computing **Λ**ˆ *<sup>m</sup>*(*i* + 1) as in (49).

#### *4.4. Differenced-Phase-Based Approach*

Subtracting the estimated phase errors of two successive frequency bins, we obtain:

$$-\frac{4\pi (f\_{m,n+1} - f\_{m,n})}{c} \Delta R\_{\text{ll}} = \phi\_{\text{m,n+1}} - \phi\_{\text{m,n}}.\tag{52}$$

Therefore, Δ*Rm* can be estimated as [30]

$$
\hat{\Delta R}\_m(i+1) = -\frac{c}{4\Delta f\_m(N\_m - 1)} \sum\_{n=1}^{N\_m - 1} \Delta \phi\_{m,n}(i+1) \tag{53}
$$

where

$$
\Delta\phi\_{m,n}(i+1) = \tan^{-1}\frac{\sin(\phi\_{m,n+1}(i+1) - \phi\_{m,n}(i+1))}{\cos(\hat{\phi}\_{m,n+1}(i+1) - \hat{\phi}\_{m,n}(i+1))}.\tag{54}$$

Note that the four-quadrant arctangent has been used here to handle the phase wrapping. An estimate of the phase error matrix **Λ**ˆ *<sup>m</sup>*(*i* + 1) is now obtained as in (49) using Δ 2*Rm*(*i* + 1) in (53).

#### **5. Simulation and Discussion**

Numerical simulations are presented in this section to evaluate the performance of the methods described in previous sections.

#### *5.1. Scenario 1: Two Sub-Bands*

We consider a synthetic scenario with two sub-bands at carrier frequencies of *f*<sup>1</sup> = 6 GHz and *f*<sup>2</sup> = 8 GHz, each having a bandwidth of *B* = 300 MHz and 64 frequency steps (i.e., *N* = *N*<sup>1</sup> = *N*<sup>2</sup> = 64). The range profile is discretised over a grid with a length of (*N* − 1)*c*/(2*B*) = 31.5 m and a grid step of ΔGrid = *c*/(10*B*) = 0.1 m. We consider a far-field target consisting of six point scatterers which are aligned with the grid. Figure 1 plots the true range profile of the target. We set Δ*R*<sup>2</sup> = 2.78ΔGrid for the case of existing phase errors. The signal-to-noise ratio is set to 10 dB.

Figure 2 compares the reconstructed range profiles obtained by the conventional OMP algorithm without and with the presence of phase errors. The OMP is terminated when the signal residual reaches the noise level or after 15 iterations have been carried out. We observe that OMP successfully reconstructs the range profile of the target by correctly identifying the scatterers of the target with accurate range and coefficient estimates when no phase errors exist. However, OMP provides unsatisfactory results in the presence of phase errors, where the reconstructed image is observed

as exhibiting many spurious scatterers. Similar observations are obtained for the results obtained by the conventional *l*1-norm regularised optimisation solver (without phase error correction), as shown in Figure 3. Here, we set *<sup>δ</sup>* = <sup>10</sup>−<sup>5</sup> and *<sup>λ</sup>* = 0.001 max |*cl*| where *cl* is the *<sup>l</sup>* th element of *c* = *FHS*˜. The *l*1-norm regularised optimisation solver is stopped if the relative change in the *l*2-norm of the range profile vector *α* falls below 10−<sup>5</sup> or after it reaches 500 iterations. The performance degradation of these conventional sparse reconstruction techniques is not unexpected, since they were not originally developed to cope with dictionary mismatch arising from the presence of the phase errors.

**Figure 1.** True range profile of synthetic target under consideration.

(**a**) Without the presence of phase errors.

(**b**) With the presence of phase errors.

**Figure 2.** Performance of conventional OMP.

(**b**) With the presence of phase errors.

**Figure 3.** Performance of conventional *l*1-norm regularised optimisation solver.

Figure 4 shows the reconstructed range profile obtained by the POMP. POMP constructs candidate dictionaries based on a grid of Δ*R*<sup>2</sup> with a grid step size of ΔGrid/100. The same stopping criteria of OMP is used for POMP. We observe that POMP produces a range profile which is almost identical to the ground truth, thereby demonstrating the effectiveness of POMP in terms of dealing with the phase errors between different sub-bands.

**Figure 4.** Performance of POMP in the presence of phase errors.

Figure 5 shows the results obtained by different *l*1-norm regularised optimisation solvers with phase error correction, as presented in Section 4. The same parameters and stopping criteria of the conventional *l*1-norm regularised optimisation solver as described above are used in the simulations. Although these algorithms exhibit some improvements over the conventional *l*1-norm regularisation (i.e., without phase error correction), they provide poorer results compared to that of POMP. Specifically, the peaks of the reconstructed range profiles obtained by these algorithms only appear close to but not exactly at the true scatterer positions. In addition, the magnitudes of the peaks are much smaller than the ground truth values.

(**d**) With structured DP-based error correction

**Figure 5.** Performance of *l*1-norm regularised optimisation solver with phase error correction.

The inferior performances of these algorithms can be explained by noting that the *l*1-norm regularised optimisation in (31) is a nonconvex problem due to the phase error matrix **Λ**. Figure 6 plots the objective function of (31) as a function of Δ*R*<sup>2</sup> assuming that *α* is perfectly known. We observe that this objective function has many local maxima and minima, confirming the non-convexity of the *l*1-norm regularised optimisation in (31). The reason for this non-convexity is explained in Appendix B. Due to this nonconvexity, the iterative solvers presented in Section 4 are prone to converge to local minima; thus, limiting the effectiveness of this approach.

**Figure 6.** Illustration of the nonconvexity of the *l*1-norm regularised optimisation problem (31). The objective function of (31) is plotted against Δ*R*<sup>2</sup> assuming that *α* is perfectly known.

The performance of the OMP, POMP, and *l*1-norm regularised optimisation methods are now compared using the earth mover's distance (EMD) between the true and reconstructed range profiles. EMD [31] is a metric estimating the distance between two distributions or equivalently the minimal amount of work required to transform one distribution to the other. Figures 7 and 8 show the EMD performance of the OMP, POMP, and *l*1-norm regularised optimisation methods, averaged over 100 Monte Carlo runs, versus different levels of SNR (signal-to-noise ratio) and phase error, respectively. It is observed that the POMP method yields the smallest EMD values amongst all algorithms considered. Since a smaller value of EMD corresponds to a higher level of similarity between the true and reconstructed range profiles, this observation indicates that the reconstructed range profile obtained by POMP is closer to the ground-truth range profiles than those obtained from the OMP and *l*1-norm regularised optimisation methods. This verifies the performance advantage of the POMP method from a statistical point of view.

**Figure 7.** Earth mover's distance (EMD) performance of the OMP, POMP, and *l*1-norm regularised optimisation methods versus various of SNRs (Δ*R*<sup>2</sup> = 2.78ΔGrid).

**Figure 8.** EMD performance of the OMP, POMP, and *l*1-norm regularised optimisation methods versus various levels of phase error (SNR = 10 dB).

#### *5.2. Scenario 2: Four Sub-Bands*

We now consider another scenario with four sub-bands at carrier frequencies of *f*<sup>1</sup> = 6 GHz, *f*<sup>2</sup> = 8 GHz, *f*<sup>3</sup> = 10 GHz, and *f*<sup>4</sup> = 12 GHz, each having a bandwidth of *B* = 300 MHz and 64 frequency steps (i.e., *N* = *N*<sup>1</sup> = *N*<sup>2</sup> = *N*<sup>3</sup> = *N*<sup>4</sup> = 64). The range errors are set to Δ*R*<sup>2</sup> = 2.78ΔGrid, Δ*R*<sup>3</sup> = 1.33ΔGrid, and Δ*R*<sup>4</sup> = 3.69ΔGrid. Other simulation parameters and the true range profile of the target remain unchanged as in the previous simulation example.

Figure 9 compares the reconstructed range profiles obtained by the conventional OMP algorithm and the POMP algorithm presented in Section 3.2. OMP results in an unsatisfactorily reconstructed range profile with many spurious peaks, as expected, because it ignores the phase errors between different sub-bands. In contrast, the POMP is capable of reconstructing the true range profile with a high accuracy thanks to the use of dictionary learning with a pruning process. Note that, given the inferior performance of the *l*1-norm regularised optimisation approach compared to POMP, as demonstrated in the previous simulation scenario, this approach is excluded from the comparison here.

(**b**) Range profile reconstructed by POMP

**Figure 9.** Performance comparison between OMP and POMP for Simulation Scenario 2 (with four sub-bands).

#### **6. Conclusions**

This paper explores the use of the POMP algorithm and *l*1-norm regularisation solvers for the problem of sparsity-driven HRRP with bandwidth stitching in the presence of phase errors. We observe that the *l*1-norm regularisation solvers do not provide significant performance improvement over the conventional sparse reconstruction algorithms due to the nonconvexity of *l*1-norm regularised optimisation when phase errors exists. In contrast, POMP is observed to be capable of effectively dealing with the phase errors and thus be able to reconstruct the range profile of the target with high accuracy. Simulation results show a significant performance improvement by POMP over OMP and the conventional and refined *l*1-norm regularisation. In future work, we propose using experimental data for a more general scenario where the true scatterers constituting the target are located in off-grid positions with respect to the dictionary grid, and the true range errors have off-grid values. We shall also consider the more general case of frequency-dependence of scatterer RCS (radar cross-section). A potential approach for this is to exploit the framework of spectral compressive sensing [32,33].

**Author Contributions:** Conceptualisation, P.B., N.H.N., and H.-T.T.; methodology, P.B., N.H.N., and H.-T.T.; simulation, N.H.N.; validation, P.B., N.H.N., and H.-T.T.; writing—original draft preparation, N.H.N.; writing—review and editing, P.B., N.H.N., and H.-T.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Defence Science and Technology Group, Australia.

**Conflicts of Interest:** The authors declare no conflict of interest.

**Appendix A. Derivation of Gauss–Newton Algorithm for Estimation of Δ** <sup>1</sup>*Rm***(***<sup>i</sup>* **<sup>+</sup> <sup>1</sup>)**

We rewrite *em*,*n* as

$$\mathcal{L}\_{m,n} = S\_{m,n} - \exp\left\{-\frac{4\pi j f\_{m,n}}{c} \Delta R\_m \right\} Z\_{m,n} \tag{A1}$$

with

$$Z\_{m,n} = \mathcal{U}\_{m,n} \sum\_{l=1}^{L} \mathfrak{A}\_{l}(i+1) \exp\left\{-\frac{4\pi j f\_{m,n}}{c} \mathbf{x}\_{l}\right\}.\tag{A2}$$

By noting that

$$\begin{split} &\mathbb{E}\exp\left\{-\frac{4\pi f f\_{m,n}}{c}\Delta R\_{m}\right\}Z\_{m,n} \\ &= Z\_{m,n}^{R}\cos\left\{\frac{4\pi f\_{m,n}}{c}\Delta R\_{m}\right\} + Z\_{m,n}^{I}\sin\left\{\frac{4\pi f\_{m,n}}{c}\Delta R\_{m}\right\} \\ &\quad + j\left(Z\_{m,n}^{I}\cos\left\{\frac{4\pi f\_{m,n}}{c}\Delta R\_{m}\right\} - Z\_{m,n}^{R}\sin\left\{\frac{4\pi f\_{m,n}}{c}\Delta R\_{m}\right\}\right) \end{split} \tag{A3}$$

where *Z<sup>R</sup> <sup>m</sup>*,*<sup>n</sup>* and *Z<sup>I</sup> <sup>m</sup>*,*<sup>n</sup>* are the real and imaginary components of *Zm*,*n*, we can decouple and stack the real and imaginary components of *e<sup>m</sup>* to form a real-valued vector as

$$\boldsymbol{\mathfrak{e}}\_{\rm{m}} = \left[ \left( \boldsymbol{\mathfrak{e}}\_{\rm{m}}^{\rm{R}} \right)^{T}, \left( \boldsymbol{\mathfrak{e}}\_{\rm{m}}^{\rm{I}} \right)^{T} \right]^{T} \tag{A4}$$

where

$$\boldsymbol{\sigma}\_{m}^{R} = \left[ \dots , \boldsymbol{\varepsilon}\_{m,n}^{R}, \dots \right]\_{n=1,\dots,N\_{m}}^{T} \tag{A5}$$

$$\mathbf{e}\_{\mathfrak{m}}^{\mathrm{I}} = \left[ \dots , \mathbf{e}\_{\mathfrak{m}, \mathfrak{m}}^{\mathrm{I}} \dots \right]\_{\mathfrak{m} = 1, \dots, N\_{\mathfrak{m}}}^{\mathrm{T}} \tag{A6}$$

and

$$\varepsilon\_{m,n}^R = Z\_{m,n}^R \cos\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} + Z\_{m,n}^I \sin\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} \tag{A7}$$

$$\epsilon\_{m,n}^{l} = Z\_{m,n}^{l} \cos\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} - Z\_{m,n}^{R} \sin\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\}.\tag{A8}$$

Using (46)–(A8), the minimization (45) becomes

$$
\overrightarrow{\Delta R}\_{m}(i+1) = \underset{\overrightarrow{\Delta R}\_{m}}{\text{arg min}} \|\mathfrak{e}(\Delta R\_{m})\|\_{2}^{2}.\tag{A9}
$$

By adopting the Gauss–Newton algorithm, Δ 2*Rm*(*i* + 1) can be computed from Δ 2*Rm*(*i*) as

$$
\widehat{\Delta R}\_{\mathfrak{M}}(i+1) = \widehat{\Delta R}\_{\mathfrak{M}}(i) - \left(f\_{\mathfrak{m}}^T(i)f\_{\mathfrak{m}}(i)\right)^{-1} f\_{\mathfrak{m}}^T(i) \mathfrak{e}(\widehat{\Delta R}\_{\mathfrak{m}}(i)) \tag{A10}
$$

where (Δ 2*Rm*(*i*)) is an estimated version of computed from Δ <sup>2</sup>*Rm*(*i*) and *<sup>J</sup>m*(*i*) is the Jacobian of (Δ*Rm*) with respect to Δ*Rm* evaluated at Δ*Rm* = Δ 2*Rm*(*i*).

The expression for the Jacobian *Jm* of (Δ*Rm*) with respect to Δ*Rm* is given by

$$J\_m = \left[ \left( J\_m^R \right)^T, \left( J\_m^I \right)^T \right]^T \tag{A11}$$

where

$$J\_m^R = \left[ \dots, J\_{m,n}^R, \dots \right]\_{n=1,\dots,N\_m}^T \tag{A12}$$

$$J\_m^I = \begin{bmatrix} \dots \dots \, J\_{m,n'}^I \dots \end{bmatrix}\_{n=1,\dots,N\_m}^T \tag{A13}$$

and

$$J\_{m,n}^R = \frac{4\pi f\_{m,n}}{c} \left( -Z\_{m,n}^R \sin\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} + Z\_{m,n}^I \cos\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} \right) \tag{A14}$$

$$\left[J\right]\_{m,n}^{l} = -\frac{4\pi f\_{m,n}}{c} \left(Z\_{m,n}^{l} \sin\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} + Z\_{m,n}^{R} \cos\left\{\frac{4\pi f\_{m,n}}{c} \Delta R\_m \right\} \right). \tag{A15}$$

#### **Appendix B. Analysis of the Nonconvexity of the** *l***1-Norm Regularised Optimisation Problem** (31)

From Equation (31)

$$\{\hat{\mathbf{a}}, \hat{\mathbf{A}}\} = \operatorname\*{arg\,min}\_{\mathbf{a}, \mathbf{A}} \left\{ \|\mathbf{S} - \mathbf{A} \mathbf{F} \mathbf{a}\|\|\_{2}^{2} + \mu \|\mathbf{a}\|\|\_{1} \right\},\tag{A16}$$

or

$$\{\hat{\mathfrak{a}}, \hat{\mathsf{A}}\} = \underset{\mathfrak{a}}{\arg\min} \arg\min \left\{ \|\mathfrak{S} - \mathsf{A}\mathsf{F}\mathfrak{a}\|\_{2}^{2} + \mu \|\mathfrak{a}\|\_{1} \right\}.\tag{A17}$$

$$\|\|\mathbf{S} - \Lambda \mathbf{F}\mathbf{a}\|\|\_{2}^{2} = \left\{\mathbf{S} - \Lambda \mathbf{F}\mathbf{a}\right\}^{H} \left\{\mathbf{S} - \Lambda \mathbf{F}\mathbf{a}\right\} \tag{A18}$$

$$\|\|\mathbf{S} - \mathbf{A}\mathbf{F}\mathbf{a}\|\|\_{2}^{2} = \mathbf{S}^{H}\mathbf{S} - 2\Re\left[\left(\mathbf{F}^{H}\boldsymbol{\Lambda}^{H}\mathbf{S}\right)^{H}\mathbf{a}\right] + \mathbf{a}^{H}\mathbf{F}^{H}\mathbf{F}\mathbf{a}.\tag{A19}$$

Only the term (*FH***Λ***HS*)*Hα* is dependant on Δ*R*2, so the objective function is minimised with respect to Δ*R*<sup>2</sup> when the correlation between *FH***Λ***HS* and *α* is maximised. Now

$$F^H \Lambda^H \mathbf{S} = F\_1^H \mathbf{S}\_1 + F\_2^H \Lambda\_2^H \mathbf{S}\_2 \tag{A20}$$

And the terms *F<sup>H</sup>* <sup>1</sup> *<sup>S</sup>*<sup>1</sup> and *<sup>F</sup><sup>H</sup>* <sup>2</sup> **<sup>Λ</sup>***<sup>H</sup>* <sup>2</sup> *<sup>S</sup>*<sup>2</sup> represent the conventional pulse compression (i.e., transformation from frequency domain to range domain) for each of the radars with a range offset

Δ*R*2. These are non-sparse and low resolution due to the oversampling in range. So *FH***Λ***H*(Δ*R*2)*S* represents the linear superposition of the two conventional range profiles. When these range profiles from the two radars are correctly aligned in range, they will better correlate with the true range profile *α*. Also if the range offset Δ*R*<sup>2</sup> is such that a scatterer for one radar is superimposed upon a different scatterer for the other radar, a local minimum will occur. That explains Figure 6 and the reason for its non-convexity.

#### **Appendix C. Analysis of Operation Count for POMP**

Consider first the case of *M* = 2. Let the size of the grid for the parameter Δ*R*<sup>2</sup> be *L*<sup>Δ</sup> = 2*N*<sup>Δ</sup> , so that a dictionary is constructed for each of these 2*N*<sup>Δ</sup> values of Δ*R*2. OMP is implemented with the number of dictionaries halved at each stage; hence, the name "pruned" OMP. At stage *k* of OMP, the number of complex multiplications and divisions for a single dictionary is denoted *Comp*(*k*). Here *k* = 1, ... , *N*<sup>Δ</sup> + 1 with 2*N*Δ−*k*+<sup>1</sup> dictionaries considered at stage *k*. The purpose of this section is to estimate the dependence of the computational cost of POMP on the size of the dictionary *Lx* and the grid size *L*<sup>Δ</sup> for the parameter Δ*R*2.

With reference to Table 1, the significant costs for POMP are associated with the Identify and Update steps. At each stage of OMP for a given dictionary, the Identify step performs atom/residual correlations which require ∼ *O*(*NmLx*) complex multiplications. The Update step performs a linear least squares estimation requiring Gaussian elimination which, at stage *k* of OMP, has an operation count ∼ *<sup>O</sup>*(*k*3).

Due to the halving of the number of dictionaries at each stage, the total operational count required until only one dictionary is left (although more OMP steps may be required for that dictionary until the residual is sufficiently small) is therefore of order

$$\sum\_{k=1}^{N\_{\Lambda}+1} \left( N\_{\text{H}} L\_{\text{x}} + k^3 \right) 2^{N\_{\Lambda}-k+1} \tag{A21}$$

or

$$(2^{N\_{\Lambda}+1}-1)N\_{\text{ll}}L\_{\text{x}}+(N\_{\Lambda}+1)^3+2^{N\_{\Lambda}+1}\sum\_{k=1}^{N\_{\Lambda}}k^3z^k\tag{A22}$$

with *z* = <sup>1</sup> 2 .

The finite sum ∑*<sup>n</sup> <sup>k</sup>*=<sup>1</sup> *<sup>k</sup>*3*z<sup>k</sup>* is referred to as a low-order polylogarithm, for which a formula may be derived [34]. This formula can be shown to have a leading term of order *n*3*zn*+<sup>3</sup> so that the overall operation count for POMP is <sup>∼</sup> *<sup>O</sup>*((2*N*Δ+<sup>1</sup> <sup>−</sup> <sup>1</sup>)*NmLx* <sup>+</sup> *<sup>N</sup>*<sup>3</sup> <sup>Δ</sup>). As the size of the grid for Δ*R*<sup>2</sup> is *L*<sup>Δ</sup> = <sup>2</sup>*N*<sup>Δ</sup> , *<sup>N</sup>*<sup>Δ</sup> = log *<sup>L</sup>*<sup>Δ</sup> log 2 , and the operation count in terms of *L*<sup>Δ</sup> is ∼ *O L*Δ*NmLx* + log *<sup>L</sup>*<sup>Δ</sup> log 2 <sup>3</sup> . We see that to leading order the computational cost is proportional to the size of the grid for Δ*R*2.

For the case of general *M* this cost is multiplied by (*M* − 1).

#### **References**


© 2020 Commonwealth of Australia. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Compressive Sensing for Tomographic Imaging of a Target with a Narrowband Bistatic Radar**

**Ngoc Hung Nguyen 1,***∗***, Paul Berry <sup>2</sup> and Hai-Tan Tran <sup>2</sup>**


Received: 12 November 2019; Accepted: 9 December 2019; Published: 13 December 2019

**Abstract:** This paper introduces a new approach to bistatic radar tomographic imaging based on the concept of compressive sensing and sparse reconstruction. The field of compressive sensing has established a mathematical framework which guarantees sparse solutions for under-determined linear inverse problems. In this paper, we present a new formulation for the bistatic radar tomography problem based on sparse inversion, moving away from the conventional *k*-space tomography approach. The proposed sparse inversion approach allows high-quality images of the target to be obtained from limited narrowband radar data. In particular, we exploit the use of the parameter-refined orthogonal matching pursuit (PROMP) algorithm to obtain a sparse solution for the sparse-based tomography formulation. A key important feature of the PROMP algorithm is that it is capable of tackling the dictionary mismatch problem arising from off-grid scatterers by perturbing the dictionary atoms and allowing them to go off the grid. Performance evaluation studies involving both simulated and real data are presented to demonstrate the performance advantage of the proposed sparsity-based tomography method over the conventional *k*-space tomography method.

**Keywords:** radar tomography; compressive sensing; sparse reconstruction; bistatic radar; radar imaging; parameter-refined orthogonal matching pursuit (PROMP); orthogonal matching pursuit (OMP); *k*-space tomography; narrowband radar; off-grid compressive sensing

#### **1. Introduction**

Radar imaging has received much attention for several decades, having a wide range of applications in both civilian and military domains [1–3]. In principle, to obtain high-resolution radar images, a wide bandwidth of radar waveform is required for a fine resolution in the range direction, while a large antenna aperture is required for a fine resolution in the cross-range direction. To overcome the physical constraints of the radar aperture size, a synthesized aperture with a much larger size can be formed by exploiting the relative motion between the radar and target. This is, in fact, the main idea behind the synthetic aperture radar (SAR) and inverse SAR (ISAR) [1]. In recent years, there has been an increasing demand on the radio-frequency (RF) electromagnetic spectrum due to rapid advances in radar and communications, and the radar has to compete for spectrums with many different services, including radio and television broadcasting, communications, and radio-navigation [4]. As a result, the constraints on spectrum availability may present severe limits on signal bandwidth, prompting the need for high-resolution imaging techniques using narrowband radars.

As a consequence, Doppler tomography has been considered for narrowband radar imaging [5–12], which is also called "Doppler radar tomography" (DRT). The main idea of DRT is to utilize the information given by the Doppler frequencies induced from the relative radar-target rotational motion to construct an image of the target, which can be conveniently formulated in the slow-time *k*-space [8]. Imaging can also be formulated in the more traditional fast-time *k*-space, the support for which is created by sweeping out the complex samples of the received signal in the angular direction. For a particular transmit frequency, the complex samples for all available aspect angles form a circular arc in the spatial frequency space. The traditional range-Doppler ISAR imaging can be considered as a special case of this fast-time *k*-space technique when a wideband signal is available and the total rotation angle is small enough that the support region can be approximated as being rectangular. The inversion process for image formation has evolved from traditional tools, such as filtered back projection, to the more modern non-uniform fast Fourier transform (NUFFT). The *k*-space radar tomography was also considered in bistatic settings [13,14]. The bistatic radar offers several advantages over a monostatic radar, including higher performance against stealth targets, less vulnerability to jamming, and its covertness.

The main objective of this paper is to present a new tomographic imaging technique for a narrowband bistatic radar based on the framework of compressive sensing and sparse reconstruction. The field of compressive sensing has established a mathematical framework which guarantees sparse solutions for underdetermined linear inverse problems that occur across numerous engineering and mathematical science fields. In particular, this framework has found applications in various radar imaging problems, ranging from moving target indication, ISAR imaging, coherence imaging, multichannel imaging, micro-Doppler imaging, to through-the-wall radar imaging (see, e.g., [15–26]). The key contributions of this paper are summarized as follows.


The remainder of this paper is organized as follows. Section 2 describes the signal model for bistatic radar tomography. Section 3 formulates bistatic radar tomographic imaging as a sparse inversion problem. Section 4 derives the sparse solution based on the PROMP algorithm. Performance studies with simulated and real data are presented in Section 5. The paper ends in Section 6 with some concluding remarks.

#### **2. Signal Model**

Figure 1 shows the geometry for the problem of bistatic radar tomographic imaging under consideration. The transmitter Tx and receiver Rx are located in the far field of the target of interest. The bistatic angle between the transmitter and receiver with respect to the target is denoted as *β*. The transmitter is narrowband, transmitting a continuous waveform at a single frequency *f* (i.e., the wavelength is *λ* = *c*/ *f* , where *c* is the speed of signal propagation). A local target coordinate frame T (*x*1, *x*2, *x*3), which is fixed and rotated with the target, is chosen as the reference frame. Here, both the transmitter and receiver lie on the image plane X (*x*1, *x*2) of the target, and the origin of the frame T is placed at the target rotation centre. It is also assumed that the target rotational speed Ω is constant over the coherent processing interval (CPI) and is accurately estimated a priori.

**Figure 1.** The radar-target geometry of the considered bistatic radar tomographic imaging problem.

The receiver takes one complex-valued scattered signal sample for each rotation angle *θ<sup>n</sup>* = Ω*tn* of the target (with respect to the axis *x*<sup>2</sup> of the frame T ) at time *tn*. The expression of the scattered signal sample collected at time *tn* is given by [13,14]

$$s(t\_{\mathbb{H}}) = \int\_{\mathcal{X}\_1} \int\_{\mathcal{X}\_2} \sigma(\mathbf{x}) \exp\left\{-j\frac{2\pi}{c}f\left[\mathbf{R}(t\_{\mathbb{H}}) + 2\cos\left(\frac{1}{2}|\beta|\right)\mathbf{x} \cdot \mathbf{u}(t\_{\mathbb{H}})\right]\right\} d\mathbf{x}\_1 d\mathbf{x}\_2. \tag{1}$$

where *x* = [*x*1, *x*2] *<sup>T</sup>*, *R* is the total bistatic range between the target centre (or focus point) and the transmitter and receiver, and *σ*(*x*) is the scatterer reflectivity distribution projected onto the image plane. Note that the total bistatic range *R* is, in general, a function of time *tn* because of the target translational motion. In (1), *u* = [*u*1, *u*2] *<sup>T</sup>* denotes the unit vector along the bisector of the bistatic angle *β*. It is also noted that the radar tomographic imaging problem under interest is considered in the rotating local frame T , and the signal model (1) has already taken into account the rotational motion of the target by the rotation of the unit vector *<sup>u</sup>*(*tn*) relative to <sup>T</sup> .

Since *x*<sup>1</sup> and *x*<sup>2</sup> in the image domain are discrete variables in practical radar imaging applications, the reflectivity function *<sup>σ</sup>*(*x*) is commonly discretized over the image plane <sup>X</sup> (*x*1, *<sup>x</sup>*2) onto a grid of points *<sup>x</sup><sup>m</sup>* for *<sup>m</sup>* ∈ {1, . . . , *<sup>M</sup>*}, as

$$\sigma(\mathbf{x}) = \sum\_{m=1}^{M} \sigma\_{\mathfrak{M}} \delta(\mathbf{x} - \mathbf{x}\_{\mathfrak{M}}).\tag{2}$$

Substituting (2) into (1) and assuming that translational motion compensation is accurately accomplished by additional pre-processing, we obtain

$$s(t\_{\rm ll}) = \sum\_{m=1}^{M} \sigma\_{\rm m} \exp\left\{-j\frac{4\pi}{c}f\cos\left(\frac{1}{2}|\beta|\right)\mathbf{x}\_{\rm m} \cdot \mathbf{u}(t\_{\rm tr})\right\}.\tag{3}$$

A compact vector form of (3) is given by

$$\mathbf{s} = \Phi \boldsymbol{\sigma},\tag{4}$$

where

$$\mathbf{s} = \begin{bmatrix} s(t\_1), \dots, s(t\_N) \end{bmatrix}^T \tag{5}$$

$$\sigma = [\sigma\_1, \dots, \sigma\_M]^T \tag{6}$$

and

$$\Phi = [\Phi(\mathbf{x}\_1), \dots, \Phi(\mathbf{x}\_M)] \tag{7a}$$

$$\boldsymbol{\Phi}(\mathbf{x}\_{m}) = \begin{bmatrix} \boldsymbol{\Phi}(\mathbf{x}\_{m}, t\_{1}), \dots, \boldsymbol{\Phi}(\mathbf{x}\_{m}, t\_{N}) \end{bmatrix}^{T} \tag{7b}$$

$$\phi(\mathbf{x}\_{\mathrm{ll}}, t\_{\mathrm{ll}}) = \exp\left\{-j\frac{4\pi}{c}f\cos\left(\frac{1}{2}|\beta|\right)\mathbf{x}\_{\mathrm{ll}} \cdot \boldsymbol{\omega}(t\_{\mathrm{ll}})\right\}.\tag{7c}$$

In practice, where noise is presented, the radar received signal becomes

$$
\ddot{s} = s + \mathfrak{u} = \Phi \sigma + \mathfrak{u},\tag{8}
$$

where *n* = [*n*(*t*1),..., *n*(*tN*)]*T*, with *n*(*tn*) denoting the complex-valued noise term at time *tn*.

#### **3. Sparse Inversion Formulation of Bistatic Radar Tomography**

The objective of any target imaging problem is to construct a spatial reflectivity map of the target from the backscattered radar signal. Specifically, the ultimate objective is to estimate the unknown reflection vector *σ* from the noisy received signal vector *s*˜ by solving (8). Since the number of signal samples received, as often is the case in practice, is much smaller than the number of grid points in the reflectivity map (i.e., *N M*), solving (8) is essentially an underdetermined linear inverse problem which requires additional regularization constraints to obtain meaningful solutions.

Typical target images captured by microwave radar signals has been known in the literature to contain a few dominant scattering centers (see, e.g., [15–17,19–23]). As a result, the reflection vector *σ* only has a small number of non-zero elements, thus enjoying a sparse characteristic. Such a sparse characteristic of *σ* can be utilized as a regularizing constraint to solve the underdetermined inverse problem (8), that is,

$$\text{A fixed sparse } \sigma \text{ such that } \mathfrak{s} \approx \Phi \sigma. \tag{9}$$

This sparse inversion problem can be effectively solved under the compressive sensing framework using sparse reconstruction algorithms. Note that in the compressive sensing context, the matrix **Φ** is commonly referred to as the dictionary, and the columns of **Φ** are called the atoms, each representing the theoretical scattered signal component of a hypothetical scatterer residing on a grid point of the reflectivity map.

Compressive sensing and sparse reconstruction have been extensively studied in the last two decades, with various techniques proposed. Comprehensive surveys of the state-of-the-art on this topic can be found in [22,23,30,31]. The objective of this paper is to apply the sparse reconstruction approach to the bistatic radar tomographic imaging problem.

The main challenge for this work is that the true scatterers constituting the target do not coincide exactly with the grid which is used to construct the dictionary, leading to dictionary mismatch problems which in turn significantly degrade the performance of conventional sparse reconstruction techniques [27,28]. Several methods have been presented in the literature to address the off-grid dictionary mismatch problems based on the ideas of joint-sparse recovery [32], dictionary perturbation [33,34], sparse Bayesian learning [35,36], and parameter perturbation [29,37]. In this paper, we will exploit the use of the PROMP method [29,37], that is, a parameter perturbation method, to solve the sparse inversion problem (9). The main motivations of using PROMP are twofold. Firstly, PROMP is capable of tackling the dictionary mismatch problem by perturbing the dictionary atoms and allowing them to go off the grid. Secondly, PROMP is computationally efficient and thus suitable for real-time operation because it belongs to the greedy pursuit family which identifies the support of the solution in an iterative manner based on the level of correlation between the input data and the dictionary atoms.

#### **4. Parameter-Refined Orthogonal Matching Pursuit**

Table 1 summarizes the overall structure of the PROMP algorithm. As a variant of the greedy pursuit technique, PROMP solves the sparse inversion problem (9) by identifying the support of *σ* in an iterative greedy manner. In particular, it starts with an empty support set Λ[0] = ∅ and sets the signal *s*˜ as the initial signal residual *r*[0] . Like other greedy techniques, one column of **Φ** (corresponding to one atom of the dictionary) that produces the largest correlation with the current signal residual *r*[*i*] is chosen and added to the support set Λ[*i*] in each iteration. However, a unique feature of PROMP is that it allows the dictionary atoms to go off the grid by perturbing their parameters, thus it can overcome the off-grid dictionary mismatch problem. Specifically, the updated step of PROMP, as different to that of other greedy pursuit techniques, not only estimates the coefficients *σ*[*i*] *<sup>k</sup>* but also determines the positions *x* [*i*] *<sup>k</sup>* = *x* [*i*] 1,*k*, *x* [*i*] 2,*k T* , *k* = 1, ... , *i*, of the scatterers associated with the current support set Λ[*i*] via the least-square sense as

$$\begin{aligned} \left\{ \left. \boldsymbol{\sigma}\_{k}^{[i]}, \dot{\boldsymbol{\mathfrak{x}}}\_{k}^{[i]} \right\} \right\}\_{k=1,\ldots,i} &= \arg\min \left\| \bar{\mathbf{s}} - \sum\_{k=1}^{i} \boldsymbol{\sigma}\_{k}^{[i]} \boldsymbol{\Phi} \left( \mathbf{x}\_{k}^{[i]} \right) \right\|\_{2} \\ \text{subject to} & \\ \left\| \left. \dot{\boldsymbol{\mathfrak{x}}}\_{1,k}^{[i]} - \dot{\boldsymbol{\mathfrak{x}}}\_{1,k}^{[i]} \right\| \leq \zeta \text{ and } \left\| \left. \dot{\boldsymbol{\mathfrak{x}}}\_{2,k}^{[i]} - \dot{\boldsymbol{\mathfrak{x}}}\_{2,k}^{[i]} \right\| \leq \zeta \text{.} \end{aligned} \tag{10} $$

2,*k*

subject to

1,*k*

which is in fact a nonlinear least-square (NLS) estimation problem. Here,  $\bar{x}\_k^{[i]} = \left[\bar{x}\_{1,k}^{[i]}, \bar{x}\_{2,k}^{[i]}\right]^T$ ,  $k = 1, \dots, k$  are the positions of the dictionary atoms in the current support set  $\Lambda^{[i]}$ . Note that the constraint in (10) ensures the position estimate for each scatterer starting within the vicinity of the corresponding dictionary atom. A nominal resolution of  $\lambda/2$  can be used to set the value of  $\tilde{\iota}$ .

#### INPUT: ˜*s*, **Φ**.

	- Identify:

$$\begin{array}{l} \mathbf{c}^{[i]} = \boldsymbol{\Phi}^{H} \boldsymbol{r}^{[i-1]}\\ \boldsymbol{j}^{[i]} = \operatorname{arg} \max\_{\boldsymbol{j}} \left| \mathbf{c}\_{\boldsymbol{j}}^{[i]} \right| \end{array}$$

quit the iteration **if** *j* [*i*] <sup>∈</sup> <sup>Λ</sup>[*i*−1]


<sup>Λ</sup>[*i*] <sup>=</sup> <sup>Λ</sup>[*i*−1] <sup>∪</sup> *<sup>j</sup>* [*i*]


$$\left\{\boldsymbol{\mathfrak{d}}\_{k}^{[i]},\boldsymbol{\mathfrak{x}}\_{k}^{[i]}\right\}\_{k=1,\ldots,i} = \arg\min \left\|\boldsymbol{\mathfrak{s}} - \sum\_{k=1}^{i} \boldsymbol{\sigma}\_{k}^{[i]} \boldsymbol{\Phi} (\boldsymbol{\mathfrak{x}}\_{k}^{[i]})\right\|\_{2}$$

$$\begin{aligned} \text{subject to} \\ \left\|\boldsymbol{\mathfrak{x}}\_{1,k}^{[i]} - \boldsymbol{\mathfrak{x}}\_{1,k}^{[i]}\right\| \leq \zeta \text{ and } \left\|\boldsymbol{\mathfrak{x}}\_{2,k}^{[i]} - \boldsymbol{\mathfrak{x}}\_{2,k}^{[i]}\right\| \leq \zeta \end{aligned}$$


$$r^{[i]} = \mathfrak{s} - \sum\_{k=1}^{i} \partial\_k^{[i]} \Phi \left( \mathfrak{x}\_k^{[i]} \right)$$

**end for**.

*σ*ˆ [*i*] *<sup>k</sup>* , ˆ*<sup>x</sup>* [*i*] *k* 

OUTPUT: &

 $k = 1, \dots, i$ 
 $\frac{\text{The superscript } ^H \text{ stands for the \"Herrnition\"intersection\" .}}{}$ 

Since the NLS problem in (10) does not admit a closed-form solution, in what follows we will derive an iterative solution based on the Gauss–Newton (GN) approach [38]. Note that the scatterer reflection coefficient is a complex-valued variable, while the scatterer position is a real-valued variable. For the sake of convenience, we transform (10) into a NLS problem purely in the real-valued domain. In particular, we re-express the cost function in (10) as

$$\left\|\bar{\mathbf{s}} - \sum\_{k=1}^{i} \sigma\_{k}^{[i]} \boldsymbol{\Phi} \left(\mathbf{x}\_{k}^{[i]}\right)\right\|\_{2} = \left\|\begin{bmatrix} \text{Real}\{\bar{\mathbf{s}}\} \\ \text{Imag}\{\bar{\mathbf{s}}\} \end{bmatrix} - \sum\_{k=1}^{i} \begin{bmatrix} \text{Real}\left\{\sigma\_{k}^{[i]} \boldsymbol{\Phi} \left(\mathbf{x}\_{k}^{[i]}\right)\right\} \\ \text{Imag}\left\{\sigma\_{k}^{[i]} \boldsymbol{\Phi} \left(\mathbf{x}\_{k}^{[i]}\right)\right\} \end{bmatrix}\right\|\_{2} \tag{11}$$

where explicit expressions of Real ! *σ*[*i*] *<sup>k</sup> φ x* [*i*] *k* " and Imag ! *σ*[*i*] *<sup>k</sup> φ x* [*i*] *k* " are given by

$$\text{Real}\left\{\sigma\_{k}^{[i]}\Phi\left(\mathbf{x}\_{k}^{[i]}\right)\right\} = \left[\ldots, \left(\text{Real}\{\sigma\_{k}^{[i]}\}\cos\theta\_{k}^{[i]}(t\_{\text{\textquotedblleft}}) - \text{Imag}\{\sigma\_{k}^{[i]}\}\sin\theta\_{k}^{[i]}(t\_{\text{\textquotedblright}})\right), \ldots\right]\_{n=1,\ldots,N'}^{T} \tag{12a}$$

$$\text{Imag}\left\{\boldsymbol{v}\_{k}^{[i]}\boldsymbol{\Phi}\left(\mathbf{x}\_{k}^{[i]}\right)\right\} = \left[\dots\right,\left(\text{Real}\{\boldsymbol{v}\_{k}^{[i]}\}\sin\theta\_{k}^{[i]}(t\_{n}) + \text{Imag}\{\boldsymbol{v}\_{k}^{[i]}\}\cos\theta\_{k}^{[i]}(t\_{n})\right),\dots\right]\_{n=1,\dots,N}^{T}\tag{12b}$$

and

$$\theta\_k^{[i]}(t\_\hbar) = -\frac{4\pi f}{c}\cos\left(\frac{|\beta|}{2}\right)\left(x\_{1,k}^{[i]}u\_1(t\_\hbar) + x\_{2,k}^{[i]}u\_2(t\_\hbar)\right). \tag{13}$$

Now we define

$$\bar{z} = \begin{bmatrix} \text{Real}\{\bar{\mathfrak{s}}\} \\ \text{Imag}\{\bar{\mathfrak{s}}\} \end{bmatrix}, \quad z = \sum\_{k=1}^{i} \begin{bmatrix} \text{Real}\left\{\sigma\_{k}^{[i]} \boldsymbol{\Phi}\left(\mathbf{x}\_{k}^{[i]}\right)\right\} \\ \text{Imag}\left\{\sigma\_{k}^{[i]} \boldsymbol{\Phi}\left(\mathbf{x}\_{k}^{[i]}\right)\right\} \end{bmatrix} \tag{14}$$

and

$$\mathfrak{g}^{[i]} = \left[\mathfrak{f}\_1^{[i]T}, \dots, \mathfrak{f}\_i^{[i]T}\right]^T \\ \text{with } \mathfrak{f}\_k^{[i]} = \left[\sigma\_{\mathbb{R},k'}^{[i]} \sigma\_{I,k'}^{[i]} \mathbf{x}\_{1,k'}^{[i]} \mathbf{x}\_{2,k}^{[i]}\right]^T. \tag{15}$$

Here, *σ*[*i*] *<sup>R</sup>*,*<sup>k</sup>* <sup>=</sup> Real{*σ*[*i*] *<sup>k</sup>* }, *<sup>σ</sup>*[*i*] *<sup>I</sup>*,*<sup>k</sup>* <sup>=</sup> Imag{*σ*[*i*] *<sup>k</sup>* }. Noting that *<sup>z</sup>* is a function of *<sup>ξ</sup>*[*i*] , (10) is equivalent to

$$\mathfrak{E}^{[i]} = \underset{\mathfrak{E}^{[i]}}{\arg\min} \left\| \bar{z} - z\left(\mathfrak{E}^{[i]}\right) \right\|\_{2}.\tag{16}$$

This is a NLS problem solely in the real-valued domain, and its solution can be obtained via the following GN iteration [38]

$$\mathfrak{f}^{[i]}(h+1) = \mathfrak{f}^{[i]}(h) + \left(\Gamma^T(h)\Gamma(h)\right)^{-1}\Gamma^T(h)\left(\overline{z} - z\left(\mathfrak{f}^{[i]}(h)\right)\right) \tag{17}$$

for *h* = 0, 1, ... , where **Γ**(*h*) = **Γ** ˆ *ξ*[*i*] (*h*) is the Jacobian matrix of *z* with respect to *ξ*[*i*] evaluated at *ξ*[*i*] = ˆ *ξ*[*i*] (*h*) and *z* ˆ *ξ* is an estimate of *z* calculated at *ξ*[*i*] = ˆ *ξ*[*i*] (*h*).

The expression of the Jacobian matrix **Γ**(*ξ*[*i*] ) is given by

$$\Gamma = \left[ \dots, \Gamma\_k, \dots \right]\_{k=1,\dots,j \atop (\Delta) \dots (\Delta)} \tag{18a} \tag{18a}$$

$$
\Gamma\_k = \begin{bmatrix}
\Gamma\_k^{(1)} & \Gamma\_k^{(2)} & \Gamma\_k^{(3)} & \Gamma\_k^{(4)} \\
\Gamma\_k^{(5)} & \Gamma\_k^{(6)} & \Gamma\_k^{(7)} & \Gamma\_k^{(8)}
\end{bmatrix} \tag{18b}
$$

$$\Gamma\_k^{(1)} = \begin{bmatrix} \dots \dots \cos \theta\_k^{[i]}(t\_n), \dots \end{bmatrix}\_{n=1,\dots,N}^T \tag{18c}$$

$$\begin{aligned} \mathbf{T}\_k^{(5)} &= \left[ \dots , \sin \theta\_k^{[i]} (t\_n) , \dots \right]\_{n=1, \dots, N}^T \\ \mathbf{T}\_k^{(2)} &= \left[ \dots , -\sin \theta\_k^{[i]} (t\_n) , \dots \right]\_{n=1 \quad \dots \quad \dots}^T \end{aligned} \tag{18d}$$

$$\begin{aligned} \boldsymbol{\Gamma}\_{k}^{(2)} &= \left[ \dots , -\sin \theta\_{k}^{[\boldsymbol{i}]} (\boldsymbol{t}\_{\boldsymbol{n}}) , \dots \right]\_{\boldsymbol{n}=1,\dots,N}^{T} \\ \boldsymbol{\Gamma}\_{k}^{(6)} &= \left[ \dots , \cos \theta\_{k}^{[\boldsymbol{i}]} (\boldsymbol{t}\_{\boldsymbol{n}}) , \dots \right]\_{\boldsymbol{n}=1,\dots,N}^{T} \end{aligned} \tag{18}$$

$$\mathbf{T}\_{k}^{(3)} = \left[ \dots , \frac{4\pi f}{c} \cos\left(\frac{|\beta|}{2}\right) u\_1(t\_n) \left( \sigma\_{k,k}^{[i]} \sin\theta\_k^{[i]}(t\_n) + \sigma\_{l,k}^{[i]} \cos\theta\_k^{[i]}(t\_n) \right) \dots \right]\_{n=1,\dots,N}^{T} \tag{18g}$$

$$\Gamma\_{k}^{(7)} = \left[ \dots, \frac{4\pi f}{c} \cos\left(\frac{|\beta|}{2}\right) u\_1(t\_n) \left( -\sigma\_{R,k}^{[i]} \cos\theta\_k^{[i]}(t\_n) + \sigma\_{I,k}^{[i]} \sin\theta\_k^{[i]}(t\_{\text{il}}) \right), \dots \right]\_{n=1,\dots,N}^{T} \tag{18h}$$

$$\Gamma\_k^{(4)} = \left[ \dots, \frac{4\pi f}{c} \cos\left(\frac{|\beta|}{2}\right) u\_2(t\_n) \left( \sigma\_{R,k}^{[i]} \sin\theta\_k^{[i]}(t\_n) + \sigma\_{I,k}^{[i]} \cos\theta\_k^{[i]}(t\_n) \right), \dots \right]\_{n=1,\dots,N}^T \tag{18i}$$

$$\mathbf{T}\_{k}^{(8)} = \left[ \dots , \frac{4\pi f}{c} \cos\left(\frac{|\beta|}{2}\right) \boldsymbol{u}\_{2}(t\_{n}) \left( -\sigma\_{\mathbf{R},k}^{[i]} \cos\theta\_{k}^{[i]}(t\_{n}) + \sigma\_{I,k}^{[i]} \sin\theta\_{k}^{[i]}(t\_{n}) \right) , \dots \right]\_{n=1,\dots,N}^{T} . \tag{18}$$

The following decision logic is then applied at the end of each GN iteration to ensure the constraint in (10) is met:

$$\begin{split} \text{if } \left( \hat{\mathfrak{x}}\_{1,k}^{[i]}(h+1) - \bar{\mathfrak{x}}\_{1,k}^{[i]} \right) &\stackrel{\text{>}}{\gtrless} \pm \zeta \text{ set } \hat{\mathfrak{x}}\_{1,k}^{[i]}(h+1) = \bar{\mathfrak{x}}\_{1,k}^{[i]} \pm \zeta, \\ \text{if } \left( \hat{\mathfrak{x}}\_{2,k}^{[i]}(h+1) - \bar{\mathfrak{x}}\_{2,k}^{[i]} \right) &\stackrel{\text{>}}{\lesssim} \pm \zeta \text{ set } \hat{\mathfrak{x}}\_{2,k}^{[i]}(h+1) = \bar{\mathfrak{x}}\_{2,k}^{[i]} \pm \zeta. \end{split} \tag{19}$$

When the constraint is in effect, a re-estimation of the reflection coefficients is performed as

$$\left[\mathfrak{d}\_1^{[i]}(h+1), \dots, \mathfrak{d}\_i^{[i]}(h+1)\right]^T = \left(\mathbf{Y}^H\mathbf{Y}\right)^{-1}\mathbf{Y}^H\mathbf{\tilde{s}}\tag{20}$$

with **Υ** = *φ*(*x*ˆ [*i*] <sup>1</sup> (*<sup>h</sup>* <sup>+</sup> <sup>1</sup>)), ... , *<sup>φ</sup>*(*x*<sup>ˆ</sup> [*i*] *<sup>i</sup>* (*<sup>h</sup>* <sup>+</sup> <sup>1</sup>)) . The GN iteration can be halted after a fixed number of iterations or if the *l*<sup>2</sup> norm of the updating term falls below a given threshold.

The GN iteration is initialized to the solution ˆ *ξ*[*i*−1] obtained from the previous PROMP iteration *i* − 1 and the newly selected atom:

$$\mathfrak{z}^{[i]}(0) = [\mathfrak{z}^{[i-1]T}, \text{Real}\{\boldsymbol{\vartheta}\_{j[i]}\}, \text{Imag}\{\boldsymbol{\vartheta}\_{j[i]}\}, \bar{\mathfrak{x}}\_{j[i]}^T]^T \tag{21}$$

where *<sup>x</sup>*¯*<sup>j</sup>* [*i*] is the position of the newly selected atom at index *j* [*i*] within the dictionary and *σ*¯*j* [*i*] = *<sup>φ</sup>H*(*x*¯*<sup>j</sup>* [*i*])*φ*(*x*¯*<sup>j</sup>* [*i*]) −<sup>1</sup> *<sup>φ</sup>H*(*x*¯*<sup>j</sup>* [*i*])*r*[*i*−1] is the corresponding initial coefficient estimate for this atom.

#### **5. Results**

In this section, we demonstrate the performance superiority of the proposed sparsity-based tomography method based on the PROMP algorithm over the conventional *k*-space tomography method via results using both simulated and real data. The result comparison also includes the performance of the OMP algorithm to illustrate the off-grid dictionary mismatch problem of the sparsity-based tomography formulation and to verify the effectiveness of PROMP in dealing with this issue.

#### *5.1. Results with Simulated Data*

We consider two synthetic targets with two and eight scatterers, respectively, as depicted in Figure 2. In this simulation, the target rotational speed is set to Ω = 37.70 rad/s and the signal frequency is set to *f* = 9.96 GHz. The constraint value *ζ* is set to *ζ* = *λ*/2 for PROMP. The dictionary matrix is constructed using a regularly spaced grid in Cartesian coordinates with the grid step size of Δ = *λ*/5 on each *x*- and *y*-axis. The sampling frequency at the receiver is set to 2.16 kHz. PROMP and OMP iterations are halted if the signal residual reaches the noise level.

**Figure 2.** Ground truth images of two synthetic targets under consideration.

Figure 3 compares the reconstructed images for synthetic target 1 obtained by the *k*-space, OMP and PROMP algorithms for various noise levels. Here, the number of data samples is *N* = 360 (i.e., the CPI approximately being one full rotation cycle of the target). Note that the SNR is defined by SNR <sup>=</sup> <sup>20</sup> log10( *<sup>s</sup>* / *<sup>n</sup>* ). We observe that the *<sup>k</sup>*-space technique produces images with two main peaks corresponding to the true target scatterers. However, along with these two main peaks, the images obtained by the *k*-space technique also contain other sidelobes with many spurious peaks. Specifically, the higher the noise is, the poorer the performance of the *k*-space technique (i.e., yielding a larger numbers of spurious peaks). In contrast, such a problem associated with spurious peaks does not appear in the OMP and PROMP images, thus demonstrating the performance advantage of the sparsity-based tomography approach over the conventional *k*-space approach. It is observed that one of the scatterers is split into multiple peaks in the OMP images. This observation can be

explained by the fact the OMP solution relies on the fixed dictionary which is built based on a grid of atoms while the true scatterers of the target do not coincide with this dictionary grid, thereby demonstrating the dictionary mismatch problem. On the other hand, by perturbing the dictionary atoms and allowing them to go off the grid, the PROMP algorithm can effectively overcome the dictionary mismatch problem by exhibiting a clean image with only two peaks corresponding to the true scatterers. More importantly, the locations of the peaks in the PROMP images almost exactly match the locations of the true scatterers.

**Figure 3.** Reconstructed images obtained by the *k*-space, OMP and PROMP algorithms for synthetic target 1.

Figure 4 shows the results for synthetic target 2. This is a more challenging scenario because target 2 contains much more scatterers than target 1. We observer that the conventional *k*-space method is struggling to produce reliable image results because of the interaction between the sidelobes of different main peaks, especially in large noise scenarios. Such an interaction leads to some strong spurious peaks which have similar magnitudes to the correct peaks that correspond to the true scatterers, thus making the resulting images severely distorted. On the other hand, compared to the *k*-space method, OMP results in much more satisfactory images. However, the OMP performance is significantly affected by the dictionary mismatch problem arising from off-grid scatterers. As a result, the true scatterers are split into multiple peaks in the OMP images, and some spurious peaks also appear. In contrast, PROMP produces clean and clear images which are almost identical to the ground truth target image, even at large noise levels.

**Figure 4.** Reconstructed images obtained by the *k*-space, OMP and PROMP algorithms for synthetic target 2.

To further demonstrate the superior performance of PROMP, the reconstructed images obtained from less data samples (i.e., with CPI = 2/3 and 1/3 target rotation cycle) are shown in Figure 5. With a limited number of data samples, the *k*-space method results in unsatisfactory images with incorrect peaks, while OMP and PROMP are observed to retain their good performance. In addition,

similar to the observations in Figure 4, PROMP outperforms OMP and provides a better image of the target thanks to its ability to deal with off-grid scatterers.

**Figure 5.** Reconstructed images obtained by the *k*-space, OMP and PROMP algorithms for synthetic target 2, given less data samples.

Note that, to satisfy the sparsity condition, the number of dominant scatterers constituting the target must be sufficiently small compared to the number of grid points on the reflectivity map. The required sparsity level in general depends on several factors, including the number of data samples, the noise level, as well as the level of coherence between the atoms of the dictionary. In compressive sensing, the restricted isometry property and the mutual incoherence property establish theoretical connections between those factors required for the effectiveness of sparse reconstruction [23]. However, these analytical metrics are overly-conservative and do not reflect the average performance which is often of interest from the practical point of view [39].

We now compare the performance of the *k*-space, OMP, and PROMP methods using the earth mover's distance (EMD) between the true and reconstructed images. EMD [40] is a widely-used metric to compare the similarity between different images. In principle, EMD is an estimate of the distance between two distributions which is equivalent to the minimal amount of work required for one distribution to be transformed to the other [40]. Figure 6 shows the EMD performance of the *k*-space, OMP, and PROMP methods, averaged from 1000 Monte Carlo runs, against various levels of SNR for the synthetic target 2 and CPI = 1 target rotation cycle. We observe that the PROMP method exhibits an EMD much smaller than those of the *k*-space and OMP methods. This indicates that, from a statistical point of view, the image obtained by PROMP is much closer to the ground-truth image than those obtained by the *k*-space and OMP methods, thus verifying the performance superiority of the PROMP method.

**Figure 6.** EMD performance of the *k*-space, OMP and PROMP algorithms versus SNR for synthetic target 2 and CPI = 1 target rotation cycle.

Table 2 compares the runtimes of the *k*-space, OMP, and PROMP algorithm for the image results shown in Figure 4b. For a fair comparison, all methods were implemented in MATLAB on the same Intel Core i7 3.40 GHz CPU with 16 GHz RAM. We observe that the *k*-space method is much slower than the OMP and PROMP methods. The reason for this is that the *k*-space method requires the non-uniform fast Fourier transform to be performed, thus being computationally more demanding compared to OMP and PROMP. On the other hand, the OMP and PROMP methods are computationally fast, thanks to the fact that they belong to the greedy pursuit family. In addition, each iteration of PROMP is about 7.3 times slower than each iteration of OMP because PROMP incorporates a NLS solver, rather than a linear least-squares solver as in OMP. However, the overall timerun of PROMP is only 2.4 times slower than that of OMP because PROMP requires many fewer iterations than OMP to make the signal residual reach the noise level.


**Table 2.** Complexity comparison .

 All methods are implemented in MATLAB on an Intel Core i7 3.40 GHz CPU with 16 GHz RAM, for the image results shown in Figure 4b.

#### *5.2. Results with Real Data*

The experimental data used in this paper was collected in the Mumma Radar Laboratory at the University of Dayton, Ohio, USA. Although this paper focuses on narrowband tomographic imaging, a wideband waveform at X-band with stepped frequency pulses over 101 regular frequency steps from 8 GHz to 12 GHz was used in the experiment. The aim of using wideband data was only for the removal of extraneous clutter components existing in the lab environment, as described in [13]. After that, only the measured data from one discrete frequency is actually used for algorithm performance evaluation for the problem of narrowband tomographic imaging under consideration.

Figure 7 shows photos of the experimental system configuration. The experimental setup involves transmitting and receiving horn antennas mounted on separate robotic arms in a controlled laboratory environment with radar-absorbing material to reduce the radar reflections from the floor and walls. These robotic arms could be oriented and positioned with high precision. During the experiment, the antennas were kept stationary while the target was rotated over 1◦ steps through 360◦, where the stepped-frequency waveform was transmitted and sampled (one sample for each frequency). Recall from above that only one frequency sample set was used for tomographic imaging purposes. The experimental target was comprised of two vertical metallic rods with a 19 cm separation to emulate two point scatterers which rotate around a vertical pedestal. Figure 8 shows the wideband *k*-space tomographic image obtained from monostatic data with all 101 frequency steps. This will be used as a reference benchmark for the performance evaluation of narrowband bistatic imaging presented in this section.

**Figure 7.** Photos of experimental system configuration: (**a**) an antenna mounted on a robotic arm, and (**b**) two vertical metallic rods secured to a rotating pedestal.

G%

**Figure 8.** A benchmark experimental target image using wideband signals received by the monostatic receiver for all 101 frequency steps between 8 GHz and 12 GHz.

Figure 9 shows the experimental narrowband images obtained by the *k*-space, OMP, and PROMP algorithms using a narrowband signal received by the bistatic receiver with *β* = 86◦ at a single frequency of 8.8 GHz for various values of CPI. Since the SNR is unknown, the OMP and PROMP iterations are halted when the change in the signal residual norm falls below 1% of the input signal norm. Compared to the benchmark image in Figure 8, the *k*-space method produces images with a much lower quality when only a narrowband signal from a single frequency is available. We observe that the images obtained by the *k*-space method are distorted with numerous spurious sibelode peaks. In particular, the number of spurious sibelode peaks increases significantly for shorter CPI. Moreover, the main lodes corresponding to the true scatterers are also spread when the CPI is reduced. In contrast, the PROMP images contain two clear peaks at the locations very close to the peaks of the reference image in Figure 8, even when the CPI is reduced to one third of the target rotation cycle. This observation demonstrates the performance superiority of the sparsity-based tomography approach over the conventional *k*-space tomography approach. Figure 9 also shows the images obtained by OMP to illustrate the dictionary mismatch problem associated with off-grid scatterers, where each true scatterer is split into multiple peaks; thus, verifying the effectiveness of PROMP in terms of tackling the dictionary mismatch problem.

**Figure 9.** Experimental narrowband images obtained by the *k*-space, OMP, and PROMP algorithms using a narrowband signal received by the bistatic receiver with *β* = 86◦ at a single frequency of 8.8 GHz, for various values of CPI.

Figure 10 shows the experimental results where the data is downsampled. Here, we observe a similar relative performance comparison to Figures 3–5 and 9, once again confirming the performance advantages of the proposed sparsity-based tomographic imaging method based on the PROMP algorithm.

**Figure 10.** Experimental narrowband images obtained by the *k*-space, OMP, and PROMP algorithms for downsampled data (CPI = 1 target rotation cycle).

#### **6. Conclusions**

In this paper, we have proposed a new sparsity-based bistatic radar tomographic imaging method exploiting the use of the PROMP algorithm. A new formulation for radar tomography building on the framework of compressive sensing and sparse reconstruction was presented, moving away from conventional *k*-space tomography which is prone to sidelobe responses and their interference. The PROMP algorithm was adopted to obtain a sparse solution for the resulting sparsity-based tomography formulation. By perturbing the dictionary atoms and allowing the estimated scatterers to go off the grid, PROMP is capable of tackling the dictionary mismatch problem arising from off-grid scatterers. The performance

advantages of the proposed sparsity-based tomography method over the conventional *k*-space tomography method were demonstrated via numerical studies involving both simulated and real data.

**Author Contributions:** Conceptualization, N.H.N., H.-T.T. and P.B.; Methodology, N.H.N., H.-T.T. and P.B.; Simulation, N.H.N.; Validation, N.H.N., H.-T.T. and P.B.; Writing—Original Draft Preparation, N.H.N.; Writing—Review & Editing, N.H.N., H.-T.T. and P.B.

**Funding:** This research was funded by Defence Science and Technology Group, Australia.

**Acknowledgments:** Preliminary results of this work were reported in [41], where the feasibility and benefits of using compressive sensing and sparse reconstruction for narrowband bistatic radar tomographic imaging were demonstrated. This paper presents detailed algorithmic design and implementation as well as experimental evaluation of the PROMP-based tomographic imaging method.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 Commonwealth of Australia. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Two-Dimensional Augmented State–Space Approach with Applications to Sparse Representation of Radar Signatures**

#### **Kejiang Wu and Xiaojian Xu \***

School of Electronics and Information Engineering, Beihang University, Beijing 100191, China; wukejiang88@buaa.edu.cn

**\*** Correspondence: xiaojianxu@buaa.edu.cn; Tel.: +86-10-8231-6065

Received: 16 September 2019; Accepted: 21 October 2019; Published: 24 October 2019

**Abstract:** In this work, we focus on sparse representation of two-dimensional (2-D) radar signatures for man-made targets. Based on the damped exponential (DE) model, a 2-D augmented state–space approach (ASSA) is proposed to estimate the parameters of scattering centers on complex man-made targets, i.e., the complex amplitudes and the poles in down-range and aspect dimensions. An augmented state–space approach is developed for pole estimation of down-range dimension. Multiple-range search strategy, which applies one-dimensional (1-D) state–space approach (SSA) to the 1-D data for each down-range cell, is used to alleviate the pole-pairing problem occurring in previous algorithms. Effectiveness of the proposed approach is verified by the numerical and measured inverse synthetic aperture radar (ISAR) data.

**Keywords:** damped exponential (DE) model; inverse synthetic aperture radar (ISAR); radar signatures; state–space approach (SSA); sparse representation

#### **1. Introduction**

Sparse representation of two-dimensional (2-D) radar signatures has been widely used in many applications, such as super-resolution radar imaging, data compression, and target identification [1–4]. 2-D radar signatures can be reconstructed with fewer data, where a set of parameters including the locations, amplitudes, and damping factors are used to represent the returned signals from spatial distributed scattering centers. Moreover, 2-D ultra-wideband images of radar targets can be obtained through parameter interpolation or extrapolation [5].

Over the years, several model-based spectral estimation approaches have been developed for sparse representation of 2-D radar signatures, in which the common idea of these approaches is the development of a parametric model based on electromagnetic scattering mechanisms. In this way, a set of parameters may be found to represent the original signatures. Please note that most sparse representation approaches are considered to have super-resolution capabilities because they are supposed to estimate the parameters of scattering centers that cannot be distinguished by standard processing [6–21]. Typically, these approaches include the amplitude and phase estimation of a sinusoid (APES) algorithm [6], fast Fourier transform (FFT)-based technique (CLEAN) [7,8], the compressed sensing (CS)-based procedure [9–11] and the subspace-based approach [12–21]. The APES algorithm could provide accurate estimation of the complex amplitudes whereas it does not estimate the locations of scattering centers. The 2-D CLEAN technique is a computationally efficient procedure which uses the undamped exponential model and deconvolution algorithm for optimization [7,22]. The main process of the CS-based procedure is to represent the signatures using an over-complete dictionary and the corresponding coefficients [9,10]. The performance of this approach depends on the initial coefficients and the threshold used in terminating iterations.

Another important group of 2-D sparse representation technique is the subspace-based approaches. The representative algorithms of this kind include the 2-D multiple signal classification (MUSIC) [12], matrix enhancement and matrix pencil (MEMP) [13,14], 2-D total least squares (TLS) Prony [15], algebraically coupled matrix pencils (ACMP) [16], 2-D estimation of signal parameters via rotational invariance techniques (2-D ESPRIT) [17] and 2-D system realization technique [18,19]. The rationale and pole-pairing scheme of these algorithms are listed in Table 1.



(1). Pole denotes a transfer function of the scattering center collected on target. Usually, the scattering centers are characterized via pairing the complex amplitudes and the poles in down-range and aspect dimensions.

While these subspace-based algorithms are demonstrated to be useful for the signatures of targets or simulated scattering centers, there are still some technique challenges, especially for sparse representation of the wideband radar signatures collected on complex man-made targets. First, the pairing schemes used by many existing algorithms cannot provide correct pole pairs in certain circumstances. i.e., MEMP, ACMP and 2-D system realization meet with the pole-pairing problem when there are numerous repeated poles in either the down-range or aspect dimension [17,20,23]. Second, the model order, which directly affects the result of parameter estimation such as 2-D MUSIC, MEMP, and 2-D ESPRIT et al., is difficult to be determined in the measurement environment [8]. Third, the compromises between the computational complexity and accuracy should be considered in the application of these algorithms [12]; for instance, the 2-D TLS Prony has smaller computational complexity than MEMP but with loss in accuracy [15].

In this paper, we focus on developing an approach for sparse representation of 2-D radar signatures collected on man-made targets. An augmented state–space approach is proposed for pole estimation of the down-range dimension. Multiple-range search strategy is then applied to estimate the pairing poles and corresponding amplitudes along the aspect dimension. Compared to the existing methods [14,17,18], the advantages of the 2-D augmented state–space approach (ASSA) are: (1) Computational complexity of the algorithm is much reduced since the newly defined Hankel matrix and several time-saving operations are adopted, whereas the pole-estimation accuracy is still at the same level; (2) The pole-pairing problem can be alleviated because all the poles are adaptively paired by using the multiple-range search strategy; And (3) an eigenvalue sequences transform algorithm is proposed, which could provide fast model order selection.

The remainder of the paper is organized as follows. Section 2 briefly presents the damped exponential models for 1-D and 2-D signals. In Section 3, a two-step procedure for 2-D ASSA is developed. Results for numerical and measured inverse synthetic aperture radar (ISAR) data representation are demonstrated to validate the effectiveness of the proposed procedure in Section 4. We conclude the paper in Section 5. Appendices are given to show more mathematical details of the proposed approach.

#### **2. Data Model**

#### *2.1. 1-D Damped Exponential Model*

As described in DE model [22,24], the 1-D radar signatures *y*(*fn*) can be expressed as a summation of *K* scattering centers corrupted with noise *w*(*n*).

$$\begin{aligned} y(f\_n) &= \sum\_{\substack{k=1\\K\\k=1}}^K A\_k \exp\left[ \left( \beta\_k + j2\pi \frac{r\_k}{c} \right) f\_n \right] + w(n) \\ &= \sum\_{k=1}^K a\_k p\_k^n + w(n) \end{aligned} \tag{1}$$

where *n* = 1,2, ... ,*N* and *N* denotes the number of pulses, *Ak* is the complex amplitude of the *k*-*th* scattering center; β*<sup>k</sup>* is the damping factor with respect to frequency; *rk* denotes the relative range; The parameter *fn* denotes the radar frequency *fn* = *fc* + (*n* − 1 − *N*2)Δ*f* where *fc* is the center frequency and *N*2 = *ceil*(*N*/2) denotes the smallest integer less than or equal to *N*/2; *ak* represents the amplitude of the *k*-*th* scattering center in pole form, the pole *pk* = exp[(β*<sup>k</sup>* + *j*2π*rk*/*c*)Δ*f* ] represents the transfer function of the *k*-*th* scattering center; *c* = 3 <sup>×</sup> 108 m/s is the propagation velocity. It worth noting that the data models used in this paper are all considered in a stepped frequency radar [25,26].

According to the discrete-time control theory and auto-regressive moving average (ARMA) model, Piou and Naishadham proposed a one-dimensional state–space approach (1-D SSA) which use a state–space description to the 1-D radar signatures in (1) [22].

$$\mathbf{x}(n+1) = \mathbf{A}\mathbf{x}(n) + \mathbf{B}u(n) \tag{2}$$

$$\mathbf{y}(n) = \mathbf{C}\mathbf{x}(n) + \boldsymbol{\mu}(n) \tag{3}$$

where *<sup>x</sup>*(*n*) <sup>∈</sup> <sup>C</sup>*K*×<sup>1</sup> is the state vector, *<sup>y</sup>*(*n*) denotes the signal sequence *<sup>y</sup>*(*fn*), *<sup>u</sup>*(*n*) is the input vector, **<sup>A</sup>** <sup>∈</sup> <sup>C</sup>*K*×*<sup>K</sup>* represents the open-loop matrix, **<sup>B</sup>** <sup>∈</sup> <sup>C</sup>*K*×<sup>1</sup> and **<sup>C</sup>** <sup>∈</sup> <sup>C</sup>1×*<sup>K</sup>* are the constant matrices. Thus, the 1-D noiseless radar signatures same as (1) can be expressed as

$$\begin{bmatrix} \ \widetilde{y}(1) & \ \widetilde{y}(2) & \dots & \ \widetilde{y}(N) \end{bmatrix} = \begin{bmatrix} \ \mathbf{C}\mathbf{B} & \mathbf{C}\mathbf{A}\mathbf{B} & \dots & \mathbf{C}\mathbf{A}^{N-1}\mathbf{B} \end{bmatrix} \tag{4}$$

As described in [22], 1-D SSA could precisely estimate the state matrices **A**, **B**, and **C**. Once these three state matrices are computed, the model parameters in (1) can be estimated by using the eigen-decomposition technique.

#### *2.2. 2-D Damped Exponential Model*

As a 2-D extension of 1-D DE model, the radar signatures obtained from different aspect angles are considered to be the summation of a finite number of dispersive scattering centers [13–18]. Typically, it is applicable to modeling the 2-D radar signatures with small aspect ranges [27].

$$\begin{aligned} y(\theta\_{m}, f\_{n}) &= \sum\_{k=1}^{K} A\_{k} \exp\left[ (\frac{2\pi r\_{1k}}{c} + \beta\_{1k}) f\_{\ell} \theta\_{m} \right] \exp\left[ (\frac{2\pi r\_{2k}}{c} + \beta\_{2k}) f\_{n} \right] + w(m, n) \\ &= \sum\_{k=1}^{K} a\_{k} s\_{k}^{m} p\_{k}^{n} + w(m, n) \end{aligned} \tag{5}$$

in matrix notation, the 2-D radar signatures in (5) can be expressed as:

$$\mathbf{Y} = \begin{bmatrix} y(1,1) & y(1,2) & \dots & y(1,N) \\ y(2,1) & y(2,2) & \dots & y(2,N) \\ \vdots & \vdots & \dots & \vdots \\ y(M,1) & y(M,2) & \dots & y(M,N) \end{bmatrix} \tag{6}$$

where *m* = 1, ... , *M* and *n* = 1, ... , *N*; {*r*1*k*,*r*2*k*} give the relative locations of the *k*-*th* scattering center; {β1*k*, β2*k*} characterize the frequency and aspect dependence of scattering; θ*<sup>m</sup>* = θ<sup>0</sup> + (*m* − 1)Δθ denotes the *m*-*th* aspect angle where θ<sup>0</sup> is the starting angle and Δθ represents the angle interval; *w*(*m*, *n*) is the Gaussian noise with zero-mean; *y*(*m*, *n*) denotes the signal sequences *y*(θ*m*, *fn*); {*sk*, *pk*} refer to poles of the down-range and aspect dimension.

$$s\_k = \exp\left[ \left( \frac{j2\pi r\_{1k}}{c} + \beta\_{1k} \right) f\_c \Delta \theta \right] \tag{7}$$

$$p\_k = \exp\left[\left(\frac{j2\pi r\_{2k}}{c} + \beta\_{2k}\right)\Delta f\right] \tag{8}$$

#### **3. Two-Dimensional Augmented State–Space Approach**

From the 2-D DE model in (5), the vector (**Y**(*n*), which represents the *n*-th column of the noiseless signature matrix (**Y** = **Y** − **W** (where **W** is the noise matrix), can be decomposed as.

$$
\widetilde{\mathbf{Y}}(n) = \begin{bmatrix} a\_1 s\_1^1 p\_1 & a\_2 s\_2^1 p\_2 & \vdots & a\_K s\_K^1 p\_K \\ a\_1 s\_1^2 p\_1 & a\_2 s\_2^2 p\_2 & \vdots & a\_K s\_K^2 p\_K \\ \vdots & \vdots & \vdots & \vdots \\ a\_1 s\_1^M p\_1 & a\_2 s\_2^M p\_2 & \vdots & a\_K s\_K^M p\_K \end{bmatrix} \begin{bmatrix} p\_1 & 0 & \vdots & 0 \\ 0 & p\_2 & \vdots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & \vdots & p\_K \end{bmatrix}^{n-1} \mathbf{1}\_K \tag{9}
$$

where *K* denotes the number of scattering centers, **l***<sup>K</sup>* indicates the column vector of ones with length *K*.

Actually, we often meet the one-to-multiple matching situation, i.e., one pole *pk* is corresponding to more than one poles *sk*1,*sk*2, .... These repeated poles can be merged and (9) could be rewritten as.

$$
\widetilde{\mathbf{Y}}(n) = \begin{bmatrix} l\_1 & a\_{1t}s\_{1t}p\_1 & \sum\_{t=1}^{l\_2} a\_{2t}s\_{2t}p\_2 & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1t}s\_{K\_1t}p\_{K\_1} \\ \sum\_{l\_1}^{l\_2} & l\_2 & l\_{2t}s\_{2t}^2p\_2 & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1t}s\_{K\_1t}^2p\_{K\_1} \\ \sum\_{t=1}^{l\_1} a\_{1t}s\_{1t}^2p\_1 & \sum\_{t=1}^{l\_2} a\_{2t}s\_{2t}^2p\_2 & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1t}s\_{K\_1t}^2p\_{K\_1} \\ \vdots & \vdots & \vdots & \vdots \\ \sum\_{t=1}^{l\_1} a\_{1t}s\_{1t}^Mp\_1 & \sum\_{t=1}^{l\_2} a\_{2t}s\_{2t}^Mp\_2 & \vdots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1t}s\_{K\_1t}^Mp\_{K\_1} \end{bmatrix} \begin{bmatrix} p\_1 & 0 & \vdots & 0 \\ 0 & p\_2 & \vdots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & \vdots & p\_K \end{bmatrix}^{n-1} \mathbf{1}\_{\mathbb{K}\_1} \tag{10}$$

where *K*<sup>1</sup> represents the number of non-repeated poles in the down-range dimension. *l*1, *l*2, ... *lK*<sup>1</sup> denote the number of repeated poles, respectively, for the corresponding poles *p*1, *p*2, ... *pK*<sup>1</sup> .

We define the matrix **P** as.

$$\mathbf{P} = \mathbf{Q} \begin{bmatrix} p\_1 & 0 & \vdots & 0 \\ 0 & p\_2 & \vdots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & \vdots & p\_{K\_1} \end{bmatrix} \mathbf{Q}^\* \tag{11}$$

where \* denotes the Hermitian operator; **Q** is a unitary matrix which satisfies **Q**∗ **Q** = **QQ**∗ = **E**, **E** is the identity matrix.

Two constant matrices, i.e., **S** and **D**, are defined to simplify the expression in (10).

$$\mathbf{S} = \begin{bmatrix} \frac{l\_1}{\sum\_{t=1}^{l\_1} a\_{1t} s\_{1t} p\_1} & \sum\_{t=1}^{l\_2} a\_{2t} s\_{2t} p\_2 & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1 t} s\_{K\_1 t} p\_{K\_1} \\ \sum\_{t=1}^{l\_1} a\_{1t} s\_{1t}^2 p\_1 & \sum\_{t=1}^{l\_2} a\_{2t} s\_{2t}^2 p\_2 & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1 t} s\_{K\_1 t}^2 p\_{K\_1} \\ & \vdots & \vdots & \vdots \\ \sum\_{t=1}^{l\_1} a\_{1t} s\_{1t}^M p\_1 & \sum\_{t=1}^{l\_2} a\_{2t} s\_{2t}^M p\_2 & \vdots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1 t} s\_{K\_1 t}^M p\_{K\_1} \\ & & & & \end{bmatrix} \mathbf{Q}^\* \tag{12}$$
 
$$\mathbf{D} = \mathbf{Q} \mathbf{I}\_{K\_1} \tag{13}$$

Using (11)–(13), expression in (10) is simplified as.

$$\widetilde{\mathbf{Y}}(n) = \mathbf{S} \mathbf{P}^{n-1} \mathbf{D} \tag{14}$$

Thus, the noiseless signatures (**Y** can be written as.

$$\tilde{\mathbf{Y}} = \begin{bmatrix} \mathbf{S}\mathbf{D} & \mathbf{S}\mathbf{D} & \dots & \mathbf{S}\mathbf{P}^{N-1}\mathbf{D} \end{bmatrix} \tag{15}$$

Comparing (15) with (4), it can be found that these two equations have extremely similar structures. Thus, the parameters *ak*, *sk* and *pk* can be estimated by 2-D augmentation of 1-D SSA, which is called 2-D augmentation state–space approach (2-D ASSA).

Here, the 2-D ASSA consists of two steps. An augmented state–space approach is applied to estimating the pole *pk* in down-range dimension. Then multiple-range search strategy is used for estimating the matching pole *sk* as well as the corresponding amplitude *ak*. Details are as follows.

#### *3.1. Pole Estimation of the Down-Range Dimension*

First, we introduce a single augmented Hankel matrix **H** of size *M*(*N* − *L* + 1) × *L* by analogy of the enhanced Hankel matrix in 1-D SSA:

$$\mathbf{H} = \begin{bmatrix} \mathbf{Y}(1) & \mathbf{Y}(2) & \dots & \mathbf{Y}(L) \\ & \mathbf{Y}(2) & \mathbf{Y}(3) & \dots & \mathbf{Y}(L+1) \\ & \vdots & \vdots & \dots & \vdots \\ \mathbf{Y}(N-L+1) & \mathbf{Y}(N-L+2) & \dots & \mathbf{Y}(N) \end{bmatrix} \tag{16}$$

where each element **Y**(*n*), first mentioned in (9), represents the *n*-th column of the matrix **Y**; *L* is the step size of the correlation window, which is heuristically set to be *L*=*N*2 [22].

Next, do the singular value decomposition (SVD) of the Hankel matrix.

$$\mathbf{H} = \begin{bmatrix} \mathbf{U}\_{\rm sn} & \mathbf{U}\_{n} \end{bmatrix} \begin{bmatrix} \Re\_{\rm sn} \\ & \Re\_{n} \end{bmatrix} \begin{bmatrix} \mathbf{V}\_{\rm sn}^{\*} \\ & \mathbf{V}\_{n}^{\*} \end{bmatrix} \tag{17}$$

where **U***sn*,**U***n*,**V**<sup>∗</sup> *sn* and **V**<sup>∗</sup> *<sup>n</sup>* are unitary matrices; *sn* and *<sup>n</sup>* are diagonal matrices; The matrices with subscript '*sn*' refer to the components of signal space and those matrices with subscript '*n*' denote the components of noise space; The rank of the noiseless matrix **U***snsn***V**<sup>∗</sup> *sn* is known as the model order which has been widely used in [14,22,27].

As proved in [14,17], the model order in the down-range dimension is equal to *K*<sup>1</sup> and should be less than the number of columns or rows of **H** at least. Thus, the number of non-repeated poles *K*<sup>1</sup> should satisfy the following condition.

$$\begin{cases} \begin{array}{c} MN-ML + M \geq K\_1 \\ L \geq K\_1 \end{array} \end{cases} \tag{18}$$

Model order selection is an inevitable problem in modeling the 2-D signatures [14,22,27]. A series of eigenvalue-based criteria, such as Akaike information criterion (AIC), minimum description length (MDL) criterion and the minimum eigenvalue (MEV) criterion, have been proposed for solving this problem [28,29]. Those criteria are useful in model order selection but with time consuming process [30]. Here an eigenvalue sequences transform algorithm is proposed for estimating the model order due to its low computational complexity. This algorithm can provide fast estimation but with loss in robustness to noise. More details of this algorithm are presented in Appendix A.

Based on the linear systems theory [31] and the matrix expression in (14) and (16), the noiseless matrix **H**( can be further factorized as:

$$\mathbf{H} = \mathbf{U}\_{\text{sn}} \Re\_{\text{sn}} \mathbf{V}\_{\text{sn}}^{\*} = \mathbf{T} \boldsymbol{\Omega} \tag{19}$$

where **Ω** is the (*N* − *L* + 1)*M* × *K*<sup>1</sup> observability matrix which can be further expressed by the matrices **S** and **P**.

$$\mathbf{U}\Omega = \mathbf{U}\_{\rm s\boldsymbol{\mu}}\mathfrak{R}\_{\rm s\boldsymbol{n}}^{1/2} = \begin{bmatrix} \mathbf{S} & \mathbf{S}\mathbf{P} & \dots & \mathbf{S}\mathbf{P}^{N-L} \end{bmatrix}^{\rm T} \tag{20}$$

and **Γ** is the *K*<sup>1</sup> × *L* controllability matrix which can be expressed by the matrices **P** and **D**.

$$\boldsymbol{\Gamma} = \mathfrak{R}\_{sn}^{1/2} \mathbf{V}\_{sn}^\* = \begin{bmatrix} \mathbf{D} & \mathbf{P} \mathbf{D} & \dots & \mathbf{P}^{L-1} \mathbf{D} \end{bmatrix} \tag{21}$$

Considering that computational complexity of (19)–(21) is enlarged observably for large data sets, operations (22)–(24) are used for alternative steps which have lower computational load but with minimal calculation error.

$$\mathbf{\overline{H}}'\mathbf{\overline{H}} = \mathbf{V}\_{\mathfrak{su}}\mathbf{\mathfrak{R}}\_{\mathfrak{su}}^2 \mathbf{V}\_{\mathfrak{su}}^\* = \mathbf{V}\_{\mathfrak{su}}\mathbf{Z}\mathbf{V}\_{\mathfrak{su}}^\* \tag{22}$$

$$\mathbf{U}\,\Omega \approx \mathbf{H}\mathbf{V}\_{\text{sv}}\mathbf{Z}^{-1/4} \tag{23}$$

$$
\Gamma = \mathbf{Z}^{-1/4} \mathbf{V}\_{\rm sn}^\* \tag{24}
$$

where **<sup>Z</sup>** = <sup>2</sup> *sn*.

As an analogy of the open-loop matrix in 1-D SSA, the augmented open-loop matrix **P** can be derived from the observability matrix by using least square.

$$\mathbf{P} = \left(\boldsymbol{\Omega}\_{rf}^{\*}\boldsymbol{\Omega}\_{rf}\right)^{-1}\boldsymbol{\Omega}\_{rf}^{\*}\boldsymbol{\Omega}\_{rl} \tag{25}$$

Or it can be computed by the controllability matrix **Γ**.

$$\mathbf{P} = \Gamma\_{cl} \Gamma\_{cf}^\* \left(\Gamma\_{cf} \Gamma\_{cf}^\*\right)^{-1} \tag{26}$$

where **Ω***r f* is the first (*N* − *L*)*M* rows of **Ω** and **Ω***rl* denotes the last (*N* − *L*)*M* rows of **Ω**; **Γ***c f* represents the first (*L* − 1) columns of **Γ** and **Γ***cl* is the last (*L* − 1) columns of **Γ**. From (20) and (21), these matrices can be rewritten as

$$\left[\begin{array}{cccc}\mathbf{S} & \mathbf{S}\mathbf{P} & \dots & \mathbf{S}\mathbf{P}^{N-L-1} \end{array}\right]^T = \boldsymbol{\Omega}\_{rf} \tag{27}$$

$$\left[\begin{array}{cccc}\textbf{SP} & \textbf{SP}^2 & \dots & \textbf{SP}^{N-L} \end{array}\right]^T = \Omega\_{rl} \tag{28}$$

$$\left[\begin{array}{cc}\mathbf{D} & \mathbf{P}\mathbf{D} & \dots & \mathbf{P}^{L-2}\mathbf{D} \end{array}\right] = \Gamma\_{cf} \tag{29}$$

$$\left[\begin{array}{cccc}\mathbf{p}\mathbf{D} & \mathbf{p}^{2}\mathbf{D} & \dots & \mathbf{p}^{l-1}\mathbf{D} \\ \end{array}\right] = \Gamma\_{cl} \tag{30}$$

More details of the derivation of the matrices **S**, **P**, and **D** are listed in Appendix B.

According to (11), the vector [ *p*<sup>1</sup> *p*<sup>2</sup> ... *pK*<sup>1</sup> ] is obtained by performing the SVD.

$$
\Lambda = \mathbf{Q}^\* \mathbf{P} \mathbf{Q} \tag{31}
$$

$$\text{diag}(\mathbf{A}) = \begin{bmatrix} p\_1 & p\_2 & \dots & p\_{K\_1} \end{bmatrix} \tag{32}$$

where *diag*(**Λ**) denotes the diagonal element of matrix **Λ**.

#### *3.2. Pole Estimation of the Aspect Dimension*

In this step, multiple-range search, which applies one-dimensional (1-D) state–space approach (SSA) to the 1-D data for each down-range cell, is used for pole estimation of the aspect dimension and pole adaptive pairing. Details are as follows.

We introduce the Vandermonde sub-matrix **O**.

$$\mathbf{O} = \begin{bmatrix} p\_1^1 & p\_1^2 & \vdots & p\_1^N \\ p\_2^1 & p\_2^2 & \vdots & p\_2^N \\ \vdots & \vdots & \vdots & \vdots \\ p\_{K\_1}^1 & p\_{K\_1}^2 & \vdots & p\_{K\_1}^N \end{bmatrix} \tag{33}$$

From (5), the noiseless signatures (**Y** can be factorized as a product of the pairing matrix **G** and **O**, where each column of **G** is associated with each row of **O**.

$$\widetilde{\mathbf{Y}} = \begin{bmatrix} \frac{l\_1}{\sum\_{t=1}^{l\_1} a\_{1t} s\_{1t}} & \sum\_{t=1}^{l\_2} a\_{2t} s\_{2t} & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1 t} s\_{K\_1 t} \\\\ \frac{l\_1}{\sum\_{t=1}^{l\_1} a\_{1t} s\_{1t}^2} & \sum\_{t=1}^{l\_2} a\_{2t} s\_{2t}^2 & \dots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1 t} s\_{K\_1 t}^2 \\\\ \vdots & \vdots & \vdots & \vdots \\\\ \frac{l\_1}{\sum\_{t=1}^{l\_1} a\_{1t} s\_{1t}^M} & \sum\_{t=1}^{l\_2} a\_{2t} s\_{2t}^M & \vdots & \sum\_{t=1}^{l\_{K\_1}} a\_{K\_1 t} s\_{K\_1 t}^M \end{bmatrix} \mathbf{O} = \mathbf{G} \mathbf{O} \tag{34}$$

Thus, the pairing matrix **G** in (34) can be calculated by using least square, i.e.

$$\mathbf{G} = \overline{\mathbf{Y}} \mathbf{O}^\* (\mathbf{O} \mathbf{O}^\*)^{-1} \tag{35}$$

The *k*-th column of the pairing matrix **G** is presented as

$$\mathbf{G}(k) = \left[ \sum\_{t=1}^{l\_k} a\_{kt} \mathbf{s}\_{kt}^1 \quad \sum\_{t=1}^{l\_k} a\_{kt} \mathbf{s}\_{kt}^2 \quad \dots \quad \sum\_{t=1}^{l\_k} a\_{kt} \mathbf{s}\_{kt}^M \right]^T \tag{36}$$

where *k* = 1, 2, ... , *K*1.

Using (36), each column of the pairing matrix is constructed in a same structure to 1-D DE model defined in (1). This indicates that the pole *skt* and *akt*, which correspond to the *k*-*th* pole *pk* in (32), can be solved by using 1-D SSA [22] to each column of the pairing matrix **G**. For example, for the first column of the pairing matrix **G** corresponds to *p*1, the parameter matrices (**C**, **A**, **B**) can be obtained by using 1-D SSA in (4).

$$\mathbf{G}(1) = \begin{bmatrix} \mathbf{C}\mathbf{B} & \mathbf{C}\mathbf{A}\mathbf{B} & \dots & \mathbf{C}\mathbf{A}^{M-1}\mathbf{B} \end{bmatrix}^{\mathrm{T}} \tag{37}$$

The eigenvalue decomposition of the open-loop matrix **A** leads to.

$$\begin{bmatrix} \overleftarrow{y}(1) & \overleftarrow{y}(2) & \dots & \overleftarrow{y}(N) \end{bmatrix} = \begin{bmatrix} \textbf{C}\textbf{B} & \textbf{C}\textbf{A}\textbf{B} & \dots & \textbf{C}\textbf{A}^{N-1}\textbf{B} \end{bmatrix} \tag{38}$$

The matching pole *s*1*<sup>t</sup>* and the corresponding amplitude *a*1*<sup>t</sup>* are computed as.

$$\begin{bmatrix} s\_{11}, s\_{12}, \dots, s\_{1l\_1} \end{bmatrix} = \begin{bmatrix} \ \Lambda\_1(1,1) & \Lambda\_1(2,2) & \dots & \Lambda\_1(l\_1,l\_1) \end{bmatrix} \tag{39}$$

$$\left[a\_{11}, a\_{12}, \dots, a\_{1l\_1}\right] = (\mathbf{CM}\_1)^\* \left(\mathbf{M}\_1^{-1}\mathbf{B}\right) \tag{40}$$

Thus, the pairing matrix **G** can be searched column by column until all poles *skt* and the corresponding amplitudes *akt* are estimated. No extra pairing scheme is required because the poles [*sk*1,*sk*2, ... ,*sklk* ] and the amplitudes [*ak*1, *ak*2, ... , *aklk* ] have already been adaptively paired to the poles *pk* in (34). In addition, considering that the model order may be overestimated sometime, the searched pole pairs are usually checked again according to their amplitudes. Finally, locations and damping factors of all the scattering points can be obtained from (41) and (42).

$$\mathbf{r}(r\_{1k'}r\_{2k}) = (\frac{\text{Arg}\{\mathbf{s}\_k\}}{4\pi f\_c \Delta \theta / \mathbf{c}}, \frac{\text{Arg}\{p\_k\}}{4\pi \Delta f / \mathbf{c}}) \tag{41}$$

$$(\beta\_{1k}, \beta\_{2k}) = (\frac{\ln(|s\_k|)}{f\_c \Delta \theta}, \frac{\ln(|p\_k|)}{\Delta f}) \tag{42}$$

#### **4. Results and Discussion**

In this Section, three examples are presented to demonstrate the usefulness of the proposed procedure, i.e., the numerical signatures with 14 point scattering centers, the numerical signatures of a sphere tipped cone-cylinder-frustum combination model, the measured ISAR data for an aircraft model. The results obtained by 2-D ESPRIT are used for comparison since it is one of the very few techniques which have been used in real radar applications [17,32].

#### *4.1. Numerical Signatures with Point Scattering Centers*

In this example, the noisy signatures composed of 14-point scattering centers are considered to be as follows:

$$y(m,n) = \sum\_{k=1}^{14} a\_k \exp\left[ (j2\pi r\_{1k}/c + \beta\_{1k}) f\_c \theta\_n \right] \exp\left[ (j2\pi r\_{2k}/c + \beta\_{2k}) f\_m \right] + w(m,n) \tag{43}$$

where (*M*, *N*) = (41, 41); *ak* is set to be 1; *fc* = 12GHz and Δ*f* = 150MHz; θ<sup>0</sup> = 0 rad and Δθ = 0.0125 rad; The setting of coupled ranges are shown in Figure 1; All the damping factors including β1*<sup>k</sup>* and β2*<sup>k</sup>* are set as −0.05/(*fc*Δθ) and −0.05/Δ*f* except the far left scattering points (β11, β21) = (−0.02/(*fc*Δθ), −0.02/Δ*f*); *w*(*m*, *n*) denotes the additive white Gaussian noise.

**Figure 1.** Spatial distribution of the simulated poles.

As shown in Figure 1, these scattering points form a missile-like shape in Cartesian coordinates. It contains 13 pole pairs which have repeated poles in either the down-range or aspect dimensions. The noisy signatures in space domain by Fourier transform-based imaging algorithm (add Taylor window) are displayed in Figure 2a,b. As can be seen, the positions and decay rates of these scattering points are consistent with the corresponding ranges and damping factors. To sparse representation of these noisy signatures, the parameter for 2-D ASSA is chosen to be *L* = 20; the parameters for 2-D ESPRIT are set as: (*P*, *Q*) = (20, 20), β= 0.8, which are suggested in [17]. For each signal to noise ratio (SNR), the number of Monte Carlo simulation is 200. A series of range estimation results in different SNRs are presented in Figure 3. Please note that the model order in 2-D ESPRIT is pre-specified because the singular values-based criterion [28,29] cannot be used when there are repeated poles in either the down-range or aspect dimensions. Because of different pairing strategies, *K*<sup>1</sup> and *K* in 2-D ASSA could be estimated by the eigenvalue sequences transform algorithm (Appendix A) when SNR > 10 dB. However, the numbers *K*<sup>1</sup> and *K* in 2-D ASSA should be pre-specified or use the other criterion [30] when the noise level is higher (SNR ≤ 10 dB).

As can be seen in Figure 3a,b, the positions of scattering points estimated by 2-D ESPRIT are generally according to the preset poles in Figure 1, except for some missing or incorrect points (shown by the black rectangular). The possible reason of this problem is that the pairing procedure in 2-D ESPRIT may provide incorrect pole pairs when there are repeated poles in either the down-range or aspect dimension, i.e., six pole pairs (*s*1, *p*1), (*s*1, *p*2), (*s*1, *p*3), (*s*2, *p*1), (*s*2, *p*2), (*s*2, *p*3). In contrast, the results estimated by 2-D ASSA showed higher accuracy than 2-D ESPRIT for different pairing strategies.

The estimation accuracy of cross/down ranges and damping factors are displayed in Figure 4. Root mean square error (RMSE) is used as the evaluating indicator which is defined in (44). As we can see, estimation accuracy of these two algorithms are basically at the same level although the size of Hankel matrix used by 2-D ASSA is smaller than the block-Hankel matrix defined by 2-D ESPRIT.

**Figure 2.** Simulated radar image with different SNR by Fourier transform. (**a**) Simulated radar image with SNR = 5 dB; (**b**) Simulated radar image with SNR = 0 dB.

**Figure 3.** Estimation of the coupled ranges by 2-D ASSA and 2-D ESPRIT with different SNR. (**a**) Estimation by 2-D ESPRIT with SNR = 5 dB; (**b**) Estimation by 2-D ESPRIT with SNR = 0 dB; (**c**) Estimation by 2-D ASSA with SNR = 5 dB; (**d**) Estimation by 2-D ASSA with SNR = 0 dB.

**Figure 4.** Estimation accuracy of *r*11, *r*21, β<sup>11</sup> and β<sup>21</sup> with different SNR. (**a**) Estimation of *r*<sup>11</sup> with different SNR; (**b**) Estimation of *r*<sup>21</sup> with different SNR; (**c**) Estimation of β<sup>11</sup> with different SNR; (**d**) Estimation of β<sup>21</sup> with different SNR.

**Figure 5.** Estimation of the coupled ranges by 2-D ASSA with 2000 Monte Carlo runs (SNR = 0 dB).

$$\delta\_{RMSE} = 10 \log\_{10} \sqrt{\frac{1}{M\_{\odot}} \sum\_{t=1}^{M\_{\odot}} \left[ X\_{\text{est}} - X\_{\text{real}} \right]^2} \tag{44}$$

where *Mc* = 2000 denotes the number of Monte Carlo runs, *X*est and *X*real are the esimated and real parameters.

Figure 5 also presents the statistic result with 2000 Monte Carlo runs (SNR = 0 dB). The result demonstrates that the multiple-range search strategy used by 2-D ASSA is robust in pole-pairing for low SNR.

#### *4.2. Numerical Signatures of Computer-Aided Design (CAD) Model*

The numerical signatures are obtained using method of moment (MOM), where the target is from a computer-aided design (CAD) model of a sphere tipped cone-cylinder-frustum combination, (shown in Figure 6). The data was calculated from 8–12 GHz in 10 MHz frequency step size, view angle ranging from −5 to 5 deg with an increment of 0.25 deg. Figure 7a presents 2-D radar image processed using 2-D FFT. As it can be seen, strong scattering points can be observed at differential discontinuities, such as the base-edge, the body groove, and nosetip. To sparsely represent the numerical data, the parameters of this example are set as follows. For 2-D ASSA, *L* is set as *N*2. For 2-D ESPRIT, (*P*, *Q*) = (*M*2, *N*2), where *M*2 = *ceil*(*M*/2) denotes the smallest integer less than or equal to *M*/2, and β = 0.8.

Location estimation of key scattering points for these two algorithms are shown in Figure 7b–e. As can be seen, when the number of scattering centers is set to be 14, all key scattering points are accurately estimated by these two algorithms except for some minor differences. However, when *K* is set to be 18, 2-D ESPRIT encountered the pole-pairing problem and could not provide the right estimation. Relative reconstruction error (RRE) δ*RRE* (defined in (45)) of these two algorithms are shown in Figure 8. We can see that the RRE of 2-D ASSA has been falling when *K* increased from 10 to 30, whereas the RRE of 2-D ESPEIT stopped falling after *K* = 14. The result shows that 2-D ASSA tends to be more robust in pole-pairing for different numbers of scattering centers.

$$\delta\_{RRE} = \frac{1}{MN} \sum\_{m=1}^{M} \sum\_{n=1}^{N} \left| 20 \log\_{10} [\mathbf{Y}\_{\text{rcam}}(m, n) / \mathbf{Y}\_{\text{rcal}}(m, n)] \right| \tag{45}$$

where **Y***recon* denotes the signatures reconstructed by the estimated parameters. **Y***real* is the original signatures.

**Figure 6.** CAD model of the sphere tipped cone-cylinder-frustum combination.

**Figure 7.** 2-D radar image of the CAD model and location estimation of main scattering points result by 2-D ESPRIT and 2-D ASSA. (**a**) 2-D radar image in aspect angle from −5 to 5 deg; (**b**) Pole estimation by 2-D ESPRIT(*K* = 14); (**c**) Pole Estimation by 2-D ASSA(*K* = 14); (**d**) Pole estimation by 2-D ESPRIT(*K* = 18); (**e**) Pole Estimation by 2-D ASSA(*K* = 18).

**Figure 8.** Relative reconstruction error of the numerical signatures of CAD model with different numbers of scattering centers.

Figure 9a displays the sub-band data image (8–10 GHz, from −5 to 5 deg) which was extracted from the full-band data. Compared to the full-band data image in Figure 7a, scattering centers of the base-edge and the body groove cannot be distinguished when the signal bandwidth is only 2 GHz. Figure 9b presents the scattering points estimated by 2-D ASSA and Figure 9c shows the full-band data image extrapolated by these extracted pole pairs. As can be seen, scattering centers of the base-edge and the body groove are distinguished from the extracted scattering points. The full-band data image extrapolated by 2-D ASSA is in keeping with the main scattering centers distributed in the original

full-band data image (Figure 7a). It is clear from these figures that 2-D ASSA results in higher-resolution images than traditional 2-D FFT imagery.

**Figure 9.** Sub-band data image and extrapolated data image by 2-D ASSA. (**a**) Sub-band data image by 2-D FFT (8–10 GHz); (**b**) Scattering points estimated by 2-D ASSA; (**c**) Full-band data image extrapolated by 2-D ASSA (8–12 GHz).

#### *4.3. Measured ISAR Signatures*

In the third example, the measured ISAR signatures are acquired in an indoor test range where a mocked aircraft model is used. The data was collected using S-band radar with center frequency 3 GHz and 1.5 GHz bandwidth. The range of aspect angle is from −4◦ to 4◦ with an interval of 0.1◦. Figure 10a shows the ISAR image of the aircraft model generated by 2-D FFT. As it can be seen, the scattering distribution of this model is more complex than the previous model, i.e., many scattering points are densely distributed in the fuselage. The estimation parameter for 2-D ASSA is set to be *L* = *N*2; For 2-D ESPRIT, (*P*, *Q*) = (*M*2, *N*2) and β = 0.8.

**Figure 10.** Pole-estimation result of the measured data with different *K*. (**a**) Measured ISAR image of the aircraft model by using 2-D FFT (**b**) Pole estimation by 2-D ESPRIT (*K* = 57); (**c**) Pole estimation by 2-D ASSA (*K* = 57); (**d**) Pole estimation by 2-D ESPRIT(*K* = 115); (**e**) Pole estimation by 2-D ASSA (*K* = 115).

**Figure 11.** Reconstruction result of the measured data with different *K*. (**a**) Reconstruction result by 2-D ESPRIT(*K* = 57); (**b**) Reconstruction result by 2-D ESPRIT (*K* = 115); (**c**) Reconstruction result by 2-D ESPRIT (*K* = 304); (**d**) Reconstruction result by 2-D ASSA(*K* = 57);(**e**) Reconstruction result by 2-D ASSA(*K* = 115); (**f**) Reconstruction result by 2-D ASSA(*K* = 304); (**g**) Reconstruction result by 2-D ASSA(*K* = 507); (**h**) Reconstruction result by 2-D ASSA(*K* = 607).

Location estimation of main scattering centers for measured data are shown in Figure 10b–e. As we can see, both 2-D ESPRIT and 2-D ASSA could estimate the right positions of strong scattering centers. The scattering points estimated by 2-D ASSA are more densely distributed around the strong scattering centers than 2-D ESPRIT's for different pole-pairing strategies. A few relatively weak scattering points, i.e., the red point in rectangle box, are wrongly estimated by 2-D ESPRIT when the number of scattering points is set to be 115, and more wrongly estimated pole pairs can be seen in *K* > 115 which are not displayed in Figure 10.

Figure 11a–h display a set of reconstructed 2-D images by the estimated pole pairs. Please note that the number of scattering points *K* in 2-D ESPRIT should be equal to the model order whereas *K* in 2-D ASSA could be larger than the model order *K1* (the number of non-repeated poles in the down-range dimension) when there are one-to-multiple pairing poles. That is the reason *K* could be set to 607 which is already exceed the limitation of model order in [17]. As we can see from the figures, all those two algorithms can reconstruct those strong scattering centers of target which are in consistent with the estimated poles in Figure 10a–d. As displayed in Figure 11f–h, more and more relatively weak scatterin*g* centers have been reconstructed by 2-D ASSA with the increasing number of *K* and the reconstruction result with compression ratio of 91.04% is extremely similar to the original data in Figure 10a.

Evaluation of the reconstructed results by 2-D ASSA are listed in Table 2. Compression ratio (CR) ε*CR* and image similarity degree (ISD) are defined in (46) and (47). The RRE δ*RRE* is used for evaluating the reconstructed result in frequency-domain, whereas the ISD γ*ISD* is for the result in image-domain. From the table, we can see that the numbers of scattering points in 2-D ASSA are negatively correlated with RRE (or ISD). In contrast, 2-D ESPRIT performs higher RRE than 2-D ASSA when the number of scattering centers remains the same. Please note that part of evaluation results by 2-D ESPRIT are abnormal because 2-D ESPRIT cannot do the right reconstruction when *K* > 307. The possible reason is that for 2-D ESPRIT, all amplitudes factors & *a*<sup>1</sup> *a*<sup>2</sup> ... *aK* ' (defined in (3)) are estimated together by using least square technique after all the pole pairs are confirmed. Thus, numerous incorrect pairing poles estimated by 2-D ESPRIT will lead to entirely wrong reconstruction. In comparison, 2-D ASSA performs better robustness for the measurement data.

$$
\varepsilon\_{\rm CR} = \frac{M \times N - n\_{\rm P} \times K}{M \times N} \times 100\% \tag{46}
$$

$$\gamma\_{ISD} = \frac{\sum \mathbf{I}\_{rccou} \mathbf{I}\_{real}}{\sqrt{\sum \mathbf{I}\_{rccou}^2 \sum \mathbf{I}\_{real}^2}} \tag{47}$$

where *n*<sup>P</sup> = 3 represents parameter number of each paired pole (*ak*,*sk*, *pk*), *K* is the number of scattering points, *M* × *N* denotes size of the original signatures. **I***recon*, **I***real* represent the reconstructed and original 2-D images-domain data.


**Table 2.** Evaluation of the reconstruction result of the measured data by 2-D ESPRIT and 2-D ASSA.

Running time of the processed examples are listed in Table 3. All the results are carried out by Matlab (R2016b) with same hardware platform: Intel Core i7-6900k 3.2 GHz and 128 G memory. Please note that the running time of 2-D ESPRIT does not contain the estimation of model order. According to the table, 2-D ASSA performs higher efficiency than 2-D ESPRIT in processing the same data with the following differences.

(1) The size of the augmented Hankel matrix used by 2-D ASSA is approximate 1/(*M*/4) of the block-Hankel matrix used by 2-D ESPRIT when *L* = *Q* = *N*2 and *P* =*M*2. Moreover, the data size of the Hankel matrix could be further reduced by using the operation (22)–(24);

(2) No extra pairing scheme is required by using multiple-range search strategy which can finish the pairing process once the pole estimation of the aspect dimension is confirmed. Moreover, calculation of the block-Hankel matrix for the aspect dimension in 2-D ESPRIT is simplified as calculating multiple small Hankel matrices with size *M*2 × (*M* − *M*2 + 1).


**Table 3.** Running time of the numerical radar signatures and the measured ISAR data by 2-D ESPRIT and 2-D ASSA.

#### **5. Conclusions**

In this paper, we have presented a two-dimensional augmented state–space approach for sparse representation of 2-D wideband radar signatures collected on man-made targets. To do this, a two-step procedure, i.e., an augmented state–space approach followed by multiple-range search strategy, is proposed to estimate the complex amplitudes and poles in down-range and aspect dimensions. In general, there are mainly two contributions provided in this paper. First, we develop a computationally efficient approach by adopting several time-saving operations, whereas the pole-estimation accuracy is still at the same level; second, the proposed approach can apparently alleviate the pole-pairing problem by using the multiple-range search strategy.

Numerical as well as measured ISAR data are processed to validate the proposed approach. Experimental results demonstrate that 2-D ASSA is robust and accurate in pole-paring for different SNRs, and is applicable for sparse representation of 2-D wideband radar signatures collected on man-made targets with low computational cost.

Future works are considered to be as follows. On the one hand, physical meanings of the extracted parameters of scattering centers by 2-D ASSA might be further studied to the possible applications in automatic target recognition (ATR). On the other hand, the extension of the proposed approach to 3-D radar signatures obtained from different azimuth and elevation angles would also be an interesting study.

**Author Contributions:** Methodology, K.W.; validation, K.W. and X.X.; writing—original draft preparation, K.W.; writing—review and editing, K.W. and X.X.

**Funding:** This research was funded by National Natural Science Foundation of China: 61371005.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

In the appendix, we provide an eigenvalue sequences transform algorithm to estimate the model order *K*1. Details are as follows.

Suppose that the size of the input matrix **H** is *m* × *n*(*m* ≥ *n*), do the SVD of the matrix **H**<sup>∗</sup> **H**.

$$\mathbf{H}^\* \mathbf{H} = \mathbf{V} \mathfrak{R}^2 \mathbf{V}^\* \tag{A1}$$

where **<sup>V</sup>** is unitary matrix; <sup>2</sup> is the diagonal matrix.

Λ = [λ1, λ2, , ...λ*n*] is the diagonal vector of . It can be normalized as:

$$w\_i = \frac{\lambda\_i - \min(\Lambda)}{\max(\Lambda) - \min(\Lambda)} \tag{A2}$$

where *i* = 1, 2, ... , *n*; max(Λ) and min(Λ) denote the maximum and minimum elements of Λ.

The coordinate form of the normalized eigenvalue vector is:

$$[Q\_1, Q\_2, \dots, Q\_n] = [(1, w\_1), (2, w\_2), \dots, (n, w\_n)] \tag{A3}$$

where *Qn* denotes the *n*-th points.

Based on information theory and principal factor analysis [28], the separable signals and noises in **H** often have the following hypothesis.

> *w*<sup>1</sup> >...> *wK*<sup>1</sup> > *wK*1+<sup>1</sup> ...> *wn* signal space noisespace (A4)

$$
\pi\nu\chi\_1 - \pi\nu\chi\_1 + 1 > \pi\chi\_1 + 1 - \pi\nu\chi\_1 + 2 \approx \dots \approx \pi\nu\_{n-1} - \pi\nu\_n\tag{A5}
$$

where *K*<sup>1</sup> is the number of separable signals which also denotes the model order.

From (A5), it can be deduced that the point *QK*<sup>1</sup> belonging to [*Q*1, *Q*2, ... , *Qn*] has the longest distance to line *Q*1*D* which has the following condition.

$$k\left(Q\_i\overrightarrow{Q\_{K\_1+1}}\right) > k\left(\overrightarrow{Q\_1D}\right) > k\left(\overrightarrow{Q\_{K\_1+1}}Q\_n\right) \tag{A6}$$

where i = 1, 2, ... , *K*1, *k* → *QiQK*1+<sup>1</sup> denotes the slope of line <sup>→</sup> *QiQK*1+1, *D* is a point which satisfies the slope condition (A6).

Here, an approximate calculation is used to estimate the slop of <sup>→</sup> *Q*1*D*.

$$k\left(\overrightarrow{Q\_1 D}\right) \approx k\left(\overrightarrow{Q\_{p-1}}\overrightarrow{Q\_n}\right) \tag{A7}$$

$$\mathbf{g}(p) = \max[\mathbf{g}(1), \mathbf{g}(2), \dots, \mathbf{g}(n)] \tag{A8}$$

$$\log(i) = \frac{\left| k \text{( $\vec{Q\_1 Q\_n}$ )} (i - 1) - (w\_i - w\_1) \right|}{\sqrt{k \text{( $\vec{Q\_1 Q\_n}$ )}^2 + 1}} \tag{A9}$$

where *Qp* represents the point in [*Q*1, *Q*2, ... , *Qn*] which has the longest distance to line *Q*1*Qn*; *g*(*i*) denotes the distance from point *Qi* to <sup>→</sup> *Q*1*Qn*; *i* = 1, 2, ... , *n*.

Thus, *QK*<sup>1</sup> can be confirmed by searching the longest distance from point *Qi* to line <sup>→</sup> *Q*1*D*.

$$\mathbf{g}'(i) = \frac{\left| \mathbf{k} \{ \overrightarrow{Q\_1 D} \} (i - 1) - (w\_i - w\_1) \right|}{\sqrt{\mathbf{k} \{ \overrightarrow{Q\_1 D} \}^2 + 1}} \tag{A10}$$

$$\mathcal{g}'(\mathbb{K}\_1 + 1) = \max[\mathcal{g}'(1), \mathcal{g}'(2), \dots, \mathcal{g}'(n)] \tag{A11}$$

where *i* = 1, 2, ... , *n*; *g* (*i*) denotes the distance from point *Qi* to <sup>→</sup> *Q*1*D*.

#### **Appendix B**

From the observability matrix **Ω** given by (20), controllability matrix **Γ** given by (21) and the matrices **Ω***r f* , **Ω***rl*, **Γ***c f* ,**Γ***cl* in (27),(28),(29) and (30), it is not difficult to deduce that the augmented open-loop matrix **P** satisfies the following matrix equations.

$$
\Omega\_{r\bar{f}} \mathbf{P} = \Omega\_{r l} \tag{A12}
$$

$$\mathbf{PI}\_{cf} = \Gamma\_{cl} \tag{A13}$$

Then the augmented open-loop matrix **P** can be obtained by least squares.

$$\mathbf{P} = \left(\boldsymbol{\Omega}\_{rf}^{\*}\boldsymbol{\Omega}\_{rf}\right)^{-1}\boldsymbol{\Omega}\_{rf}^{\*}\boldsymbol{\Omega}\_{rl} \tag{A14}$$

or

$$\mathbf{P} = \Gamma\_{cl} \Gamma\_{cf}^\* \left(\Gamma\_{cf} \Gamma\_{cf}^\*\right)^{-1} \tag{A15}$$

Moreover, here are two ways to compute the corresponding constant matrices **S** and **P**. The first way is for matrix **P** computed by (A14), the corresponding matrix **S** is defined as the first *M* rows of the observability matrix **Ω**.

$$\mathbf{S} = \begin{bmatrix} \Omega(1,1) & \cdots & \Omega(1,K\_1) \\ \vdots & \vdots & \vdots \\ \Omega(M,1) & \cdots & \Omega(M,K\_1) \end{bmatrix} \tag{A16}$$

Using (A14) and (A16), we calculate the matrix **Ω**(

$$\mathbf{\tilde{\Omega}} = \begin{bmatrix} \mathbf{S} & \mathbf{S}\mathbf{P} & \dots & \mathbf{S}\mathbf{P}^{N-1} \end{bmatrix} \tag{A17}$$

From (15) and (A17), the original 2-D signatures **Y** can be written as

$$
\Upsilon = \Omega \mathbf{D} \tag{A18}
$$

By using least square again, the matrix **D** is

$$\mathbf{D} = \left(\overline{\mathbf{D}}^{\prime}\overline{\mathbf{D}}\right)^{-1}\overline{\mathbf{D}}^{\prime}\mathbf{Y} \tag{A19}$$

The second way is for matrix **P** computed by (A15), the corresponding matrix **D** is defined as

$$\mathbf{D} = \begin{bmatrix} \Gamma(1,1) \\ \vdots \\ \Gamma(K\_1,1) \end{bmatrix} \tag{A20}$$

Using (A15) and (A20), we calculate the matrix (**Γ**

$$
\widetilde{\Gamma} = \begin{bmatrix} \mathbf{D} & \mathbf{P} \mathbf{D} & \dots & \mathbf{P}^{N-1} \mathbf{D} \end{bmatrix} \tag{A21}
$$

Similar to (A18), the original 2-D signatures **Y** can also be rewritten as

$$\mathbf{Y} = \mathbf{S}\mathbf{T} \tag{A22}$$

By using least square, the matrix **S** is

$$\mathbf{S} = \mathbf{\Upsilon} \overrightarrow{\Gamma} \left( \overrightarrow{\Gamma \Gamma} \right)^{-1} \tag{A23}$$

In this appendix, we provide two ways to estimate the parameter matrices (**S**, **P**, **D**). For consideration of the robustness to noise, (A23), (A14) (or (A15)) and (A19) are preferred. It is also worth mentioning that 2-D ASSA has two ways, which include the matrix form (**S**, **P**, **D**) and the pole form (*ak*,*sk*, *pk*), to reconstruct the original signatures **Y**. The matrix form could provide more accurate reconstruction result than the pole form but with much more parameters.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **On the Slow-Time** *k***-Space and its Augmentation in Doppler Radar Tomography**

**Hai-Tan Tran 1,\*, Emma Heading 1,2 and Brian W.-H. Ng <sup>2</sup>**


Received: 16 November 2019; Accepted: 11 January 2020; Published: 16 January 2020

**Abstract:** Doppler Radar Tomography (DRT) relies on spatial diversity from rotational motion of a target rather than spectral diversity from wide bandwidth signals. The slow-time *k*-space is a novel form of the spatial frequency space generated by the relative rotational motion of a target at a single radar frequency, which can be exploited for high-resolution target imaging by a narrowband radar with Doppler tomographic signal processing. This paper builds on a previously published work and demonstrates, with real experimental data, a unique and interesting characteristic of the slow-time *k*-space: it can be augmented and significantly enhance imaging resolution by signal processing. High resolution can reveal finer details in the image, providing more information to identify unknown targets detected by the radar.

**Keywords:** slow-time *k*-space; spatial frequency; Doppler radar tomography; radar imaging; *k*-space augmentation; high-resolution narrowband radar

#### **1. Introduction**

Tomography is a general imaging technique that is based on lower-dimensional projections of an object from different spatial aspects, which are then processed using the projection-slice theorem [1] to reconstruct an image of the object. Radar tomography uses reflective scattering phenomenology and radar waveforms for the measurements, which may be wideband or narrowband. Wideband waveforms exploit *spectral* diversity as system resources to facilitate radar imaging and have probably been the most exploited resources in practical applications in the last few decades. The well-known synthetic aperture radar (SAR) and inverse SAR (ISAR) imaging techniques may be described as two special forms of wideband tomography, in which another system resource—*spatial* diversity—is exploited only minimally [2]. Range-Doppler ISAR imaging, and stripmap SAR in particular, typically involve aspect angle changes of a few degrees [3–5]. This constraint of small rotation angles in the linear phase regimes allows the image inversion processing to take advantage of the computationally efficient fast Fourier transform (FFT) without needing signal interpolation onto rectangular grids.

Spotlight SAR makes use of wider angles [6], while circular SAR [7] may coherently process up to a complete cycle of target aspect rotation, with sophisticated and precise motion compensation in range. More notably, in the associated spatial frequency spaces, also known as *k*-spaces [2], traditionally intensive interpolation processing prior to image inversion processing may be necessary. Nevertheless, these forms of SAR and ISAR rely on the bandwidth resource to achieve high down-range resolution, and so can be considered as belonging to the category of 'wideband radar tomography'.

Radar tomographic imaging with ultra-narrowband or single-frequency waveforms relies on spatial diversity as the only system resource for image formation [8–11]. Spatial diversity may be realized by: (i) having a radar with multiple receivers looking at the target from diverse angular locations, the received signals from which are processed coherently, or (ii) using a single receiver looking at a target undergoing relative rotational motion, i.e., changing target aspect. Both cases widen the angular extents of the measurement support of the received signal in the *k*-spaces.

Previous work [2] showed that narrowband radar tomography can be most effectively formulated in the *slow-time k*-space in conjunction with the classical Doppler processing and Doppler radar tomography (DRT) [12,13]. The DRT algorithm applies the projection-slice theorem in which the inputs of the target's cross-range projections are formed from Doppler profiles. The slow-time *k*-space is not only convenient for describing the DRT algorithm, it is also a natural tool to formulate high-resolution DRT imaging with an *augmentation* of its measurement support. Augmentation is the process of significantly enlarging the support of the slow-time *k*-space by using longer coherent processing intervals (CPI) in the DRT algorithm and correcting for nonlinear phase effects due to strong rotational motion. This 'augmentability' is a unique characteristic of the slow-time *k*-space.

The introduction of nonlinear phase terms in the *k*-space augmentation causes a blurring effect in the resulting image. Spectral compression techniques for chirped signals can be used to address this problem using bilinear transforms such as the Wigner-Ville Distribution (WVD), the Cohen's class and the time-frequency distribution series (TFDS) as discussed in [14]. The problem with these techniques is the presence of undesirable cross terms when instantaneous component frequencies may overlap, which is the case for DRT imaging [13]. The combination of the fractional Fourier transform and S-method was used to overcome the problem of cross terms in [13], which demonstrated the slow-time *k*-space augmentation with DRT. The current work is extended to a more novel technique based on the orthogonal matching pursuit (OMP) technique, inspired by related work in compressive sensing.

Radar imaging naturally is suited to compressive sensing techniques, given that real targets often resemble a sparse collection of discrete point scatterers [5,15]. OMP is fundamentally a technique for parameter estimation by matching a given signal to a dictionary of possible elemental functions spanning a finite parameter space. The dictionary is designed for the particular application and has been applied to the area of DRT imaging in varying contexts [16–18]. The particular application in this work is to estimate the non-linear phase term in the radar signal to reduce the image blurring for improved resolution. The main contribution of the paper is two-fold: to highlight the augmentability of the slow-time *k*-space as a fundamentally useful characteristic for narrowband radar imaging, and to present a novel application of the OMP technique to such augmentation processing.

The slow-time *k*-space processing technique as presented in this paper provides a complimentary approach to traditional high-resolution ISAR imaging. The dependence of wide bandwidth signals for high resolution in ISAR imaging is not always readily achievable within the confines of the available spectrum and limitations at lower frequency bands [9,19]. The proposed high-resolution imaging scheme can lead to improved target recognition despite an absence of wide bandwidth signals, provided sufficient spatial diversity is available. This is an important capability of great interest to the radar research community [20].

The rest of the paper is organized as follows. The next Section summarizes the fundamental theory: system geometry and signal model, cross-range bandwidth and resolution, and DRT. Section 3 describes the slow-time *k*-space and its augmentation with OMP processing. Section 4 describes the experimental setup using simple point scatterers on a rotating turntable with imaging results for both standard and augmented DRT processing. The final Section presents some relevant discussion points and concluding remarks.

#### **2. Background**

This Section defines the signal model, the fundamental concept of cross-range bandwidth and resolution, and summarizes the known theory of Doppler radar tomography (DRT) in its standard version.

#### *2.1. Signal Model*

Consider a monostatic radar system geometry as illustrated in Figure 1. Without loss of generality, an inertial local target reference frame, denoted as *T<sup>x</sup>* with origin *O* at the target's nominal centre of rotation, is chosen to have the the *x*<sup>2</sup> ('down-range') axis aligned with the radar line of sight (LOS), with *x*<sup>1</sup> axis denoting cross-range. The plane (*x*1, *x*2) is often known as the image projection plane (IPP, or just 'image plane'). The axis orthogonal to the IPP is denoted as the *x*3-axis (sometimes referred to as 'height'). The target's *effective* rotation vector **Ω***<sup>e</sup>* is defined as the projection of the target's total rotational velocity vector **Ω** along the *x*3-axis.

**Figure 1.** Imaging system geometry: The (*x*1, *x*2, *x*3) coordinates are defined in the local target frame, *Tx*. For clarity, a single point scatterer is shown at *xm*, which rotates around origin *O* with velocity Ω. Targets are modeled as a discrete, distributed collection of similar point scatterers.

Using the definition above, the total rotational velocity vector can be written in *T<sup>x</sup>* as

$$
\Omega = (0, \Omega\_2, \Omega\_\epsilon). \tag{1}
$$

Physically, Ω*<sup>e</sup>* introduces cross-range dependent Doppler shifts in the radar backscatter and is the principal reason that motion-based target imaging is possible. In comparison, Ω<sup>2</sup> has minimal (sometimes deleterious) impact on radar imaging. For non-cooperative targets, neither Ω*<sup>e</sup>* nor Ω2, or the orientation of the IPP itself, are known a priori. In this paper, we further assume that Ω<sup>2</sup> = 0, and Ω*<sup>e</sup>* is approximately constant during a coherent processing interval (CPI).

For this paper, we use an idealized point-scatterer model for the target: it is adequately modeled as an ensemble of *M* point scatterers with reflectivity coefficients *σm*, located in the far field of the radar. The approximate range to the *m*th point-scatterer on the target with position vector *xm*, defined relative to *O*, can be defined as

$$R(\mathbf{x}\_{\rm m}) \approx R(\mathbf{x}\_{\rm m}) \cdot \dot{\mathbf{r}}\_{\rm LOS} = R\_0 + r\_{\rm m} \tag{2}$$

in which *R*(*xm*) is the range vector to the *m*th scatterer, *R*<sup>0</sup> is the radar range to *O*, the scatterer's local down range is

$$r\_m = \mathbf{x}\_m \cdot \mathbf{i}\_{LOS\_{\prime}} \tag{3}$$

and *iLOS* = (0, 1, 0) is the unit vector along the radar LOS in the *T<sup>x</sup>* frame.

Formulation in the *T<sup>x</sup>* frame is appropriate in traditional ISAR imaging where the change of aspect is small (a few degrees), or for signal analysis within a relatively short CPI. In contrast, radar tomography exploits spatial diversity through wide changes of target aspect. For this formulation, a second, dynamic local target frame denoted as *Ty*, is needed. This frame rotates with the target and coincides with *T<sup>x</sup>* at a reference time, usually assumed to be *tk* = 0. By the Ω<sup>2</sup> = 0 assumption, it follows also that *T<sup>x</sup>* and *T<sup>y</sup>* share the same *x*3-axis. The reason for choosing *T<sup>y</sup>* frame is that its axes are aligned with those of the underlying *k*-spaces and thus preserves angles across the *T<sup>y</sup>* frame and the *k*-spaces.

Let

$$s\_T(t\_k; f) \propto \exp\left\{j2\pi ft\_k\right\},$$

denote the simple transmit continuous waveform at a single frequency *f* , where only the slow time *tk* is involved; there are no pulses and hence no 'fast time' spanning a pulse. The slow-time index is *k* = 0, 1, 2, ... , *K* − 1, and we assume a total of *K* time samples in a CPI. The received signal *sR*(*tk*; *f*) is a delayed version of *sT*(*tk*; *f*), summed over all scatterers,

$$\log\_R(t\_k; f) \propto \exp\left\{-j4\pi f \frac{\mathcal{R}\_0(t\_k)}{c}\right\} \sum\_{m=1}^M \sigma\_m \exp\left\{-j\frac{4\pi f}{c} r\_m(t\_k)\right\}.\tag{4}$$

Here, we have also assumed that radar hardware perfectly removes the carrier frequency term exp{*j*2*π f tk*}. The first factor in (4) describes translational motion of the target as a whole; the second factor captures the target geometry and scattering reflectivities to be processed for imaging.

Furthermore, we shall assume a linear translational motion model for the target,

$$R\_0(t\_k) = R\_0(0) + \nu \, t\_{k\prime} \tag{5}$$

where *ν* is the velocity, assumed known prior to DRT processing, and *R*0(0) is target range at a reference time *tk* = 0.

#### *2.2. Cross-Range Bandwidth and Resolution*

The position of each scatterer executing rotational motion with rotation vector **Ω** is described to a second order approximation by

$$\mathbf{x}(t\_k) = \mathbf{x}\_0 + (\boldsymbol{\Omega} \times \mathbf{x}\_0)t\_k - \frac{1}{2} [\boldsymbol{\Omega}^2 \mathbf{x}\_0 - (\boldsymbol{\Omega} \cdot \mathbf{x}\_0)\boldsymbol{\Omega}]t^2, t$$

where *<sup>x</sup>*<sup>0</sup> <sup>≡</sup> *<sup>x</sup>*(0) for convenience. Relative to *<sup>T</sup>x*, the local down range *rm* in (4) can be expressed as

$$r\_m(t\_k) = x\_{m\_2} + x\_{m\_1} \Omega\_\varepsilon t\_k - \frac{1}{2} x\_{m\_2} \Omega\_\varepsilon^2 t\_k^2 + \cdots,\tag{6}$$

where *xm*<sup>1</sup> , *xm*<sup>2</sup> are the initial (*tk* = 0) cross range and range, respectively, of the *<sup>m</sup>*-th scatterer in *<sup>T</sup>x*. The CPI duration is denoted by *TCPI*. As has been thoroughly discussed in [2], although *x*<sup>2</sup> (dropping the subscript *m* for brevity) cannot be directly estimated with a zero-bandwith signal, the *first*-order term of (6) suggests that a so-called *cross-range bandwidth*,

$$B\_{\perp} = f \, \Omega\_{\text{\textdegree}} T\_{\text{CPI}} = f \, \Delta \theta,\tag{7}$$

can be used to estimate cross range *x*1. In other words, the target's rotation generates an effective bandwidth which allows for the resolving cross-range measurements, as long as the rotation angle through *TCPI*,

$$
\Delta\theta = \Omega \, T\_{CPI\prime}
$$

is sufficiently small such that higher-order terms (quadratic and above) in (6) can be ignored. In practice, the Δ*θ* is limited to a few degrees, which is consistent with wideband ISAR imaging. Note that the presence of the (unknown) zeroth-order term *xm*<sup>2</sup> means *x*<sup>1</sup> cannot be directly estimated from the time-domain signal. Doppler tomography, as formulated below, overcomes such constraints to achieve target imaging.

Consider a segmented CPI of the received signal *sR*(*tk*, *l*) as illustrated in Figure 2. Taking a Fourier transform over *tk* produces a Doppler profile *SR*(*fd*) = F {*sR*(*tk*, *l*)}, with zero Doppler (*fd* = 0) corresponding to the centre of rotation at *O* (ignoring any residual translational motion after preprocessing). For a segment of duration *TCPI*, the achievable Doppler resolution is

$$
\Delta f\_d = \frac{1}{T\_{CPI}} = \frac{f \,\Omega\_\text{c}}{B\_\perp}. \tag{8}
$$

**Figure 2.** Illustration of the DRT narrowband imaging algorithm. *ks*2- and *ks*<sup>1</sup> are the components of the slow-time *k*-vector *k<sup>s</sup>* aligned with the target's initial range and cross-range directions, respectively. Each radial line represents the slow-time *k*-space samples obtained from one segmented CPI.

Each Doppler profile contains contributions from all scatterers, with the down ranges coordinates *xm*<sup>2</sup> encoded as constant phase terms. Since the cross-range of a scatterer is directly proportional to its Doppler frequency *fd*, namely

$$x\_1 = \frac{\lambda}{2\Omega\_\text{e}} f\_{d\nu} \tag{9}$$

it follows that the *magnitude* of the Doppler profile,

$$p\_{\theta}(\mathbf{x}) = |\mathcal{S}\_{\mathbb{R}}(f\_d)|\_{\prime}$$

represents a cross-range projection of the target's reflectivity function at angle *θ*, the average aspect angle over the CPI. The achievable *cross-range resolution* is

$$
\Delta \mathbf{x}\_1 = \frac{\lambda}{2\Omega\_\mathbf{c}} \,\Delta f\_d = \frac{c}{2B\_\perp}.\tag{10}
$$

This expression is exactly analogous to the down-range resolution Δ*x*<sup>2</sup> = *c*/2*B* for wideband imaging with spectral bandwidth *B*.

#### *2.3. Doppler Radar Tomography (DRT)*

The *Projection-Slice Theorem* (PST) states that the Fourier transform *P<sup>θ</sup>* (*fs*⊥) of projection *p<sup>θ</sup>* (*x*) is a slice of the 2D FT of the target's reflectivity function at aspect angle *θ*. This theorem can be used to invert the cross-range profiles accumulated from a range of aspect angles *θ<sup>l</sup>* to recover the target reflectivity function, i.e., estimate the scatterer coordinates *xm*<sup>1</sup> and *xm*<sup>2</sup> in *<sup>T</sup><sup>y</sup>* frame. For this to be effective, the target's rotation must subtend a significant change in aspect angles; the 1D cross-range projections are computed in the frequency domain as discussed above, after which the target reflectivity function (image) can be reconstructed by a 2D inverse FT.

#### 2.3.1. The Monostatic DRT Algorithm

To perform radar imaging using the DRT method, it is necessary to populate the slow-time *k*-space from the radar backscatter. The algorithm to generate the slow-time *k*-space samples consists of the following steps:


$$p\_{\theta\_l}(\mathbf{x}) = \left| \mathcal{F} \{ s\_R(t\_k, l) \exp(j2\pi \nu t\_k) \} \right| \,, \tag{11}$$

is the cross-range (which is proportional to Doppler) profile for the target at an angle *θ<sup>l</sup>* from its original orientation. Accumulate all such cross-range profiles for all the corresponding aspect angles *θl*, i.e., for all *L* segmented CPIs.

c. *Populating the k-space:* The spatial Fourier transform of *pθ<sup>l</sup>* (*x*)

$$P\_{\theta\_l}(f\_{s\perp}) = \mathcal{F}\{p\_{\theta\_l}(\mathbf{x})\}\tag{12}$$

at target aspect angle *θ<sup>l</sup>* are then used as the 'measurement samples' in the slow-time *k*-space. As the target rotates, the measurements sweep out a region of support in slow-time *k*-space as indicated in Figure 2. Due to our choice of reference frames, the measurement population always starts close to the *ks*1-axis because *pθ*<sup>1</sup> (*x*) is the initial cross-range profile.

d. *Image inversion:* An inverse Fourier transform is applied to the populated support of the *k*-space to yield the target image. Other works have either used filtered back projection, or interpolated the samples onto a rectangular grid to utilise a standard 2D inverse Fourier transform, for this task applied [12,13]. In this paper, we use the non-uniform Fast Fourier transform (NUFFT) [21–24].

It is worth noting that the image resolution is inversely proportional to the diameter of the span of the *k*-space samples which is dependent on the cross range bandwidth *B*<sup>⊥</sup> as defined in (10). The resulting supportable size of the image is then determined from the image resolution cell multiplied by a factor of *K* being the number of samples spanning the diameter of the *k*-space. Although limited amounts of target rotation can reduce image resolution in the sparsely populated direction, here we focus on the case where a half cycle of the target scatterers is visible to the radar to completely populate the *k*-space. Under this assumption, the angular sampling density of the *k*-space samples drives the image contrast and is a trade off with computational cost [25].

2.3.2. Standard DRT

By standard DRT, we refer to the case where the input cross-range profiles, as defined by (11), are Doppler migration free (DMF), and the rotation angle corresponding to each profile formed under this condition is said to be within the linear limit (of phase variation). The DMF condition can be satisfied when the segmented CPI lengths are sufficiently short such that the nonlinear phase terms in (6) are negligible and hence compensation is not necessary, or when <sup>|</sup>*xm*<sup>|</sup> is small. The former case is particularly sensitive for scatterers at larger radial distances from the centre of rotation, while the latter case applies more to scatterers sufficiently close to the centre of rotation whose Doppler frequencies are small and Doppler migration effects (if any) are also small.

As derived in the Appendix A, the standard DRT constraint on CPI rotation angle is

$$
\Delta\theta \le \min\left\{\Delta\theta\_{DM}, \Delta\theta\_{LM}\right\}, \quad \text{(rad)}\tag{13}
$$

where Δ*θDM* = (*λ*/2 *r*max)1/2 is an effective rotation angle required to induce a Doppler migration (DM) of one bin, Δ*θLM* is the 'linear limit', while DRT image resolution, in both range and cross range, is

$$
\Delta \mathbf{x}\_1 = \Delta \mathbf{x}\_2 \ge \left(\frac{\lambda \, r\_{\text{max}}}{2}\right)^{1/2}. \quad \text{(m)}\tag{14}
$$

Here, *rmax* is the maximum radial dimension of the target. Note that Δ*θ* and Δ*x* are independent of rotation rate and signal sampling rate, but only on radar wavelength and the dimension of the target (through maximum radial dimension *r*max to any scatterer). Δ*θLM* is roughly 10 degrees; Equations (13) and (14) can be used as a guide to predict the expected imaging performance or applicability of standard DRT for a specific radar wavelength and target size.

The limitations imposed by these nonlinear effects at wider rotation angles can be compensated by a processing technique described in the next Section. For differentiation from standard DRT, such cases are referred to as 'Augmented DRT'.

#### **3. The Slow-Time** *k***-Space and Its Augmentation**

While it is possible to formulate the problem and solution entirely in terms of the spatial frequency space of *fs*⊥, we shall keep up with tradition and formulate it in terms of a '*k*-space', with

$$k\_s = 2\pi f\_{s\perp}.$$

#### *3.1. The Slow-Time k-Space*

In basic Fourier analysis, for signal with a pulse repetition interval *PRI*, the Doppler frequency extent of the signal is *PRF* = 1/*PRI*, which spans the interval (−*PRF*/2, *PRF*/2). Analogously, from the spatial (cross-range) resolution Δ*x*<sup>1</sup> as given in (10), the values of spatial frequency *fs*<sup>⊥</sup> spans the interval (−*B*⊥/*c*, *B*⊥/*c*). It follows from (7) that the interval for *ks* is

$$k\_s \in \left( -\frac{2\pi f}{c} \Delta \theta , \frac{2\pi f}{c} \Delta \theta \right).$$

These limits are illustrated by extents of the radial dashed lines in Figure 2. Since the cross-range profiles are computed from FFT, both the discretized time and frequency domain vectors have *K* samples. That is, the slow-time *k*-vectors *k<sup>s</sup>* corresponding to each cross-range projection contains samples given by

$$k\_s = k' \frac{2\pi f}{c} (\Omega\_c \, T\_{PRI}) \, \mathbf{i}\_{\perp'} \tag{15}$$

where *<sup>k</sup>* <sup>=</sup> <sup>−</sup>*K*, <sup>−</sup>*<sup>K</sup>* <sup>+</sup> 2, ... , <sup>−</sup>2, 0, 2, ... , *<sup>K</sup>* <sup>−</sup> 2; *TPRI* <sup>=</sup> *TCPI*/*K*, and *<sup>i</sup>*<sup>⊥</sup> is the cross-range unit vector (perpendicular to LOS) along the *x*1-axis of the *T<sup>y</sup>* frame.

The slow-time *k*-space arises naturally out of DRT: its *radial support* determined by the cross-range bandwidth *B*<sup>⊥</sup> and its *populating samples* are *Pθ<sup>l</sup>* (*ks*) given by (12); as the target rotates, the slow-time *k*-space support is swept out in fan-like shapes around the *k*-space origin. Also, for a given *B*⊥, the number of *k<sup>s</sup>* points is a processing design parameter not necessarily fixed to *K*; its chosen value however would affect only the sidelobes of the impulse response, and hence image contrast, not image resolution.

An important and useful characteristic of the slow-time *k*-space is it can be *augmented*. As implied by (7), *B*<sup>⊥</sup> can be increased by using a wider rotation angle Δ*θ*, providing processing can effectlively correct for the nonlinear terms in the phase function of (6). In Section 3.2 below, we discuss one typical technique to correct for the *second*-order term, i.e., linear chirp components. In other words, augmentation of the *k*-space enhances resolution by permitting the CPI to be lengthened to the limit where rotational motion of all point scatterers can be modelled as linear chirps.

By comparing a standard CPI *T*(*s*) *CPI* and corresponding rotation angle <sup>Δ</sup>*θ*(*s*) in the conventional linear limit of narrowband imaging to a longer CPI we define an 'augmentation factor'

$$\kappa = \frac{\Delta\theta}{\Delta\theta^{(s)}} = \frac{T\_{CPI}}{T\_{CPI}^{(s)}},\tag{16}$$

where *TCPI* is the lengthened CPI and corresponding larger rotation angle Δ*θ*. The augmentation factor of *κ* describes the expansion of the cross range bandwidth *B*<sup>⊥</sup> or equivalently the radial span of the slow-time *k*-space described in (7) and (15). The DRT image resolution is inversely proportional to *B*<sup>⊥</sup> defined in (9), hence, an improvement in resolution can be achieved with adequate compensation of the linear chirps which is described further in Section 3.2. The concept of the slow-time *k*-space is illustrated in Figure 3. The DRT algorithm based on an augmented *k*-space is called augmented DRT.

**Figure 3.** An illustration of the augmentation of the slow-time *k*-space to generate longer segmented CPIs for cross-range profile formation, which compensates for nonlinear effects of rotation arising from wider angles. The circle indicates the boundary of support in standard DRT imaging.

#### *3.2. Augmented DRT with Orthogonal Matching Pursuit (OMP)*

This technique shares the same objective as the FrFTS-based technique [13] but instead makes use of a popular tool in the more modern approach of sparse signal approximation, OMP. Again, TMC is assumed to have been perfectly processed prior to this processing.

#### 3.2.1. Sparse Representation

With reference to (4) and (6), the segmented CPI signal received is represented in vector form as

$$\mathbf{s}\_{R} = \mathbf{\varproj} \sigma + \mathbf{e}\_{\prime} \tag{17}$$

where **Ψ** is the dictionary matrix of size *K* × *Nσ*; *σ* is a length-*N<sup>σ</sup>* column vector of (complex-valued) atom coefficients; and is a length-*K* column vector of noise and/or clutter components. The columns of **Ψ** are the *chirp atoms*, of the form

$$\log(k) = \exp\left\{-j2\pi\left(f\_{\mathcal{S}}t\_k + \frac{1}{2}c\_{\mathcal{S}}t\_k^2\right)\right\},\tag{18}$$

where *k* = 0, 1, . . . , *K* − 1, and the parameters

$$f\_{\mathcal{S}} = \frac{2 \,\mathrm{x}\_1 \,\Omega\_{\mathfrak{c}}}{\lambda}, \quad \text{and} \; c\_{\mathfrak{F}} = -\frac{2 \,\mathrm{x}\_2 \,\Omega\_{\mathfrak{c}}^2}{\lambda} \tag{19}$$

respectively represent the Doppler frequency and chirp rate of a scatterer due to rotation, at reference time *tk* = 0 of the current segmented CPI, which define the atom *g*(*tk*). Furthermore, let *fg* and *cg*, or equivalently *x*<sup>1</sup> and *x*2, be discretized as vectors of expected or possible values, of lengths *Nf* and *Nc* respectively, then *N<sup>σ</sup>* = *Nf Nc*.

Different options for discretizing (*fg*, *cg*) lead to different definitions of the dictionary **Ψ**. The above option in terms of (*x*1, *x*2) uses rectangular scatterer coordinates. Another option is by polar coordinates (*d*, *α*) with

$$\mathbf{x}\_1 = d\cos(\mathfrak{a}), \quad \mathbf{x}\_2 = d\sin(\mathfrak{a}), \tag{20}$$

which may be useful when prior knowledge about the expected scatterer locations is available. It is desirable to use a coordinate grid for (*fg*, *cg*) in such a way that the grid points efficiently spans the target while keeping the total number of grid points (the dictionary size) to a minimum. A demonstration of these options is shown in Section 4.2.4.

The OMP algorithm itself is well-known, hence will not be described here (see [26] for example). In fact, OMP is only one of several sparse approximation techniques that could be used in this algorithm.

#### 3.2.2. The OMP-Based Augmented DRT Algorithm

The augmented DRT algorithm is modified from standard DRT by simply lengthening the segmented (and overlapping) CPIs with an augmentation factor *κ*, as defined by (16); the target signal in each CPI can then be represented as a sum of linear chirp components. The aim is then to estimate such a representation and to correct for the chirps, i.e., focusing the range profile, before applying them to remaining steps of the DRT algorithm.

For each augmented CPI, suppose the output of the OMP processing is a (sparse) representation { *f* (*m*) *<sup>g</sup>* , *c* (*m*) *<sup>g</sup>* } of size *M* with corresponding atoms {*gm*(*tk*)} and coefficients {*σm*}, then a *dechirped* version for the segmented receive signal is

$$\mathfrak{s}\_R(t\_k) = \sum\_{m=1}^M \sigma\_m \mathfrak{g}\_m(t\_k) \to \sum\_{m=1}^M \sigma\_m \mathfrak{g}\_m(t\_k). \tag{21}$$

The right arrow → above denotes a *replacement* of the *gm*(*tk*) atom with a corresponding *monotone* signal

$$\bar{g}\_{\mathcal{W}}(t\_k) = \exp\left\{-j2\pi f\_{\mathcal{S}}^{(\text{mid)}} t\_k\right\},\tag{22}$$

with Doppler frequency

$$f\_{\mathcal{S}}^{(mid)} = f\_{\mathcal{S}}^{(m)} + \frac{1}{2} c\_{\mathcal{S}}^{(m)} t\_{\text{mid}\prime} \tag{23}$$

so defined as the instantaneous frequency at *t*mid–the middle time of the segmented CPI. The operation

$$p\_{\theta\_l}(\mathbf{x}) = |\mathcal{F}\{\tilde{s}\_R(t\_k)\}|\tag{24}$$

then would give a focused cross-range projection for tomographic processing.

The augmentation algorithm, applied to each CPI of the augmented DRT algorithm, can thus be summarized as follows.

	- **–** define or select expected intervals of Doppler frequency *fg* and chirp rate *cg*;
	- **–** define the corresponding chirp atoms and set up the dictionary **Ψ**;
	- **–** input segmented CPI data *sR*(*tk*);

#### **4. Experimental Results**

We present two different datasets using simple point-like scatterers on a rotating turntable to represent a target. This scenario is analagous to rotating components on a target such as a helicopter rotor blade tips [20,25,27]. The first dataset is a target with a small dimension and small scatterers to showcase improvements in resolution. The second dataset is representative of a much larger target which highlights the effect of blurring in the image that we aim to remove for improved image resolution.

#### *4.1. Small Target*

#### 4.1.1. Experimental Setup

The data was collected in the Mumma Radar Laboratory at the University of Dayton, Ohio, USA. Although the aim of the study is narrowband imaging, a wideband waveform at X-band was used with stepped-frequency pulses between 8 GHz and 12 GHz, over 101 regular frequency steps. Only the measured data from one of the available discrete frequencies *fk* was used to study narrowband tomographic radar imaging.

The transmit and receive horn antennas were mounted on separate robotic arms which could be oriented and positioned with high precision. The measurements were conducted in a controlled laboratory environment with some Radar Absorbing Material (RAM) reducing the radar reflections from the floor and walls. The experimental target consisted of two vertical metallic rods, separated by 19 cm (approximately), emulating two point scatterers which rotated around a vertical pedestal, as illustrated in Figure 4. The maximum radial distance is 11 cm. The antennas were kept stationary whilst the target was rotated through 360◦, at 0.1◦ steps. At each step, the stepped-frequency waveform was transmitted and sampled, one sample for each frequency.

**Figure 4.** An example of an antenna mounted on a robotic arm at the Mumma Radar Laboratory with the two vertical metallic rods secured to the rotating pedestal.

#### 4.1.2. System Requirements

Successful imaging is not dependent on shifts in relative velocity of the target from pulse to pulse, in fact the target could completely stop at each sampled rotation angle [28]. This is the case when a target rotates on a turntable with a very slow rotation rate during which the Doppler frequency is derived from the change in phase in time from the different target perspectives. Therefore we describe the system parameters such as target rotational speed and radar sampling rate based on the angular sampling rate.

While the full theoretical details are included in the Appendix A, the key requirements are summarized as follows.


Realistic values for *PRF* and *ω* can also be chosen such that *PRF*/*ω* = 191, however, this is not necessary for DRT processing.

For each of the selected frequencies, an elliptic filter with a very narrow stop band is also applied to the signal as a pre-processing for clutter removal. Results are shown in Sections 4.2.3 and 4.2.4.

#### 4.1.3. Standard DRT Imaging

To demonstrate *k*-space augmentation and the usefulness of sparse signal approximation, some typical results of standard DRT imaging is now shown. Figure 5 shows a spectrogram of the signal at 8 GHz and Figure 6 shows the corresponding slow-time *k*-space support and standard DRT image. We have used an overlapping factor *η* of 0.99 in the segmentation step to provide a very smooth angular coverage of the *k*-space. However, standard DRT imaging performance is poor; the *k*-space support is small; the two scatterers (metallic rods) are not distinguishable in the image.

If longer CPIs are used with the standard DRT algorithms, image blurring occurs. Suppose *κ* as defined by (16) is set to 6, the Doppler resolution in the spectrogram of the signal becomes higher, as shown in Figure 7. When the corresponding cross-range profiles (with Doppler bin migration effects present) are applied to standard DRT, the resulting image is in Figure 8.

**Figure 5.** Spectrogram using standard DRT processing for *f* = 8 GHz.

**Figure 6.** The slow-time *k*-space support (left) and corresponding standard DRT image (right), at *f* = 8 GHz.

**Figure 7.** Signal spectrogram with augmented CPIs, *κ* = 6, at *f* = 8 GHz.

**Figure 8.** The slow-time *k*-space support (left) and image (right) for standard DRT with *κ* = 6, at *f* = 8 GHz.

For a better insight into the electromagnetic scattering effects in this experiment, similar results using the highest frequency (12 GHz) available are shown in Figures 9 and 10. From Figures 7 and 9 it is clear that in addition to direct (specular) scattering off the inner side of a metallic rod, creeping waves around the rods are the most likely cause of the twin sinusoidal traces for each of the rods [29]. The effects are more pronounced with the shorter wavelength of 2.5 cm, which is more comparable to the rod diameter of approximately 2 cm. The double scattering effects are highlighted in the DRT image.

Note that image blurring in standard DRT imaging is only in the azimuthal direction; image focusing is still generally achieved in the radial direction.

**Figure 9.** Signal spectrogram with augmented CPIs, *κ* = 6, at *f* = 12 GHz.

**Figure 10.** The slow-time *k*-space support (left) and image (right) for standard DRT with *κ* = 6, at *f* = 12 GHz.

#### 4.1.4. Augmented DRT Imaging with OMP

To apply OMP for image focusing, the dictionary **Ψ** is set up with chirp atoms as defined in (18) and (19). As mentioned in Section 3.2.1, two coordinate options for spatial scatterer grids are possible: rectangular in (*x*1, *x*2) or polar in (*d*, *α*). In either case, prior knowledge can be used from the standard DRT processing to constrain the parameter span for the dictionary.

A rectangular scatterer grid was chosen spanning ±0.4 m with a nominal spacing proportional to *λ*/2. As for the standard DRT demonstration, the two frequencies of 8 GHz (*λ* = 3.75 cm) and 12 GHz (*λ* = 2.5 cm) are used. Over the selected discretization interval, the number *N<sup>σ</sup>* of atoms was 1849 (for 8 GHz) or 4096 (for 12 GHz).

The coefficient magnitudes of the first 20 atoms extracted from the 8 GHz signal show a clear convergence, as shown in Figure 11.

**Figure 11.** Magnitude of atom coefficients at *f* = 8 GHz for first 20 atoms, shown for a subset of the total number of CPIs.

To reduce signal processing noise effects in the resulting image, a simple thresholding method can be used to control the number of atoms to keep in the sparse representation: in each CPI, stop the OMP iteration when atom coefficient magnitude falls below 20% of the maximum magnitude, as an example. This criterion can also save on computational cost, as less atoms need to be extracted in the processing.

The spectrogram of the reconstructed and 'OMP-focused' signal is shown in Figure 12 which clearly shows more resolvable sinusoidal traces compared to Figure 7.

An example is shown in Figure 13 for the 8 GHz signal. Compared to Figure 8, this is a clearly significantly more focused image where the scatterers are more easily resolvable.

For completeness, we also show results in Figure 14 for the 12 GHz signal, which also resolve the scatterers significantly better using the OMP processing as compared to Figure 10 for standard DRT. The double scattering effects resulting in 'double rods' are also enhanced.

**Figure 12.** Spectrogram reconstructed, OMP-focused signal with first 20 atoms, at *f* = 8 GHz (compared to Figure 7).

**Figure 13.** The slow time *k*-space support (left) and image (right) after OMP processing using a 20% coefficient magnitude threshold at *f* = 8 GHz (compared to Figure 8).

**Figure 14.** The slow time k-space support (left) and image (right) after OMP processing using a 20% coefficient magnitude threshold at *f* = 12 GHz (compared to Figure 10).

#### *4.2. Large Target*

#### 4.2.1. Experimental Setup for Large Target

The experiment was carried out on the turntable at the RAAF Edinburgh airbase, which has a diameter of 17 m. The test target consists of three metallic cylinders as shown in Figure 15 with physical specification listed in Table 1. The experimental X-band radar employed a vertically polarised pulsed stepped frequency waveform starting at 9 GHz with 4 MHz steps, spanning a total of 256 frequencies. The turntable was rotated at approximately one revolution per 15 minutes with a receiver sampling rate of *PRF* = 20 Hz at each frequency, which translates to an angular sampling interval of 0.02◦.

**Figure 15.** The turntable at the RAAF Edinburgh airbase with three metallic cylinders as a test target.


**Table 1.** Metallic cylinder configuration.

#### 4.2.2. System Requirements

As previously described in Section 4.1.2 we designed the system requirements such that the backscattered radar signal is dependent on the angular sampling rate as follows.


The angular sampling interval of 0.02◦ per sample in the experiment translates to a *PRFa* = 2864 samp/rad, which satisfies (A1).

Similar to the previous data set, an elliptic filter with a very narrow stop band is applied to the signal for clutter removal. Results are shown in Sections 4.2.3 and 4.2.4.

#### 4.2.3. Standard DRT Imaging

Figure 16 shows a spectrogram of the signal at 9 GHz, featuring three distinct sinusoidal traces corresponding to the three cylinders. Figure 17 shows the corresponding slow-time *k*-space support and standard DRT image. The standard DRT imaging performance is poor due to the small diameter of the *k*-space support, with the cylinder locations represented by coarsely granulated pixels.

**Figure 16.** Spectrogram using standard DRT processing for *f* = 9 GHz. (The small gap near 600 sec is due to an antenna pointing error during the measurements.)

**Figure 17.** Slow-time *k*-space support (left) and image (right) for standard DRT processing; *f* = 9 GHz.

When longer CPIs are used with the standard DRT algorithms; for example, when *κ* in (16) is set to 6, the Doppler resolution in the spectrogram of the signal becomes higher, as evident in Figure 18; the corresponding cross-range profiles (with Doppler bin migration effects present) applied to standard DRT result in Figure 19.

Again it is shown that image blurring in standard DRT imaging is only in the azimuthal direction; image focusing is still generally achieved in the radial direction. The blurring effect in the image is more severe for scatterers at larger radial distances which travel along greater arc lengths within a given angular rotation angle (i.e., larger Doppler effects) and hence more severe Doppler bin migration.

#### 4.2.4. Augmented DRT Imaging with OMP

We choose the polar grids representation for this dataset as defined in Section 3.2.1. This approach is useful when some prior knowledge about the radial coordinate of the major scatterers is available from the standard DRT processing.

**Figure 18.** Signal spectrogram with augmented CPIs, *κ* = 6 at *f* = 9 GHz.

**Figure 19.** The slow time *k*-space support (left) and image (right) for standard DRT with *κ* = 6, at *f* = 9 GHz.

A relatively narrow window is used for discretization of the *d*-dimension derived from the Doppler information of the scatterers in Figure 16 with a *λ*/2 spacing. The full 360◦ with 1◦ spacing is used for *α*. Twenty atoms were extracted in each CPI from the OMP process giving a reconstructed spectrogram that is virtually identical to that in Figure 18, affirming the sufficient accuracy of the sparse representation. After the de-chirping operation, the scatterer locations are much more focussed as shown in Figure 20 compared to the same scatterers in Figure 19 with the same augmentation factor. The technique shows some degradation with the furthermost cylinder which exhibited the most blurring.

**Figure 20.** The slow time *k*-space support (left) and image (right) after OMP processing using a 20% coefficient magnitude threshold at 9 GHz (compared to Figure 19).

Our current study is focused more on imaging performance rather than computational cost; nevertheless, to give some idea on computational cost, we ran the algorithm on the high performance

computer called 'Phoenix' at the University of Adelaide which took approximately 1 hour to run on 16 CPUs using 64 GB RAM.

#### **5. Further Discussion**

This paper is an expansion to the work reported earlier in [2], demonstrating high-resolution DRT imaging with real experimental data. As this is not a real moving and rotating target in a typical operational scenario, a number of issues could be noted.

Firstly, the target's translational velocity is exactly zero for the entire data collection. Nevertheless, this is not expected to be a sensitive factor. For most real moving targets, translational velocity can be readily compensated by shifting the 'body Doppler' line to zero Doppler. Sensitive propagation phases, as in the case of fast-time *k*-spaces, do not enter the slow-time *k*-spaces.

Secondly, the measured data were collected at precise angular sampling rates *PRFa*, which can only be estimated in typical operational scenarios. Errors in *PRFa* or Ω*<sup>e</sup>* would translate into errors of the locations of populated samples as well as image scaling factor. Hence both image focusing and image scaling could be affected. We have not fully addressed these issues in this work.

The experimental data does reveal interesting electromagnetic phenomenology, highlighting the limiting simplicity of the ideal point-scatterer assumption; creeping waves and nonlinear scattering effects do exist, which are not taken into account in the current DRT theory.

On application of the OMP algorithm, what this work has demonstrated its feasibility: techniques such as OMP can be used for slow-time *k*-space augmentation. Other alternative sparse approximation techniques can possibly be used to yield higher performance. Numerous other aspects can also be considered, such as dictionary 'learning': how to select an optimum spatial scatterer grid for the best focusing performance while keeping computational cost at manageable levels? Or how to deal with the off-grid/mismatched scatterer problem [30]. Many open questions remain, some of which will be addressed in future publications.

#### **6. Concluding Remarks**

We have demonstrated, with two datasets, the ability to improve image resolution using a rotating target with an ultra-narrowband radar. The enabling signal processing technique presented was a combination of Doppler radar tomography and a sparse reconstruction technique such as OMP, with a unifying mathematical framework based on the slow-time *k*-space. We have shown that closely spaced scatterers can be resolved by illustrating the creeping wave effect when the scatterer size is similar to the radar wavelength. The technique also performed well addressing the adverse effect of blurring in the image with scatterers at larger radial distances to the centre of rotation. By compensating for the blurred scatterer locations in the image, the ability to resolve closely spaced scatterers is improved providing finer details for target recognition.

Although the demonstration of this technique is effective, the application to a real complex target with many non-ideal scatterers may present additional challenges including discontinuous scattering effects, larger dictionaries affecting computational cost and inaccuracies due to signal mismatch with finite dictionary elements. In future work, we aim at investigating the use of multiple widely separated radar receivers to reduce the requirement on large target rotation angles for DRT imaging, where the direct application of OMP may not scale efficiently for large amounts of data. The increase in data may require a modified approach such as dictionary learning to help reduce the computational cost.

**Author Contributions:** Conceptualization, H.-T.T.; Methodology, E.H., H.-T.T. and B.W.-H.N.; Simulation, E.H.; Validation, B.W.-H.N., H.-T.T.; Writing—Original Draft Preparation, H.-T.T. and E.H.; Writing—Review & Editing, B.W.-H.N., E.H., and H.-T.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Defence Science and Technology Group, Australia.

**Acknowledgments:** We sincerely thank Lorenzo Lo Monte and Nihad Afaisali from the University of Dayton and Mark Ingham, John Senior, Peter Drake and Shane Hatty from DST Group for their technical expertise and professionalism in performing the radar measurements presented in this manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Standard DRT: System Parameters and Image Resolution**

Figure A1 illustrates the spectral composition of a segmented CPI in the DRT algorithm. The sinusoidal traces depicts the instantaneous Doppler frequencies of scattererers as the target rotates, which are generally chirp signals. The chirps are approximately linear for short CPIs.

**Figure A1.** Instantaneous Doppler traces of point scatterers on a rotating target.

There are three main constraints on system parameters for standard DRT to be applicable. The first one is Doppler ambiguity free condition: the sampling rate *PRF* must be at least two times the largest Doppler components in the received signal–the well-known Nyquist criterion. With *r*max denoting a largest radial distance of scatterers on the target, this constraint can be written as

$$PRF \ge \frac{4\,\omega\,r\_{\text{max}}}{\lambda}, \quad \text{or} \quad PRF\_d \ge \frac{4\,r\_{\text{max}}}{\lambda},\tag{A1}$$

where *PRFa* = *PRF*/*ω* is the angular sampling rate (in units of samples/rad).

The second constraint is: Doppler migration-free (DMF), i.e., variation of the instantaneous Doppler frequency of any scatterer is less than a Doppler bin size. This constraint may be derived as follows: a scatterer's cross range is given by *x*<sup>1</sup> = *r* cos *θ*, hence the differential change in *x*<sup>1</sup> is

$$d\mathbf{x}\_1 = -r\omega\sin\theta\,dt.$$

Maximum cross range migration occurs near *θ* = *nπ*/2 for odd integral values of *n*. If *dt* represents a CPI time, here denoted as *TCPI*, then the DMF requirement translates to having

$$|d\mathbf{x}\_1| \equiv r\omega T\_{CPI} \tag{A2}$$

to be not larger than a cross range bin size, at all range bins.

A cross range bin size Δ*x*<sup>1</sup> is related to Doppler filter size Δ*f* through the well-known relationship Δ*f* = (2 *ω*/*λ*)Δ*x*<sup>1</sup> for monostatic radars, hence

$$
\Delta \mathbf{x}\_1 = \frac{\lambda}{2\omega} \Delta f = \frac{\lambda}{2\omega} \frac{PRF}{K}.\tag{A3}
$$

Here, *K* denotes both the number of samples spanning the CPI and the FFT length; and therefore

$$T\_{CPI} = \frac{K}{PRF}.\tag{A4}$$

Using (A2), (A3) and (A4) in the DMF requirement |*dx*1| ≤ Δ*x*<sup>1</sup> then leads to the condition

$$
\Delta\theta = \frac{\omega \, K}{PRF} \le \left(\frac{\lambda}{2\, r\_{\text{max}}}\right)^{1/2} \equiv \Lambda\theta\_{DM\prime} \tag{A5}
$$

where Δ*θ* is the (segmented) CPI rotation angle (or angular extent), and Δ*θDM* is an effective angle the target would need to rotate to induce a Doppler migration (DM) through one frequency bin. Note that Δ*θDM* depends only on radar wavelength *λ* and maximum radial extent *r*max, not rotation speed. Another useful related expression is

$$T\_{CPI} = \frac{K}{PRF} \le \frac{\Delta\theta\_{DM}}{\omega} \,\tag{A6}$$

for the corresponding CPI time.

The third constraint is the so-called 'linear limit': the maximum rotation angle at which the Taylor expansion in (6) up to the first order in time remains valid. In other words,

$$
\Delta\theta < \Delta\theta\_{LM\_{\Delta'}} \tag{A7}
$$

where Δ*θLM* ≈ 10◦.

Combining the three constraints above, the system constraints on *PRF* and *K* are (A1) and

$$K \le PRF\_d \min \left\{ \Delta \theta\_{DM}, \Delta \theta\_{LM} \right\}. \tag{A8}$$

As an example, suppose Δ*θLM* = 8◦ is used; and *λ* = 5 cm, *r*max = 1 m and *ω* = 300 RPM, then (A5) gives Δ*θDM* ≈ 9.1◦ > Δ*θLM*, and the choice of *PRF* = 15 kHz and *K* = 64 would satisfy all constraints.

The result in (A8) also highlights a useful comparison between the DMF condition and the linear limit Δ*θLM*: the CPI length *K* may be limited by either of the two factors; it is however more desirable to be DMF-limited, i.e, without strong dependence on wide rotation angles. Indeed, shorter radar wavelengths and larger target dimensions would induce more pronounced Doppler effects which are required for the applicability of the DRT imaging algorithm itself.

Image resolution for standard DRT can be derived as follows. The minimum *PRF* for unambiguous Doppler effects, given by (A1), means a cross-range profile given by (11) exactly spans the cross-range extent of the target. Larger values would lead to outer range bins of the profile containing noise only, while those range bins spanning the target remain the same, in both bin size and number. For the case of minimum unambiguous *PRF*, the maximum selectable value of *K*, given by (A6), is

*KSTD* = 8 *r*max *λ* 1/2 , (A9)

or smaller if *K* is limited by Δ*θLM* in (A8). The resolution of a cross-range profile is therefore

$$
\Delta \mathbf{x}\_{RB} = \frac{2 \, r\_{\text{max}}}{K\_{STD}} = \left(\frac{\lambda \, r\_{\text{max}}}{2}\right)^{1/2}. \tag{A10}
$$

The same result can be obtained from (10) and (7) by using the equality in (A5) for Δ*θ*. The result in (A10) is also the expected image resolution in standard DRT imaging.

#### **References**


© 2020 Commonwealth of Australia. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Target Doppler Rate Estimation Based on the Complex Phase of STFT in Passive Forward Scattering Radar**

#### **Karol Abratkiewicz \*, Piotr Krysik, Zbigniew Gajo and Piotr Samczy ´nski**

Institute of Electronic Systems, Faculty of Electronics and Information Technology, Warsaw University of Technology, 00-665 Warsaw, Poland

**\*** Correspondence: k.abratkiewicz@elka.pw.edu.pl

Received: 19 July 2019; Accepted: 17 August 2019; Published: 20 August 2019

**Abstract:** This article presents a novel approach to the estimation of motion parameters of objects in passive forward scattering radars (PFSR). In such systems, most frequency modulated signals which are used have parameters that depend on the geometry of a radar scene and an object's motion. Worth noting is that in bistatic (or multistatic) radars forward scattering geometry is present thus in this case only Doppler measurements are available while the range measurement is unambiguous. In this article the modulation factor, also called the Doppler rate, was determined based on the chirp rate (equivalent Doppler rate) estimation concept in the time-frequency (TF) domain. This approach utilizes the idea of the complex phase of the short-time Fourier transform (STFT) and its modification known from the literature. Mathematical dependencies were implemented and verified and the simulation results were described. The accuracy of the considered estimators were also verified using the Cramer-Rao lower bound (CRLB) to which simulated data for the considered estimators was compared. The proposed method was validated using a real-life signal collected from a radar operating in PFSR geometry. The Doppler rate provided by a car crossing the baseline between the receiver and the GSM transmitter was estimated. Finally, the concept of using CR estimation, which in the case of PFSR can be understood as Doppler rate, was confirmed on the basis of both simulated and real-life data.

**Keywords:** passive forward scattering radar; chirp rate estimation; passive radar; forward scattering radar; radar measurements; time-frequency analysis

#### **1. Introduction**

Passive forward scattering radars are a special class of passive bistatic radars (PBR), in which the bistatic angle *β* between the non-cooperating transmitter, the target and the receiver is *β* ≈ 180◦ [1–3]. In such kinds of passive radars, the illuminating signal can be a wave from a commercial transmitter of popular systems such as FM, DAB, DVB-T, GSM, and so forth [4–7]. The simplified passive forward scattering radar (PFSR) geometry is presented in Figure 1.

Unlike typical PBRs, the PFSR is characterized by the fact that objects cross the baseline, which has certain consequences. In such a case the PBR radar using classical PCL (Passive Coherent Location) processing is blind, as a target is crossing the line of sight between a receiver and a transmitter, and for PBR in this geometry there is no existing range resolution. Additionally, the target disturbs the reference signal, thus based on PCL principles it is difficult to detect the target at the line of sight to the transmitter as only the reference antenna is pointed in this direction, and surveillance beams are pointed in other directions. In bistatic radars a target moving at a velocity *V* provides the Doppler shift expressed as follows: [8]:

$$f\_d = \frac{2V}{\lambda} \cos(\kappa) \cos(\beta/2). \tag{1}$$

**Figure 1.** Simplified passive forward scattering radar (PFSR) geometry. *β*—bistatic angle, Tx—non-cooperative transmitter, Rx—radar receiver, TGT—target, *L*—baseline, *R*1—distance from the transmitter to the target, *R*2—distance from the target to the receiver, *D*—distance from the receiver to the crossing point.

As previously mentioned, in PFSR *β* ≈ 180◦, which makes *fd* ≈ 0 (Hz). However, observing the object in a slightly wider range, that is *β* ∈ (180◦ − Δ, 180◦ + Δ), where Δ is a certain angle, it is possible to measure the Doppler rate (or equivalently the chirp rate (CR)). This parameter describes the kinematic properties of the measured object.

The literature describes some methods of Doppler rate estimation in PFSR. Ustalli et al. proposed in Reference [9] a four-step processing technique for the extraction of kinematic motion parameters of targets in forward scattering radar (FSR). The method is based on multiple matched filtration of the signal with simultaneous time-frequency analysis, resulting in the precise estimation of the motion parameters of a single object near the intersection of the baseline. However, a problem may be the analysis of several objects that intersect the baseline at the same time. In addition, due to multiple matched filtration and other processing operations, the complexity of the method is significant. The same authors developed this approach and described it in Reference [10]. Another solution is to use the Radon transformation to calculate the phase acceleration [11]. After transforming the signal into a two-dimensional distribution in the TF domain, the components responsible for the Doppler rate of objects are found. Analyzing the aforementioned papers, it can be noticed that the analysis in the TF domain is the proper approach to the PFSR signal's considerations. A waveform received by the radar can be treated as a non-stationary frequency modulated signal. In the vicinity of *fd* ≈ 0 (Hz) the signal can be approximated as a linear frequency modulated waveform. This methodology was also presented in References [12–14]. This is due to the fact that the phase of the received signal can be described by the dependency:

$$\phi(t) = -\frac{2\pi}{\lambda} \left[ R\_1(t) + R\_2(t) - L \right],\tag{2}$$

which can be approximated using Taylor expansion into the following formula [15]:

$$
\phi(t) \approx \frac{\pi}{\lambda} v\_p \left( \frac{1}{L - D} + \frac{1}{D} \right) (t - t\_0)^2,\tag{3}
$$

where *vp* is the velocity component perpendicular to the baseline *L*, *t*<sup>0</sup> is the moment when the target crosses the baseline and *D* is the range from the receiver to the crossing point (see Figure 1). As can be noted, Equation (3) describes the second order polynomial characteristic for the frequency modulated signals.

The above considerations prompted the authors to use the CR estimation in the TF domain to analyze signals from PFSR. This approach is based on short–time Fourier transform (STFT) modification and allows the CR at each point in the TF plane to be determined. This is consistent with the assumption that when the baseline and the trajectory of the object are crossed, the signal appearing in the receiver can be approximated with the linear frequency modulated waveform, and this technique is dedicated for such a problem. In addition, the method is computationally efficient, which reduces the calculation time.

This paper is organized as follows: Section 2 presents the CR estimation theory background, including Cramer–Rao lower bound (CRLB) analysis. Section 3 depicts simulation results, and Section 4 covers real-life signal analysis provided by the GSM PFSR. Discussion and comments close the article.

#### **2. Chirp Rate Estimation**

#### *2.1. Theory Background*

The pioneer of CR estimation in the TF domain using the complex phase of STFT was Czarnecki, who proposed a method for determining the instantaneous frequency rate a two-dimensional signal distribution [16,17]. In general, the signal described by the following model will be considered:

$$\mathbf{x}(t) = A\_x \mathbf{exp}(j\Phi\_x(t)),\tag{4}$$

for the amplitude *Ax*, *<sup>j</sup>* <sup>=</sup> √−1 and phase described as:

$$
\Phi\_{\mathbf{x}}(t) = \phi\_{\mathbf{x}} + \omega\_{\mathbf{x}}t + 2\pi \cdot at^2/2 = \phi\_{\mathbf{x}} + 2\pi t \left(f\_0 + at/2\right),
\tag{5}
$$

where *ω<sup>x</sup>* = 2*π f*<sup>0</sup> is the angular frequency with the carrier frequency *f*0, and *α* is the CR. *x*(*t*) can be transformed into a two-dimensional distribution using STFT, which is given by the formula:

$$F\_x^h(t,\omega) = \int\_{\mathbb{R}} \mathbf{x}(\tau)^\* h(t-\tau) e^{-j\omega\tau} d\tau,\tag{6}$$

where (·)<sup>∗</sup> is the complex conjugate, and the upper index in the *<sup>F</sup><sup>h</sup> <sup>x</sup>* expression denotes the analysis window *h*(*t*) whereas the lower index expressed the signal under consideration *x*(*t*). Energy distribution in the TF domain can be calculated as a squared absolute value of the STFT and is called a spectrogram:

$$S\_x^h(t,\omega) = \left| F\_x^h(t,\omega) \right|^2,\tag{7}$$

where |·| denotes the absolute value operator. STFT can be presented using the concept of a complex phase in the thought of dependency [16]:

$$F\_x^h(t,\omega) = \int\_{\mathbb{R}} \mathbf{x}(\tau)^\* h(t-\tau) e^{-j\omega\tau} d\tau = A\_x^h(t,\omega) e^{j\phi\_x^h(t,\omega)} = e^{\Lambda\_x^h(t,\omega) + j\phi\_x^h(t,\omega)},\tag{8}$$

where STFT phase is described as *φ<sup>h</sup> <sup>x</sup>*(*t*, *ω*), whereas *A<sup>h</sup> <sup>x</sup>*(*t*, *ω*) denotes STFT absolute value (*A<sup>h</sup> <sup>x</sup>*(*t*, *ω*) > 0). By using the property described in Reference [18], the STFT phase in Equation (8) was transformed into a complex form in which Λ*<sup>h</sup> <sup>x</sup>*(*t*, *ω*) = ln(*A<sup>h</sup> <sup>x</sup>*(*t*, *ω*)) then the complex phase of the STFT is defined as:

$$\Psi\_x^{\rm h}(t,\omega) = \ln \left( F\_x^{\rm h}(t,\omega) \right) = \Lambda\_x^{\rm h}(t,\omega) + j\phi\_x^{\rm h}(t,\omega). \tag{9}$$

such a transformation allows many useful signal parameters in the TF domain to be determined. By calculating the partial derivatives of the real part of the complex phase with respect to time and frequency, the instantaneous bandwidth and the local group delay are obtained respectively. The ratio of these values gives the CR estimator in the manner described as follows:

$$\mathfrak{A}(t,\omega) = -\left(\frac{\partial \Lambda\_x^h(t,\omega)}{\partial t} / \frac{\partial \Lambda\_x^h(t,\omega)}{\partial \omega}\right). \tag{10}$$

The graphic interpretation of the estimator is presented in Figure 2.

In Reference [19] it was proposed that the K estimator can be calculated more efficiently utilizing the modified analysis window. Additionally, two new estimators were revealed. The uncertainty effect occurring in the K estimator has been reduced by the differentiation of the numerator and the denominator with respect to time (giving the D estimator) and frequency (giving the F estimator) giving the following relationships:

$$\mathfrak{D}(t,\omega) = -\left(\frac{\partial^2 \Lambda\_x^h(t,\omega)}{\partial t^2} \Big/ \frac{\partial^2 \Lambda\_x^h(t,\omega)}{\partial \omega \partial t}\right),\tag{11}$$

$$\mathcal{G}(t,\omega) = -\left(\frac{\partial^2 \Lambda\_x^h(t,\omega)}{\partial \omega \partial t} \Big/ \frac{\partial^2 \Lambda\_x^h(t,\omega)}{\partial \omega^2}\right). \tag{12}$$

This method was tested using different types of signals and is described in the literature. Acoustic signals were processed using this method, which can be found in References [17,20]. Radar applications utilizing this approach are presented in References [21–23]. In this paper, the PFSR application is proposed in order to verify the possibility af assessing the motion parameters of a target. Because the estimators given by Equations (10)–(12) are actually equivalent, further considerations are made using one of them to present the correctness of the concept. This is an example of using the idea in PFSR applications, and each of the estimators should give similar results. The comparison of estimators as well as their limitations and computational complexity have been made in the literature [19,21]. Due to the smaller variance in comparison to the estimator K (see Section 2.2) and the lower sensitivity to noise (see Reference [21]), the authors decided to perform the tests using the estimator F; however, the estimators K and D can be used in the same way.

**Figure 2.** CR estimation in the TF domain—an interpretation.

#### *2.2. Analysis of the Estimation Accuracy*

The accuracy of the estimators has been compared in this section to the CRLB for the considered signal model. The analyzed complex chirp signal is given by:

$$\ln[n] = A\_x \exp\left(j a \frac{n^2}{2}\right),\tag{13}$$

where *Ax* = 1, *<sup>n</sup>* ∈ [0, *<sup>N</sup>* − <sup>1</sup>]. This signal is merged in a white Gaussian noise *<sup>w</sup>*[*n*] with a variance *<sup>σ</sup>*<sup>2</sup> *w*. Thus, the observation vector **x** = [*x*[0], *x*[1], ..., *x*[*N* − 2], *x*[*N* − 1]] *<sup>T</sup>* is normally distributed:

$$\propto \sim \mathcal{N}(\mu(\alpha), \mathcal{C}(\alpha)),\tag{14}$$

where

$$\mu(a) = \mathbf{x} = \left[ A\_{x'} A\_x \exp\left( j a \frac{1^2}{2}\right), A\_x \exp\left( j a \frac{2^2}{2}\right), \dots, A\_x \exp\left( j a \frac{(N-1)^2}{2}\right) \right]^T,\tag{15}$$

and *<sup>C</sup>*(*α*) is a covariance matrix of observation vector whereas (·)*<sup>T</sup>* is the matrix transpose. For such a Gaussian observation model depending on the scalar parameter, *α* the Fisher information matrix (FIM) is a 1 by 1 matrix (scalar) given by [24]:

$$I(a) = 2\Re\left(\left[\frac{\partial\mu(a)}{\partial\alpha}\right]^H \mathcal{C}^{-1}(a) \left[\frac{\partial\mu(a)}{\partial\alpha}\right]\right) + \frac{1}{2}\mathfrak{C}\left[\left(\mathcal{C}^{-1}(a)\frac{\partial\mathcal{C}(a)}{\partial\alpha}\right)^2\right],\tag{16}$$

where (·)*<sup>H</sup>* is the Hermitian transpose and <sup>T</sup> denotes the matrix trace and expresses a real part. Since, the additive noise *w*[*n*] is white, the covariance matrix *C*(*α*) equals *σ*<sup>2</sup> *<sup>w</sup> <sup>I</sup>* and does not depend on the parameter *<sup>α</sup>*. Thus *<sup>∂</sup>C*(*α*) *∂α* <sup>=</sup> 0 and the second term in Equation (15) vanishes. Moreover, *<sup>C</sup>*−<sup>1</sup> (*α*) <sup>=</sup> <sup>1</sup> *σ*2 *w I*. The FIM now takes the following form:

$$I(\boldsymbol{a}) = 2\Re\left(\frac{1}{\sigma\_w^2} \left[\frac{\partial\mu\left(\boldsymbol{a}\right)}{\partial\boldsymbol{\alpha}}\right]^H \left[\frac{\partial\mu\left(\boldsymbol{a}\right)}{\partial\boldsymbol{\alpha}}\right]\right). \tag{17}$$

According to Equation (15) it can be written as:

$$\begin{split} I(\boldsymbol{a}) &= 2\Re\left(\frac{1}{\sigma\_w^2} \sum\_{n=0}^{N-1} \left(\frac{\partial}{\partial \boldsymbol{a}} A\_x \mathbf{e}\left(\frac{-j\boldsymbol{m}^2}{2}\right)\right) \left(\frac{\partial}{\partial \boldsymbol{a}} A\_x \mathbf{e}\left(\frac{j\boldsymbol{m}^2}{2}\right)\right)\right) = \\ &= 2\Re\frac{A\_x^2}{\sigma\_w^2} \left(\sum\_{n=0}^{N-1} \mathbf{e}\left(\frac{-j\boldsymbol{m}^2}{2}\right) \left(-j\frac{\boldsymbol{n}^2}{2}\right) \mathbf{e}\left(\frac{j\boldsymbol{n}^2}{2}\right) \left(j\frac{\boldsymbol{n}^2}{2}\right)\right) = \frac{A\_x^2}{2\sigma\_w^2} \sum\_{n=0}^{N-1} n^4. \end{split} \tag{18}$$

Finally, the CRLB for the estimator variance is given by:

$$
\sigma^2\left(\hat{\mathfrak{a}}\right) \ge \left[I\left(\mathfrak{a}\right)\right]^{-1} = \frac{2\sigma\_w^2}{A\_\chi^2 \sum\_{n=0}^{N-1} n^4}.\tag{19}
$$

In reference to the article describing the estimators used [19], the accuracy of the estimate was investigated and compared to the presented CRLB given by Equation (19). The linear complex chirp was considered. The signal model given by Equation (13) has the following parameters: *N* = 250, *α* = 2*π*0.36 *<sup>N</sup>* . In 1000 realizations of noise in the range from −20 to 30 dB, the CR value was estimated and verified at point *<sup>n</sup>*<sup>0</sup> on the reference frequency *<sup>f</sup>*ref = *<sup>α</sup>*ref <sup>2</sup>*<sup>π</sup> n*0, where *α*ref = 0.009. Results for the estimators K, D and F are presented in Figure 3.

**Figure 3.** Comparison of the accuracy of the utilized estimators.

As can be seen, for signals with signal–noise ratio (SNR) ≥ 0 dB the D and F estimator give very similar results. The signals from this range will be considered later in the paper, so choosing one of the two more accurate tools was right. Additionally, according to the authors' experience, this estimator is less sensitive to the changes of the analysis window width. Because of the uncertainty problem in the K estimator, the CR was verified in the vicinity of the *f*ref. For this reason, the variance of this tool is clearly greater than in the case of the other two estimators. Although all of the tools used are characterized by an error in relation to the CRLB, it should be kept in mind that for SNR ≥ 0 dB the variance of the D and F estimators is satisfactorily low, and in many practical cases is sufficient to estimate CR.

#### **3. Results of the Simulation**

In order to verify the proposed method simulations were carried out. Two radar scenes were considered in which one or two targets crossed the baseline. The first of the analyzed cases covers the situation presented in Figure 4.

**Figure 4.** Radar scene for the 1st simulation case.

The continuous wave (CW) transmitter working with the harmonic signal using the carrier frequency *fc* = 900 MHz is 1500 m away from the target trajectory (at the closest point) which, in turn, is 500 m from the receiver. The point target (dimension is neglected) moves with the velocity *v* = 10 <sup>m</sup> s perpendicularly to the baseline. The TF distribution of the signal energy in the form of a spectrogram is presented in Figure 5. The signal was merged with the white Gaussian noise, for which SNR = 30 dB. This was due to the fact that in the further part of the article covering the real-life signal analysis, waveforms characterized by SNR > 0 dB are considered, and on such conditions the simulations were focused. STFT parameters during the simulations are as follows:


**Figure 5.** Spectrogram of the 1st simulation case.

The spectrogram shows a characteristic curve, typical for a bistatic geometry utilizing the forward scattering phenomenon. The received signal is defined only by the Doppler frequency (due to the fact that the transmitter works with the CW) expressed by Equation (1), the consequence of which is the frequency modulation of the wave. The Doppler rate, in this case, can be considered as the CR described in the previous section and the physical interpretation of both is the same. Thus, the proposed CR estimation methods were employed to verify their usability in such a context. A calculated accelerogram presenting instantaneous CR for each point in the TF plane is depicted in Figure 6.

**Figure 6.** Accelerogram of the 1st simulation case.

By analyzing this distribution, it can be noted that during the crossing of the baseline the target provided a Doppler rate of ∼−1.5 <sup>m</sup> <sup>s</sup> . Additionally, the Doppler rate can be determined for each point for the observation time. This is an advantage of the proposed approach due to the fact that the estimation process automatically provides a set of information about the movement parameters of the target, allowing an unknown trajectory or velocity to be assessed. In comparison to the methods presented in the literature, unique movement signatures are provided by this approach, which allows additional information about the object to be distinguished.

For the purpose of validation, the theoretical Doppler rate, as the first order derivative of the *fd* given by Equation (1), was compared with the estimated value. This value was read in the maximum point of the spectrogram in each time frame. Because of the amplitude modulation, the estimated value in the minimum of the envelope is distorted, thus additionally results for the constant signal amplitude were plotted. A comparison of the true value and estimated CR for both cases (amplitude modulation and no amplitude modulation) is presented in Figure 7.

**Figure 7.** Comparison of the true Doppler rate and estimated CR for two amplitude cases.

As can be noticed, the estimated Doppler rate (CR) for the constant amplitude signal is similar to the theoretical value. In Figure 7 the blue line overlapped completely with the yellow line, which confirms the convergence of the results of the estimation with the theory. However, because of the fact that the signal has an amplitude locally near or equal to zero, the estimation process returns

distorted results (see the red line in Figure 7). This is caused by the character of the signal, not by the error of the estimation, which can be seen in the case where the amplitude is constant. In the simulation, the maximum value of the spectrogram in each time portion was extracted, so moments in which signal amplitude is near to zero, the maximum energy value was found in the noise, which caused significant errors.

In order to present the advantages of the proposed method, an extended scenario of the simulation was carried out. In this situation two objects are considered, however, one of them approaches the baseline from a different angle in comparison to the first one, and with slightly higher velocity. The simulation parameters are presented in Figure 8.

**Figure 8.** Radar scene for the 2nd simulation case.

A spectrogram presenting the energy distribution of the reflected signal from the two targets is depicted in Figure 9. The additional target introduces a second curve with a different Doppler rate depending on the velocity and trajectory. Thus, based on the previous results it should be possible to differentiate considered targets on the accelerogram because of the individual Doppler rate seen during the time of analysis.

In the simulated case, it was assumed that both targets cross the baseline approximately the same time in order to verify the possibility of extracting particular targets in PFSR geometry. An accelerogram of the simulated data is presented in Figures 9 and 10.

**Figure 9.** Spectrogram of the 2nd simulation case.

**Figure 10.** Accelerogram of the 2nd simulation case.

Based on the simulation carried out it was proved that CR estimation may be an effective technique, allowing additional information in PFSR to be extracted. Even if the Doppler rate in the vicinity of *t* = 0 (s) is similar for both targets, the Doppler rate history from the entire observation time provides additional information about the trajectory of the objects. Such data can be used as an extension for PFSR systems. In the next section, the method is tested using real-life data to verify the adaptability and usability of the proposed approach.

#### **4. Real-Life Signal Analysis**

#### *4.1. Measurement Campaign*

The experimental data were gathered during a measurement campaign using a GSM-based passive radar for monitoring ground moving objects. The source of the illuminating signal was a GSM base transceiver station with an antenna mounted at the height of approximately 50 m. During the trials, a cooperating vehicle was used as an observed target moving with the velocity *<sup>v</sup>* <sup>≈</sup> <sup>10</sup> <sup>m</sup> <sup>s</sup> [25]. Measurements were carried out a few times under similar conditions. This was done to confirm the repeatability of results in comparable scenarios, and to verify the possibility of distinguishing characteristic features in the movement.

The location of the radar's surveillance antenna and trajectory of the target were selected in order to ensure that the forward scattering effect of the vehicle occurred at the observation spot. The surveillance antenna was mounted 1.5 m above the ground and was located approximately 1.5 km from the transmitter. The cooperating car crossed the baseline between the transmitter and the receiver within approximately 1 km of the transmitter.

The reference signal was acquired at the same location as where the measurement was performed. In order to reduce the influence of the target echo in the reference channel, an antenna with vertical beamwidth of approximately 25 degrees was mounted on a 12 m mast and tilted upwards.

The measurement equipment consisted of commercial-off-the shelf components, with a two channel receiver based on the National Instruments PXIe-5667 vector signal analyzer, low noise amplifiers and band-pass filters for the GSM900 band (925–960 MHz) connected after the antennas of both the surveillance and reference channels. The measurement scene utilized during the trials is presented in Figure 11, and the general measurement geometry diagram is depicted in Figure 12. More details about the measurement campaign and the signal processing chain are available in Reference [25].

**Figure 11.** Measurement scene. In white—Range from the GSM transmitter to the receiver, in red—range from the receiver to the intersection point, in blue—the target trajectory.

**Figure 12.** Measurement geometry diagram. Tx—GSM transmitter of opportunity, Rxr—the reference antenna, Rxs—the surveillance antenna.

#### *4.2. Target Doppler Rate Estimation*

The obtained data were examined with one of the analyzed estimators in order to verify whether the proposed tool is effective in analyzing real-life signals from PFSR. The recorded waveform was processed using the cross-correlation of the reference and reflected from the target signal. As with the simulation, the object's dimensions can be neglected. This is due to the fact that the available GSM signal bandwidth is ∼200 kHz, giving a bistatic range resolution of ∼1500 m. Four cases were considered during which the object examined crossed the baseline between the transmitter and the receiver. During the real-life signal analysis, the following parameters were employed:


The spectrograms of the analyzed cases are presented in Figure 13.

**Figure 13.** Spectrograms of all considered cases.

Apart from the signal that is caused by the analyzed phenomenon, that is the signal with frequency modulation, additional components are visible on the spectrograms. They arise for several reasons. The first of these is the presence of a significant number of stationary objects in the radar beam. A strong echo from the surface of the earth, trees or buildings creates a strong permanent component at *f* ≈ 0 (Hz), the impact of which can be reduced, as described in Reference [25], for example. The second effect is visible as numerous components with a much higher modulation coefficient. However, these are only visible for a short time. This phenomenon, in turn, arises as a result of the presence of side lobes, which cause the receiving of a signal reflected from a moving object as well as from other objects in space. In addition, there was a highway near the measuring scene, and the cars moving on it are visible in the spectrograms. However, it is possible to distinguish a significant component, which is the echo coming from the analyzed object.

For such obtained data, accelerograms, that is a local (instantaneous) CR distribution on the TF plane, were determined. The results for the analyzed cases are shown in Figure 14.

**Figure 14.** Accelerograms of all considered cases.

As can be noticed, similar results were obtained for four measurements which confirmed the repeatability of the results. Although the most significant portion of the signal energy is at the moment of changing the frequency sign (around 0 Hz), it is possible, as in the case of the simulation, to determine the CR for the entire recorded segment. Such information can be extremely useful when determining the trajectory or kinematic parameters of the target. In the case of the previously mentioned solutions [9–15], only the Doppler rate was determined at the moment when the baseline is crossed. The proposed method therefore extends the spectrum of possibilities, allowing more information to be extracted. For all cases the estimated CR value at the time of *t*<sup>0</sup> is CR∈ [−0.3, −0.5] (Hz/s), which agrees with the actual value. For example in the first case shown in Figures 13a and 14a, it can be observed that the frequency is reduced by less than 30 Hz within 60 s, which gives an estimated value. In fact, it is difficult to reproduce the vehicle's motion perfectly a few times, which resulted in temporary changes in the signal. It can be observed by comparing the 1st and 2nd cases (corresponding spectrograms Figure 13a,b and accelerograms Figure 14a,b) with 3rd and 4th (corresponding spectrograms Figure 13c,d and accelerograms Figure 14c,d). For the first two cases the car maintains a constant velocity, while for the other two cases the velocity is not retained in the second phase of the move. For the third case it is noticeable that the car moves slowly, which results in a decrease in the chirp rate (absolute) value. In the fourth case, the acceleration of the car is noticeable, which was also observed. In this situation, the accelerogram presents more rapid changes in the range of <sup>−</sup>0.5∼−0.8 Hz <sup>s</sup> in the final part of the observed movement. The presented results confirmed the correctness of the proposed method in the estimation of the object motion parameters observed by the PFSR radar, as well as extraction characteristic features of the object movement. The tested algorithms can help and/or speed up the estimation of the object's parameters, which is especially important due

to the fact that information about the distance of the object is lost in the PFSR radar. In this situation, any additional information about the object may be useful.

#### **5. Conclusions**

The article has presented a novel approach to the analysis of signals occurring in radars using the forward scattering phenomenon. Based on the concept of the complex STFT phase and the CR estimators known from the literature, the signals from the PFSR radar have been analyzed. In the first part, using a mathematical model, the possibility of applying the proposed method has been tested using simulated signals. A situation has been considered in which one object and two objects with different kinematic parameters intersect the baseline. Simulations have confirmed the applicability of the method, after which the method was verified using real-life data. The role of the illuminator of opportunity has been fulfilled by the GSM transmitter and the transmitter-receiver line was crossed by the car. The analysis allowed the Doppler rate to be determined in the real scenario. In addition, the method gives the Doppler rate estimation, which had not yet been available, not only at the time of the baseline intersection, but also for the entire observed trajectory. A valuable property is the ability to extract temporary changes in velocity, which increases the amount of information describing the observed object.

The accuracy of the considered tools has been verified by statistical analysis and the comparison of results to the CRLB, which mainly showed the differences between the estimators as well as the expected accuracy. In the future, the authors want to verify the accuracy of the estimators for a different analysis window length. This is an important parameter due to the fact that in the classical STFT the length of the analysis window affects the estimation variance and bias, and because the tools used are based on STFT, the bias and the variance are also related to the analysis window's length.

A promising point of further work is the extension of the method with the possibility of classifying objects and estimating the targets tracks. Based on the Doppler rate, not only at the intersection of the baseline but also in the wider observation period, it is possible to classify objects. This can be particularly important in security systems, where the detection of fast and maneuvering missiles is difficult for both active and passive radars. The use of the proposed approach for a transmitter located on the ground and a receiver on the ground or in orbit would allow for detection and classification of dangerous rockets and flying objects.

**Author Contributions:** Project administration, K.A.; formal analysis, K.A. and Z.G.; investigation, K.A.; methodology, K.A. and P.S.; raw radar data collection, P.K.; software, K.A. and P.K.; supervision, P.S.; validation, K.A.; visualization, K.A. and P.K.; Writing—Original Draft, K.A., P.K. and Z.G.; Writing—Review & Editing, P.S.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*

## **The Use of the Reassignment Technique in the Time-Frequency Analysis Applied in VHF-Based Passive Forward Scattering Radar**

#### **Marek Płotka \*, Karol Abratkiewicz, Mateusz Malanowski , Piotr Samczy ´nski and Krzysztof Kulpa**

Institute of Electronic Systems, Faculty of Electronics and Information Technology, Warsaw University of Technology, 00-665 Warsaw, Poland; k.abratkiewicz@elka.pw.edu.pl (K.A.);

M.Malanowski@elka.pw.edu.pl (M.M.); P.Samczynski@elka.pw.edu.pl (P.S.); k.kulpa@elka.pw.edu.pl (K.K.) **\*** Correspondence: m.plotka@elka.pw.edu.pl

Received: 12 May 2020; Accepted: 15 June 2020; Published: 17 June 2020

**Abstract:** This paper presents the application of the time-frequency (TF) reassignment technique in passive forward scattering radar (FSR) using Digital Video Broadcasting – Terrestrial (DVB-T) transmitters of opportunity operating in the Very High Frequency (VHF) band. The validation of the proposed technique was done using real-life signals collected by the passive radar demonstrator during a measurement campaign. The scenario was chosen to test detection ranges and the capability of estimating the kinematic parameters of a cooperative airborne target in passive FSR geometry. Additionally, in the experiment the possibility of utilizing FSR geometry in foliage penetration conditions taking advantage of the VHF band of a DVB-T illuminator of opportunity was tested. The results presented in this paper show that the concentrated (reassigned) energy distribution of the signal in the TF domain allows a more precise target Doppler rate to be estimated using the Hough transform.

**Keywords:** passive forward scattering radar; forward scattering radar; passive radar; radar measurements; time-frequency analysis; time-frequency reassignment

#### **1. Introduction**

Over the past decades, passive radars have evolved significantly [1–3], which can be seen in numerous demonstrations and works devoted to this topic [4–8]. This has resulted from the advantages of passive radars and the possibility to detect targets which do not have their own emission. In fact, issues related to passive radars are generally widely described in the literature, however, there are still problems that require a specific approach.

The main problem in Passive Coherent Location (PCL) radar technology is that using classical passive radar processing for air target detection [9] does not allow one to detect and localize the target in the direction of the illuminator of opportunity, therefore the radar is "blind" at this particular angle and additionally the target is in the first range cell, which provides unclear detection results. This direction is reserved for the collection of the reference signal, which is used for cross-correlation with signals collected from other receiver channels whose measurement antennas are pointed in other surveillance directions where a target echo is suspected to be received. However, this paper deals with the methodology which allows kinematic parameters of the object to be distinguished even if the range information is lost. Various answers to the problem of passive radar direct path "blindness" can be found in numerous literature positions [10–15].

A possible solution to this issue may be the employment of reference signal reconstruction [10]. However, this technique only works for digital signals and with sufficient signalto-noise ratio (SNR) values. Using a beamforming technique to reduce direct signal leakage to surveillance channels [11,12] can also be employed. Aubry et al. [13], used a Constrained Least Squares two-dimensional localization algorithm. Its performance, expressed in terms of Root Mean Square Error (RMSE), is even comparable to square root of the Cramer Rao Lower Bound (CRLB) for some of the simulation scenarios presented. A significant disadvantage of this algorithm is a necessity of employment multiple transmitters of opportunity. A different solution to the localization issue can be found in Aubry et al. [14]. Joint target location is based on a PCL and Time Difference of Arrival (TDOA) measurement techniques. However, the TDOA method requires multiple dislocated radar receivers. Another study on the target location accuracy in multistatic scenario is presented by Anastasio et al. [15].

The other solution for this problem, utilizing a single receiver and single transmitter of opportunity only, might be to additionally use FSR methods in the PCL processing chain. The use of FSR geometry allows passive radar to detect and estimate main movement parameters such as target velocity for the targets crossing the Tx–Rx baseline [16,17]. In such a case, data from the FSR module applied in PCL radars might be used as additional information for the radar tracker, and consequently the detection and velocity estimation from the FSR module to the tracker working in the bistatic range-Doppler plane. Such a method will significantly improve the detection and tracking performance in PCL processing. This fact led the authors to study in more detail the possibility of applying FSR geometry in passive radar, and test novel methods for target Doppler frequency rate estimation which might be applied in PCL processing. An additional motivation was using low-frequency DVB-T sources of illumination in passive radars and their ability to perform foliage penetration. The VHF DVB-T operates in the band of 174–230 MHz [18]. As these are relatively low frequencies, they penetrate the foliage well. The authors did one experiment where a VHF DVB-T based passive radar was deployed in a forest on a low mast around 3 m in height, which was much lower than the surrounding trees, and successfully detected the air targets. The results have been described by Plotka et al. [19]. These valuable results also motivated the authors to check how efficiently the VHF DVB-T illuminator of opportunity would be used in FSR geometry, where the reference signal is also received through the transmitter.

This paper has the following structure: Section 2 presents the passive FSR geometry principle that is considered in this work. Section 3 covers the description of the proposed method for the target Doppler rate estimation. In Section 4, the measurement campaign and the numerical results using the real-life signals are depicted. The paper is closed by comments and conclusions.

#### **2. Passive FSR Geometry**

The forward scattering phenomenon [16,17] is schematically depicted in Figure 1.

**Figure 1.** A typical passive forward scattering radar (FSR) geometry. Tx—transmitter, Rx—receiver, TGT—target, *L*—baseline, *R*1—range from the transmitter to the target, *R*2—range from the receiver to the target, *D*—range from the receiver to the crossing point, *β*—bistatic angle, *α*—angle between the bistatic bisector and the velocity vector.

In bistatic radars using an electromagnetic wave of a length *λ*, the Doppler shift *fd* produced by the target moving at the velocity *V* can be expressed as

$$f\_d = \frac{2V}{\lambda} \cos\left(\alpha\right) \cos\left(\beta/2\right),\tag{1}$$

where *β* is the bistatic angle and *α* is the angle between the bistatic bisector and the movement vector (see Figure 1). For such a spatial configuration where the target crosses the baseline (when the range information is lost and the Doppler frequency tends to 0 Hz) *β* ≈ 180◦, thus *fd* ≈ 0 Hz, however, the target can be observed in a range *β* ∈ (180◦ − Δ, 180◦ + Δ), where Δ is a small angle, which delivers more information about the target trajectory. In the vicinity of *fd* ≈ 0 Hz the signal impinging the receiving antenna can be defined as [20]

$$
\phi(t) = \frac{2\pi}{\lambda} \left[ R\_1(t) + R\_2(t) - L \right]. \tag{2}
$$

Assuming that the target moves along a linear trajectory at a constant velocity *v* and the velocity vector (composed of *vx* and *vy* components corresponding respectively to the x- and y-axis) creates an angle with respect to the *x*-axis that is normal to the baseline *α* = tan−<sup>1</sup> *vy vx* . If the target crosses the baseline at the point *D* from the receiver (see Figure 1) then *x*(*t*) = *vxt* and *y*(*t*) = *D* + *vyt* which leads to

$$R\_1(t) = \sqrt{\mathbf{x}(t)^2 + \left(L - \mathbf{y}(t)\right)^2} \tag{3}$$

and

$$R\_2(t) = \sqrt{\mathbf{x}(t)^2 + \mathbf{y}(t)^2}.\tag{4}$$

Apart from the Doppler history, the target forward radar cross section (FRCS) has a contribution to the signal reaching the receiver in the passive FSR system. Namely, a rectangular target of horizontal *lh* and vertical *lv* dimension such that *lh* >> *λ* and *lv* >> *λ*, is given by [21]

$$\sigma(t) = \frac{L^2}{R\_1(t)R\gamma(t)} \frac{\cos\left(\theta\_T(t) - a\right) + \cos\left(\theta\_R(t) + a\right)}{2} \Pi\left(\frac{l\_h}{\lambda}\left(\sin\left(\theta\_T(t) - a\right) + \sin\left(\theta\_R(t) + a\right)\right)\right) \tag{5}$$

where the target aspect angle with respect to transmitter is *θ<sup>T</sup>* = tan−<sup>1</sup> *x*(*t*) *L*−*y*(*t*) and with respect to receiver is *θ<sup>R</sup>* = tan−<sup>1</sup> *x*(*t*) *y*(*t*) and Π is the function such that Π(*x*) = sin(*x*) *<sup>x</sup>* . Then, the signal impinging the FSR receiving antenna can be written as

$$s(t) = -\sigma(t)\sin\left(\phi(t)\right). \tag{6}$$

Approximating Equation (6) by the third order Taylor polynomial around the crossing point *t* = *t*0, Equation (2) becomes

$$
\phi(t) \approx \pi \epsilon (t - t\_0)^2 + \pi \zeta (t - t\_0)^3,\tag{7}
$$

where

$$
\varepsilon = \frac{\upsilon\_\chi^2}{\lambda} \left[ \frac{1}{L - D} + \frac{1}{D} \right],
\tag{8}
$$

and

$$\zeta = \frac{\upsilon\_x^2 \upsilon\_y}{\lambda} \left[ \frac{1}{(L-D)^2} + \frac{1}{D^2} \right],\tag{9}$$

and the latter equation goes to 0 for *vy* = 0 or *D* = *L*/2. Finally, the signal phase can be approximated as follows

$$
\phi(t) \approx \frac{\pi}{\lambda} v\_{\text{x}} \left( \frac{1}{L - D} + \frac{1}{D} \right) (t - t\_0)^2,\tag{10}
$$

where *vx* denotes the velocity component which is perpendicular to the baseline *L*, *D* expresses the range from the receiver to the crossing point, and *t*<sup>0</sup> is the particular moment the target crosses the baseline. In fact, Equation (10) can be considered as a quadratic phase function resulting in the linear frequency-modulated signal. The modulation factor (also known as a chirp rate, frequency rate, frequency slope, etc.) is valuable information describing the target in the situation when the range measurement is ambiguous, which is the case in passive FSR. Thus, any additional characterization of the target movement is significant in this case. In the literature, different approaches are proposed in order to estimate motion parameters, such as spectrogram analysis [22], chirp rate estimation in the time-frequency (TF) domain [23], or the Radon transform [20]. This paper refers to the latter example, and the authors of this work proposed the method known from the literature to improve the resolution of the TF distribution using the TF reassignment technique in order to distinguish the kinematic parameter of the cooperative target using the Hough transform, which can be interpreted as a discrete realization of the Radon transform [24,25]. The obtained outcomes are compared to existing methodology [20,26–29], and the improvement in the estimation precision is shown.

#### **3. Target Doppler Rate Estimation**

Typically, the signal received by the passive FSR antenna after initial processing is presented in the TF domain as a quasi-linear frequency-modulated waveform in accordance to Equation (10). Next, by using the Radon transform the Doppler rate is estimated at the crossing point which determines the kinematic parameters of the target [20,26,28,29]. As shown by Toft et al. [24,25], the Hough transform can be used equivalently as a discrete realization of the Radon method, which may be used for the estimation of the frequency slope of the signal in the TF domain. This approach is very fast and can be implemented in real-life systems, however, some details have to be taken into account. Namely, classical TF representations suffer from the limited resolution resulting from the Heisenberg–Gabor uncertainty principle [30], which spoils the estimation accuracy in this case. Especially when the SNR is low or when several objects cross the baseline at the same time, the method proposed by Ustalli et al. may require additional processing steps. Widely speaking, in many practical applications the short-time Fourier transform (STFT) based approach may be insufficient due to the finite resolution of the TF plane. Additionally, the resolution is strongly dependent on the processing parameters, such as window type, window width, overlap, etc. Even a high-SNR signal can be distributed incorrectly over the TF plane if the processing parameters are badly conditioned. One of the popular and widely applied methods for TF resolution enhancement is TF reassignment [30–33]. This method can be implemented through the classical STFT-based method, as well as using a recursive version presented by Fourer et al. [34]. The latter is particularly interesting due to the possibility of its efficient implementation and the fast operation of the energy relocation (*t*, *<sup>ω</sup>*) → (ˆ*t*, *<sup>ω</sup>*ˆ). As both the recursive and fast Fourier transform (FFT)-based implementations are equivalent, the FFT-based method is used in this paper as an example of the technique.

In general, the STFT of the signal *x*(*t*) can be computed as follows:

$$F\_x^h(t,\omega) = \int\_{\mathbb{R}} \mathbf{x}(\tau) h^\*(t-\tau) e^{-j\omega\tau} d\tau = M\_\mathbf{x}^h(t,\omega) e^{j\phi\_\mathbf{x}^h(t,\omega)},\tag{11}$$

where *<sup>j</sup>* <sup>=</sup> √−1, (·)<sup>∗</sup> is the complex conjugate, <sup>R</sup> denotes the set of real numbers, *<sup>M</sup><sup>h</sup> <sup>x</sup>*(*t*, *ω*) is the amplitude, and *φ<sup>h</sup> <sup>x</sup>*(*t*, *ω*) is the phase of the transform. The energy distribution, commonly called a spectrogram, is defined as a squared absolute value of the STFT and is given by

$$E\_x^h(t,\omega) = |F\_x^h(t,\omega)|^2,\tag{12}$$

where |·| denotes the absolute value operator.

*Sensors* **2020**, *20*, 3434

Using the relationship defined by Hahn [35], Equation (11) can be transformed into a complex phase as follows:

$$\Phi\_{\mathbf{x}}^{h}(t,\omega) = \ln \left( F\_{\mathbf{x}}^{h}(t,\omega) \right) = \Lambda\_{\mathbf{x}}^{h}(t,\omega) + j\phi\_{\mathbf{x}}^{h}(t,\omega), \tag{13}$$

where Λ*<sup>h</sup> <sup>x</sup>*(*t*, *<sup>ω</sup>*) = ln *M<sup>h</sup> <sup>x</sup>*(*t*, *ω*) . The concept of complex phase is widely used in the literature as an effective tool for the estimation of signal parameters in the TF domain, which is also applied in the considered approach. For TF reassignment, the relocation operators have to be estimated, which correspond to vectors for both the time and frequency axes along which the energy has to be moved. In the investigated approach, the reassignment operators may be estimated respectively [31]:

$$f(t,\omega) = -\Im\left(\frac{\partial \Phi\_x^h(t,\omega)}{\partial \omega}\right) = t - \Re\left(\frac{\partial \Phi\_x^h(t,\omega)}{\partial \omega}\right) = t - \Re\left(\frac{F\_x^{Th}(t,\omega)}{F\_x^h(t,\omega)}\right),\tag{14}$$

$$\Delta \hat{\omega}(t,\omega) = \omega + \Im \left( \frac{\partial \Phi\_x^h(t,\omega)}{\partial t} \right) = \omega + \Im \left( \frac{F\_x^{\mathcal{D}h}(t,\omega)}{F\_x^h(t,\omega)} \right),\tag{15}$$

where is the real and is the imaginary part. <sup>ˆ</sup>*t*(*t*, *<sup>ω</sup>*) denotes the relocation along the *<sup>t</sup>*-axis and *ω*ˆ(*t*, *ω*) expresses the reassignment vector along the frequency axis. In contrast to another energy concentration technique known from the literature, for example TF synchrosqueezing transform [32], the reassignment method allows strong concentration to be obtained, however, this technique is irreversible. In Equation (14), the expression T *h* = *th*(*t*) is a window multiplied by the linear time ramp with a root in 0, and D*h* in Equation (15) denotes the first order derivative of the analyzing window <sup>D</sup>*<sup>h</sup>* <sup>=</sup> <sup>d</sup>*h*(*t*) <sup>d</sup>*<sup>t</sup>* . In fact, these operations can be interpreted as follows:

$$
\left\langle \partial F\_x^{\rm h}(t,\omega) \right\rangle \Big/ \partial t = \int \mathbf{x}(\tau) \frac{\partial h^\*(t-\tau)e^{-j\omega(t-\tau)}}{\partial t} d\tau = F\_x^{\mathcal{D}h}(t,\omega), \tag{16}
$$

as well as:

$$\left\langle \partial \mathbf{F}\_x^h(t,\omega) \right\rangle \Big/ \partial \omega = \int \mathbf{x}(t-\tau) h^\*(\tau) \frac{\partial \epsilon^{-j\omega\tau}}{\partial \omega} \mathrm{d}\tau = -\int \mathbf{x}(t-\tau) h^\*(\tau) j\tau e^{-j\omega\tau} \mathrm{d}\tau = -j\Gamma\_x^{Th}(t,\omega), \quad \text{(17)}$$

which leads to:

$$\frac{\partial \ln \left( F\_{\mathbf{x}}^{\hbar} (t, \omega) \right)}{\partial t} = \frac{\partial F\_{\mathbf{x}}^{\hbar} (t, \omega)}{\partial t} \frac{1}{F\_{\mathbf{x}}^{\hbar} (t, \omega)} = \frac{F\_{\mathbf{x}}^{\mathcal{D}\hbar} (t, \omega)}{F\_{\mathbf{x}}^{\hbar} (t, \omega)} \tag{18}$$

and

$$\frac{\partial \ln(F\_x^h(t,\omega))}{\partial \omega} = \frac{\partial F\_x^h(t,\omega)}{\partial \omega} \frac{1}{F\_x^h(t,\omega)} = -j\frac{F\_x^{Th}(t,\omega)}{F\_x^h(t,\omega)}\tag{19}$$

which can be directly applied in Equations (14) and (15). This means that the reassignment operators can be easily computed through the STFT method using the modified analyzing window, which increases the utility of the method and reduces the computational effort. Equivalently, the method may be implemented using a recursive filter bank, as described in [34].

Finally, the energy relocation using the reassignment method can be expressed as [31]

$$R\_x^\hbar(t,\omega) = \iint\_{\mathbb{R}^2} |F\_x^\hbar(t,\omega)|^2 \delta(t - \mathfrak{f}(t,\omega)) \delta(\omega - \hat{\omega}(t,\omega)) \mathrm{d}t \mathrm{d}\omega,\tag{20}$$

where *δ*(·) denotes the Dirac distribution. The distribution given by Equation (20) results in strongly concentrated energy on the TF plane, with an enhanced readability and separated components. In fact, the reassignment method usually does not relocate the maximum of the energy but only attracts the surrounding distribution, hence the signal localization remains stable whilst the readability of the transform is improved.

The TF reassignment is a widely used technique in many applications, e.g., ultrasound signal processing [36], audio signal analysis [37,38], or sonar applications [39]. However, despite the high potential of this method, it is still not very popular in the radar community. Namely, in the literature one can find results for the energy concentration in micro-Doppler signature analysis [40–42], characterization of frequency shift keying (FSK) signals [43], improving the quality of inverse synthetic aperture radar (ISAR) imaging [44] as well as in direction of arrival estimation [45] and Doppler radar tomography imaging [46]. The novelty presented in this paper is to apply the TF reassignment method and combine it with the Hough transform that aims to extract the Doppler rate of the target with the improved accuracy comparing the classical method existing in the literature.

The valuable properties of the TF reassignment method prompted the authors to apply this technique to analyze radar signals in a passive FSR application. The "energy gathering" properties applied in such an application may improve the accuracy of the Doppler rate estimation in passive FSR systems. This, in fact is the novelty proposed in this paper, since to the authors knowledge there is no similar application that uses the TF reassignment method to improve the accuracy of the Doppler rate estimate in passive FSR system. Additionally, the outcomes are compared to the classical approach existing in the literature with particular emphasis on the usability in real-life data processing which is investigated in the next section.

#### **4. Numerical Experiments**

#### *4.1. Measurement Campaign*

The measurements took place during the APART GAS 2019 (Active PAssive Radar Trials Ground based, Airborne, Sea-borne) trials. The trials were described by Plotka et al. [19], nevertheless some details of the measurement FSR scenario geometry will be examined here. The positions of the receiving station, transmitter of opportunity, aircraft trajectory, and transmitter–receiver baselines are depicted in Figure 2.

**(b)** Zoom on the baseline cross point

**Figure 2.** Scenario geometry: transmitter–receiver baseline (white line) cross point marked by a green circle, aircraft trajectory in red.

During the passive FSR measurements, the radar receiving station was placed in an open space, on an airfield (see Figure 3). The location was chosen to also test FSR geometry in foliage penetration conditions. The receiver was placed close to the forest line, where the trees were in the direction of the transmitter of opportunity—see Figure 2b. The radar demonstrator was equipped with 6 antennas, but only two of them were used in the presented FSR experiment. During the trials the signal from both V- and H-polarized receiving antennas was gathered, however, due to the lack of significant differences between the results only the selected pair of the receiving channels was analyzed. Hence, in the further part of this paper the signal from V-polarized antennas mounted on a tripod mast at the height of ca. 3 m above the ground is presented. The parameters of the employed transmitter of opportunity are listed in Table 1.

**Figure 3.** Radar demonstrator during measurements.


**Table 1.** Main parameters of the transmitter of opportunity.

During the measurements a cooperative target was used—a light Cessna aircraft (see Figure 4). At the moment of crossing the transmitter–receiver baseline, the aircraft flight parameters were as follows: the target altitude was 196 m above terrain level and velocity was 44 m/s. The distance between receiver and target was 280 m, the distance between transmitter and target was 27.5 km, and the distance between receiver and transmitter was 27.706 km.

**Figure 4.** The cooperative aerial target.

The composition of the receiving station was as follows. Antennas: commercial-off-the-shelf (COTS) 4-element Uda-Yagi, with directivity from 6 dBi to 8 dBi, operating in an upper VHF frequency band (170 MHz up to 230 MHz). Analog front-end: COTS channel amplifier, operating in 87–230 MHz frequency band, with 25 dB gain. Digital signal recorder (see Figure 5): Vector Signal Analyzer (VSA) based on National Instruments PXIe components, with six independent input channels synchronized coherently with GPS signal, operating in the frequency range 10 MHz–6.6 GHz with the maximum bandwidth of 50 MHz. More detailed description, of the radar demonstrator hardware, has been presented by Plotka et al. [19].

**Figure 5.** Digital multichannel signal recorder.

#### *4.2. Results*

Signals recorded by the passive radar demonstrator were processed with the passive FSR signal processing chain as presented in Figure 6.

**Figure 6.** Passive FSR signal processing scheme.

At first, signals from two channels (reference and surveillance) were selected for further calculations. All further computations were performed on the signals' blocks with a 200 ms integration period (which resulted in a velocity resolution equal to 8.13 m/s). Next, a clutter filter was used for removing the reference signal from the surveillance signal [9]. This operation normally significantly reduces target echo power when reaching zero Doppler velocity. In order to limit the scale of this phenomenon, clutter removal filter coefficients were fixed for the time when the target was crossing the baseline [47]. An additional step of signal processing was the removal of direct current (DC) offset, which was achieved by subtracting the mean value from the signal after clutter filtering. The last step of the processing was a multi-stage decimation. According to the simulations carried out, the expected Doppler frequency was not greater than 100 Hz. The input sampling rate of the recorded signals was equal to 8 MHz, so the filtered signal might have been down-sampled a few thousand times without losing valuable information. This step considerably reduced the number of unnecessary subsequent calculations. It should be mentioned that the bistatic range resolution for the processed signal is equal to 37.5 m (this is the size of the first range cell).

Next, the signal was transformed into the TF domain using Equation (11) giving the classical signal distributions on the 2D TF plane, as shown in Figure 7. The Doppler history is clearly visible for the entire trajectory, however, some interference related to multipath propagation and clutter removal algorithm are apparent, especially at the point when the waveform changes the frequency sign. Additionally, the clutter cannot be suppressed at this point due to the presence of the useful signal that should not be filtered out.

**Figure 7.** The energy distributions of the measured signal obtained using the classical short-time Fourier transform (STFT).

The same signal was processed using the concept of energy concentration. The outcomes for this approach are depicted in Figure 8, where the significant concentration was obtained. In such a case, the component extraction and separation are obtainable even in the case of low SNR.

**Figure 8.** The energy distributions of the measured signal obtained using the concentrated STFT.

Both classical and reassigned distributions were obtained using an 8192-point FFT and the Gaussian window of a standard deviation *σ* = 0.2. The shift of the window in consecutive steps the processing was equal to 1 sample in order to provide precise signal representation. Then, in accordance to the approach proposed by Ustalli et al. [20,26,28,29], the Hough transform was applied to estimate the signal Doppler rate as a straight line on the TF plane composed by the Doppler history of the signal near the zero-intersection point. The results are depicted in Figure 9.

**Figure 9.** Results of the Hough transform obtained for the two methods investigated—the classical and the reassigned STFT.

As can be observed, both the classical and the concentrated distributions allowed the Doppler rate to be estimated. The selected sections of both distributions *f* ∈ (−20, 20) Hz, *t* ∈ (40, 60) s containing the most important parts of the signal were processed using the Hough transform, which gave results corresponding to the frequency slope at the interesting point. These values were estimated as follows: *fS* <sup>=</sup> <sup>−</sup>3.6392 Hz <sup>s</sup> for the classical STFT distribution and *fR* <sup>=</sup> <sup>−</sup>3.9873 Hz <sup>s</sup> for the reassigned spectrogram.

Both estimated lines coincide with the Doppler history, and the visual analysis of them does not give an answer as to which of them is more precise. Thus, in order to verify the correctness of the estimate, an additional line was created. The Hough transform was applied on the curve fragment derived from GPS data (see the green line in Figure 9), allowing the precise Doppler rate to be assessed. Next, the error between the reference (GPS data) and two estimated lines was computed. However, due to the limited precision of the DVB-T transmitter localization, as well as the smoothing of the GPS-based trajectory, the actual crossing point is mismatched. The additional sources of mismatch error may be connected with the GPS logger. This device's coordinates estimation accuracy is limited, and as well as its own location inside the aircraft also mattered. An another point which had an impact on the accuracy is the information about the transmitter (Tx) position. For the analysis, the authors took the Tx position from an open database and validated the Tx coordinates using Google Maps. However, the accuracy of the Tx position is also often given with a precision of several meters, which might have had an impact on the presented results. Therefore, an additional simulation was performed in which the geometry was appropriately modified by changing the transmitter position that aims to reduce this fault. After these modifications the error was eliminated, and the results for both the initial and modified trajectories are depicted in Figure 10. The plots show the absolute value of the estimation error *δ*.

**Figure 10.** Absolute value of estimated errors of the Doppler rate estimation—initial error and after correction.

correction

The proposed method allowing signal concentration in the TF to be obtained improved the precision of the Doppler rate estimation. As can be observed, the error was reduced for both parameters: the Doppler rate, and for the time when the target crossed the baseline. This result indicates the effectiveness of the proposed approach.

For the Hough transform, the computational complexity increases at a rate of <sup>O</sup> *Am*−<sup>2</sup> where *A* denotes the size of the image space and *m* corresponds to the number of parameters applied in the processing pipeline. In the proposed method, the Hough transform is the same as in the classical approach with STFT. Therefore, the only difference results from the signal processing associated with the reassignment operation. For this reason, only the STFT and the reassigned STFT computational complexity are compared in Table 2 [40].

**Table 2.** Computational complexity for the STFT and the reassigned STFT in the O notation. *N*—amount of points in the FFT analysis, *K* = *M*/*H*—amount of time instants for which the FFT has to be applied for *M*—signal length in samples, and *H*—window shift in samples.


The increase in computational complexity in the concentrated spectrogram technique results from the fact that 3 distributions are necessary to be implemented in accordance to Equation (20). Namely, the first distribution is the classical STFT with the original window according to Equation (11). The second one corresponds to Equation (16), and the last one to Equation (17). Hence, the precision of the Doppler rate can be improved at the expense of computational complexity. In fact, the processing time may be reduced through the manipulation of the processing parameters. For the purpose of this paper, the number of frequency bins and the window shift were assumed with some redundancy for high resolution of distributions. In practice, these parameters can be reduced to ensure fast processing. For the parameters defined above, the computation time for the spectrogram (*tS*) and the reassigned spectrogram (*tR*) was respectively *tS* = 0.42 s and *tR* = 1641.09 s, but for 1024 points of the FFT this

time was reduced to *tS* = 0.17 s and *tR* = 90.73 s. For the purposes of calculations, a computer with an Intel i7-7700HQ 2.8 GHz processor, 16 GB DDR4 RAM, an SSD hard drive, and a 64-bit Windows 10 system was used. The calculations were performed in the Matlab environment. Consequently, by reducing the distribution quality and for the window shift *H* = 2, the processing time was additionally decreased to *tS* = 0.06 s and *tR* = 29.62 s with nearly the same readability of the distribution. This analysis shows how the processing time may be easily manipulated and reduced with almost conserved resolution, allowing enhanced estimation of the motion parameters to be performed.

#### **5. Discussion and Conclusions**

In this paper, the concept of applying the reassignment technique in passive FSR applications has been proposed. The main purpose of using this method was to enhance the readability of the energy distribution in the TF domain, which improved the result of the Hough transform and finally the precision of the Doppler rate estimation in the passive FSR system. In the considered case, the passive FSR system used a DVB-T VHF signal as a source of illumination, and the cooperating target (Cessna aircraft) crossed the baseline between the signal transmitter and the passive radar demonstrator. The investigation showed that the low-frequency illuminating signal allows detection in the specific geometry to be carried out and, additionally, by using the concept of energy concentration the signal processing pipeline may be improved. Such an improved energy representation can be utilized in further processing for different purposes, such as the estimation of the maneuvering target trajectory for the passive FSR configuration where the range information is ambiguous and the only data describing the target is its Doppler rate related to the velocity and the trajectory. The concept of using passive FSR systems may be implemented in various real-life systems known from the literature as well as in completely new applications. The authors consider realization of the technique in such applications:


In the future, the authors intend to work on the above applications and their implementation in passive FSR systems. An additional perspective is to investigate the possibility of applying real-time processing for the methods presented in this paper.

**Author Contributions:** Formal analysis, M.P. and K.A.; investigation, M.P. and K.A.; methodology, M.P., K.A., M.M., P.S., and K.K.; project administration, M.P.; software, M.P. and K.A.; supervision, M.M., P.S., and K.K.; validation, M.M., P.S., and K.K.; visualization, M.P. and K.A.; writing—original draft, M.P. and K.A.; writing—review and editing, M.M., P.S., and K.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Noise Suppression for GPR Data Based on SVD of Window-Length-Optimized Hankel Matrix**

**Wei Xue 1,2,\*, Yan Luo 1,2, Yue Yang 1,2 and Yujin Huang 1,2**


Received: 16 July 2019; Accepted: 31 August 2019; Published: 3 September 2019

**Abstract:** Ground-penetrating radar (GPR) is an effective tool for subsurface detection. Due to the influence of the environment and equipment, the echoes of GPR contain significant noise. In order to suppress noise for GPR data, a method based on singular value decomposition (SVD) of a window-length-optimized Hankel matrix is proposed in this paper. First, SVD is applied to decompose the Hankel matrix of the original data, and the fourth root of the fourth central moment of singular values is used to optimize the window length of the Hankel matrix. Then, the difference spectrum of singular values is used to construct a threshold, which is used to distinguish between components of effective signals and components of noise. Finally, the Hankel matrix is reconstructed with singular values corresponding to effective signals to suppress noise, and the denoised data are recovered from the reconstructed Hankel matrix. The effectiveness of the proposed method is verified with both synthetic and field measurements. The experimental results show that the proposed method can effectively improve noise removal performance under different detection scenarios.

**Keywords:** ground-penetrating radar; noise suppression; singular value decomposition; Hankel matrix; window length optimization

#### **1. Introduction**

Ground-penetrating radar (GPR) is a geophysical detecting instrument that transmits high-frequency electromagnetic wave and receives the reflections [1]. GPR has been widely used in several fields such as civil engineering, archaeology, geology, and military exploration [2–6] for its nondestructive, continuous, rapid, and efficient properties. Due to the effect of complex underground environment [7] and ultra-wide bandwidth receiver [8], the echoes of GPR contain significant noise. The noise collected by the system can easily mask the effective signals. Therefore, noise suppression is very important for improving the signal quality and interpretation accuracy.

Different approaches for GPR noise suppression have been reported to the literature [9–21]. The wavelet transform is a popular method for GPR data denoising [9,10], and it is simple and effective. However, the selection of the mother wavelet function, the decomposition level, and the threshold function still rely on subjective experiences. Frequency-wavenumber (F-K) filtering originating from seismic data denoising has also been applied to remove noise in GPR data [11,12] and can remove cross rebar reflections and ringing noise effectively. However, the filter design in the F-K domain is relatively complex and the method is only suitable for point targets. The ensemble empirical mode decomposition (EEMD) method is an improved empirical mode decomposition (EMD) method carrying out the EMD over an ensemble of the signal plus Gaussian white noise. The EEMD method can extract the effective signals components from noisy GPR data [13,14]. However, the EEMD method is time-consuming and incapable of processing the raw data with a low signal-to-noise ratio (SNR). The robust principle component analysis (RPCA) method can recover a low-rank matrix from noisy measurements and it

has been employed to suppress the clutter and noise of GPR data [15–17]. However, the RPCA method is sensitive to the choice of thresholds. Singular value decomposition (SVD) is a convenient method to decompose a matrix, which can decompose GPR data into different subspaces that correspond to different components [18–21]. The noise can be suppressed by selecting components that contain effective signals to reconstruct GPR signals. Since each component corresponds to one singular value, the key problem of denoising is the selection of appropriate singular values corresponding to effective signals. A criterion based on the SNR of recovered data has been applied for GPR signal denoising [22], which shows better performance than the wavelet threshold denoising method. The local energy ratio rule has been used to remove background noise of GPR signals [23], which exhibits good robustness under different detection conditions. The fuzzy c-means (FCM) clustering rule has been used to extract multiple targets in heavily cluttered GPR images [24], which can accurately separate the overlapping boundaries of clutter, noise, and target signals and improve the performance of conventional SVD.

Although the denoising methods based on SVD are effective and easy to implement, they are designed to decompose a matrix (two-dimensional data) and cannot fully separate effective signals from the noise in one-dimensional data. To resolve this problem, the one-dimensional data can be transformed into many kinds of matrices, such as the Toeplitz matrix, cycle matrix, and Hankel matrix. The difference lies in the method of creating the matrix, which will affect signal processing of SVD. Among the matrices, SVD of the Hankel matrix can achieve a similar signal processing effect to the wavelet transform [25]. Therefore, SVD of the Hankel matrix is more suitable for noise suppression. A scheme based on SVD of the Hankel matrix has been used to reduce noise for radar cross-section (RCS) data [26], which can improve the accuracy of target recognition greatly. The Hankel matrix-based SVD can eliminate the false peak in processing an impulse signal with strong trend and enhance the SNR in the reconstructed signal [27], which helps to improve the fault diagnosis performance for rolling bearings. The SVD and Hankel matrix-based denoising process has also been applied to the ball bearing vibration signals in both time and frequency domain for the elimination of the background noise [28]. It was found that denoising in the frequency domain yields better fault identification results than the denoising in the time domain. The SVD method based on the Hankel matrix in the local frequency domain has been applied to eliminate random noise in GPR data [29], which can improve suppression of random noise around non-horizontal phase reflection events.

Although the aforementioned papers have proven the effectiveness of SVD of the Hankel matrix in noise suppression, little research has been conducted with respect to the influence of the Hankel matrix size on denoising performance. The size of the Hankel matrix depends on the length of the sliding window which affects the information quantity that can be extracted from this matrix [30]. Based on this previous research, this paper proposes SVD of a window-length-optimized Hankel matrix to suppress noise for GPR data. First, the Hankel matrix formed by one-dimensional GPR data is decomposed with SVD, and the fourth root of the fourth central moment (FRFCM) of singular values is used to select the optimal window length of the Hankel matrix. Then, one threshold is generated by the difference spectrum of singular values, which is used to select effective signal components. Finally, the Hankel matrix is reconstructed with singular values corresponding to effective signals to suppress noise, and the denoised data are recovered from the reconstructed Hankel matrix. The performance of the proposed method is verified with series of synthetic and field measurements. The experimental results of the proposed method are also compared with those of the conventional SVD method based on the local energy ratio rule and wavelet transform method. The results show that the proposed method can effectively improve the denoising performance for GPR data.

#### **2. Methodology**

#### *2.1. Denoising Method Based on SVD of the Hankel Matrix*

The two-dimensional GPR data can be denoted by *B* ∈ *RN*×*L*, where *<sup>L</sup>* is the number of traces and *N* is the number of sampling points in each trace. For the data of one trace (one-dimensional data) *X*=[x(1),x(2), ... ,x(*N*)], a Hankel matrix can be formed by sliding a window over the corresponding vector [25], which can be written as

$$A = \begin{bmatrix} \mathbf{x}(1) & \mathbf{x}(2) & \cdots & \mathbf{x}(n) \\ \mathbf{x}(2) & \mathbf{x}(3) & \cdots & \mathbf{x}(n+1) \\ \vdots & \vdots & \vdots & \vdots \\ \mathbf{x}(m) & \mathbf{x}(m+1) & \cdots & \mathbf{x}(N) \end{bmatrix} \tag{1}$$

where *m* = *N* − *n* + 1, 1 < *n* ≤ *m* < *N*, *A* ∈ *Rm*×*n*, and *n* is the window length.

The SVD of Hankel matrix A can be expressed as

$$A = \mathcal{L}SV^T \tag{2}$$

where *U* ∈ *Rm*×*<sup>m</sup>* and *V* ∈ *Rn*×*<sup>n</sup>* are the left singular and right singular orthogonal matrices, respectively. *S* = *diag*(σ1, σ2, ... , σ*r*) is a singular value matrix with σ<sup>1</sup> ≥ σ<sup>2</sup> ≥ ... ≥ σ*<sup>r</sup>* ≥ 0, and *r* = min(*m*, *n*). According to the definition of Equation (1), the number of singular values *r* is equal to the window length *n*.

Then, Equation (1) can be written as

$$A = \sum\_{i=1}^{r} \sigma\_i u\_i v\_i^T = \sum\_{i=1}^{n} \sigma\_i u\_i v\_i^T \tag{3}$$

where *ui* <sup>∈</sup> *Rm*×<sup>1</sup> and *vi* <sup>∈</sup> *<sup>R</sup>n*×1. *uivT <sup>i</sup>* <sup>∈</sup> *Rm*×*<sup>n</sup>* is the single rank matrix, which is the *<sup>i</sup>*th eigen image of *<sup>A</sup>*. It is obvious that σ*<sup>i</sup>* is actually the projection of matrix *A* on the basis *uiv<sup>T</sup> i* .

As singular values are arranged in descending order, the first few larger singular values generally correspond to effective signals with strong correlations, while the smaller singular values correspond to the noise with weak correlation. Therefore, matrix *A* can be written as

$$A = \sum\_{i=1}^{k} \sigma\_i u\_i \upsilon\_i^T + \sum\_{i=k+1}^{n} \sigma\_i u\_i \upsilon\_i^T \tag{4}$$

where *k* is the demarcation point of singular values, and the first *k* singular values correspond to effective signals? Then the Hankel matrix with noise suppression can be reconstructed as

$$A\_s = \sum\_{i=1}^k \sigma\_i u\_i v\_i^T \tag{5}$$

According to the construction rule of the Hankel matrix, the denoised one-dimensional data can be given by

$$X\_s = \left[ A\_\S(1, :), A\_\S(2:m, n) \right] \tag{6}$$

where *AS*(1, :) is the first row of matrix *AS* and *AS*(2 : *m*, *n*) is the last column without the first element.

#### *2.2. Optimization Method of Window Length*

The window length *n* is the only parameter of the Hankel matrix which not only affects the information quantity extracted from the matrix but also the performance of SVD. As an example, synthetic one-dimensional GPR data are used to analyze the effect of the window length *n* on the performance of SVD. The synthetic data are generated by the "gprMax" simulator [31].

Figure 1 shows the geometry of the simulation model for the scenario. The background medium is concrete. The relative permittivity and conductivity are 6 and 0.01, respectively. The target is a perfect metal cylinder, with 0.4-m diameter, which is buried at a depth of 0.6 m. The Ricker wavelet with a center frequency of 900 MHz is adopted. There are 80 traces in total and the trace interval is 0.035 m. The time window for each trace is 12 ns and each trace contains 2036 sampling points.

**Figure 1.** Geometry of the simulation model for point target detection.

The Gaussian white noise is added to the original GPR image and the SNR is −5.00 dB. Figure 2 shows the original GPR image and the noisy GPR image. Figure 3 shows the original data and noisy data of trace 38. The direct wave and target echoes are near the 250th and the 1300th sampling points, respectively. The noisy data are used to form the Hankel matrices with different window lengths, and SVD is applied to the Hankel matrices.

**Figure 2.** Synthetic ground-penetrating radar (GPR) image: (**a**) original image; (**b**) noisy image.

**Figure 3.** Data of trace 38: (**a**) original data; (**b**) noisy data.

Figure 4 shows the probability distribution of singular values for Hankel matrices with different window lengths. The few larger singular values corresponding to effective signals are distributed in a relatively wide range, and the distribution is sparse. However, the smaller singular values corresponding to the noise are distributed in a narrow range, and the distribution is approximately normal. Moreover, the window length has an obvious effect on the probability distribution of singular values.

**Figure 4.** Probability distribution of singular values for Hankel matrices with different window lengths.

For noise suppression, when the distance between the distribution of singular values corresponding to effective signals and the distribution of singular values corresponding to the noise increases, it is easier to distinguish between effective signal components and noise components, which helps to improve noise removal performance. Based on the analysis of distribution characteristics of singular values in Figure 4, the fourth root of the fourth central moment (FRFCM) of singular values is proposed to measure the distance between the two distributions, which is defined by

$$P(n) = \left(\frac{1}{n} \sum\_{i=1}^{n} (\sigma\_i - \overline{\sigma})^4\right)^{\frac{1}{4}} \tag{7}$$

where *n* is the number of singular values, σ*<sup>i</sup>* is the *i*th singular value, and σ is the mean of singular values.

In order to obtain optimal noise suppression performance, *P*(*n*) should be maximized. Therefore, the optimal window length can be given by

$$m\_{\text{opt}} = \text{argmax}\_{n} [P(n)] \tag{8}$$

#### *2.3. Selection Method of Singular Values*

The number of singular values selected results in a trade-off between noise suppression and recovery of the signal of interest. The selection methods based on SNR of recovered data [22] and local energy ratio [23] merely consider the energy of singular values, and their performance degrades when the SNR is relatively low. The selection method based on FCM clustering [24] uses a membership function to find suitable singular values corresponding to effective signals, which is relatively complex.

In order to obtain an efficient and accurate selection of singular values, the synthetic data in Section 2.2 are used to analyze the variation of singular values. Figure 3 shows the variation of singular values for Hankel matrices (the window length is 300) under different SNRs. For simplicity, only the first 80 singular values are shown in Figure 5.

**Figure 5.** Variation of singular values for Hankel matrices under different signal-to-noise ratios (SNRs): (**a**) SNR = −5 dB; (**b**) SNR = 0 dB; (**c**) SNR = 5 dB; (**d**) SNR = 10 dB.

As shown in Figure 5, the first few singular values correspond to effective signals, and they are larger and decrease quickly with the increase of order; the remaining singular values correspond to the noise, and they are smaller and decrease slowly with the increase of order. For noise, when the SNR increases, the amplitude of singular values decreases obviously and the number of singular values also decreases slightly. For effective signals, when the SNR increases, the amplitude of singular values changes little and the number of singular values increases slightly.

Based on the analysis of variation characteristics of singular values, the difference spectrum of singular values is used to find the demarcation point between singular values corresponding to effective signals and singular values corresponding to the noise. The difference spectrum of singular values [32] can be defined as

$$b\_i = \sigma\_i - \sigma\_{i+1} \quad i = 1, 2, \dots \\ r - 1 \tag{9}$$

where σ*<sup>i</sup>* is the *i*th singular value and *r* is the number of singular values.

The mean of the difference spectrum of singular values is calculated, and a threshold is given by

$$T = \frac{\rho}{r - 1} \sum\_{i=1}^{r-1} b\_i \tag{10}$$

where ρ is a weight coefficient that adjusts the threshold.

Then, the threshold *T* is used to select singular values corresponding to effective signals. To improve the accuracy of the selection, three adjacent difference spectra are compared with the threshold *T* to obtain the demarcation point

$$k\_1 = i \Big| b\_i < T \text{ and } b\_{i+1} < T \text{ and } b\_{i+2} < T \quad i = 1, 2, \dots, r-3 \tag{11}$$

where the first *k*<sup>1</sup> singular values correspond to effective signals.

For two-dimensional GPR data *B* ∈ *RN*×*L*, the noise suppression method based on SVD of a window-length-optimized Hankel matrix can be summarized by the following steps:

1. Select the data of one trace (one-dimensional data) from two-dimensional GPR data and use the one-dimensional data to form a Hankel matrix with a certain window length by Equation (1).

2. Decompose the Hankel matrix by Equation (3) and compute the FRFCM of singular values by Equation (7).

3. Repeat steps 1 and 2 for different window lengths and obtain the optimal window length by Equation (8).

4. For the Hankel matrix with optimal window length, calculate the difference spectrum of singular values and obtain a threshold by Equations (9) and (10).

5. Select the demarcation point between singular values corresponding to effective signals and singular values corresponding to the noise by Equation (11).

6. Reconstruct the denoised Hankel matrix with singular values corresponding to effective signals by Equation (5) and obtain the denoised one-dimensional data by Equation (6).

7. Repeat steps 1–6 for all the traces and implement noise removal for two-dimensional GPR data.

#### **3. Results and Discussion**

A series of synthetic and real data is used to evaluate the proposed method. In addition, the performance of the proposed method is also compared with those of the conventional SVD method based on the local energy ratio rule and the wavelet transform method. The synthetic data are also generated by the "gprMax" simulator [31] based on the finite difference time domain (FDTD) method [33]. All the programs are executed on a 3.60 \_GHz CPU and 32\_GB memory computer.

#### *3.1. Synthetic Example 1*

The example shows the scenario of point target detection. Figure 6 shows the geometry of the simulation model. The targets are three perfect conductor metal cylinders with 0.4 m diameter and they are buried at the same depth of 0.6 m. The interval of the three targets is 0.6 m. The transmitting antenna is placed in the air layer and excited by a Ricker wavelet with a center frequency of 900 MHz. There are 80 traces in total and the trace interval is 0.035 m. The time window for each trace is 12 ns and each trace contains 2036 sampling points. Figure 7 shows the original GPR image and the noisy GPR image (SNR= −5.00 dB).

**Figure 6.** Geometry of the simulation model for point target detection.

**Figure 7.** Synthetic GPR image: (**a**) original image; (**b**) noisy image.

First, the performance of the proposed method is analyzed using one-dimensional data. Figure 8 shows the data of trace 30.

**Figure 8.** Data of trace 30: (**a**) original data; (**b**) noisy data (SNR = −4.62 dB).

Figure 9 shows the FRFCM of singular values for Hankel matrices with different window lengths. It can be seen that when the window length increases, the FRFCM of singular values first increases and then decreases and reaches the maximum when the window length is 250. Therefore, the optimal window length for the Hankel matrix is 250.

According to the selection method of singular values, the demarcation point *k*<sup>1</sup> is 6. Then the Hankel matrix is reconstructed with the first 6 singular values, and the denoised data are recovered from the reconstructed Hankel matrix.

**Figure 9.** The fourth root of the fourth central moment (FRFCM) of singular values for Hankel matrices with different window lengths.

In order to verify the performance of the window length optimization method, the denoised results with several different window lengths are shown in Figure 10. When the window length is 100, the denoised data contain many burrs; when the window length is 250, the denoised data are relatively smooth; when the window length is 400 and 700, the denoised data also contain some noise. The results preliminarily verify the effectiveness of the window length optimization method.

**Figure 10.** Denoised results with different window lengths for the data of trace 30: (**a**) *n* = 100 (SNR = 5.43 dB); (**b**) *n* = 250 (SNR = 7.45 dB); (**c**) *n* = 400 (SNR = 6.66 dB); (**d**) *n* = 700 (SNR = 5.19 dB).

In order to quantitatively analyze the performance of the window length optimization method, the SNR of denoised data with different window lengths is shown in Figure 11. The SNR exhibits

a fluctuation similar to the FRFCM of singular values, and reaches the maximum 7.45 dB at the optimal window length (*n*=250), which shows that the window length optimization method can obtain the best noise removal performance for SVD of the Hankel matrix.

**Figure 11.** SNR of denoised data with different window lengths.

Then, the performance of the proposed method is verified using two-dimensional data. In addition, the experimental results of the proposed method are compared with those of the conventional SVD method based on the local energy ratio rule and the wavelet transform method. Figure 12 shows the denoised results of the three methods. As shown in Figure 12a the conventional SVD method can remove noise, but it also removes some of the target signals. As shown in Figure 12b, the wavelet transform method retains complete target signals, but it also retains a small amount of noise. As shown in Figure 12c the proposed method can retain complete target signals while removing more noise.

**Figure 12.** Denoised results of the three methods for a GPR image: (**a**) singular value decomposition (SVD) method based on the local energy ratio rule; (**b**) wavelet transform method; (**c**) proposed method.

Table 1 lists the SNR, processing time, and the amount of RAM memory required for the three methods. As shown in Table 1, the proposed method yields a higher SNR than the other two methods, and it also needs more processing time and larger RAM memory than the other two methods due to the calculation of SVD of the Hankel matrix for each one-dimensional data.


**Table 1.** Results of the three methods.

#### *3.2. Synthetic Example 2*

The example shows the scenario of layer detection. Figure 13 shows the geometry of the simulation model. The model contains two layers: clay and sand. The transmitting antenna is placed in the air layer and excited by a Ricker wavelet with a center frequency of 900 MHz. There are 41 traces in total and the trace interval is 0.02 m. The time window for each trace is 10 ns and each trace contains 1696 sampling points. Figure 14 shows the original GPR image and the noisy GPR image (SNR= −5.00 dB).

**Figure 13.** Geometry of the simulation model for layer detection.

**Figure 14.** Synthetic GPR image: (**a**) original image; (**b**) noisy image.

First, the one-dimensional data are used to verify the performance of the proposed method. Figure 15 shows the data of trace 20. Figure 16 shows the FRFCM of singular values for Hankel matrices with different window lengths. Evidently, FRFCM reaches the maximum when the window length is 300. Therefore, the optimal window length for the Hankel matrix is 300. The demarcation point *k*<sup>1</sup> is

set to 6 by the selection method of singular values. Then the Hankel matrix is reconstructed with the first 6 singular values, and the denoised data are recovered from the reconstructed Hankel matrix.

**Figure 15.** Data of trace 20: (**a**) original data; (**b**) noisy data (SNR = −5.12 dB).

**Figure 16.** FRFCM of singular values for Hankel matrices with different window lengths.

The denoised results with several different window lengths are shown in Figure 17. As the figure shows, the optimal window length can obtain the best compromise between noise suppression and retaining effective signals.

In order to quantitatively analyze the performance of the window length optimization method, the SNR of denoised data with different window lengths is shown in Figure 18. The results further show the window length optimization method can achieve the best noise removal performance for SVD of the Hankel matrix.

Then, the two-dimensional data are used to verify the performance of the proposed method. The experimental results of the proposed method are also compared with those of the conventional SVD method based on the local energy ratio rule and wavelet transform method. Figure 19 shows the denoised results of the three methods. As shown in Figure 19a, the layer signals are relatively weak, and some horizontal noise is also introduced. Figure 19b shows that the layer signals are obvious, but a small amount of noise is also retained; and Figure 19c shows that the layer signals are relatively strong, and the noise is also removed more thoroughly.

**Figure 17.** Denoised results with different window lengths for the data of trace 20: (**a**) *n* = 150 (SNR = 5.91 dB); (**b**) *n* = 300 (SNR = 7.86 dB); (**c**) *n* = 450 (SNR = 5.82 dB); (**d**) *n* = 600 (SNR = 5.40 dB).

**Figure 18.** SNR of denoised data with different window lengths.

**Figure 19.** Denoised results of the three methods for a GPR image: (**a**) SVD method based on the local energy ratio rule; (**b**) wavelet transform method; (**c**) proposed method.

Table 2 lists the SNR, processing time, and the amount of RAM memory required for the three methods. Table 2 also shows that the proposed method yields a higher SNR and consumes more memory space compared with the other two methods.


**Table 2.** Results of the three methods.

#### *3.3. Synthetic Example 3*

In this section, the performance of the proposed method is investigated in the presence of correlated noise. This example uses the same original GPR image as synthetic example 1. The autocorrelation function of the noise is an exponential function and the correlation length of the noise is 10. Figure 20 shows the original GPR image and the noisy GPR image (SNR = −5.00 dB).

**Figure 20.** Synthetic GPR image: (**a**) original image; (**b**) noisy image.

First, the performance of the proposed method is analyzed using one-dimensional data. Figure 21 shows the data of trace 30. Figure 22 shows the FRFCM of singular values for Hankel matrices with different window lengths. In this case, it is evident that the value of FRFCM is greater than that in the case of white noise and the optimal window length for the Hankel matrix is 300. The Hankel matrix is reconstructed with the first eight singular values, and the denoised data are recovered from the reconstructed Hankel matrix.

**Figure 21.** Data of trace 30: (**a**) original data; (**b**) noisy data (SNR = −4.55 dB).

**Figure 22.** FRFCM of singular values for Hankel matrices with different window lengths.

Figure 23 shows the denoised results with several different window lengths. When the window length is 100, the denoised data contain some oscillating components; when the window length is 500 and 650, the denoised data also contain a lot of interference with large amplitude; when the window length is 300, the denoised data contain the least noise. The results confirm that the window length optimization method is also effective in the case of correlated noise.

**Figure 23.** Denoised results with different window lengths for the data of trace 30: (**a**) *n* = 100 (SNR = 1.72 dB); (**b**) *n* = 300 (SNR = 4.31 dB); (**c**) *n* = 500 (SNR = 2.96 dB); (**d**) *n* = 650 (SNR = 2.39 dB).

Figure 24 shows the SNR of denoised data with different window lengths. The results show that SVD of the Hankel matrix obtains the best noise removal performance at the optimal window length (*n* = 300).

**Figure 24.** SNR of denoised data with different window lengths.

Then, the performance of the proposed method is verified using two-dimensional data. Moreover, the experimental results of the proposed method are compared with those of the conventional SVD method based on local energy ratio rule and the wavelet transform method. Figure 25 shows the denoised results of the three methods. As shown in Figure 25a the conventional SVD method removes

some target signals while denoising. Figure 25b shows that the wavelet transform method also retains some noise while retaining target signals; and Figure 25c shows that the proposed method retains more target signals while removing more noise.

**Figure 25.** Denoised results of the three methods for a GPR image: (**a**) SVD method based on the local energy ratio rule; (**b**) wavelet transform method; (**c**) proposed method.

Table 3 lists the SNR, processing time, and the amount of RAM memory required for the three methods. Compared with the results of synthetic example 1, the SNR of the three methods all decreases due to the correlation of the noise. The proposed method yields an obviously higher SNR with an appropriate increase in processing time compared with the other two methods. In addition, the wavelet transform method requires larger RAM memory due to the correlation of the noise.



#### *3.4. Synthetic Example 4*

In this section, the performance of the proposed method is also investigated in the presence of correlated noise. This example also uses the same original GPR image as synthetic example 1. The autocorrelation function of the noise is an exponential function and the correlation length of the noise is 20. Figure 26 shows the original GPR image and the noisy GPR image (SNR = −5.00 dB).

**Figure 26.** Synthetic GPR image: (**a**) original image; (**b**) noisy image.

First, the performance of the proposed method is analyzed using one-dimensional data. Figure 27 shows the data of trace 30. Figure 28 shows the FRFCM of singular values for Hankel matrices with different window lengths. Evidently, the correlation length of the noise increases, the value of FRFCM also increases, and the optimal window length for the Hankel matrix is 300. The Hankel matrix is reconstructed with the first nine singular values and the denoised data are recovered from the reconstructed Hankel matrix.

**Figure 27.** Data of trace 30: (**a**) original data; (**b**) noisy data (SNR = −4.49 dB).

**Figure 28.** FRFCM of singular values for Hankel matrices with different window lengths.

Figure 29 shows the denoised results with several different window lengths. As the figure shows, denoised data of the optimal window length contain less noise than those of other window lengths.

**Figure 29.** Denoised results with different window lengths for the data of trace 30: (**a**) *n* = 100 (SNR = 1.61 dB); (**b**) *n* = 300 (SNR = 3.02 dB); (**c**) *n* = 500 (SNR = 2.16 dB); (**d**) *n* = 650 (SNR = 1.66 dB).

Figure 30 shows the SNR of denoised data with different window lengths. The SNR exhibits more fluctuation, and it reaches maximum at the optimal window length (*n* = 300).

**Figure 30.** SNR of denoised data with different window lengths.

Then, the performance of the proposed method is verified using two-dimensional data. The experimental results of the proposed method are also compared with those of the conventional SVD method based on the local energy ratio rule and the wavelet transform method. Figure 31 shows the denoised results of the three methods. As shown in Figure 31, the conventional SVD method loses a lot of target signals; the wavelet transform method also retains a lot of noise while retaining target signals; the proposed method achieves a good compromise between retaining target signals and removing the noise.

**Figure 31.** Denoised results of the three methods for a GPR image: (**a**) SVD method based on the local energy ratio rule; (**b**) wavelet transform method; (**c**) proposed method.

Table 4 lists the SNR, processing time, and the amount of RAM memory required for the three methods. Compared with the results of synthetic example 3, the increase of the correlation length of the noise obviously degrades the SNR of the three methods. The proposed method also achieves a higher SNR at the cost of the increasing processing time and more memory space compared with the other two methods.



#### *3.5. Field Measurements 1*

The example shows the scenario of pipeline detection. The antenna center frequency is 400 MHz. There are 251 traces in total and each trace contains 301 sampling points. Figure 32 shows the original noisy GPR image. As the figure shows, there is a lot of noise around the target hyperbolic signals, which affects the target detection.

**Figure 32.** Real GPR image of pipeline detection.

The optimal window length for the Hankel matrix is 90. The denoised results of the three methods are shown in Figure 33. As shown in Figure 33a the conventional SVD method removes some of the noise, but it generates some false target hyperbolic signals. Figure 33b shows that the wavelet transform method removes most of the noise and retains complete target signals, but it introduces a small amount of vertical noise; Figure 33c shows that the proposed method removes most of the noise and preserves complete target signals without introducing any other signals. The results show that the proposed method achieves better noise removal performance than the other two methods and helps to detect the pipeline accurately.

**Figure 33.** Denoised results of the three methods for a real GPR image: (**a**) SVD method based on the local energy ratio rule; (**b**) wavelet transform method; (**c**) proposed method.

The processing time of the conventional SVD method, the wavelet transform method, and the proposed method is 0.45 s, 1.32 s, and 1.77 s, respectively.

#### *3.6. Field Measurements 2*

The example shows the scenario of road layer detection. The antenna center frequency is 400 MHz. There are 46 traces in total and each trace contains 450 sampling points. Figure 34 shows the original noisy GPR image. As the figure shows, there is some horizontal noise around the layer signals, which interferences with the layer recognition.

**Figure 34.** Real GPR image of road layer detection.

The optimal window length for the Hankel matrix is 110. The denoised results of the three methods are shown in Figure 35. As shown in Figure 35a, the conventional SVD method removes some of the noise, but it still retains some noise between the 80th and the 150th sampling points. Figure 35b shows that the wavelet transform method retains a small amount of noise between the 80th and the 150th sampling points, but it removes part of the layer signals near the 240th sampling point; Figure 35c shows that the proposed method removes most of the noise, and it retains the layer signals completely. The results show that the proposed method obtains the best noise removal performance and provides the best profile for layer detection.

The processing time of the conventional SVD method, the wavelet transform method, and the proposed method is 0.14 s, 0.47 s, and 0.67 s, respectively.

**Figure 35.** *Cont*.

**Figure 35.** Denoised results of the three methods for a real GPR image: (**a**) SVD method based on the local energy ratio rule; (**b**) wavelet transform method; (**c**) proposed method.

#### **4. Conclusions**

In this paper, a method based on SVD of a window-length-optimized Hankel matrix is proposed to improve the noise suppression performance for GPR data. The fourth root of the fourth central moment of singular values is used to determine the window length of the Hankel matrix, which provides a solution to optimize the size of the Hankel matrix. Then, the difference spectrum of singular values is used to construct a threshold, which provides a solution to select singular values corresponding to effective signals.

The proposed method is verified by series of synthetic and practical data. The results show the proposed method can obtain the best noise removal performance for both white noise and correlated noise. The proposed method also achieves better denoising performance than the conventional SVD method based on the local energy ratio rule and wavelet transform method at the cost of the appropriate increases in processing time and memory space. Future work will investigate more efficient solutions to optimize SVD of the Hankel matrix to further improve noise removal performance.

**Author Contributions:** Investigation, Y.H.; methodology, W.X.; software, Y.L.; validation, Y.Y.; writing—original draft, W.X.; writing—review and editing, W.X.

**Funding:** This research was funded by the National Key Research and Development Program of China, grant number 2018YFC1503702; the Fundamental Research Funds for the Central Universities of China, grant number CUG2018JM15; and the State Scholarship Fund of China Scholarship Council, grant number 201806415018.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Communication* **New Concept of Combined Microwave Delay Lines for Noise Radar-Based Remote Sensors**

#### **Zenon Szczepaniak and Waldemar Susek \***

Faculty of Electronics, Military University of Technology, gen. Sylwestra Kaliskiego St. No. 2, 00-908 Warsaw, Poland; zenon.szczepaniak@wat.edu.pl

**\*** Correspondence: waldemar.susek@wat.edu.pl; Tel.: +48-261-839-831

Received: 31 August 2019; Accepted: 4 November 2019; Published: 6 November 2019

**Abstract:** Delay lines with a tunable length are used in a number of applications in the field of microwave techniques. The digitally-controlled analogue wideband delay line is particularly useful in noise radar applications as a precise detector of movement. In order to perform coherent reception in the noise radar, a delay line with a variable delay value is required. To address this issue, this paper comprises a new concept of a digitally-controlled delay line with a set of fine distance gates. In the paper, a solution for micro-movement detection is proposed, which is based on direct signal processing in the time domain with the use of a microwave analogue correlator. This concept assumes the use of a microwave analogue tapped delay line structure. It was found that the optimal solution for a noise radar with an analogue signal correlator is a combined delay line consisting of switched reference sections, a tapped delay line, and a precision phase shifter. The combined delay line presented in this paper is dedicated to serving as the adjustable reference delay for a noise radar intended for the detection of micro-movement. The paper contains the calculation results and delay line implementation for a given example. The new structure of the analogue tapped delay line with the calculation of optimal parameters is also presented. The precise detector of movement can be successfully used for the remote sensing of human vital signs (especially through-the-wall), e.g., breathing and heart beating, with the simultaneous determination of position.

**Keywords:** noise radar; radar signal processing techniques; analogue correlation; modern radar applications; delay line

#### **1. Introduction**

Delay lines with a tunable length are used in a number of applications in the field of microwave techniques [1–4]. The possible applications depend on the length of the line and generally they may be divided as follows: phase shifters, phase correctors, impedance tuning stubs, or time delay references.

The simplest form of a fixed microwave delay line is a section of a transmission line with a specified length, preferably without the effect of group velocity dispersion. In this case, the group delay *Tg* may be expressed by

$$T\_{\mathcal{S}} = \frac{L}{v\_{\mathcal{S}}},$$

where *L* is the length of the transmission line and *vg* is the group velocity, defined as

$$
v\_{\mathcal{S}} = \frac{\partial \omega}{\partial \beta}.\tag{2}$$

In Equation (2), variable β is the propagation constant and ω is the angular frequency. For the frequency range, where the group velocity dispersion may be neglected, the group velocity may be approximated by the expression for phase velocity of *v*ϕ = ω/β that gives

$$T\_{\mathcal{S}} = \frac{L\beta}{\alpha}.\tag{3}$$

The time delay of a section of a transmission line is proportional to its physical length. In order to achieve large time delays, adequately long sections of the transmission line have to be used. Because the Relationship (3) also contains a phase constant and further electrical length β*L*, the time delay is a function of the parameters (especially permittivity) of the material filling of the transmission line. Therefore, there is a possibility of obtaining bigger time delays per unit length of the transmission line when it is filled with material with a high permittivity ε*r*. Therefore, for a TEM line, one may write

$$T\_{\mathcal{S}} = \frac{\sqrt{\varepsilon\_r}L}{\mathfrak{c}}.\tag{4}$$

An example of this relationship is as follows: one meter of free space propagation or a perfect TEM line with air filling corresponds to 3.33 ns of time delay. In order to have a properly working delay line set, the propagation of the delayed wave should not be affected by the group velocity dispersion. This requirement is fulfilled when TEM lines are used, for example, coaxial lines or a planar line microstrip or coplanar. The bandwidth of the transmitted signal should be limited in order to not excite an unwanted waveguide (not TEM) mode of propagation.

The digitally-tunable delay line allows a number of pre-defined values of the time delay or length corresponding to the smallest assumed delay step to be set [5]. In the case of analogue tuning of a line's electrical length, theoretically, it is possible to obtain any value of delay from a predefined delay range.

An electronically-controlled microwave delay line is part of an analogue correlation detector used in a noise radar. The most basic design of a noise radar consists of an analogue receiver and analogue delay line with the ability to adjust the time delay. The current development of noise radars mainly concerns the use of advanced techniques of digital signal processing in order to obtain fully-digital correlation receivers [6–8]. However, a noise radar with an analogue correlation receiver with a tunable reference delay line may still be a useful enough solution, especially for micro-movement detection with range determination. In comparison, a typical CW (Continuous Wave) Doppler radar only detects the micro-movement speed, e.g., breathing or heartbeat, without information about the distance to the measured object [9]. Ultra WideBand (UWB) radars including noise or pseudorandom noise-based radars are used in various applications, including remote vital sign detection for rescue, security, and medical care or diagnostics [10,11].

To perform coherent reception in noise radar, a delay line with a constant or variable delay value is required. By analyzing the possibility of micro-movement detection, which is also described in the literature by the term micro-Doppler detection [12], one may notice that current investigations concern the development of algorithms to perform digital signal processing in the baseband in the frequency domain (spectrum analysis). The main issue is finding the spectrum shift of the received signal with respect to the transmitted signal. This shift may be equal to an order of MHz for a spectrum width equal to dozens of MHz. It is very difficult to detect such a small shift and new specialized methods should be developed.

The research question is to find the optimized analogue microwave tunable delay line in order to adjust the operating point of the correlation detector. In this paper, a solution for micro-movement detection is proposed, which is based on direct signal processing in the time domain with the use of a microwave analogue correlator. Therefore, to address this issue, further sections of the paper comprise a new concept of a digitally-controlled delay line with a set of fine distance gates. This concept assumes the use of a combined set of three lines, including a new version of a tapped delay line. In order to verify the concept, the example of micro-movement detection is presented, with the following assumptions: target distance 12 m, micro-movement amplitude 1 mm, noise radar with bandwidth 1 GHz, and center frequency 6.5 GHz.

The combined delay line presented in this paper is dedicated to serving as the adjustable reference delay for a noise radar intended for the detection of micro-movement. This approach allows the measurement setup to be simplified and increases the possibility of micro-movement detection. The delay line's physical structure depends on the dedicated application of a given radar system. The source of micro-movement may be different, for example, it may be a vital activity of the human body or its organs, such as chest movement due to heartbeat and breathing.

#### **2. Principle of Noise Radar Technology**

Noise radars belong to radars which use random or pseudorandom signals for probing purposes and coherent detection techniques for receiving signals. Their fundamental parameters are the following: wide bandwidth, low power density, and high accuracy for distance and velocity measurements, which results from the properties of the ambiguity function of the wideband noise signal. A correlation receiver is a typical element of a noise radar. Coherent reception needs delay lines of constant or variable parameters to be applied in the receiving systems. The delay line allows one to memorize a sample of the transmitted noise signal for an amount of time delay resulting from the round-trip of the transmitted signal, from the radar transmitter to the target and back to the radar receiver. In general, the principle of the operation of a noise radar and its various structures has been widely described in the scientific literature, including the basic use of the radar, which is target range and radial velocity estimation [13–20].

The radar transmitter allows the noise signal from the frequency range, which is said to be from 1 to 18 GHz, and bandwidth of about 2 GHz, to be generated. The primary noise source in the transmitter may be realized by means of a semiconductor avalanche diode or a Zener's diode. The signal from the transmitter is fed to a transmitting antenna with the use of a directional coupler to the delay line. The signal collected by a receiving antenna is amplified and filtered in the front-end receiver and further correlated with a copy of the transmitted signal delayed by a delay line. When the time delay value *TDL* is equal to the time T corresponding to the round-trip of the transmitted signal (from the radar to the target and back to the radar), a so-called "correlation peak" will appear at the output of the correlation detector. The existence of a distinguishable value of the signal *SOUT*(*t*) occurring for a given value of time delay *TDL* provided by a delay line allows target detection and distance estimation according to the formula *R* = c*TDL*/2.

For the following consideration, it is assumed that the transmitter generates a signal in the form of noise with a limited bandwidth and normal distribution, with an average value equal to zero and a variance equal to σ2. The signal generated in the transmitter can be described by the Expression (5), whereas signals in the individual points of the system (Figure 1) are described by Relations (6) and (7):

$$S\_T(t) = \mathbf{x}(t)\cos(\omega\_0 t) - y(t)\sin(\omega\_0 t) \tag{5}$$

$$S\_{DL}(t) = k\_1 S\_T(t - T\_{DL}) \tag{6}$$

$$S\_R(t) = k\_2 S\_T \Big( t - T - \frac{2D(t)}{c} \Big) \tag{7}$$

where *T* is the signal delay time on the radar-object-radar path, *TDL* is the delay time of the delay line, *D*(*t*) is the micro-movement of certain object parts, ω<sup>0</sup> is the median frequency of the band occupied by the noise signal, *x*(*t*) and *y*(*t*) are independent realizations of the stationary random process with a Gaussian distribution having an average value equal to zero, and *k*<sup>1</sup> and *k*<sup>2</sup> are the propagation coefficients.

**Figure 1.** Outline of a noise radar.

As a result of the multiplication of signals described by (6) and (7), and afterwards, integration of the obtained products, the output signal *SOUT*(*t*) can be expressed in the following Form (8):

$$S\_{\rm OUT}(t) = A(\Delta T) \cos \left[ \alpha\_0 \left( \Delta T + \frac{2D(t)}{\mathbf{c}} \right) \right] \tag{8}$$

where Δ*T* = *T* − *TDL* and

$$A(\Delta T) = \frac{k\_1 k\_2}{T\_p} \int\_0^{T\_p} (\mathbf{x}(t - T)\mathbf{x}(t - T\_{DL}) + \mathbf{y}(t - T)\mathbf{y}(t - T\_{DL})) \, \mathrm{d}t. \tag{9}$$

The result of Integration (9) is a value independent of time *t*. However, this value (i.e., integration result) depends on the time difference *T* − *TDL*. When the value of *TDL* is set by the delay line and the value of time *T* is constant (constant position of the target), the output signal from correlator *SOUT* is also constant, and it reaches its maximum for *T* = *TDL*. Another situation takes place when a target is making micro-movement, for example, that described by the harmonic expression *D*(*t*) = *D* × cos(ω<sup>0</sup> × *t*), where *D* is the amplitude of this movement. Then, the round-trip time *T* is also harmonically dependent on *t*. In effect, the output signal from the integrating circuit also depends on *t*. This means that the integrator output signal *SOUT* varies in time according to the varying distance from the radar to target and its frequency corresponds to the frequency of the target micro-movement.

There is the requirement that the micro-movement should be a slow-varying function of time compared to the integration time *Tp* of the integrating circuit. The value of micro-Doppler frequency must be lower than the cut-off frequency of the lowpass filter at the output of the correlator multiplying circuit.

Equation (8) describes the value of the correlation function for the noise signal transmitted and received by the noise radar [21–24].

#### **3. New Concept of a Combined Microwave Delay Line with a Set of Fine Distance Gates**

#### *3.1. Combined Structure*

In order to control the position of the detector operating point, an adjustable analogue delay system for micro-movement in a noise radar has to be used and it may be realized with the use of a combined delay line structure.

The two structures of microwave delay lines are as follows: the digitally-controlled cascaded line with switched delay sections and the analogue tapped delay line, which may be combined with one other. The resulting structure, formed by cascading, gains additional interesting features. The line with switched delay sections sets one time delay value from a predefined finite set, which may be called the coarse one. Furthermore, the delayed input signal enters the second line, which is the tapped delay line. The tapped line introduces several values of a smaller time delay for each tap output simultaneously. These delays may be called the fine ones and they are added to the coarse time delay set by the first line. As a result, there is a comb of time delays corresponding to radar range gates (spread by unit time delay of the tapped line), which is switched up and down by the value of unit time delay of the digitally-controlled first line (coarse step). The fine time delays offset should be sorted to evenly cover the unit time delay of the coarse line.

The combined line mentioned above does not ensure that the optimal operating point on the detector characteristic (with the meaning of point P1 on the autocorrelation function) corresponds to one of the fine gate delays. In order to find the optimal operating point for detection, the third delay line has to be cascaded. This additional delay line is the adjustable one, preferably in the form of an analogue precision phase shifter.

The innovative concept of such a combined line is shown in Figure 2, with an example of specified time delay implementation.

**Figure 2.** Example of a digitally-controlled delay line with a set of fine distance gates and an adjustable phase shifter for finding the optimal operating point of a correlation detector.

#### *3.2. Analogue Tapped Cascaded Delay Line*

Digitally-controlled microwave delay lines are considered here to realize a reference delay for a noise radar with analogue correlation. This type of radar performs analogue correlation of the received signal with a delayed version of the transmitted signal. However, compared to a radar with digital signal processing including signal cross-correlation, the analogue correlator only performs convolution for one selected value of time delay called the time gate.

In order to bring the functionality of a radar with an analogue correlator closer to the digital one, a special kind of delay line may be used, known as the analogue tapped delay line.

In general, the analogue tapped delay line, known from the literature, consists of a number of unit delay sections and microwave couplers, having the same coupling factor, cascaded as shown in Figure 3 [25]. One unit delay line with one coupler forms one delay line stage. Every tap, which is a coupler's coupled signal port, is the output of the delayed input signal, with a time delay equal to the unit time delay τ<sup>1</sup> multiplied by the stage number.

**Figure 3.** General scheme of an analogue tapped cascaded delay line.

According to this idea, every signal tap corresponds to one time gate and one signal correlator. The general scheme of an analogue tapped delay line is shown in Figure 3.

The application of this circuit causes a need to use several analogue correlator units equal to N, consisting of a signal mixer and a lowpass filter.

It is important to note that the analogue tapped delay line allows all signals corresponding to all time gates to be obtained quasi-simultaneously. Here, the term "quasi" means that the time gate output signals appear after subsequent unit time delay, but there is no need to use a signal switch (SPDT or SPST) to set the one desired value of time delay per one specified state of line. The analogue tapped cascaded delay line can be characterized by a number of features.

Advantages:


Disadvantages:


The design of a new concept of the analogue tapped delay line, which is proposed below, requires the design and optimization of dedicated microwave couplers with precisely chosen values of the coupling factor in order to obtain the same signal power level at each tap.

#### **4. Numerical Validation**

#### *4.1. Optimization of the Correlation Detector Operating Point*

A normalized Function (8) for *T* = 80 ns, 6–7 GHz band, and *D*(*t*) = 0 is shown in Figure 4.

**Figure 4.** Plot of a normalized Function (8) for *T* = 80 ns, 6–7 GHz band, and *D*(*t*) = 0.

As can be seen in Figure 4, the noise radar may be applied for micro-movement detection. The operating points P1 and P2 of the correlation detector can be selected from the operational range shown in Figure 4. An example of the output signal of the correlation detector *SOUT*(*t*) for P1 and P2 is shown in Figure 5. The assumptions for this calculation are as follows: distance to object *R* = 12 m,

harmonic micro-movement *D*(*t*) with amplitude 1 mm and frequency equal to 1 Hz, radar transmits noise signal with bandwidth *B* = 1 GHz, and center frequency *f* <sup>0</sup> = 6.5 GHz. In this case, the point P1 corresponds to delay *TDL*<sup>1</sup> and P2 corresponds to delay *TDL*2, which equal *TDL*<sup>1</sup> = 80.0385 ns and *TDL*<sup>2</sup> = 80.0770 ns. The use of a tunable microwave delay line is crucial in this case. The best operating point P1 for proper micro-movement detection is not placed for *T* = 80 ns, resulting from the distance between the radar and a target. Therefore, there is a need to correct the position of the operating point by introducing an additional value of time delay. It is possible to shift the operating point of the micro-movement detector due to the precise adjustment of time delay provided by the tunable delay line.

**Figure 5.** Illustration for the conversion of harmonic movement of an object to the output signal of the microwave correlation detector.

The operating point P1 is placed on the linear part of the detector characteristic and the detector output signal (black line in Figure 5) at this point properly corresponds to the shape and frequency of the micro-movement.

In contrast, the operating point P2 is placed on the nonlinear part of the detector characteristic and the detector output signal at this point (red line in Figure 5) incorrectly replicates the shape and frequency of the micro-movement. In particular, the micro-movement frequency is incorrect and equal to the doubled value of the proper frequency due to the even shape of the characteristic in the vicinity of the operating point.

#### *4.2. Implementation of Time Delay Values*

The example of the implementation of specified time delay values in a combined delay line structure, according to the proposed concept (Figure 2), assumes the case of micro-movement detection introduced above in Section 4.1.

The first line is the fixed delay line, which sets the time delay τfix corresponding to the distance 12 m lowered by the value of 1 ns, i.e., half of the autocorrelation function width (as shown in Figure 4). The digitally-controlled line with switched sections consists of three sections with the following time delay values: 0.25, 0.5, and 1 ns. It corresponds to a 3-bit control with unit time delay equal to 0.25 ns, and a maximal value of coarse time delay equal to 1.75 ns. Next, there is the tapped delay line with three taps and fine unit time delay equal to 166.7 ps. In effect, for every state of the digitally-controlled line, there are three values of fine gates shifted by 166.7 ps, which cover the range of 0.5 ns evenly.

The combined line presented in Figure 2 comprises one analogue phase shifter preceded by an SP3T switch. In this application, only one micro-Doppler detector is needed, and it is placed at the output of the phase shifter. This structure allows the optimal operating point for a correlation detector to be found in the case of a change of target position.

There is another variant of this solution that is possible when there are three correlation detectors connected to the subsequent output taps of the tapped line, and the phase shifter is placed between switched and tapped delay lines. In this case, there is no need for an SP3T switch (or in general, an N-way SPNT switch).

#### *4.3. Calculations of Optimal Parameters of an Analogue Tapped Delay Line*

According to the new concept, the main assumption for optimization of the tapped line design is that the same level of signal power is obtained at each tap, i.e., the coupled signal port.

Denoted in terms of power,


where *N* is the maximal number of taps, for lossless couplers, one may notice

$$\mathbf{x}\_1 = (1 - \mathbf{C}\_1)\mathbf{x} \tag{10}$$

and

$$y\_1 = \mathbb{C}\_1 \mathbf{x}.\tag{11}$$

Substituting further for *y*<sup>2</sup> and *x*<sup>2</sup> results in

$$y\_2 = \mathbb{C}\_2 \mathbf{x}\_1 = \mathbb{C}\_2 (1 - \mathbb{C}\_1) \mathbf{x} \tag{12}$$

and

$$\mathbf{x}\_1 = (\mathbf{1} - \mathbf{C}\_2)\mathbf{x}\_1 = (\mathbf{1} - \mathbf{C}\_1)(\mathbf{1} - \mathbf{C}\_2)\mathbf{x}.\tag{13}$$

The considered situation is shown in Figure 6.

**Figure 6.** General scheme of a cascaded tapped delay line used in considerations, with variables denoted.

The condition for equal coupled signal powers is

$$y\_1 = y\_2 = \dots y\_{N\_\prime} \tag{14}$$

which gives the result

$$C\_2 = \frac{C\_1}{1 - C\_1} \tag{15}$$

and

$$\mathbb{C}\_{N} = \frac{\mathbb{C}\_{N-1}}{1 - \mathbb{C}\_{N-1}}.\tag{16}$$

Finally, by substituting coupling factors for subsequent taps, one may obtain

$$\mathcal{C}\_N = \frac{\mathcal{C}\_1}{1 - (n - 1)\mathcal{C}\_1}.\tag{17}$$

Because the last tap does need not a coupler, the value of *CN* equals 1 and using Equation (17), one may find

$$\mathbb{C}\_{\mathcal{N}} = \frac{1}{N + 1 - n}. \tag{18}$$

The equation above (18) expresses the value of the power coupling factor as a function of the tap number *n* for the assumed overall number *N* of taps. With the use of this equation, the example values of coupling factors were calculated and are presented in Table 1. The calculations were done for three values of the overall tap number, i.e., *N* = 4, 8, and 16.

**Table 1.** Calculated values of power coupling factors for values of the overall tap number: *N* = 4, 8, and 16.


When the design guidelines above are implemented, the power transmission factor for every tap output with respect to the main input of the whole delay line is the same and equals 1/*N*. This is true for lossless unit delay lines and couplers; when these devices are lossy, the insertion losses accumulate and increase with the tap number.

Assuming that the constant tapped power rule is taken into account, the design methodology of the microwave tapped analogue delay line should follow


Further optimization of the whole delay line structure, e.g., possible integration of coupling structures with circuits of unit delay lines, can be pursued in the case of one's own design (especially possible when planar technology is chosen).

#### **5. Discussion**

For very high frequencies in the microwave range, the simple analogue systems of the correlative detector with a controlled microwave delay line can be used, because digital realization of the micro-movement detector for microwave frequencies is very difficult and demands very high-speed analogue-to-digital conversion. As far as existing delay line technologies are concerned, the conclusion is that, in spite of constant development, there is no ideal solution, which would join the features like low losses, a high delay, a low cost, and small dimensions.

The choice of delay line solution depends on the expected application. One may use commercially available components or design one's own dedicated solutions. When the type and technology of unit delay lines is chosen, the whole structure of the delay line set should be considered in order to fulfill the requirements of the given project.

For the radar (especially noise one) with an analogue correlator, the optimal solution seems to be a combined delay line consisting of switched and tapped parts. The parameters like the number of control bits, number of delay states, and unit delay value are matched in order to optimize signal losses, bandwidth, structure complexity, maximal detection range, and range gate grid.

There are various possible applications of combined delay lines when noise radar is used:


The natural competition for analogue signal processing methods in a noise radar are digital techniques, which rely on calculation of the correlation function in the digital domain. This solution requires direct analog-to-digital (AD) conversion of the microwave signal in a given bandwidth. However, for very high frequencies in the microwave range, simple analogue systems of the correlative detector can still be used, because direct sampling of the microwave signal and digital realization of the autocorrelation function for these frequencies are very difficult and require extremely fast AD conversion.

The methods presented in this paper use an analogue correlation detection in the microwave band, in contrast to processing of the received signal in the primary band. This allows a movement using an internal structure of the correlation function of a noise signal to be precisely detected. The solution that uses the digitally-controlled switched delay line cascaded with the tapped delay line allows the features of synthesizing the big delay values (switched line) to be combined with simultaneously obtaining several values of the time delay (tapped line). An additional precision analogue phase shifter allows the optimal operating point of a micro-Doppler detector to be found.

**Author Contributions:** Conceptualization, Z.S. and W.S.; methodology, Z.S.; investigation, W.S.; writing—original draft preparation, Z.S.; writing—review and editing, W.S.

**Funding:** This research was funded by the Polish Ministry of Defense, grant number GBMON/13-996/2018.

**Conflicts of Interest:** The authors declare no conflicts of interest. The sponsors had no role in the design, execution, interpretation, or writing of the study.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Sensors* Editorial Office E-mail: sensors@mdpi.com www.mdpi.com/journal/sensors

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18