1. Introduction
High-resolution inverse synthetic aperture radar (ISAR) transmits a wide band signal to achieve a high resolution range profile and synthesizes a virtual aperture through the motion of the target, to achieve a high resolution along azimuth direction. Unlike optical imaging, ISAR has been used in various military applications, e.g., space situation awareness and air target surveillance in all-day and all-weather environments [
1,
2]. Generally, well-focused ISAR imaging can be obtained from high signal-to-noise ratio (SNR) and complete echoes using a range-Doppler (RD) algorithm and the polar formatting algorithm (PFA) [
3]. In practice, the long observation distance in a low-elevation angle environment may generate low SNR. Additionally, in complex electromagnetic environments with active interference and passive interference, the non-cooperative nature of target and radar resource scheduling [
4] may result in incomplete echoes, and the available imaging algorithms based on Fourier analysis cannot obtain satisfactory results. Considering the sparse nature of the scattering centers in the image domain, high-resolution ISAR imaging based on sparse signal reconstruction has received intensive attention in recent years [
5,
6].
The sparse signal reconstruction methods convert the sparse ISAR imaging problem into a sparse signal reconstruction problem, and then search the optimal solution by
-norm and
-norm optimizations. The
-norm optimization, such as orthogonal matching pursuit (OMP) [
7] and smoothed L0 (SL0) [
8] are sensitive to noise and prone to local optima. For
-norm optimization, a fast iterative shrinkage-thresholding algorithm (FISTA) [
9] and alternating direction method of multipliers (ADMM) [
10] can guarantee the sparsest solution. However, these methods usually require careful tuning of the regularization parameters, which is still an open problem. For 2D imaging, the vectorized optimization required a long operational time and large memory storage space. To tackle this problem, 2D SL0 [
11], 2D FISTA [
12], and 2D ADMM [
13] based on matrix operations are proposed. Although the abovementioned sparse signal reconstruction methods have a clear physical significance and strong theoretical support, their performance will degrade rapidly, due to improper parameter setup.
Apart from traditional optimization methods, deep networks have recently provided unprecedented performance gains in ISAR imaging. The available deep networks mainly include: (1) model-driven methods, and (2) data-driven methods. Model-driven methods are generally based on the unrolling technique, which was first proposed by expanding the iterative shrinkage-thresholding (ISTA) algorithm, to improve the computational efficiency of sparse coding algorithms through end-to-end training [
14]. Specifically, in ISAR imaging, the model-driven methods expand the iterative steps of the sparse signal reconstruction method into a deep network with finite-layers, set the adjustable parameters as network parameters, and then obtain their optimal values by network training. Finally, they output the focused image of an unknown target from the trained network. Typical networks with model-driven methods include AF-AMPNet [
15], 2D-ADMM-Net (2D-ADN) [
16], and convolution iterative shrinkage-thresholding (CIST) [
17], etc. Although model-driven methods have strong interpretability and satisfying reconstruction performance, the optimal parameters are sensitive to SNR. Therefore, to obtain a well-focused image under various SNRs, it is necessary to build a model set, which will increase the time and space complexity.
The data-driven methods directly learn the nonlinear mapping between the input (e.g., the RD image) and the label image, to achieve high-resolution imaging by designing and training deep networks. Facilitated by off-line network training, these methods can reconstruct multiple images rapidly. Typical ones in data-driven methods include fully convolutional neural network (FCNN) [
18] and UNet [
19], etc. The data-driven methods generally adopt a hierarchical architecture composed of many layers and a large number of parameters (possibly millions); thus, they are capable of learning complicated mappings [
14]. Specifically, in ISAR imaging, they have demonstrated robustness to various noise levels, i.e., a single trained network can achieve well-focused imaging of echoes with various SNR. However, the subjective network design process lacks unified criterion and theoretical support, which makes it difficult to analyze the influence of network structure and parameter settings on the reconstruction performance. In addition, they usually suffer from poor generalization performance and fail to obtain satisfying imaging results with incomplete data, as will be demonstrated later in
Section 5.
Plug-and-play (PnP) [
20] is a non-convex framework that combines proximal algorithms with advanced denoiser priors [
21], e.g., block-matching and 3D filtering (BM3D) [
22] or deep denoising network DnCNN [
23]. Recently, PnP has achieved great empirical success in a large variety of imaging applications [
24,
25,
26,
27], owing to its effectiveness and flexibility, especially with the integration of a deep denoising network. Compared with the original iterative methods, PnP methods offer more promising imaging results, due to their powerful denoising performance. However, PnP is highly sensitive to parameter selection. In addition, parameter tuning requires several trials, which is cumbersome and time-consuming.
To tackle the abovementioned issues, this article proposes plug-and-play 2D ADMM-Net (PAN), for high-resolution 2D ISAR imaging in complex environments. The key contributions include the following:
2D ADMM algorithm for sparse ISAR imaging is derived. On this basis, PnP 2D ADMM is proposed by combining a deep denoising network DnCNN and the 2D ADMM algorithm, which can significantly improve reconstruction performance.
To tackle the issues of parameter selection and tuning in PnP 2D ADMM, PAN, which is derived from PnP 2D ADMM, is designed using the ‘unrolling’ technique. Particularly, the adjustable parameters are estimated through end-to-end training. By this means, the sensitivity of the model-driven deep network to noise and the poor performance of the data-driven deep network for incomplete data are effectively addressed.
Although PAN is only trained by simulated data, experiments have shown that it can be generalized to measured data with different SNRs and obtain well-focused imaging.
The remainder of this article is organized as follows:
Section 2 establishes the sparse ISAR observation model, provides the iterative formulae of 2D ADMM, and proposes the PnP 2D ADMM for high-resolution 2D imaging.
Section 3 introduces the structure of PAN in detail.
Section 4 introduces the training of PAN in detail.
Section 5 carries out various experiments to prove the effectiveness of PAN.
Section 6 discusses the depth of PAN, and
Section 7 concludes the article with suggestions for future work.
3. Structure of PAN
To tackle the issue of the optimal internal parameter selection of the PnP 2D ADMM, we modify the structure of PnP 2D ADMM and expand it into a learnable deep architecture, i.e., the PAN, employing the unrolling technique. As shown in
Figure 1, the network has
stages, and Stage
,
, represents the
th iteration described by
Table 1. Typically, one stage consists of three layers, i.e., the reconstruction layer
; the denoising layer
, where
is the network parameter; and the multiplier update layer
. The three layers correspond to (10), (13), and (12), respectively. In addition,
represents the inputs of each layer. By these means, the penalty parameters and the update rate are trainable and the internal parameters are adjustable in each iteration.
3.1. Reconstruction Layer
The inputs of the reconstruction layer are the output of the denoising layer
and the multiplier update layer
, and the output of the reconstruction layer is
where
is the adjustable network parameter.
For
,
and
are initialized to zero matrices and, thus, Equation (14) can be rewritten as
For , the output of the reconstruction layer serves as the input of the denoising layer and the multiplier update layer . For , the output of the reconstruction layer is the input of the loss function.
3.2. Denoising Layer
As shown in
Figure 2, the inputs of the denoising layer are the output of the reconstruction layer
and the previous multiplier update layer
, and the output of the denoising layer is
where the deep network
is DnCNN, without residual learning and batch normalization. As shown in
Figure 2, the first three layers are convolution layers followed by the ReLU activation function, and each layer has 64 convolution kernels with the same kernel size
. In addition, we add a convolutional layer with one convolution kernel, whose size is
.
For
,
is initialized as a zero matrix and Equation (16) can be reduced to the following form,
The output of the denoising layer is the input of the reconstruction layer and the multiplier update layer .
3.3. Multiplier Update Layer
The inputs of the multiplier update layer are the output of the previous multiplier update layer
, the reconstruction layer
, and the denoising layer
, and the output of the multiplier update layer is
where the update rate
is the adjustable network parameter.
For
,
is initialized as a zero matrix and Equation (18) can be expressed as,
For , the output of the multiplier update layer is the input of the next reconstruction layer , the next denoising layer , and the next multiplier update layer . For , the output of the multiplier update layer is the input of the next reconstruction layer.
5. Experimental Results
In this section, we will demonstrate the effectiveness of PAN with high-resolution imaging of 2D incomplete data. The missing data pattern of the 2D incomplete data is shown in
Figure 3a, where the white bars denote the available data and the black ones denote the missing data, and the loss rate is 50%.
For network training, 1000 image pairs
were generated as the simulated data set, where each
includes randomly distributed scattering centers with Gaussian amplitude. The training set consisted of 800 samples and the test set consisted of 200 samples. The SNR of the range-compressed echoes of the training set ranged from 2 dB to 20 dB; while the SNR of the range-compressed echoes of the test set was set to 5 dB, 10 dB, and 15 dB, respectively. The label image of a typical noise-free test sample is shown in
Figure 3b.
To evaluate the imaging and generalization performance of PAN on measured data, two additional test sets of airplanes were selected. For complete data and high SNR, their RD images are shown in
Figure 3c and
Figure 3d. Then, the SNR of the range-compressed echoes was set to 5 dB, 10 dB, and 15 dB, respectively, by adding Gaussian noise.
As a comparison, we provide imaging results of the 2D-ADN [
16], UNet [
19], and PnP 2D ADMM, respectively. According to the analysis given in
Section 1, we needed to build a model set for the noise-level dependent, model-driven 2D-ADN, as the SNR is varying among echoes. Therefore, we trained three 2D-ADNs for SNRs of 5 dB, 10 dB, and 15 dB, respectively.
For quantitative performance evaluation, we calculated the normalized mean square error (NMSE), the peak signal-to-noise ratio (PSNR), the structure similarity index measure (SSIM), entropy (ENT), and the running time.
The computing platform was an Intel i9-10920X 3.50-GHz computer with a 12-core processor and 64 GB RAM. In addition, UNet, PnP 2D ADMM and PAN were implemented on an NVIDIA GTX3090 GPU with 24 GB RAM using the framework of Pytorch. The 2D-ADN was implemented on a CPU within the framework of MATLAB.
5.1. 2D Incomplete Data
For the test data illustrated in
Figure 3b, the imaging results are shown in
Figure 4, and for the test set, the average metrics of the 200 samples with SNR of 5 dB, 10 dB, and 15 dB are shown in
Table 1,
Table 2 and
Table 3, respectively. It can be observed that the data-driven model, i.e., the UNet, gives unsatisfying imaging results on incomplete 2D data, due to the lack of theoretical support and sparse constraints. On the contrary, the rest of the methods were based on the 2D ADMM and demonstrated better reconstruction performance on the incomplete data.
In addition, the parameters of the different methods are shown in
Table 4, although parameters of PAN are 1/70 of the UNet, it is observed that PAN had the highest PSNR, SSIM, and the smallest NMSE (except for 2D-ADN with single SNR), demonstrating its superior imaging performance and robustness to different noise levels. In addition, we adjusted the penalty parameters of PnP 2D ADMM manually to obtain better imaging results, which led to a low efficiency. The PnP 2D ADMM had the longest running time, as more iterations were required to obtain well-focused imaging results.
For the measured data illustrated in
Figure 3c, the imaging results are shown in
Figure 5, and the corresponding entropy (ENT) and running time are shown in
Table 5. It is observed that although 2D-ADN had a lower entropy in low SNR scenarios, it had a poor generalization performance, as shown in
Figure 5. On the contrary, PAN had a more complete structure, demonstrating its superior reconstruction performance on the measured data. In addition, UNet had artifacts, demonstrating its poor generalization performance.
For another set of measured data, illustrated in
Figure 3d, the imaging results are shown in
Figure 6, and the corresponding entropy and running time are listed in
Table 6. It can be seen that PAN had the best imaging results, indicating its good generalization performance on the measured data.
5.2. Different Loss Rates
In order to further analyze the imaging performance of PAN with different loss rates, we designed more experiments with only 25% and 10% available data, the data missing patterns of a 75% loss rate and 90% loss rate are shown in
Figure 7a and
Figure 7b, respectively.
For the test data illustrated in
Figure 3b, the imaging results are shown in
Figure 8, and for the test set, the average metrics of the 200 samples with SNRs of 5 dB, 10 dB, and 15 dB are shown in
Table 7,
Table 8, and
Table 9, respectively. It is observed that the test data with 90% loss rate had the worst imaging result under a 5 dB SNR. However, when we raised the SNR to 10 dB or 15 dB, the imaging performance improved rapidly. In addition, if we reduced the loss rate to 75% or 50%, the imaging performance also improved rapidly.
For the measured data illustrated in
Figure 3c, the imaging results are shown in
Figure 9, and the corresponding entropy (ENT) and running time are shown in
Table 10. It is observed that the test data with 90% loss rate had a lot of false points and artifacts. However, if we reduced the loss rate to 75% or 50%, the imaging performance improved rapidly.
For another measured dataset, illustrated in
Figure 3d, the imaging results are shown in
Figure 10, and the corresponding entropy and running time are listed in
Table 11. The test data with a 75% loss rate had a better imaging result than the test data with 90% loss rate, demonstrating that the loss rate had a significant impact on the imaging performance.