A Two-Stage SAR Image Generation Algorithm Based on GAN with Reinforced Constraint Filtering and Compensation Techniques

Liu, Ming; Wang, Hongchen; Chen, Shichao; Tao, Mingliang; Wei, Jingbiao

doi:10.3390/rs16111963

Open AccessTechnical Note

A Two-Stage SAR Image Generation Algorithm Based on GAN with Reinforced Constraint Filtering and Compensation Techniques

by

Ming Liu

¹

,

Hongchen Wang

¹,

Shichao Chen

^2,*,

Mingliang Tao

²

and

Jingbiao Wei

³

¹

School of Computer Science, Shaanxi Normal University, Xi’an 710119, China

²

School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China

³

Army Aviation Research Institute, Beijing 101121, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(11), 1963; https://doi.org/10.3390/rs16111963

Submission received: 26 April 2024 / Revised: 24 May 2024 / Accepted: 28 May 2024 / Published: 30 May 2024

(This article belongs to the Special Issue Advances in Synthetic Aperture Radar Image Processing and Information Extraction)

Download

Browse Figures

Versions Notes

Abstract

:

Generative adversarial network (GAN) can generate diverse and high-resolution images for data augmentation. However, when GAN is applied to the synthetic aperture radar (SAR) dataset, the generated categories are not of the same quality. The unrealistic category will affect the performance of the subsequent automatic target recognition (ATR). To overcome the problem, we propose a reinforced constraint filtering with compensation afterwards GAN (RCFCA-GAN) algorithm to generate SAR images. The proposed algorithm includes two stages. We focus on improving the quality of easily generated categories in Stage 1. Then, we record the categories that are hard to generate and compensate by using traditional augmentation methods in Stage 2. Thus, the overall quality of the generated images is improved. We conduct experiments on the moving and stationary target acquisition and recognition (MSTAR) dataset. Recognition accuracy and Fréchet inception distance (FID) acquired by the proposed algorithm indicate its effectiveness.

Keywords:

automatic target recognition (ATR); generative adversarial network (GAN); image generation; synthetic aperture radar (SAR)

1. Introduction

Synthetic aperture radar (SAR) enjoys a good reputation in the domain of remote sensing due to its imaging capability which is independent of flight altitude and weather condition. It is widely used in environmental surveillance [1], military reconnaissance [2], automatic target recognition (ATR) [3], crop monitoring [4] and other civil use [5].

In recent years, deep learning recognition algorithms have been applied to many domains such as face recognition [6,7], plant disease detection [8,9], autonomous driving [10], etc. Deep learning algorithms rely on deep networks that usually consist of layers of convolution, batch normalization, and activation. These algorithms are able to automatically extract the feature representation of the dataset and make reliable predictions for real-world samples [11,12,13]. Deep learning algorithms are also widely used in SAR ATR, since the network can be trained without human interference and can achieve better results compared to hand-crafted features used in traditional SAR ATR methods [14,15,16].

However, the number of available SAR images cannot meet the requirement of deep learning algorithms for sufficient training data [17,18,19]. SAR image acquisition is complex and time-consuming [20]. Also, it is laborious to label a sufficient number of SAR images for the deep learning algorithms [21,22]. Therefore, exploring an appropriate method to augment small-sized SAR datasets becomes an important issue.

In general, there are three main kinds of augmentation algorithms to expand SAR datasets.

Firstly, the traditional augmentation methods used in SAR include rotating, shifting, noise adding, and pose synthesis [23,24]. These methods are easy to implement and the results are quick to gain. However, additional useful information brought by these operations is limited [25].

Secondly, SAR image simulators are often used to expand the dataset, such as ray-tracing-based RaySAR [26] and coherent ray tracing SAR simulator (CohRaS) [27]. The images are generated using electromagnetic computation in these simulators. However, the generation results are easily influenced by geometric and radiation accuracy [20].

Another way of expanding the dataset is to use deep learning augmentation methods. There are two main branches in generative models: generative adversarial network (GAN) [28] and auto-encoder (AE) [29]. AE encodes the data into a low-dimensional representation by an encoder and decodes the representation back to the original data. But, as AE uses a small latent dimension, it cannot fit complex distributions and some of its generation results are unsatisfying [30].

GAN generates images through confronting the generator with the discriminator. The generator tries to make the generation results good enough to deceive the discriminator, while the discriminator attempts to discriminate the generation results from the real data. In this way, high-quality images are generated by conducting a min–max game until the two roles achieve Nash equilibrium [28]. However, there still exist some limitations for the original GAN. Various architectures have been proposed. Wasserstein GAN with gradient penalty (WGAN-GP) [31] uses the Wasserstein distance instead of the Jensen–Shannon (JS) divergence [32]. It can alleviate the problem of unstable training caused by gradient explosion [33,34]. Conditional GAN (CGAN) [35] inputs a conditional vector such as category, semantic label, or other conditional information to the generator so that the label of the generated data is controllable. Auxiliary classifier GAN (ACGAN) [36] is proposed by forcing the discriminator’s network to reconstruct the label of the generated image, so as to enhance the ability of the network and provide additional information back to the generator.

As for GAN-based methods used in SAR datasets, Cao et al. [37] propose label-directed GAN (LDGAN), which combines the architectures and the loss functions of WGAN-GP and CGAN to generate an SAR image of the desired label. Multi-constraint GAN (MCGAN) [38] introduces a pretrained classifier in order to ensure the correctness of the category of the generated image. Popular modules are also introduced to GAN, such as improved self-attention GAN (ISGAN) [39], which integrates a self-attention module with GAN to generate images of higher quality. Methods like dual discriminator and high frequency pass filter (DH-GAN) [40] attempt to combine physical characteristics with a GAN generation procedure to generate realistic images. Recently, there has also been a series of GAN methods which concentrate on generating SAR images of designated azimuth angles, including pose estimator and auxiliary classifier GAN (PeaceGAN) [41], azimuth-controllable GAN (AG-GAN) [42], and angle transformation GAN (ATGAN) [43].

However, from the generation results of these GANs, we can discover a common problem whereby the generated images are not of the same quality and some of the categories of the generated images are incorrect. When images of bad quality are sent to SAR ATR models, the performance will be severely influenced. To solve this problem, we propose a reinforced constraint filtering with compensation afterwards GAN (RCFCA-GAN) algorithm. There are two stages in the proposed algorithm. Stage 1 aims to improve the quality and ensure the correctness of the easily generated images. It is implemented by constructing a reinforced constraint filtering to the images that are returned from the discriminator and the auxiliary classifier. Adding a reinforced constraint filtering is to retain the images which are both good in quality and correct in category. Images of bad quality and of the wrong category can be discarded by filtering. Learning from easy and correct ones helps to adjust the updating direction and distance of the generator’s parameters. Stage 2 deals with the less trained categories which are discarded in Stage 1 and augments them using the traditional augmentation methods as a compensation. The reason we use traditional methods on the bottom-n categories is that GAN generates poor results for these categories. In this scenario, we turn to traditional methods for help to provide new information to the dataset while not introducing wrong feature information to the ATR model. The generated SAR images are evaluated using Fréchet inception distance (FID) and recognition accuracy. We observe improvements of the proposed RCFCA-GAN algorithm on these experimental results compared with other GAN-based methods.

This paper is organized as follows: Section 2 briefly introduces the related works of the proposed algorithm, including auxiliary classifier GAN (ACGAN) and the original top-k method. In Section 3, we describe the proposed RCFCA-GAN algorithm. In Section 4, we conduct experiments of recognition accuracy and FID metric to validate the effectiveness of the proposed algorithm. Finally, the proposed algorithm is concluded in Section 5.

2. Related Works

2.1. ACGAN

ACGAN is a variant of GAN which is featured with an auxiliary classifier in the architecture. The auxiliary classifier shares the same network and parameters with the discriminator, so that the network becomes dual-purposed: it not only measures the authenticity of the generated images, but also reconstructs their labels. Enforcing the network to be dual-tasked can result in better performance for the generator, as extra information is introduced by the auxiliary classifier and is fed back to the generator [36]. To achieve the dual purposes and to output the scalar score and label prediction, the final layer of the network is separated into two different structures. One is ended with sigmoid to output a scalar score, indicating the fidelity of the generated images. The other is ended with a Softmax layer which outputs a prediction. The prediction is composed of a series of possibilities. Each of the possibilities represents the probability of an image belongs to a specific category.

In ACGAN, the loss function is divided into two parts: the source loss

L_{s}

and the label reconstruction loss

L_{C}

. They can be described as follows:

L_{C} = E [\log P (C = c | X_{r e a l})] + E [\log P (C = c | X_{f a k e})]

(1)

L_{S} = E [\log P (S = r e a l | X_{r e a l})] + E [\log P (S = f a k e | X_{f a k e})]

(2)

where we denote the fake image as

X_{f a k e}

and the real image as

X_{r e a l}

. C indicates a certain category label. S indicates the source of the images. Two sources are considered in the function:

r e a l

and

f a k e

.

E

indicates the mean value.

P (C | X)

is the conditional probability of the image label reconstruction result C given the prior information of image C.

P (S | X)

is the conditional probability of the image source result S given the prior information of the image X. The generator tries to maximize

L_{C} - L_{s}

as it attempts to make the generated images as realistic as possible. The discriminator tries to maximize

L_{C} + L_{s}

, so that the discriminator can tell fake images from the real ones. The two roles of GAN continue to play the min–max game until they reach the Nash equilibrium.

2.2. Top-k Training

In GAN, the discriminator is actually a two-categorical classifier which discriminates fake images from real ones. For example, in the original GAN, the discriminator outputs a scalar score ranging from 0 to 1, which indicates the probability of an image being real. A higher output can be interpreted as a better image generation result. In top-k training [44], the idea is to take advantage of the meaning of discriminator’s output and to use it to improve the quality of generated images’ quality. The top-k method first ranks the generated images according to their corresponding scores, which are judged by the discriminator’s output. Then, it returns to the generator only the best top-k images to optimize the update direction of the generator’s parameters. After the top-k selection, the parameters become closer to the global optimum, including cosine similarity and change in distance.

3. The Proposed RCFCA-GAN Algorithm

3.1. Problems with Traditional GAN Models

From the generation results of the previous GAN models, we noticed a phenomenon that not all of the generated images are of the same quality. There exist some images that are totally blurred and their categories can be hardly recognized according to their shape and characteristics. Thus, augmenting the original dataset using these unrealistic images will negatively influence target recognition. Additional information provided by these images is also unreliable. Figure 1 shows the generation results of the traditional GAN. We can see that the quality of some generated images is poor and cannot be used in the subsequent recognition model training.

The reason is that for some data in the dataset, it is hard for GAN to generate and this may result in inconsistent generation quality within the augmented dataset. To solve this problem, we consider augmenting the dataset through two stages.

In Stage 1, we concentrate on generating the images that are easy to train during GAN training. We put a reinforced constraint on the returned images by taking advantage of the outputs of both the auxiliary classifier (AC) and the discriminator (D). The main idea is to discard the generated images that do not perform well in AC and D, and preserve those that perform well and rank in the top-k. In this way, we can weaken the bad influences caused by the data that are difficult to generate when we update G. This is helpful for G to focus on the data that are easily learnt.

However, discarding the images in Stage 1 means some categories in the dataset are less trained compared to those that always rank in the top-k. To compensate for this, we record these categories and augment them using the traditional augmentation method in Stage 2.

3.2. Stage 1 of RCFCA-GAN Algorithm: Reinforced Constraint Filtering Based on Top-k

Firstly, we conduct prediction filtering on AC. Among all the predictions made by AC, we determine which predictions are good enough to be returned to G.

However, we cannot determine directly how good a prediction is in AC. Since, for each input image, AC outputs a prediction containing a series of probabilities, each of them indicates the probability that one image belongs to a specific category.

In order to determine whether a prediction is good or not, we use the L2 distance to evaluate the difference between AC’s prediction and the input label of G. Smaller L2 distance indicates the better quality of the generated image. We select the predictions of which L2 distance ranks in the top-k. As AC is not fully trained in the early epoch of training, the predictions that it outputs are unreliable. Selecting among these unreliable predictions is thus meaningless, so we set k to be the full batch size at the very start to avoid the selection. Then, we set a decay rate on k. Decay k for each epoch to gradually tighten the selection standard until k descends to a certain value, which can be expressed as follows:

k_{c u r} = \{\begin{matrix} k_{p r e v} \times φ & k_{p r e v} > η_{1} \\ k_{p r e v} & k_{p r e v} \leq η_{1} \end{matrix}

(3)

where

k_{c u r}

indicates the value of k in the current training epoch;

k_{p r e v}

indicates the value in the previous training epoch.

φ

is the decay factor of k. k descends by multiplying

φ

when it is greater than the threshold

η_{1}

. After the selection is finished, a filtered set

S_{A C}

is formed. The original auxiliary classifier returns the binary cross entropy (BCE) loss calculated by all the generated images to the generator for its parameter updating. The proposed reinforced constraint filtering implements top-k filtering on the auxiliary classifier and selects the generated images which are correct in category according to the output of the auxiliary classifier. The generator is updated using the BCE loss calculated by filtered set

S_{A C}

. The easy and correct images help to adjust its parameter updating direction and distance in order to ensure the categories of the generate images.

As for D, we also apply this top-k filtering to the generated images according to D’s output: high score images are preserved, while low score ones are discarded. Then, the filtered set

S_{D}

is formed. The two filtering steps of AC and D are stacked together and we entitle them as reinforced constraint filtering.

Finally, we return the common part of

S_{A C}

and

S_{D}

to G, according to which G updates its parameters. The three steps above can be summarized as: (1) Calculating the distance between the predictions and the input labels, then retaining the images whose distance ranks in top-k. (2) Sorting and retaining images whose scores rank in top-k. (3) Finding their common part and returning to the generator. The three steps are illustrated in Figure 2.

In each training epoch, when D and AC give positive results by outputting a scalar that is close to 1 or a prediction that is close to the original category, it means the generated images are of good quality. But, more importantly, it also indicates that G has learnt this part of the data correctly so that it can generate realistic images to deceive D and AC. We can interpret it as: this part of the data is easily learnt for G. Therefore, if we return the well-behaved images to G, we intentionally force G to learn only the data that are easily learnt.

3.3. Stage 2 of the RCFCA-GAN Algorithm: Compensation Afterwards for Less Trained Categories in Bottom-n

In Stage 1, we discard the images that D and AC judge to be poor in quality and avoid updating the parameters according to them. Many generated images are discarded due to their low probabilities of being real images. As a result, for some categories, the frequency of getting discarded is higher than the others; these categories are seldom returned to the generator, so they will result in lower training times compared to the others. We entitle these categories as a “less trained category”. In this case, for the categories that are seldom trained, we record the categories that occur in bottom-n for each epoch during training. The occurrence times of these categories are accumulated and ranked. We augment the categories whose occurrence times rank in the top w by the traditional dataset augmentation methods, including noising and shifting and do not use the GAN-generated results to augment the original dataset. We add 5 dB noise to the images which belong to bottom-n categories. And, considering the target is positioned in the center of the SAR image, we crop the image from left to right and from top to bottom while keeping the target’s integrity. This will serve as a compensation for those less trained categories. For example, as demonstrated in Figure 3, in epoch 100 and epoch 101, the categories of the images that appear in the bottom 4 are recorded and ranked. The records are updated in each epoch.

One thing to note is that since the generated results of the bottom-w categories by the generator of GAN are poor, they cannot be directly utilized to augment the original dataset. Therefore, in Stage 2, we augment the categories that are discarded in Stage 1 using traditional augmentation methods. The purpose of traditional augmentation is to provide the SAR dataset with extra images that do not introduce incorrect features which may mislead the ATR model.

As for the value of n, at the very beginning of training, D and AC are unreliable. Records of the occurrence time are also meaningless. So, we decide to ascend n from a small value as epoch grows until n is greater than the threshold

η_{2}

, which can be expressed as follows:

n_{c u r} = \{\begin{matrix} n_{p r e v} \times θ & n_{p r e v} < η_{2} \\ n_{p r e v} & n_{p r e v} \geq η_{2} \end{matrix}

(4)

where

θ

is the ascending rate,

n_{p r e v}

is the value of n in the previous epoch, and

n_{c u r}

is the value of n in the current epoch.

3.4. Summary of the Proposed RCFCA-GAN Algorithm

In the proposed RCFCA-GAN algorithm, we firstly conduct reinforced constraint filtering according to the outputs of D and AC to select top-k images. In this way, we can make G concentrate on the images that are easy to generate. Then, for the data that are hard to generate, we record the categories that occur most frequently in bottom-n and augment these categories by traditional augmentation methods to compensate.

The loss function adopted In the R”FCA-GAN algorithm can be described as follows:

L_{W G A N - G P} = E (D (X_{r e a l}) - D (X_{f a k e})) + λ [{({‖\nabla_{Z} D (Z)‖}_{2} - 1)}^{2}]

(5)

Z = X_{r e a l} + β (X_{f a k e} - X_{r e a l})

(6)

where

D (X_{r e a l})

indicates the discriminator’s output when inputting real images,

D (X_{f a k e})

indicates the discriminator’s output when inputting fake images, and the last item is the gradient penalty that implements the Lipschitz constraint.

λ

is the coefficient of the gradient penalty. Equation (6) reveals Z in the Lipschitz constraint item. It is composed of mixed real and fake images multiplied by

β

, which is a random number ranging from 0 to 1. The final loss of the discriminator and the generator are given by:

L_{D} = L_{W G A N - G P} + L_{C}

(7)

L_{G} = - L_{W G A N - G P} + L_{C}

(8)

An overall procedure of the RCFCA-GAN algorithm is demonstrated in Figure 4.

4. Experiments

In order to verify the effectiveness of the proposed algorithm, we set up two experiments: (1) FID score; (2) Recognition accuracy on the SAR dataset augmented using the generated images. We use the same hyperparameter settings such as batch size, total training epochs, etc., for the RCFCA-GAN algorithm and other algorithms. We carry out the experiments on the moving and stationary target acquisition and recognition (MSTAR) dataset [45], in which 10 categories of targets are included. The images with depression angle 17° are used for training and those of 15° are used for testing. Detailed information of the dataset, including the categories and numbers of samples, is shown in Table 1.

4.1. Network Architecture and Parameter Settings

The architecture of the proposed RCFCA-GAN algorithm is presented in Figure 5.

As for the parameters, we set the learning rate to be 0.0003 and the batch size to be 64. For k, the initial value is set to be the full batch size and we set the decay rate

φ

to be 0.99. As for the threshold

η_{1}

, we conduct a Fréchet inception distance (FID) experiment to evaluate the influences of different

η_{1}

settings. FID is used to measure the similarity between the real images and the images generated from GAN by calculating the distance of feature vector pairs in two datasets. It is an indicator in terms of discriminability, robustness and computational efficiency [46]. In the experiment, we use Inception Net V3 [47] to extract features. Also, the network is pretrained on a large dataset to ensure its feature extracting ability. After the features of the images are extracted by the Inception Net V3, we use the following equation to calculate the FID scores:

F I D = {‖μ_{r} - μ_{f}‖}^{2} + T r (\sum r + \sum f - 2 {(\sum r \sum f)}^{\frac{1}{2}})

(9)

where

μ_{f}

and

μ_{r}

indicate the mean value of the fake and real images, respectively;

T r

indicates the trace of a matrix.

\sum r

and

\sum f

indicate the multivariate of the real and fake images. The FID results are shown in Table 2. BS is the abbreviation for batch size. BS × 90% indicates the circumstance when

η_{1}

equals 90% of the batch size. From the results in Table 2, we can see that if

η_{1}

is too big, the FID score of the generated images is high. This is because increasing

η_{1}

can be deemed as reducing the constraints on filtering, which results in more unrealistic generated images to be returned to G. However, if

η_{1}

is too small, the FID score also becomes high, since too many samples are discarded during training. G cannot obtain enough feedback to update. When we set

η_{1}

to be 70% of the batch size, the number of the remaining images is appropriate, so we can obtain the best FID result.

For

n

, we set the initial value to be 6. Its increase rate

θ

is set to be 1.05. For its threshold

η_{2}

, we set it to be 25% of the batch size. This means we record, at most, the categories of the bottom 25% images in each epoch. For the number of categories that are augmented using traditional methods, we set it to be dynamic. In the final category records, if the occurrence times of some categories occupy more than 15%, it means these specific categories are hard to generate and we augment them using traditional methods.

4.2. FID of Generated Images

In this part, we evaluate the FID of different GAN algorithms to demonstrate the effectiveness of the proposed RCFCA-GAN algorithm. We train ACGAN [37], ACGAN with WGAN-GP loss function [30] (we name this algorithm as “WACGAN-GP”), WACGAN-GP using the single top-k method (we name this algorithm as “Single top-k”) and the proposed algorithm implementing only Stage 1 (we name this algorithm as “RCFCA-GAN (Stage 1)”), respectively. The reason why we only implement Stage 1 in this experiment is that FID is used to evaluate the generation results of the GAN model. It is not appropriate to calculate the FID of a dataset augmented by traditional augmentation methods in Stage 2. The results are shown in Table 3, from which we can observe that the FID score of the RCFCA-GAN algorithm is 38.3, which is the lowest. The improvement in FID is due to the reinforced constraint filtering, as the algorithm reduces the bad Influences of the hard to learn Images.

Figure 6 shows the real images and generated images using four different GAN methods. The images in each blue rectangular frame belong to a specific target category. For each frame, images from left to right are real images, generated images from ACGAN, WACGAN-GP, single top-k, and RCFCA-GAN (Stage 1), respectively. From Figure 6, we can see that the first four methods can hardly reconstruct the features of the targets. Blurred and unrealistic areas are highlighted using yellow and red circles. For example, we can see from the generation result of target T72 using ACGAN that the generated image has two tank guns, whereas there is only one tank gun in the real image. WACGAN-GP results in the generated target 2S1 image have an unrealistic highlighted spot which does not exist in real images. Many images are blurred as shown in the generation results of single top-k. As for the images obtained by using the proposed RCFCA-GAN (Stage 1), they show more realistic characteristics compared to the other three methods.

4.3. Recognition Accuracy

In this part, we utilize multiple recognition models, including AlexNet [48], AconvNet [49], VGG [50] and ResNet [51], to evaluate the performance of the dataset expanded by using the proposed RCFCA-GAN algorithm in terms of target recognition. The training set contains images of 17° depression angle in the MSTAR dataset and the generated images. For the testing set, we use images of 15° depression angle in the MSTAR dataset, as shown in Table 1. The parameters of these four recognition models are the default settings in their original paper. Five GAN-based augmentation algorithms including ACGAN, WACGAN-GP, single top-k, RCFCA-GAN (Stage 1) and RCFCA-GAN (Stage 1 + Stage 2) are chosen to compare the performance in terms of recognition accuracy. The results are demonstrated in Table 4. From Table 4, we can see that the accuracy is improved when we expand the original dataset using ACGAN, WACGAN-GP and single top-k. This validates the effectiveness of GAN in terms of dataset augmentation. When RCFCA-GAN (Stage 1) is used, the accuracy of the four recognition models is further improved, since the proposed algorithm conducts a filtering constraint on the returned images in Stage 1. Thus, the effectiveness of Stage 1 is demonstrated. When the compensation using traditional augmentation methods for less trained categories is implemented (RCFCA-GAN (Stage 1 + Stage 2)), we can achieve the best accuracy of 97.9% on the VGG recognition model. This validates the effectiveness of the compensational augmentation in Stage 2. The confusion matrices of the proposed RCFCA-GAN algorithm on these four recognition models are presented in Figure 7.

5. Conclusions

In this paper, we proposed an RCFCA-GAN algorithm to improve the overall quality of the generated images and thus enhance the performance of SAR ATR models. From the lowest FID result obtained by the proposed algorithm, we validate the effectiveness of the reinforced constraint filtering in Stage 1 on improving the quality of the images that are easy to generate. The constraint forces the generator to learn the easy parts of the data and reduces the influences of bad images on GAN training. From the highest SAR recognition accuracy achieved by the proposed RCFCA-GAN algorithm, we validate the usefulness of combining the filtering in Stage 1 with the compensational augmentation in Stage 2. The proposed RCFCA-GAN algorithm can be used to augment SAR datasets effectively.

Author Contributions

Conceptualization, M.L.; methodology, M.L., H.W. and S.C.; software, H.W.; validation, M.L., H.W. and S.C.; writing—original draft preparation, M.L. and H.W.; writing—review and editing, S.C., M.T. and J.W.; supervision, S.C.; funding acquisition, S.C., M.T. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported in part by the National Natural Science Foundation of China under Grant 62271408 and Grant 61971355, in part by the Special Support Program for High Level Talents of Shaanxi Province, and in part by the Project of Xi’an Science and Technology Planning Foundation under Grant 22GXFW0021. The Fundamental Research Funds for the Central Universities, grant number G2024KY05104. (Corresponding author: Shichao Chen.).

Data Availability Statement

The MSTAR dataset used in the paper can be obtained at: https://www.sdms.afrl.af.mil/index.php?collection=mstar (accessed on 27 May 2024).

Acknowledgments

We would like to thank the reviewers for their valuable and constructive feedback, which has greatly contributed to the enhancement of this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wu, X.; Zhang, Z.; Xiong, S.; Zhang, W.; Tang, J.; Li, Z.; An, B.; Li, R. A Near-Real-Time Flood Detection Method Based on Deep Learning and SAR Images. Remote Sens. 2023, 15, 2046. [Google Scholar] [CrossRef]
Sun, Z.; Leng, X.; Lei, Y.; Xiong, B.; Ji, K.; Kuang, G. BiFA-YOLO: A Novel YOLO-Based Method for Arbitrary-Oriented Ship Detection in High-Resolution SAR Images. Remote Sens. 2021, 13, 4209. [Google Scholar] [CrossRef]
Ding, B.; Wen, G.; Huang, X.; Ma, C.; Yang, X. Target Recognition in Synthetic Aperture Radar Images via Matching of Attributed Scattering Centers. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3334–3347. [Google Scholar] [CrossRef]
Zhou, X.; Wang, J.; Shan, B.; He, Y. Early-Season Crop Classification Based on Local Window Attention Transformer with Time-Series RCM and Sentinel-1. Remote Sens. 2024, 16, 1376. [Google Scholar] [CrossRef]
Sajjad, M.; Wang, J.; Afzal, Z.; Hussain, S.; Siddique, A.; Khan, R.; Iqbal, J. Assessing the Impacts of Groundwater Depletion and Aquifer Degradation on Land Subsidence in Lahore, Pakistan: A PS-InSAR Approach for Sustainable Urban Development. Remote Sens. 2023, 15, 5418. [Google Scholar] [CrossRef]
Shakeel, M.S.; Lam, K.M. Deep-Feature Encoding-Based Discriminative Model for Age-Invariant Face Recognition. Pattern Recognit. 2019, 93, 442–457. [Google Scholar] [CrossRef]
He, X.; Yan, S.; Hu, Y.; Niyogi, P.; Zhang, H.J. Face Recognition Using Laplacianfaces. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 328–340. [Google Scholar]
Alabi, T.R.; Adewopo, J.; Duke, O.P.; Kumar, P.L. Banana Mapping in Heterogenous Smallholder Farming Systems Using High-Resolution Remote Sensing Imagery and Machine Learning Models with Implications for Banana Bunchy Top Disease Surveillance. Remote Sens. 2022, 14, 5206. [Google Scholar] [CrossRef]
Tetila, E.C.; Machado, B.B.; Menezes, G.K.; Oliveira, A.S. Automatic Recognition of Soybean Leaf Diseases Using UAV Images and Deep Convolutional Neural Networks. IEEE Geosci. Remote Sens. Lett. 2020, 17, 903–907. [Google Scholar] [CrossRef]
Muhammad, K.; Ullah, A.; Lloret, J.; Del, J.; Abuquerque, V.H.C. Deep Learning for Safe Autonomous Driving: Current Challenges and Future Directions. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4316–4336. [Google Scholar] [CrossRef]
Du, P.; Li, E.; Xia, J.; Samat, A.; Bai, X. Feature and Model Level Fusion of Pretrained CNN for Remote Sensing Scene Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 2600–2611. [Google Scholar] [CrossRef]
Zhang, X.; Wang, L.; Su, Y. Visual place recognition: A Survey From Deep Learning Perspective. Pattern Recognit. 2021, 113, 107760–107802. [Google Scholar] [CrossRef]
Rouast, P.V.; Adam, M.T.; Chiong, R. Deep Learning for Human Affect Recognition: Insights and New Developments. IEEE Trans. Affect. Comput. 2019, 12, 524–543. [Google Scholar] [CrossRef]
Yu, Z.; Yu, L.; Cheng, P.; Chen, J.; Chi, C. A Comprehensive Survey on SAR ATR in Deep-Learning Era. Remote Sens. 2023, 15, 1454. [Google Scholar] [CrossRef]
Zhou, J.; Shi, Z.; Xiao, C.; Qiang, F. Automatic Target Recognition of SAR Images Based on Global Scattering Center Model. IEEE Trans. Geosci. Remote Sens. 2011, 49, 3713–3729. [Google Scholar]
Wen, X.; Zhang, S.; Wang, J.; Yao, T.; Tang, Y. A CFAR-Enhanced Ship Detector for SAR Images Based on YOLOv5s. Remote Sens. 2024, 16, 733. [Google Scholar] [CrossRef]
Huang, Z.; Pan, Z.; Lei, B. Transfer Learning with Deep Convolutional Neural Network for SAR Target Classification with Limited Labeled Data. Remote Sens. 2017, 9, 907. [Google Scholar] [CrossRef]
Wang, L.; Qi, Y.; Mathiopoulos, P.T.; Zhao, C.; Mazhar, S. An Improved SAR Ship Classification Method Using Text-to-Image Generation-Based Data Augmentation and Squeeze and Excitation. Remote Sens. 2024, 16, 1299. [Google Scholar] [CrossRef]
Yang, Y.; Chen, J.; Sun, L.; Zhou, Z.; Huang, Z.; Wu, B. Unsupervised Domain-Adaptive SAR Ship Detection Based on Cross-Domain Feature Interaction and Data Contribution Balance. Remote Sens. 2024, 16, 420. [Google Scholar] [CrossRef]
Balz, T.; Stilla, U. Hybrid GPU-Based Single-and Double-Bounce SAR Simulation. IEEE Trans. Geosci. Remote Sens. 2009, 47, 3519–3529. [Google Scholar] [CrossRef]
Yu, Q.; Hu, H.; Geng, X.; Jiang, Y.; An, J. High-Performance SAR Automatic Target Recognition under Limited Data Condition Based on a Deep Feature Fusion Network. IEEE Access 2019, 7, 165646–165658. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Hao, X.; Liu, L.; Yang, R.; Yin, L.; Zhang, L.; Li, X. A Review of Data Augmentation Methods of Remote Sensing Image Target Recognition. Remote Sens. 2023, 15, 827. [Google Scholar] [CrossRef]
Yang, R.; Wang, R.; Deng, Y.; Jia, X.; Zhang, H. Rethinking the Random Cropping Data Augmentation Method Used in the Training of CNN-Based SAR Image Ship Detector. Remote Sens. 2020, 13, 34. [Google Scholar] [CrossRef]
Khalifa, N.E.; Loey, M.; Mirjalili, S. A Comprehensive Survey of Recent Trends in Deep Learning for Digital Images Augmentation. Artif. Intell. Rev. 2022, 55, 2351–2377. [Google Scholar] [CrossRef] [PubMed]
Tao, J.; Auer, S.; Palubinskas, G.; Reinartz, P.; Bamler, R. Automatic SAR Simulation Technique for Object Identification in Complex Urban Scenarios. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 994–1003. [Google Scholar] [CrossRef]
Hammer, H.; Schulz, K. Coherent Simulation of SAR Images. SPIE 2009, 7477, 74771G-1–74771G-9. [Google Scholar]
Wang, C.; Xu, C.; Yao, X.; Tao, D. Evolutionary Generative Adversarial Networks. IEEE Trans. Evol. Comput. 2019, 23, 921–934. [Google Scholar] [CrossRef]
Gao, R.; Hou, X.; Qin, J. Zero-VAE-GAN: Generating Unseen Features for Generalized and Transductive Zero-Shot Learning. IEEE Trans. Image Process. 2020, 29, 3665–3680. [Google Scholar] [CrossRef]
Liu, Y.; Liu, Z.; Li, S.; Yu, Z.; Guo, Y.; Liu, Q.; Wang, G. Cloud-VAE: Variational Autoencoder with Concepts Embedded. Pattern Recognit. 2023, 140, 109530–109572. [Google Scholar] [CrossRef]
He, J.; Ouyang, M.; Chen, Z.; Chen, D.; Liu, S. A Deep Transfer Learning Fault Diagnosis Method Based on WGAN and Minimum Singular Value for Non-Homologous Bearing. IEEE Trans. Instrum. Meas. 2022, 71, 3509109. [Google Scholar] [CrossRef]
Yang, W.; Song, H.; Huang, X.; Xu, X.; Liao, M. Change Detection in High-Resolution SAR Images Based on Jensen–Shannon Divergence and Hierarchical Markov Model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 3318–3327. [Google Scholar] [CrossRef]
Shi, Y.; Han, L.; Chang, S.; Hu, T.; Dancey, D. A Latent Encoder Coupled Generative Adversarial Network (LE-GAN) for Efficient Hyperspectral Image Super-Resolution. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5534819. [Google Scholar] [CrossRef]
Li, W.; Fan, L.; Wang, Z.; Ma, C.; Cui, X. Tackling Mode Collapse in Multi-Generator GANs with Orthogonal Vectors. Pattern Recognit. 2021, 110, 107646–107690. [Google Scholar] [CrossRef]
Yang, X.; Zhao, J.; Wei, Z.; Wang, N.; Gao, X. SAR-to-Optical Image Translation Based on Improved CGAN. Pattern Recognit. 2022, 121, 108208–108217. [Google Scholar] [CrossRef]
Zhang, X.; Wang, Z.; Lu, K.; Pan, Q.; Li, Y. Data Augmentation and Classification of Sea–Land Clutter for Over-the-Horizon Radar Using AC-VAEGAN. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5104416. [Google Scholar] [CrossRef]
Cao, C.; Cao, Z.; Cui, Z. LDGAN: A Synthetic Aperture Radar Image Generation Method for Automatic Target Recognition. IEEE Trans. Geosci. Remote Sens. 2020, 58, 3495–3508. [Google Scholar] [CrossRef]
Du, S.; Wang, J.; Qi, Y. A High-quality Multicategory SAR Images Generation Method with Multi-Constraint GAN for ATR. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar]
Shi, X.; Xing, M.; Zhang, J.; Sun, G. ISAGAN: A High-Fidelity Full-Azimuth SAR Image Generation Method. In Proceedings of the 2022 3rd China International SAR Symposium (CISS), Shanghai, China, 2–4 November 2022; pp. 1–41. [Google Scholar]
Oghim, S.; Kim, Y.; Bang, H. SAR Image Generation Method Using DH-GAN for Automatic Target Recognition. Sensors 2024, 24, 670. [Google Scholar] [CrossRef]
Sun, Y.; Wang, Y.; Hu, L. Attribute-Guided Generative Adversarial Network with Improved Episode Training Strategy for Few-Shot SAR Image Generation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 1785–1801. [Google Scholar] [CrossRef]
Wang, C.; Pei, J.; Liu, X. SAR Target Image Generation Method Using Azimuth-Controllable Generative Adversarial Network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 9381–9397. [Google Scholar] [CrossRef]
Zeng, Z.; Tan, X.; Zhang, X. ATGAN: A SAR Target Image Generation Method for Automatic Target Recognition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 2986–3001. [Google Scholar] [CrossRef]
Sinha, S.; Zhao, Z.; Goyal, A.C.; Odena, A. Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples. Proc. Adv. Neural Inf. Process. Syst. 2020, 33, 14638–14649. [Google Scholar]
Keydel, E.R.; Lee, S.W.; Moore, J.T. MSTAR Extended Operating Conditions: A Tutorial. Algorithms SAR Imagery 1996, 2757, 228–242. [Google Scholar]
Szegedy, C.; Vanhoucke, V.; Ioffe, S. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; Volume 85, pp. 2818–2826. [Google Scholar]
Borji, A. Pros and Cons of GAN Evaluation Measures. Comput. Vis. Image Underst. 2019, 179, 41–65. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet Classification with Deep Convolutional Neural Networks. Proc Adv. Neural Inf. Process. Syst. 2012, 25, 84–90. [Google Scholar] [CrossRef]
Chen, S.; Wang, H.; Xu, F.; Jin, Y.Q. Target Classification Using the Deep Convolutional Networks for SAR Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4806–4817. [Google Scholar] [CrossRef]
Ye, M.; Ni, R.; Zhang, C.; Gong, H.; Hu, T.; Li, S.; Sun, Y.; Zhang, T.; Guo, Y. A Lightweight Model of VGG-16 for Remote Sensing Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 6916–6922. [Google Scholar] [CrossRef]
Li, L.; Wang, C.; Zhang, H.; Zhang, B. SAR Image Ship Object Generation and Classification with Improved Residual Conditional Generative Adversarial Network. IEEE Geosci. Remote Sens. Lett. 2020, 19, 4000105. [Google Scholar] [CrossRef]

Figure 1. Generation results of the traditional GAN.

Figure 2. Three steps in Stage 1 of RCFCA-GAN algorithm. (a) Top-k selection of auxiliary classifier; (b) Top-k selection of discriminator; (c) Calculate common parts of two Top-k results.

Figure 3. Category recording during training.

Figure 4. Overall procedure of the RCFCA-GAN algorithm.

Figure 5. Architecture of the proposed RCFCA-GAN algorithm.

Figure 6. Real images and generation images of ten target categories. Images from left to right are real images and generated images from ACGAN, WACGAN-GP, single top-k and RCFCA-GAN (Stage 1), respectively.

Figure 7. Confusion matrices of RCFCA-GAN algorithm on four recognition models. (a) AlexNet. (b) AConvNet. (c) VGG. (d) ResNet.

Table 1. Category and number information of MSTAR dataset of ten categories.

Target Category	Depression Angle	Number of Samples
BMP2	17°	233
BMP2	15°	587
BTR70	17°	233
BTR70	15°	196
T72	17°	232
T72	15°	582
BTR60	17°	256
BTR60	15°	195
2S1	17°	299
2S1	15°	274
BRDM2	17°	298
BRDM2	15°	274
D7	17°	299
D7	15°	274
T62	17°	299
T62	15°	273
ZIL131	17°	299
ZIL131	15°	274
ZSU23/4	17°	299
ZSU23/4	15°	274

Table 2. FID under different thresholds.

$η_{1}$	BS × 90%	BS × 70%	BS × 50%	BS × 30%
FID	39.1	38.3	38.9	45.3

Table 3. FID of the generated images using four methods.

Generation Model	FID Scores
ACGAN	57.2
WACGAN-GP	41.5
Single top-k	39.6
RCFCA-GAN (Stage 1)	38.3

Table 4. Recognition accuracy of original dataset and five augmented datasets on multiple recognition models.

	Original	ACGAN	WACGAN-GP	$Sin gle Top - k$	RCFCA-GAN (Stage 1)	RCFCA-GAN (Stage 1 + Stage 2)
AlexNet	90.5	91.6	92.1	92.3	92.8	95.1
AconvNet	95.4	95.8	96.4	96.9	97.4	97.8
VGG	94.9	95.6	96.5	97.0	97.3	97.9
ResNet	94.8	95.5	96.6	96.8	97.2	97.5

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, M.; Wang, H.; Chen, S.; Tao, M.; Wei, J. A Two-Stage SAR Image Generation Algorithm Based on GAN with Reinforced Constraint Filtering and Compensation Techniques. Remote Sens. 2024, 16, 1963. https://doi.org/10.3390/rs16111963

AMA Style

Liu M, Wang H, Chen S, Tao M, Wei J. A Two-Stage SAR Image Generation Algorithm Based on GAN with Reinforced Constraint Filtering and Compensation Techniques. Remote Sensing. 2024; 16(11):1963. https://doi.org/10.3390/rs16111963

Chicago/Turabian Style

Liu, Ming, Hongchen Wang, Shichao Chen, Mingliang Tao, and Jingbiao Wei. 2024. "A Two-Stage SAR Image Generation Algorithm Based on GAN with Reinforced Constraint Filtering and Compensation Techniques" Remote Sensing 16, no. 11: 1963. https://doi.org/10.3390/rs16111963

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Two-Stage SAR Image Generation Algorithm Based on GAN with Reinforced Constraint Filtering and Compensation Techniques

Abstract

1. Introduction

2. Related Works

2.1. ACGAN

2.2. Top-k Training

3. The Proposed RCFCA-GAN Algorithm

3.1. Problems with Traditional GAN Models

3.2. Stage 1 of RCFCA-GAN Algorithm: Reinforced Constraint Filtering Based on Top-k

3.3. Stage 2 of the RCFCA-GAN Algorithm: Compensation Afterwards for Less Trained Categories in Bottom-n

3.4. Summary of the Proposed RCFCA-GAN Algorithm

4. Experiments

4.1. Network Architecture and Parameter Settings

4.2. FID of Generated Images

4.3. Recognition Accuracy

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI