Next Article in Journal
Comprehensive Study on the 143 A.D. West Gangu Earthquake in the West Qinling Area, Northeastern Margin of Tibetan Plateau
Previous Article in Journal
Semi-Tightly Coupled Robust Model for GNSS/UWB/INS Integrated Positioning in Challenging Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Detection and Type Recognition of SAR Artificial Modulation Targets Based on Multi-Scale Amplitude-Phase Features

State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(12), 2107; https://doi.org/10.3390/rs16122107
Submission received: 15 May 2024 / Accepted: 6 June 2024 / Published: 11 June 2024

Abstract

:
With respect to the detection and recognition of an Artificial Modulation Target (AMT) with different modulated types, the state-of-the-art methods generally suffer the deficiencies of overfitting and insufficient generalization of existing neural network solutions. To address these problems, this paper proposes a multi-scale amplitude-phase feature discrimination method for AMTs in SAR images. First, a multi-type modulated AMT Dataset is generated (AMT Detection and Modulation Type Recognition Dataset, ADMTR Dataset), wherein the factors of jamming position, jamming-to-signal ratio (JSR), and the modulated parameter are considered to enhance the generalization. Second, a Multi-Input Multi-Output Fusion Wavelet Neural Network (MIMOFWTNN) is established, which not only uses the amplitude information of the scene but also adequately makes use of the phase and high-frequency information. This empowers us to detect the AMT in a higher dimensional feature space such that the type recognition can be implemented with more certainty. Analysis and discussions conducted on comparison experiments and ablation experiments demonstrate that the proposed network can achieve an average accuracy of 96.96% on the cross-validation set and a correct rate of 99.0% on the completely independent test set, which outperforms the compared methods.

1. Introduction

1.1. Relevant Background

Synthetic Aperture Radar (SAR) is a kind of radar equipment that uses active transmission of broadband microwave signals and receives and processes the reflected echoes to achieve microwave imaging of an area scene [1,2,3]. In order to achieve the effect of high-resolution imaging, it is necessary to adopt the broadband transmission method in the distance between the radar and target to achieve high-range resolution and adopt the synthetic aperture method in the direction of the radar and target to obtain a synthetic equivalent antenna beam angle to achieve high azimuth resolution. This radar has high application value in remote sensing, geology, information, and other fields [4,5,6], which has attracted the attention of many scholars.
The AMT is the final imaging result generated when some unintended interference signals or intentional jamming signals enter the receiver in the SAR under the processing of imaging algorithms. In this paper, we focus on the AMT based on Coherent Modulation and Forwarding (CMF) technology, especially the specific echo signal that leads to the formation of the AMT.

1.2. Existing Problem

After the SAR receives and processes the echo signal, how to analyze and study the signal echo is an important later process [7]. At present, the global electromagnetic space is complicated. In many cases, the SAR will receive some unintentional or intentional CMF signals in addition to its own signal echoes, forming an AMT while imaging, which will affect the use and interpretation of the final imaging results. Therefore, it is necessary to analyze the received echo, identify whether there are AMTs, and identify the modulation type when there are AMTs so as to carry out the next step of the information system. In order to solve this problem, scholars have put forward many methods.
On the one hand, the further development of artificial intelligence and neural network technology has been well-applied in many fields [8,9]. Therefore, some scholars in the field of computer vision use the method of optical image processing to directly feed the final gray-level image of the SAR into the neural network model for processing, and analyze the result after the imaging [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25]. Some of these methods choose to input only one channel, which contains only one gray-level image of the final SAR image [26], some choose to copy the gray-level image three times to correspond to the three RGB channels in the optical image [27], and some classical neural network structures are directly used for transfer learning and use pre-trained parameters on the optical images. At present, the mainstream neural network processing method based on deep learning adopts CNNs [28,29,30], which can abstract the data in SAR images at a higher level and make the analysis more accurate. However, there are still some problems with this method. Although these operations can achieve certain results and achieve good network effects under certain fixed scene conditions, they obviously ignore the gap between the SAR and the optical image in the basic principle of imaging. We believe that the processing methods of SAR echo information and optical images should be different.
On the other hand, the existing AMT detection and recognition methods based on conventional algorithms or neural networks will show significant performance degradation in the face of relatively rare coherent modulation types, large changes in imaging scenes, or large changes in the coherence modulation parameters [31]. A network that can perform signal-level analysis of SAR echoes from the level of the SAR imaging principle is needed, with a generalization performance that is good enough to cope with large imaging scene changes and large coherent modulation parameter changes. In addition, when the datasets are established, it is necessary to analyze the principle of the SAR signal and the types of AMT, so that the modulation style coverage is rich and the composite modulation style is considered.

1.3. Our Work

To solve the above problems, we established the ADMTR Dataset, proposed the Multi-scale Amplitude-Phase Feature analysis method, and established the MIMOFWTNN. This network includes three input information processing branches: the amplitude information processing branch, the phase information processing branch, and the multi-scale information processing branch. The two outputs are to detect the existence of AMTs and identify the CMF type of AMTs. The established ADMTR Dataset includes the original echo information, which can be used for the network training of different methods and also to compress the pulse of saved parameters to obtain imaging results.
According to the statements in the literature [32,33,34], the network proposed by us is not strictly defined as a composite classical MIMO network but is closer to the concept of the MIMO Fusion network in the literature [34]; therefore, the network is finally named MIMOFWTNN. Moreover, a rich dataset is constructed to simulate the problems that need to be solved, which can be used for the training of various types of networks. The workflow of the whole article is shown in Figure 1.
The overall framework model of the method we propose in the paper is shown in Figure 2.
Finally, after training and debugging, the effects of AMT detection and modulation type recognition in the SAR are improved effectively. The average validation set accuracy of the network is as high as 96.96%, and there is no overfitting on the independent test set while the accuracy is 99.0%. Upon further verification, we introduced InceptionV3, ResNet18, ResNet50, VGG16, DenseNet201, Xcepetion, DarkNet19, and SqueezeNet into the comparison experiment, which were either far less accurate than MIMOFWTNN on average or overfitted on a completely independent test set. It proves that the performance and generalization ability of MIMOFWTNN are excellent. In addition, in the ablation experiment of MIMOFWTNN, some branches were removed for testing, which proved the effectiveness of the branch designed in this paper. The limitation of the network is that the absolute accuracy of the training set and validation set is slightly decreased.
In summary, the contributions of this paper are as follows:
  • A new Dataset, the AMT Detection and Modulation Type Recognition Dataset (ADMTR Dataset), is established, wherein the factors of different jamming positions, different Jamming-to-Signal Ratios (JSRs), and the modulated parameter are considered to enhance the generalization, which conforms more to the reality of complex electronic countermeasure environments.
  • The Multi-scale Amplitude-Phase Feature analysis method is proposed to detect and identify AMTs formed in echoes, which not only uses the amplitude information of the scene but also adequately makes use of the phase and high-frequency information.
  • A MIMOFWTNN network is designed to realize the detection and recognition task. The network structure is improved and optimized, especially in terms of phase processing and multi-scale processing branches, achieving a better correction rate on the completely independent test set compared to the state-of-the-art methods.
  • The focus of this work is the AMT, which is a special modulated signal rather than the real echoes and targets in SAR Automatic Target Recognition (ATR). To the best of our knowledge, the related work is rarely studied in this field and the proposed network can not only detect AMTs but also recognize AMTs with different modulation types.
The rest of the paper is organized as follows. In Section 2, we briefly describe some of the preliminary preparations for our study and the generation of the dataset. Section 3 describes the architecture of MIMOFWTNN. Section 4 shows the MIMOFWTNN training results, as well as the comparison experiment and ablation experiment. Finally, we summarize and provide an outlook for our MIMOFWTNN in Section 5.

2. Model Construction

In this section, the scenarios of the SAR and AMT are modeled. Based on the constructed model, the ADMTR Dataset is created to train and test the neural network according to the signal processing principle.

2.1. SAR Model

Our SAR physical model is shown in Figure 3. The SAR load platform moves forward along the azimuth axis Y with the relative ground height h . The X axis is the projection of the horizontal direction of the SAR to the detected target on the ground, and the down Angle of the radar to the target is the azimuth Angle γ . Then, the coordinate of the target can be determined as ( x , y , 0 ) , and the oblique distance R between the radar and the target can be obtained by calculating the two norms with the coordinate of the SAR.
Based on the SAR model established by computer simulation, we used the Inverse ωK algorithm to invert the signal based on the MiniSAR dataset as the scattering scene. In order to facilitate subsequent processing and control the size of total data, some 512 × 512 scattering point scenes are captured from MiniSAR for imaging. The visual grayscale image examples of partial slice scattering scenes are shown in Figure 4. SAR imaging adopts the Range Doppler (RD) algorithm. Abundant training scattering scenes can be used to enhance the generalization ability of the network.

2.2. AMT Modulation Signal Model

We built an AMT model and selected nine CMF modes in the confrontation scenario. The innovation compared to the usual case is that in addition to considering 5 single modulation methods, we also chose 4 composite modulation methods as more difficult identification options. The specific principles of each modulation are shown in Table 1.
A 0 is the echo amplitude. The range-modulated frequency of the LFM signal transmitted by SAR is K r . f c is the carrier frequency of the SAR signal. t r is fast time. t a is slow time. t a c is center slow time. ω r , ω a represents the transmitting antenna pattern function after the weighted point target in range and azimuth. R ( t r , t a ) represents the distance from the SAR to the target. c is the speed of light. rect ( · ) represents the rectangular signal. f d is the shift frequency. g ( t r ) is the range direction intermittent sampling signal. τ is the pulse width of g ( t r ) . T s is the pulse repetition period of g ( t r ) . δ ( t r ) is the impulse function. g ( t a ) is the azimuth direction intermittent sampling signal. A m is the modulation amplitude. f m is the modulation frequency. φ m is the initial phase of the modulated signal. f a ( t a ) is the randomly generated frequency value. f s h is the modulated forwarded signal of the step frequency shift value. f p is the SAR pulse repetition frequency. Δ f s h is the fixed frequency shift increment. t 0 is the time for the SAR signal to reach the modulation and forwarding machine. m , n is an integer, and the initial value of m is m 1 .
In the above content, the straight body is the scalar, the italic body is the variable, the straight body in bold is the matrix, and the italic body in bold is the scalar matrix.

2.3. Establish Dataset

According to the above model, the ADMTR Dataset is generated by computer simulation. Twenty scattering scenes of 512 × 512 points of intercepted MiniSAR datasets are used as the basic template. They are divided into three parts according to a ratio of 0.7:0.15:0.15, meaning that different types of echo signals generated by 14 scattering scenes were used as the training set for neural network training, 3 scattering scenes were used as the test set, and 3 scattering scenes were used as the cross-validation set. The scattering scenes used by the training set, test set, and cross-verification set are completely isolated to ensure that the generalization performance of the trained neural network can be verified and that the final trained network is robust to different scattering scenes.
In order to enhance the generalization ability of the network, in all scattering scenes, the CMF technology is set with 9 kinds of positions, 7 kinds of Jamming-to-Signal Ratios (JSRs), and the parameters of the modulation are randomly generated in an interval. Taking the center of the scattering scene of 512 × 512 points as the origin (0, 0), 9 kinds of modulation transmitters are located in the scattering scene: (−128, −128), (−128, 0), (−128, 128), (0, −128), (0, 0), (0, 128), (128, −128), (128, 0), and (128, 128). The JSRs are 0 dB, 5 dB, 10 dB, 12.5 dB, 15 dB, 17.5 dB, and 20 dB.
These generation settings are used to ensure that the trained neural network is also robust to the position of the modulator, JSR, and the modulation parameters. The JSR is defined as the ratio between the power of the modulating and transmitting signal P J and the power of the original echo signal P s , as shown in Equation (1).
J S R = 10 lg P J P S
In addition, the ADMTR Dataset was enhanced by random thermal noise with the same power size as the JSR added to each sample to ensure that the echo signal of the scene is different when there is no AMT and that the trained network is robust to thermal noise. Finally, the ADMTR Dataset generation method is shown in Figure 5.
Therefore, a total of 12,600 samples of 20 × 9 × 10 × 7 were generated, resulting in a dataset size of 47.2 GB. The typical samples with the AMT affected by modulation and forwarding are shown in Figure 6.
At this point, the data used for training, testing, and cross-validation have been generated.

3. Method Design

This section shows the network structure of the designed MIMOFWTNN, focusing on the phase information processing branch and multi-scale information processing branch designed in the network, and explains their physical significance.

3.1. Network Structure

For AMT recognition with CMF technology, the traditional methods of extracting the SAR texture feature based on Short-Time Fourier Transform (STFT) or Spectral Estimation require a large amount of computation. Moreover, these methods only have a good recognition effect on a single CMF style, without class generality, and their abilities are limited in the face of complex and variable electromagnetic environments. However, intelligent recognition methods based on decision trees, support vector machines, and random forests are sensitive to JSR, with limited generalization ability and recognition ability in practical applications.
At the same time, although some deep learning methods based on convolutional neural networks have a good recognition effect, they only input the gray image of the conventional SAR processing results into the optical image-processing neural network, or re-input the gray image into the three RGB channels of the original optical image-processing neural network. This ignores the fact that the SAR image is a signal composed of echo rather than an optical picture composed of three kinds of RGB spectrums, which throws away the internal connection between signals. When training these deep learning methods, most of them utilize network parameter direct transfer learning that has been pre-trained on the optical map, ignoring the difference in principle between optical imaging and SAR imaging and omitting the SAR phase information. When the parameters of modulation change, the overfitting phenomenon will easily occur.
Therefore, MIMOFWTNN is designed to solve the above defects through the combination of multi-input multi-output fusion neural network technology and wavelet analysis technology, as well as the amplitude, phase, and multi-scale joint analysis of the signal. MIMOFWTNN achieves better network performance at the same scale, and its processing technology conforms to the analysis principle of radar signals, which has certain practical value. The detailed network structure is shown in Figure 7.
The detailed network information is shown in Table 2.
For the proposed MIMOFWTNN, the following two sections focus on its phase information processing branch and multi-scale information processing branch, as well as explaining its practical physical significance corresponding to SAR signals.

3.2. Phase Information Processing Branch

After receiving and processing SAR echoes, a complex matrix is usually formed, which has rich physical information. The amplitude of the complex number generally represents the scattering intensity of the scattering point. The phase of complex numbers is related to wave path difference, scatterer shape, and scatterer surface type. Due to the complexity of scattering scenes, rich shapes, and a variety of scatterer materials, the phases in a scattering scene are usually random and evenly distributed. However, if there is an echo signal of the AMT introduced by CMF technology in the scene, an obvious texture will be generated in the phase diagram due to the significance of the intensity.
The purpose of the phase branch designed in MIMOFWTNN is to detect the texture, determine the presence of AMTs, and determine the type of modulation. Phase extraction is achieved by using residual block and multi-layer perceptron for texture extraction, with a focus on phase texture in regions rather than the specific numerical values of phases. Detecting the specific numerical values of phases is not a good way to achieve the goal because the distribution of phase values at a specific point is random. A typical visual phase diagram is shown in Figure 8, and the selected part of the red box is the AMT presence area. The phase texture shows a center-symmetric trend and is similar to the interference pattern of light.
Compared with the same scattering scene, the presence or absence of the modulated signal is reflected in the specific visual phase diagram, as shown in Figure 9, and the phase texture presents a center symmetric style. In fact, in terms of the specific value of the phase, the parts other than the red box are basically the same.
In this branch, we compress the echo signal stored in the ADMTR Dataset and take the phase information as one input of MIMOFWTNN. The flow chart of the phase information processing branch is shown in Figure 10.
The phases of various typical CMF techniques will be significantly different after compression and extraction, and there will be differences in texture styles. A typical diagram of these nine styles is shown in Figure 11. This is why phase information can aid in type identification.
Due to the small size of the layout picture, in order to avoid problems of clarity, the phase disturbance has been encircled with a red box and has a specific textured area.

3.3. Multi-Scale Information Processing Branch

Since SAR is generally used for scene imaging in a region, due to the complexity of scene objects, the echo signal is full of high-frequency details. In order to better analyze the high-frequency components of the signal and solve the defects caused by the existing processing schemes that only consider the SAR gray-level picture, we designed the multi-scale processing branch of MIMOFWTNN. The multi-scale processing branch adopts the wavelet domain analysis method.
The physical meaning of wavelet domain analysis is similar to time-frequency domain analysis, which is further developed on the basis of time-frequency domain analysis. Time-frequency domain analysis is usually carried out on non-stationary signals. Through the method of windowed segmented Fourier transform, it is possible to observe the spectral component changes of a non-stationary signal at different times, so as to better analyze the information in the signal. STFT is a typical time-frequency domain analysis method; however, there is a very obvious limitation, which is the length of the window function selection of STFT—if the window function is narrow, the frequency resolution of STFT will be poor, if the window function is long, the time resolution of STFT will be poor. In addition, the window function of STFT is difficult to define when processing SAR echoes because each SAR pulse is relatively independent and most of the single pulse is an LFM signal. Therefore, STFT will have a large amount of computation and it is difficult to deal with the problem of sliding windows between pulses.
However, the wavelet transform of wavelet domain analysis solves the above problems effectively, replacing the trigonometric function basis used by STFT transform with the wavelet basis, changing its basis function from infinite length to finite length. It can not only analyze the frequency well but also intercept a short period of time well. In addition, wavelet analysis can better analyze and extract the high-frequency details of SAR signals. In addition, two-dimensional wavelet analysis can be carried out from a distance and azimuth to avoid the problem of sliding windows between pulses. The two-dimensional operation can be carried out in parallel at the same time, which greatly improves the operation efficiency. Therefore, in this paper, the method of two-dimensional wavelet analysis is selected for multi-scale branch design.
When the modulation signal that produces AMT is present, it will produce a sudden change in the originally relatively flat echo signal. Reflected in the wavelet decomposition is a significant increase in the high-frequency components. Compared with the same scattering scene, a characteristic high-frequency component will be formed after visualization, as shown in Figure 12. The purpose of the multi-scale processing branch is to extract high-frequency components with significant modulation characteristics, and then assist in the identification of AMT modulation types. After visualization, it can be seen that the echo component of the scene part will be weaker, and the contour and features of the modulated signal part will be more significant.
In this branch, the echo signal stored in the ADMTR Dataset is compressed first, then two-dimensional wavelet decomposition is used, and finally, the wavelet information is processed as the input of MIMOFWTNN. The flow chart of the multi-scale information processing branch is shown in Figure 13.
For the decomposition of the two-dimensional wavelet, we used ‘db2’ as the wavelet base of the decomposition and carried out the first-order decomposition. The wavelet basis formula is shown in Equation (2), where H ( ω ) is the frequency-domain wavelet filter and h ( n ) is the time-domain wavelet filter.
| H ( ω ) | 2 = [ cos 2 ( ω / 2 ) ] 2 P [ sin 2 ( ω / 2 ) ] | H ( ω ) | 2 = 1 2 n = 0 3 h ( n ) e j n ω P ( x ) = n = 0 1 C n 1 + n x n C m n = n ! n ! ( m n ) !
In the four components of the two-dimensional wavelet decomposition, the high frequency information within the pulse and the high-frequency information between the pulses are extracted and fused into the high-frequency information matrix. The high-frequency information matrixes of nine typical CMF technologies are shown in Figure 14 after visualization.
At this point, network design and model linking are complete. Experiments can be conducted in the next section to verify the specific effects.

4. Experimental Verification

In this section, MIMOFWTNN is trained and tested to show the practical effect of the network. The advantage of the network compared with other networks is proven through a comparison experiment, and the effectiveness of the designed branch is proven through an ablation experiment.

4.1. Training and Testing

After completing the network programming and dataset preparation, we used a desktop computer for training. The training environment and parameters are shown in Table 3.
According to the above training parameters, the neural network is trained and tested 10 times to verify the convergence stability and the effectiveness of the results. The results of these 10 tests are shown in Table 4.
Due to the size of the dataset and the size of the mini batch, there are 275 mini batches per epoch, and the total number of mini batches brought by 10 epochs is 2750. Cross-validation tests are performed every 100 mini batches according to the Settings. Finally, the best cross-validation among the 10 epochs is selected as the final training result of the network, and the output results are tested on the complete cross-validation set.
In these ten tests, the training and test data from the eighth time are randomly selected as a typical sample, and its training curve is shown in Figure 15. The blue line in the figure is the correct rate curve of the network on the training set during the training process, which has been smoothed to a certain extent to facilitate display. The black dashed line and black mark accompanying the blue line are the correct rate curves of the cross-verification set. The red line is the loss curve of the network on the training set during the training process, and the curve has also been smoothed to a certain extent to facilitate display. The black dashed line and black mark accompanying the red line are the loss curves of the cross-verification set. The golden dots are the accuracy and loss of the final output network running on the complete cross-validation set, with the ordinate corresponding to the selected mini batch rounds. It can be seen that the network convergence performance is good, and there is no overfitting phenomenon.
The confusion matrix for the test results on a separate test set is shown in Figure 16. The output class in the figure is the analysis result after network operation, and the target class is the label truth value. A value of 0 means no interference, and values 1–9 mean CMF methods: SF, RJ, AJ, RAJ, MJ, MRJ, RJSF, AJSF, and MJSF. The final accuracy rate of 99.0% shows that the network still performs well on the test set where the SAR scene is completely independent and the modulation parameters are also independent, with only some errors occurring in the recognition of individual types. It is proven that this method can learn the essential characteristics of a SAR signal, and that the network generalization performance is good.

4.2. Comparison Experiment

We compared the proposed MIMOFWTNN with some common neural networks to verify the effectiveness of the network, including Inception V3 Amplitude-Phase Dual Input, ResNet18 Amplitude-Phase Dual Input, ResNet50 Amplitude-Phase Dual Input, VGG16 Amplitude-Phase Dual Input, DenseNet201 Amplitude-Phase Dual Input, Xcepetion Amplitude-Phase Dual Input, DarkNet19 Amplitude-Phase Dual Input, and SqueezeNet Amplitude-Phase Dual Input. Each type of network is calculated only from the first six training results, and the experiment effect is shown in Table 5.
The fitting curves of the remaining networks used for comparison experiments are shown in Figure 17, and it can be seen that each of the eight networks has corresponding shortcomings. Combined with the data in Table 5, it can be found that, except for MIMOFWTNN, overfitting phenomena occurred in different degrees on independent test sets, and the accuracy will have a certain decline. The validation accuracies of InceptionV3, ResNet18, DenseNet201, Xcepetion, DarkNet19, and SqueezeNet are very low, and the training times of VGG16 and DenseNet201 are very long. In addition, the size of the storage file of VGG16 is very bulky compared to the others. Subsequently, we conducted 20 separate training sessions on MIMOFWTNN, all of which achieved good results on independent test sets without overfitting. Some of the released supporting materials can be downloaded at the relevant link.
MIMOFWTNN has the same effect on a completely independent test set, and there is no overfitting phenomenon compared with the accuracy of the test set and cross-validation set, meaning that it has excellent generalization performance.
The line colors in Figure 17 have the same meaning as in Figure 15. It is proven that the MIMOFWTNN proposed in this paper is better in terms of convergence speed and convergence stability, and it also shows that a more effective network can be designed by combining the SAR imaging principle and the signal processing principle in network design.

4.3. Ablation Experiment

In order to verify the actual effect of the three designed branches, we conducted the ablation experiment of MIMOFWTNN to verify the influence of the presence of branches on the final performance of the network. When the comparison targets are consistent with other network modules, there are only amplitude input branches, only phase input branches, amplitude-phase dual input branches (addition combination), and amplitude-phase dual input branches (depth combination). For each kind of network, only the first six training results were taken, and the experiment effect is shown in Table 6.
It can be seen from the table that the MIMOFWTNN proposed in this paper can obtain better and more stable convergence and improve the accuracy rate at the cost of sacrificing some early training time. MIMOFWTNN avoids overfitting caused by the network using only SAR grayscale pictures, and the network size does not grow much. The fitting curves of the remaining four networks used for ablation experiments are shown in Figure 18, and the color meanings of the lines in the figure are consistent with those in Figure 15. Some of the released supporting materials can be downloaded using the relevant link.
After the analysis of the ablation experiments, it can be proven that the branch created by the MIMOFWTNN proposed in this paper is effective. According to the results, the influence and importance of the multi-scale information processing branch is relatively greater than that of the phase information processing branch. It shows again that a more effective network can be designed by combining the SAR imaging principle and the signal processing principle in network design.

5. Conclusions and Prospects

In the field of SAR information processing, because of the lack of research on jamming signal characteristics, the network structure has great limitations regarding AMT detection and type recognition. In view of this, this paper proposes the Multi-scale Amplitude-Phase Feature analysis method to detect AMTs and distinguish between different CMF types. On the one hand, we built the ADMTR Dataset, wherein the factors of nine jamming positions, seven JSRs, and the modulated parameter are considered to enhance the generalization, which is more in line with the complexity of reality. On the other hand, we designed MIMOFWTNN. The network is constructed from signal phase characteristics, wavelet analysis theory, branch fusion design, and a residual block structure, improving the detection and recognition of the interference signals that form AMTs.
Experimental results show that the average verification accuracy of MIMOFWTNN is as high as 96.96% and the accuracy of the independent test set is 99.0%, and there is no overfitting in 20 training sessions. Our network has the same effect on a completely independent test set, and there is no overfitting phenomenon compared with the accuracy of the test set and cross-validation set, meaning that it has excellent generalization performance.
The advantages of our method are stronger generalization ability, better accuracy for independent test sets than other networks, and more consistency with the signal processing principle and physical practice. The limitation is that the absolute accuracy of the training set and cross-validation set is slightly decreased.
In future research, improvement of the training speed by optimizing the network architecture will be considered. For further communication, please contact the corresponding author’s email address.

Author Contributions

S.X. and S.Q. are co-corresponding authors. (Both of them provide funding and effort for this research as PI). Conceptualization, W.M. and S.X.; methodology, W.M. and S.X.; software, W.M.; validation, W.M., Z.C. and F.F.; formal analysis, W.M. and Z.C.; investigation, W.M. and J.W.; resources, D.F., S.X. and S.Q.; data curation, W.M. and S.Q.; writing—original draft preparation, W.M.; writing—review and editing, F.F., S.Q. and S.X.; visualization, W.M. and Z.C.; supervision, S.X. and D.F.; project administration, S.X. and S.Q.; funding acquisition, S.Q. and S.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The National Natural Science Foundation of China, grant number 62001487 and 62201589.

Data Availability Statement

Parts of research data can be found on the website https://www.sandia.gov/radar/pathfinder-radar-isr-and-synthetic-aperture-radar-sar-systems/complex-data/ (accessed on 10 July 2022). We give sincere thanks to their open-source sharing.

Acknowledgments

We give sincere thanks to the administrative support from the College of Electronic Science and Technology.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, M.; Wang, R.; Deng, Y.; Wu, L.; Zhang, Z.; Zhang, H.; Li, N.; Liu, Y.; Luo, X. A Synchronization Algorithm for Spaceborne/Stationary BiSAR Imaging Based on Contrast Optimization with Direct Signal from Radar Satellite. IEEE Trans. Geosci. Remote Sens. 2016, 54, 1977–1989. [Google Scholar] [CrossRef]
  2. Fu, S.; Xu, F.; Jin, Y.-Q. Reciprocal Translation between SAR and Optical Remote Sensing Images with Cascaded-Residual Adversarial Networks. Sci. China Inf. Sci. 2021, 64, 122301. [Google Scholar] [CrossRef]
  3. Zhang, H.; Deng, Y.; Wang, R.; Li, N.; Zhao, S.; Hong, F.; Wu, L.; Loffeld, O. Spaceborne/Stationary Bistatic SAR Imaging with TerraSAR-X as an Illuminator in Staring-Spotlight Mode. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5203–5216. [Google Scholar] [CrossRef]
  4. Song, Q.; Xu, F. Zero-Shot Learning of SAR Target Feature Space with Deep Generative Neural Networks. IEEE Geosci. Remote Sens. Lett. 2017, 14, 2245–2249. [Google Scholar] [CrossRef]
  5. Zhang, Z.; Wang, H.; Xu, F.; Jin, Y.-Q. Complex-Valued Convolutional Neural Network and Its Application in Polarimetric SAR Image Classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 7177–7188. [Google Scholar] [CrossRef]
  6. Hou, X.; Ao, W.; Xu, F. End-to-End Automatic Ship Detection and Recognition in High-Resolution Gaofen-3 Spaceborne SAR Images. In Proceedings of the IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 9486–9489. [Google Scholar]
  7. Potter, L.C.; Moses, R.L. Attributed Scattering Centers for SAR ATR. IEEE Trans. Image Process. 1997, 6, 79–91. [Google Scholar] [CrossRef] [PubMed]
  8. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6000–6010. [Google Scholar]
  9. Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; MIT Press: Cambridge, MA, USA, 2014; Volume 2, pp. 3104–3112. [Google Scholar]
  10. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
  11. Ravuri, S.; Vinyals, O. Classification Accuracy Score for Conditional Generative Models. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; pp. 12268–12279. [Google Scholar]
  12. Metz, L.; Poole, B.; Pfau, D.; Sohl-Dickstein, J. Unrolled Generative Adversarial Networks. arXiv 2017, arXiv:1611.02163. [Google Scholar]
  13. Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5769–5779. [Google Scholar]
  14. Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. In Proceedings of the International Conference on Machine Learning, Sydney, ACT, Australia, 6–11 August 2017. [Google Scholar]
  15. Menon, S.; Damian, A.; Hu, S.; Ravi, N.; Rudin, C. PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 2434–2442. [Google Scholar]
  16. Dahl, R.; Norouzi, M.; Shlens, J. Pixel Recursive Super Resolution. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5449–5458. [Google Scholar]
  17. Chen, Y.; Tai, Y.; Liu, X.; Shen, C.; Yang, J. FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2492–2501. [Google Scholar]
  18. Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2016, arXiv:1511.06434. [Google Scholar]
  19. Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv 2018, arXiv:1710.10196. [Google Scholar]
  20. Krichen, M. Generative Adversarial Networks. In Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 6–8 July 2023; pp. 1–7. [Google Scholar]
  21. Kingma, D.P.; Dhariwal, P. Glow: Generative Flow with Invertible 1×1 Convolutions. arXiv 2018, arXiv:1807.03039. [Google Scholar]
  22. Vahdat, A.; Kautz, J. NVAE: A Deep Hierarchical Variational Autoencoder. arXiv 2021, arXiv:2007.03898. [Google Scholar]
  23. Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2022, arXiv:1312.6114. [Google Scholar]
  24. van den Oord, A.; Kalchbrenner, N.; Vinyals, O.; Espeholt, L.; Graves, A.; Kavukcuoglu, K. Conditional Image Generation with PixelCNN Decoders. arXiv 2016, arXiv:1606.05328. [Google Scholar]
  25. van den Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. arXiv 2016, arXiv:1609.03499. [Google Scholar]
  26. Amrani, M.; Jiang, F.; Xu, Y.; Liu, S.; Zhang, S. SAR-Oriented Visual Saliency Model and Directed Acyclic Graph Support Vector Metric Based Target Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3794–3810. [Google Scholar] [CrossRef]
  27. Coman, C.; Thaens, R. A Deep Learning SAR Target Classification Experiment on MSTAR Dataset. In Proceedings of the 2018 19th International Radar Symposium (IRS), Bonn, Germany, 20–22 June 2018; pp. 1–6. [Google Scholar]
  28. Tang, J.; Deng, C.; Huang, G.-B.; Zhao, B. Compressed-Domain Ship Detection on Spaceborne Optical Image Using Deep Neural Network and Extreme Learning Machine. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1174–1185. [Google Scholar] [CrossRef]
  29. Jiao, J.; Zhang, Y.; Sun, H.; Yang, X.; Gao, X.; Hong, W.; Fu, K.; Sun, X. A Densely Connected End-to-End Neural Network for Multiscale and Multiscene SAR Ship Detection. IEEE Access 2018, 6, 20881–20892. [Google Scholar] [CrossRef]
  30. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  31. Li, X.; Zhang, G.; Cui, H.; Hou, S.; Wang, S.; Li, X.; Chen, Y.; Li, Z.; Zhang, L. MCANet: A Joint Semantic Segmentation Framework of Optical and SAR Images for Land Use Classification. Int. J. Appl. Earth Obs. Geoinf. 2022, 106, 102638. [Google Scholar] [CrossRef]
  32. Park, J. Efficient Ensemble via Rotation-Based Self- Supervised Learning Technique and Multi-Input Multi-Output Network. IEEE Access 2024, 12, 36135–36147. [Google Scholar] [CrossRef]
  33. Ferianc, M.; Rodrigues, M. MIMMO: Multi-Input Massive Multi-Output Neural Network. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 17–24 June 2023; IEEE: Vancouver, BC, Canada, 2023; pp. 4564–4569. [Google Scholar]
  34. Zhong, F.; Wang, G.; Chen, Z.; Yuan, X.; Xia, F. Multiple-Input Multiple-Output Fusion Network for Generalized Zero-Shot Learning. In Proceedings of the ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; IEEE: Toronto, ON, Canada, 2021; pp. 1725–1729. [Google Scholar]
Figure 1. Workflow of the whole article.
Figure 1. Workflow of the whole article.
Remotesensing 16 02107 g001
Figure 2. Overall framework of our method. We will explain each part in detail in the upcoming posts. The detailed structural explanation of MIMOFWTNN is shown in following content.
Figure 2. Overall framework of our method. We will explain each part in detail in the upcoming posts. The detailed structural explanation of MIMOFWTNN is shown in following content.
Remotesensing 16 02107 g002
Figure 3. SAR basic model.
Figure 3. SAR basic model.
Remotesensing 16 02107 g003
Figure 4. Visual grayscale image of part of the scattering scene (using the RD algorithm).
Figure 4. Visual grayscale image of part of the scattering scene (using the RD algorithm).
Remotesensing 16 02107 g004
Figure 5. ADMTR Dataset generation method.
Figure 5. ADMTR Dataset generation method.
Remotesensing 16 02107 g005
Figure 6. Typical samples in the dataset.
Figure 6. Typical samples in the dataset.
Remotesensing 16 02107 g006
Figure 7. Detailed structural explanation of MIMOFWTNN. Overall structure is shown in Figure 2.
Figure 7. Detailed structural explanation of MIMOFWTNN. Overall structure is shown in Figure 2.
Remotesensing 16 02107 g007
Figure 8. A typical visual phase diagram with an AMT. Key sections are marked in red boxes.
Figure 8. A typical visual phase diagram with an AMT. Key sections are marked in red boxes.
Remotesensing 16 02107 g008
Figure 9. Visual phase contrast diagram. No AMT on the left, AMT on the right. Key sections are marked in red boxes.
Figure 9. Visual phase contrast diagram. No AMT on the left, AMT on the right. Key sections are marked in red boxes.
Remotesensing 16 02107 g009
Figure 10. The flow chart of the phase information processing branch.
Figure 10. The flow chart of the phase information processing branch.
Remotesensing 16 02107 g010
Figure 11. Typical phase diagram of these nine types.
Figure 11. Typical phase diagram of these nine types.
Remotesensing 16 02107 g011
Figure 12. Visualization of the difference between high-frequency components in the same scene.
Figure 12. Visualization of the difference between high-frequency components in the same scene.
Remotesensing 16 02107 g012
Figure 13. The flow chart of the multi-scale information processing branch.
Figure 13. The flow chart of the multi-scale information processing branch.
Remotesensing 16 02107 g013
Figure 14. Typical high-frequency information diagram of these nine types.
Figure 14. Typical high-frequency information diagram of these nine types.
Remotesensing 16 02107 g014
Figure 15. Typical training progress of MIMOFWTNN.
Figure 15. Typical training progress of MIMOFWTNN.
Remotesensing 16 02107 g015
Figure 16. A typical confusion matrix of MIMOFWTNN.
Figure 16. A typical confusion matrix of MIMOFWTNN.
Remotesensing 16 02107 g016
Figure 17. Typical training progress of the remaining four networks in the comparison experiment.
Figure 17. Typical training progress of the remaining four networks in the comparison experiment.
Remotesensing 16 02107 g017aRemotesensing 16 02107 g017b
Figure 18. Typical training progress of the remaining four networks in the ablation experiment.
Figure 18. Typical training progress of the remaining four networks in the ablation experiment.
Remotesensing 16 02107 g018
Table 1. AMT Modulation Signal Equations.
Table 1. AMT Modulation Signal Equations.
NameAbbr.Equations
Shift Frequency CMFSF s r _ s f ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } exp { j 2 π f d t r }
Range intermittent sampling Jamming CMFRJ g ( t r ) = rect ( t r τ ) + δ ( t r n T s )
s r _ r j ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } g ( t r )
Azimuth intermittent sampling Jamming CMFAJ g ( t a ) = rect ( t a τ ) + δ ( t a n T s )
s r _ a j ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } g ( t a )
Range-Azimuth intermittent sampling Jamming CMFRAJ s r _ r a j ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) g ( t r ) g ( t a ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 }
Micromotion frequency Jamming CMFMJ s r _ m j ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } exp { j A m sin ( 2 π f m t a + φ m ) }
Micromotion frequency and Range intermittent sampling Jamming CMFMRJ s r _ m r j ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) g ( t r ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } exp { j A m sin ( 2 π f m t a + φ m ) }
Range intermittent sampling Jamming and Shift Frequency CMFRJSF s r _ r j s f ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) g ( t r ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } exp { j 2 π f a ( t a ) t a }
Azimuth intermittent sampling Jamming and Shift Frequency CMFAJSF f s h = m Δ f s h = f p Δ f s h t a Δ f s h ( t 0 f p m 1 )
s r _ a j s f ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) g ( t a ) exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } exp { j 2 π f s h t a }
Micromotion frequency Jamming and Shift Frequency CMFMJSF s r _ m j s f ( t r , t a ) = A 0 ω r ( t r 2 R ( t r , t a ) / c ) ω a ( t a t ac ) exp { j A m sin ( 2 π f m t a + φ m ) } exp { j 4 π f c R ( t r , t a ) / c } exp { j π K r ( t r 2 R ( t r , t a ) / c ) 2 } exp { j 2 π f s h t a }
Table 2. Network information.
Table 2. Network information.
FeatureValue
Total Network Layers187 layers
Input Branch3 branches
Output Branch2 branches
Output Class10 classes
Parameter Scale23.6 million
Number of Residual Blocks16 blocks
Table 3. Training environment and parameter information.
Table 3. Training environment and parameter information.
EnvironmentValueParameterValue
CPUIntel i9-13900KOptimizerSGDM
GPUNvidia RTX 4090Initial Learning Rate0.01
RAM64 G @ 4000 MHzLearning rate strategyPiecewise Decline
OS VersionWindows 11Decline strategy0.5 Every 2 Epoch
Environment VersionMATLAB R2023bMax Epoch10
GPU Driver Version546.33Mini Batch Size32
CUDA Version12.3Validation FrequencyEvery 100 Mini Batch
Training OptionOnly GPUOutput NetworkBest Validation Loss
Table 4. Training results of 10 times.
Table 4. Training results of 10 times.
Test NumberTraining DurationVerification Accuracy
182 m 17 s98.68%
266 m 19 s86.67%
383 m 26 s99.37%
481 m 17 s99.42%
566 m 04 s98.20%
665 m 54 s99.42%
768 m 36 s99.15%
866 m 16 s99.15%
967 m 51 s99.37%
1065 m 54 s99.21%
Table 5. Comparison experiment result.
Table 5. Comparison experiment result.
NetworkAverage Validation AccuracyAverage
Training Time
Test Set
Accuracy
Average Storage File Size
MIMOFWTNN96.96%74.21 m99.0%85.8 MB
InceptionV317.84%38.04 m23.5%81.2 MB
ResNet1838.46%10.11 m57.4%40.2 MB
ResNet5095.46%32.70 m88.4%85.5 MB
VGG1682.79%369.39 m71.9%1.94 GB
DenseNet20126.59%242.57 m60.9%73.6 MB
Xcepetion16.09%68.34 m82.8%76.5 MB
DarkNet199.61%28.22 m13.0%70.6 MB
SqueezeNet10.48%11.38 m10.0%3.34 MB
Table 6. Ablation experiment result.
Table 6. Ablation experiment result.
NetworkAverage Validation AccuracyAverage
Training Time
Test Set
Accuracy
Average Storage File Size
MIMOFWTNN96.96%74.21 m99.0%85.8 MB
Only Amplitude99.05%31.05 m83.7%85.5 MB
Only Phase11.69%31.64 m11.1%85.5 MB
Dual Addition74.21%52.64 m20.1%85.6 MB
Dual Depth92.50%64.70 m39.1%85.6 MB
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Meng, W.; Cai, Z.; Fang, F.; Feng, D.; Wang, J.; Xing, S.; Quan, S. Detection and Type Recognition of SAR Artificial Modulation Targets Based on Multi-Scale Amplitude-Phase Features. Remote Sens. 2024, 16, 2107. https://doi.org/10.3390/rs16122107

AMA Style

Meng W, Cai Z, Fang F, Feng D, Wang J, Xing S, Quan S. Detection and Type Recognition of SAR Artificial Modulation Targets Based on Multi-Scale Amplitude-Phase Features. Remote Sensing. 2024; 16(12):2107. https://doi.org/10.3390/rs16122107

Chicago/Turabian Style

Meng, Weize, Zhihao Cai, Fuping Fang, Dejun Feng, Jinrong Wang, Shiqi Xing, and Sinong Quan. 2024. "Detection and Type Recognition of SAR Artificial Modulation Targets Based on Multi-Scale Amplitude-Phase Features" Remote Sensing 16, no. 12: 2107. https://doi.org/10.3390/rs16122107

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop