LiOSR-SAR: Lightweight Open-Set Recognizer for SAR Imageries

Yang, Jie; Gu, Jihong; Xin, Jingyu; Cong, Zhou; Ding, Dazhi

doi:10.3390/rs16193741

Open AccessArticle

LiOSR-SAR: Lightweight Open-Set Recognizer for SAR Imageries

by

Jie Yang

¹,

Jihong Gu

²

,

Jingyu Xin

¹,

Zhou Cong

² and

Dazhi Ding

^2,*

¹

School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

²

School of Microelectronics, Nanjing University of Science and Technology, Nanjing 210014, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(19), 3741; https://doi.org/10.3390/rs16193741

Submission received: 15 July 2024 / Revised: 29 September 2024 / Accepted: 7 October 2024 / Published: 9 October 2024

(This article belongs to the Special Issue Advances in Remote Sensing Video Data Processing: Theories, Technologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Open-set recognition (OSR) from synthetic aperture radar (SAR) imageries plays a crucial role in maritime and terrestrial monitoring. Nevertheless, numerous deep learning-based SAR classifiers struggle with unknown targets outside of the training dataset, leading to a dilemma, namely that a large model is difficult to deploy, while a smaller one sacrifices accuracy. To address this challenge, the novel “LiOSR-SAR” lightweight recognizer is proposed for OSR in SAR imageries. It incorporates the compact attribute focusing and open-prediction modules, which collectively optimize its lightweight structure and high accuracy. To validate LiOSR-SAR, “fast image simulation using bidirectional shooting and bouncing ray (FIS-BSBR)” is exploited to construct the corresponding dataset. It enhances the details of targets for more accurate recognition significantly. Extensive experiments show that LiOSR-SAR achieves remarkable recognition accuracies of 97.9% and 94.1% while maintaining a compact model size of 7.5 MB, demonstrating its practicality and efficiency.

Keywords:

open-set recognition; synthetic aperture radar; deep learning; lightweight network; fast image simulation

1. Introduction

Synthetic aperture radar (SAR) operates independently from weather conditions and times, offering distinct advantages in both maritime and terrestrial reconnaissance [1,2]. The capabilities of robust detection and classification are especially valuable in environmental monitoring, where identifying uncharacterized targets such as ships and tanks is critical [3,4]. Open-set recognition (OSR), which addresses the challenges of classifying undefined target categories in SAR imageries, has recently gained significant attention [5,6,7]. Traditional SAR identification methods, such as the constant false-alarm rate (CFAR), often rely on manual settings [8]. However, the introduction of deep learning (DL) has revolutionized automatic target recognition (ATR), with DL-based target recognition methods outperforming traditional techniques in both efficiency and accuracy [9,10,11]. As DL-based methods gain prominence in ATR, new factors must be considered when optimizing these models.

The first critical task is achieving a balance between model size and accuracy, which is essential for developing efficient and effective models. Researchers are continually striving to balance model size and accuracy in DL-based ATR systems. However, it is worth noting that the utilization of large models often results in decreased recognition efficiency [12,13]. Consequently, scholars have been focusing on optimizing DL-based methods to address these existing limitations [14,15,16]. Yu Pan proposes a lightweight classifier based on a dense convolutional network (CNN), achieving improved recognition performance in satellite-borne SAR images [17]. The convolutional block attention module (CBAM) was integrated into a CNN to solve the SAR target recognition task, considering that it is a lightweight attention module [18,19]. The multiple kernel sizes block, channel shuffle, and split modules were utilized in CNNs, defining a lightweight feature extraction block (LSCB) to tackle challenges in SAR recognition [20]. A recent study introduced HDLM, a lightweight method for improving SAR ATR with limited data [21]. A YOLOv5-based lightweight method for SAR vehicle detection and recognition has been introduced, with a focus on parameter reduction and efficiency enhancement [22]. Meanwhile, a lightweight detector specifically for ship detection has been designed to address the problem of large network sizes [23].

In the field of DL-based ATR, there is an ongoing issue that requires continuous attention, which is the adaptability to open datasets, particularly in intricate environments and with unknown targets, which restricts their applicability in diverse scenarios. This problem underscores the necessity for novel optimization strategies to improve the universality and effectiveness of the model. Previous methods, assuming that known samples contain all target categories, were referred to as closed-set recognition (CSR) problems, which fail to address unknown targets, i.e., the OSR problem. The open-world recognition problem was first formally defined, and solutions have since been proposed to address it [24,25]. Recent work has applied OSR algorithms, including the one-vs.-set machine, W-SVM, and POS-SVM, to the MSTAR SAR dataset, demonstrating their effectiveness in classifying known targets and rejecting unknown ones in cluttered environments [26]. A new module was developed to handle unknown targets and was tested on optical images [27]. Matthew Scherreik limited the risk of unknown targets being labeled as known at the output of the support vector machine, employing threshold restrictions for reject-option recognition on infrared images [28]. Intermediate features like activation functions were also considered to be integrated into the recognition layer, enhancing computational efficiency [29]. The incremental learning joint OpenMax module was also given to maintain a state of continuous learning [30].

Additionally, the effectiveness of DL methods heavily relies on high-quality datasets, essential for training models and exploring new techniques. Thus, efficient data collection and the use of simulated data are becoming key research priorities. Addressing the scarcity of real measurement data, scholars have made many efforts to obtain simulated SAR/ISAR images using techniques like the range Doppler algorithm, chirp scaling algorithm, and back projection algorithm [31,32,33]. However, these methods are time-consuming, which slows down image generation and hampers efficient dataset production. An image-domain ray tube integration formula based on the shooting and bouncing ray (SBR) technique was derived by R. Bhalla et al. to address this issue [34]. This method, founded on the concept of equivalence between monostatic and bistatic stations [35], links the contribution of each ray tube to the overall ISAR image under small-angle approximations. Another research study utilizes precomputed scattering center representations derived from ISAR to simulate radar channels efficiently, enabling the visualization of vehicle scattering centers while reducing computational costs [36]. An ISAR fast imaging method using the SBR technique at arbitrary angles was employed to enhance imaging efficiency [37]. With these advancements, Yun et al. have refined the convolution (conv) scheme of imaging, reducing distortion caused by interpolation errors [38].

This paper addresses key challenges in SAR imageries, focusing on model efficiency, the adaptability to open sets, and the need for high-quality datasets. To this end, we propose a novel DL architecture, a lightweight recognizer for open sets in SAR imageries (LiOSR-SAR). It benefits from two modules, the compact attribute focusing (CAF) and open-prediction (OP) modules, which together improve accuracy and compactness for open datasets. Additionally, we propose an efficient dataset construction method, the fast image simulation using bidirectional shooting and bouncing rays (FIS-BSBRs), which includes a dynamic learning strategy to optimize training efficiency. The main contributions of this paper are as follows:

(1): The LiOSR-SAR is introduced to enhance OSR, demonstrating substantial accuracy improvements on both simulated and MSTAR datasets.
(2): By integrating the compact attribute focusing (CAF) module, the LiOSR-SAR achieves a balance between the compactness and the performance of the model. It maintains a size below 7.5 MB while achieving an accuracy of up to 97.9% on simulated datasets and 94.1% on the MSTAR dataset, confirming its effectiveness in resource-limited environments.
(3): An open-prediction (OP) module is featured within the LiOSR-SAR framework, enhancing the recognition of open-category targets. It has improved accuracies from 33.3% to 86.9% and 97.9% on simulated datasets, and from 55.7% to 94.1% on the MSTAR dataset, demonstrating robust performance in complex recognition tasks.
(4): A novel bidirectional ray tube integration approach in the image domain, the FIS-BSBR method, is employed to construct an effective dataset for validating the performance of LiOSR-SAR.

2. LiOSR-SAR

2.1. Overview

The LiOSR-SAR framework, as depicted in Figure 1, is designed to address the challenges in target recognition within the open datasets. Inspired by ResNet-18 (shown in Figure 1a) [35], the system features a core CAF module. Modified residual blocks (MRBs) improve accuracy and reduce model size by effectively capturing scattering information from SAR imageries. The architecture is further refined with the OP module, which is crucial for producing decision outputs and improving the identification of targets from unknown categories. Additionally, convergence performance is augmented by dynamically adjusting the learning rate using a cosine annealing algorithm.

2.2. CAF Module

To effectively enhance multi-scale feature processing capabilities while simultaneously reducing computational costs, the CAF module is introduced (marked as a yellow cube in Figure 1b). It begins with initial feature extraction via a 7 × 7 c-conv, and further explores associations in scattering information within SAR images, significantly reducing the model size. The structure of the CAF module, as depicted in Figure 2, includes three layers of MRBs and one CBAM block.

SAR imageries are characterized by complex scattering centers and prominent speckle noise, which pose unique challenges for feature extraction and noise suppression. To effectively address these issues, the model proposed in this paper is specifically optimized for these characteristics of SAR imageries. First, depth-wise separable convolutions (DS-conv) are introduced to reduce computational cost while enhancing the ability to capture spatial and channel-specific features in SAR imageries [39]. By separating spatial information from scattering features, DS-conv significantly enhances feature extraction efficiency while maintaining low computational demands. To tackle the speckle noise commonly present in SAR images, batch normalization is employed to stabilize feature distribution and ensure robustness during the training process. Simultaneously, ReLU activation functions are utilized to suppress irrelevant signals, thereby further enhancing the extraction of valid features. To enhance the focus on key targets, the CBAM block applies both channel attention and spatial attention mechanisms. This approach reduces the interference of background noise during feature extraction, improving overall accuracy [19,40]. Through this series of design optimizations, the model maintains computational efficiency while improving its accuracy in extracting critical features from SAR images. The learned feature maps from different scales are interconnected via residual links, which are structured as follows:

Θ = \{θ_{M R B_{1}}, θ_{M R B_{2}}, θ_{M R B_{3}}, θ_{M R B_{4}}\},

(1)

o u t p u t = M_{S} [m_{i n}, Θ] \circ \{M_{C} [m_{i n}, Θ] \circ m_{4}\} .

(2)

2.3. OP Module

In OSR tasks, a fundamental challenge is effectively recognizing unknown or novel categories within the dataset. To address this, we introduce an advanced module, highlighted as a green cube in Figure 1b, designed to enhance the capability to manage such special cases.

The core of the OP module is to construct a feature space that identifies unknown categories by deeply learning the characteristics of known ones. The block first determines if a test sample belongs to a known category, then estimates the likelihood of it belonging to an unknown category. It reinterprets and expands the traditional classifier outputs, enabling the architecture to make reliable predictions about targets outside the established categories in the training dataset. Thus, the OP module enhances adaptability and accuracy in complex environments, as detailed in Figure 3.

Step 1: Establish the feature space.

In addressing the multi-category classification problem, a DL model labeled CLA is employed to classify the input image

x_{i}

, as shown in Figure 3a. It is trained to recognize

K

distinct categories using a training set

ℂ_{k}, k = 1, 2, \dots, K

. The original output of the CLA, denoted as

z_{k}

, represents the “score” reflecting its prediction that the input sample

x_{i}

belongs to the category

k

. Each score

z_{k}

is converted into a probability

f_{k} (x_{i})

using the SoftMax function, described in (3), with each probability lying within

[0, 1]

. This probability reflects the confidence of the model in associating the sample

x

with a specific category, with the highest value indicating the most likely category [29].

f_{k} (x_{i}) = \frac{e^{z_{k}}}{\sum_{j = 1}^{K} e_{j}^{z}}, j = 1, 2, \dots, K .

(3)

Step 2: Measure similarity scores in feature space.

The mean vector

μ_{k}

for each category

ℂ_{k}

is computed from the average of the feature vectors, improving the accuracy and robustness of the classifier CLA.

μ_{k} = \frac{1}{N_{k}} \sum_{x \in ℂ_{k}} {(x_{i} - \frac{1}{N_{k}} \sum_{x \in ℂ_{k}} Z (x_{i}))}^{2}, k = 1, 2, \dots, K,

(4)

where

Z (x_{i}) = [z_{1} (x_{i}), z_{2} (x_{i}), \dots, z_{K} (x_{i})]

is the original output of the classifier CLA and

N_{k}

is the number of samples in the category

k

.

During testing, for a new sample

x_{i}

, its similarity score

d (x_{i}, μ_{k})

to the mean vector

μ_{k}

of each category is calculated, as described in (5). It integrates initial probabilities with similarity to the center of the category, offering a more precise basis for classification. As shown in Figure 3b, these similarity values are transformed into normalized probability adjustment factors to produce the normalized similarity vector

D (x_{i}) = [D_{1} (x_{i}), D_{2} (x_{i}), \dots, D_{k} (x_{i}), \dots, D_{K} (x_{i})]

. Here

D_{k} (x)

corresponds to the category

k

.

d (z_{k}, μ_{k}) = {‖z_{k} - μ_{k}‖}_{2} .

(5)

Step 3: Make category determinations.

The transition from activation vector

A (x_{i}) = [A_{1} (x_{i}), A_{2} (x_{i}), \dots, A_{k} (x_{i}), \dots, A_{K} (x_{i})]

to the final open-set identification is illustrated in Figure 3c. Here, activation vector

A (x_{i})

and the final classification scores

Ω (x)

are calculated using (6) and (7), respectively. Equation (6) implies that if the sample

x

is far away from the center of category

k

, its activation value

A_{k} (x_{i})

decreases accordingly. It combines the initial classification probabilities and similarity data, providing a comprehensive solution to the multi-category classification task. If all category scores fall below a predefined threshold or if the highest score remains low, the model demonstrates low confidence in the classification. In such instances, the sample is classified as “unknown”, an approach that is particularly useful in open-set classification scenarios.

A_{k} (x_{i}) = f_{k} (x_{i}) \times (1 - D_{k} (x_{i})),

(6)

Ω (x_{i}) = SoftMax (ReLU (A (x_{i}))) .

(7)

3. Datasets

Both simulated and measured datasets play crucial roles in engineering applications; the former enhances the performance of models by improving feature recognition capabilities, while the latter ensures its accuracy and reliability. This work primarily focuses on facial targets of SAR imageries, specifically military vehicles and vessels measuring 7 to 10 m in length. Due to the relatively indistinct characteristics of small and linear targets, they are not the focus of this study. In this section, three datasets are introduced, comprising two simulated and one measured dataset. The first simulation dataset is constructed using the fast image simulation based on shooting and bouncing rays (FIS-SBRs), which captures basic scattering characteristics but is limited in detail. The second is derived from the fast image simulation based on bidirectional shooting and bouncing rays (FIS-BSBRs), which enhances accuracy by capturing both forward and backward scattering paths, allowing for a more detailed feature extraction. These two methods are used to evaluate the adaptability and accuracy of LiOSR-SAR under different scattering conditions. Additionally, the characteristics and usage of the MSTAR dataset are discussed to provide a more comprehensive understanding of the application and performance of the proposed model. By testing on both simulated and measured datasets, the robustness of LiOSR-SAR is ensured across varying levels of complexity in scattering environments. The combination of diverse datasets enables a thorough evaluation of its generalization and effectiveness.

3.1. Construction of Simulated Datasets

Traditional methods generate two-dimensional (2D) inverse synthetic aperture radar (ISAR) images by performing inverse Fourier transforming on computed scattering fields over the frequency–aspect domain. Scattering field calculations using the SBR technique require ray tracing, making the generation of 2D ISAR images time-consuming.

3.1.1. Image Formation

A critical formula is derived, incorporating the ray tracing process into the ISAR image formation formula [41]. This image-domain ray tube integration formula, developed under small-angle imaging conditions, adheres to the equivalence principle of a radar cross-section (RCS) in both monostatic and bistatic configurations. Given the small-angle imaging condition, second-order terms concerning angle

θ

can be neglected, as expressed in [18].

O_{f} (x, z) = \sum_{i = 1}^{M_{1}} \{α_{i} e^{- j k_{0} (2 z - z_{i} + d_{i})} \sin c [Δ k (z - \frac{z_{i} - d_{i}}{2})]\} \cdot \{k_{0} \sin c [k_{0} θ_{0} (x - x_{i})]\} .

(8)

From (8), the sum of the contributions from all rays to the 2D image is determined, a process referred to as fast imaging based on forward ray tracing. In (8), the imaging plane is considered as the

x o z

plane, with incident plane waves propagating in the

- z

direction, and the observation point is in the

φ = 0 °

plane, within a small angle range around

θ = 0 °

. Here,

x_{i}

and

z_{i}

represent the respective components where the ray exits the target,

d_{i}

is the total path traveled by the ray, and

M_{1}

is the number of ray tubes in the forward ray tracing process.

Δ k

is the bandwidth,

k_{0}

is the central wave number, and

[- θ_{0}, θ_{0}]

is the observation angle range under the bistatic configuration. The response amplitude of each ray tube

α_{i}

in the image domain is represented as follows:

α_{i} = - \frac{2}{π^{2}} Δ k θ_{0} B_{θ, φ} {(Δ A)}_{e x i t},

(9)

where

{(Δ A)}_{e x i t}

represents the ray-integrating surface element on the target and

B_{θ, φ}

represents the geometrical optics (GO) field at the exit of each ray tube.

The computation time required by (8) primarily depends on the number of rays and the discrete points in the imaging scene. Let

M

be the number of rays,

X

the number of grids in the range direction, and

Z

the number in the azimuth direction, with the total computational load for the entire imaging process given by

M \times X \times Z

. When calculating electrically large targets in rough backgrounds, the increase in the number of ray tubes and discrete points on the imaging window leads to a high time cost. To improve efficiency, the decaying nature of function

\sin c

can be exploited for truncation and acceleration of the computation.

The accuracy of solving the scattering field directly impacts the quality of radar images. As illustrated in Figure 4,

B

,

C

, and

E

represent three different facets on the target. SBR [42] techniques follow the path

A \to B \to C \to D

(forward ray tracing) from the transmitter to the receiver. It may miss certain scattering contributions, especially in complex targets where some regions might not be directly illuminated.

In our proposed bidirectional shooting and bouncing rays (BSBRs) [43], the scattered path

C \to E \to F

(backward ray tracing) is also considered in addition to the path from the transmitter to the receiver (i.e.,

A \to B \to C \to D

). Here, point

C

acts not only as a reflecting point but also as a scattered source. The inductive electric current generated at point

C

on the facet scatters electromagnetic waves along path

C \to E \to F

. Since paths

C D

and

E F

are parallel, additional scattered contributions are captured at the same receiver. Capturing scattered energy from complex surfaces more thoroughly provides additional illumination in areas not directly illuminated, enhancing both the accuracy and completeness of the solution.

The incident electric field in Figure 4 is denoted as

E_{A B}

. During forward ray tracing, the electric field leaving the target is obtained from the incident and reflected electric fields as

E_{f} = E_{B C} + E_{C D}

. The inductive current induced by the incident wave on the target, represented by

J_{f} = n \times H_{f}

, acts as a new source of scattered electric field

E_{C E}

. The interaction between facets, donated by

C \to E

in Figure 4, involves the re-radiation of the inductive current. According to Huygens’ principle,

E_{C E}

is expressed using the dyadic Green function

G (r, r^{'})

in free space as follows:

E_{C E} (r) = \int_{s} - j k η G (r, r^{'}) \cdot J_{f} (r^{'}) d s^{'} .

(10)

During the backward ray tracing

C \to E \to F

, the incident electric field is represented as

E_{C E}

. The ray intersects at point

E

with its plane and undergoes reflection. The reflected field of the ray tube during backward ray tracing is given by

E_{E F} = {(D F)}_{E} {(R)}_{E} E_{C E} e^{- j k \cdot d_{i}},

(11)

where

{(D F)}_{E}

is the scattering factor on the reflecting facet and

{(R)}_{E}

is the reflection coefficient. The induced electric current during the backward tracing is

J_{b} = n \times H_{b} = n \times (H_{C E} + H_{E F}) = n \times (\frac{1}{Z_{0}} k_{i} \times E_{C E} + \frac{1}{Z_{0}} k_{i} \times E_{E F}) .

(12)

The backward scattering field formula is

E_{b} (r) = j k η \frac{e^{- j k r}}{4 π r} \iint_{s} r \times r \times J_{b} (r^{'}) e^{j k r \cdot r^{'}} d s^{'} .

(13)

Ray tracing is employed to establish

M_{1}

forward paths and

M_{2}

backward paths. The total electric field resulting from this BSBR process is computed as

E_{S} = \sum_{i = 1}^{M_{1}} E_{f} + \sum_{i = 1}^{M_{2}} E_{b} .

(14)

It is assumed that electromagnetic waves radiated by the scattered field behave as plane waves. During the current iterations, facets that have already produced effects are recorded. Marked facets are excluded from further interactions to avoid redundant calculations.

The contribution of each ray to the 2D image in forward ray tracing is defined as

O_{f} (x, z)

and in backward ray tracing as

O_{b} (x, z)

. The representation of the 2D image based on BSBRs is given by

O (x, z) = \sum_{r a y_{1} = 1}^{M_{1}} O_{f} (x, z) + \sum_{r a y_{2} = 1}^{M_{2}} O_{b} (x, z) = \sum_{r a y}^{M_{1} + M_{2}} \{α_{i} e^{- j k_{0} (2 z + d_{i} - z_{i})} \sin c [Δ k (z + \frac{d_{i} - z_{i}}{2})]\} \cdot \{k_{0} \sin c [k_{0} θ_{0} (x - x_{i})]\} .

(15)

It should be noted that although (15) appears formally similar to (8), the total number of ray tubes in (15) and (8) are

M_{1} + M_{2}

and

M_{1}

, respectively. It implies that a more comprehensive illumination of regions is achieved with BSBRs, enhancing the accuracy and detail of the resulting 2D images.

3.1.2. Validation of Simulated Datasets

A fast image simulation is conducted, followed by a series of experiments on both simulated datasets to validate the simulated images and demonstrate the superiority of the proposed LiOSR-SAR. The experiments were performed on a personal computer (PC) equipped with an Intel i5-10400 CPU, 32 GB of RAM, and a GTX 1660 SUPER graphics card. The development of the architecture was carried out using Pytorch 1.7.0 [44], with GPU acceleration applied exclusively for the training of LiOSR-SAR.

A simulated dataset was constructed using the FIS-SBR method. Another dataset employed the proposed FIM-BSBR technique. For clarity, these two datasets are referred to as the FIS-SBR and FIM-BSBR. Both datasets utilize consistent radar parameters, as detailed in Table 1. The imaging scene size was set to 25.5 m by 25.5 m. This scene size is appropriate for the typical dimensions of the vessels used in the experiments, particularly small boats under 10 m in length. It ensures the capture of detailed structural features and the distribution of scattering centers, while also aligning with the required range and azimuth resolution.

The simulated dataset included three target categories, ship 1, ship 2, and ship 3. Each category has distinct dimensions. Ship 1 measures 7.9 m in length, 1 m in width, and 2.2 m in height. Ship 2 has a length of 9.2 m, a width of 1.2 m, and a height of 2.16 m. Ship 3 measures 7.5 m in length, 0.9 m in width, and 2.2 m in height. The dataset contains targets of varying sizes and complexities. These differences reflect the variety in structural features and scattering characteristics [45]. The size and complexity of the targets directly influence scattering center resolution and classification accuracy. To verify the correctness as well as the efficiency of the dataset, a comparison is presented among three image simulation methods, namely the range Doppler algorithm (RDA) [46], the FIS-SBR method, and the FIS-BSBR technique.

The comparative imaging results for ship 1 using three different methods—the RDA, FIS-SBR, and the proposed FIS-BSBR—are shown in Figure 5. To simulate SAR imageries with different polarizations, separate simulations are performed for each polarization mode (HH, HV, VH, and VV). In each mode, the scattering characteristics are calculated based on the specific polarization of the transmitted and received signals. For example, the HH mode uses horizontal polarization for both transmission and reception, while the HV mode uses horizontal transmission and vertical reception. All methods accurately pinpoint the positions of scattering centers, as demonstrated by the discernible basic outline of the ship and the distribution of scattering centers in the ISAR images. It indicates that our method aligns well with the comparative methods in terms of numerical results. However, traditional imaging methods struggle with complex geometric shapes and diverse materials, especially in analyzing coupling effects between scattering centers in intricate structure of the ship. These methods are also hindered by high computational demands and reduced efficiency.

While the FIS-SBR efficiently generates radar images for large targets, it overlooks scattering contributions in complex targets, such as those with cavities, and fails to reveal certain coupling effects. In contrast, as shown in Figure 5c, the FIS-BSBR significantly enhances the detail representation and the description of coupling among scattering centers compared to Figure 5a,b. This detailed visualization of interactions between scattering centers is crucial for subsequent open-set target recognition and characteristic analysis.

Quantitative results in Table 2 confirm that the FIS-BSBR takes more time for image formation compared to the FIS-SBR (64 s vs. 12 s). This is due to the incorporation of both forward and backward ray tracing, which enhances image quality by capturing more detailed scattering information. Unlike the RDA, the FIS-BSBR employs convolution based on the

\sin c

function, which significantly improves data processing efficiency. This approach leverages the convolution theorem and fast Fourier transform (FFT) to simplify the imaging process by conducting calculations in the frequency domain. As a result, the time spent on multiple scans in the range and azimuth directions is greatly reduced. The data formation time for the FIS-BSBR is considerably shortened, and most of the processing time is concentrated in the convolution step of the imaging process.

Table 3 and Table 4 detail the distribution of samples per category in the FIS-SBR and FIS-BSBR datasets. Ship 1 and ship 2 are categorized for training as known categories, while ship 3 is designated as an unknown category for testing.

3.2. MSTAR Dataset

The moving and stationary target acquisition and recognition (MSTAR) dataset, developed by Sandia National Laboratories [13], is a cornerstone in the development and evaluation of ATR based on SAR. MSTAR employs a high-resolution spotlight-mode SAR sensor with a resolution of 0.3 m × 0.3 m. It includes a diverse array of vehicle targets captured from multiple azimuth angles, making it a benchmark dataset in SAR-based target recognition.

Specifically designed for terrestrial target classification, the MSTAR dataset [43,44] supports our work across ten categories, namely BRDM-2, BTR-60, BTR-70, T-72, ZSU23-4, ZIL-131, D7, T-62, BMP-2, T-62, and 2S1 categories are used only as unknown classes during testing and are excluded from the training phase. It is divided based on pitch angles, with the training set featuring seven targets at pitch angles ranging from 14° to 16°, and the test dataset comprising ten target types at a pitch angle of 17°. The detailed composition of the MSTAR dataset used in the following experiments is listed in Table 5, and a sample from each category is shown in Figure 6.

4. Experiment

To optimize data loading time, the size of SAR images was reduced to 64 × 64 pixels for input processing. For the simulation dataset, the output layer of LiOSR-SAR was configured with three neurons, which correspond to two known categories and one unknown category. For the MSTAR dataset (provided by Sandia National Laboratories, Albuquerque, NM, USA), it was set up with four neurons, covering three known categories and one unknown category.

The LiOSR-SAR was trained for 50 epochs, with a training batch size of 64. To mitigate the risk of converging to local optima, a cosine annealing scheduler was utilized to adjust the learning rate dynamically. The relationship between the adaptive learning rate and the training process is defined in (16), where

η_{t}

is the current learning rate,

η_{m a x} = 0.005

and

η_{\min} = 0

are the maximum and minimum learning rates, respectively,

T_{cur}

represents the current epoch,

T = 50

denotes the number of epochs in a cycle, and

T_{warmup} = 10

is the number of epochs used for warm up [48]. The effects of these learning rate adjustments on LiOSR-SAR are discussed in the following section.

\{\begin{matrix} η_{t} = η_{m i n} + \frac{T_{cur}}{T_{warm up}} \cdot (η_{m a x} - η_{m i n}), T_{cur} \leq T_{warmup} \\ η_{t} = η_{m i n} + \frac{1}{2} (η_{m a x} - η_{m i n}) \cdot [1 + \cos (\frac{T_{cur} - T_{warmup}}{T} π)], T_{cur} > T_{warmup} \end{matrix} .

(16)

The accuracy assessment for LiOSR-SAR utilizes a cross-entropy loss function [49], defined as

l o s s = - \frac{1}{N_{z}} \sum_{i = 1}^{N_{z}} z_{i}^{'} \log (z_{i}),

(17)

where

z_{i}

is the classification result of sample

i

,

z_{i}^{'}

is the ground truth label of sample

i

, and

N_{z}

indexes the number of all training samples.

Accuracy is quantified by [49]

A c c u r a c y = \frac{@ T P + @ T N}{@ T P + @ T N + @ F P + @ F N},

(18)

where @ denotes the number. True positive (TP), false positive (FP), false negative (FN), and true negative (TN) represent true positives, false positives, false negatives, and true negatives, respectively. The terms “positive” and “negative” indicate whether a sample is classified as belonging to a specific category or not. In the absence of the OP module, thresholds were adjusted based on the dataset to improve the recognition of unknown targets.

The efficiency of LiOSR-SAR was assessed by comparing the model size and the time spent on training.

4.1. Classification Performance on Diverse Datasets

4.1.1. Accuracy Analysis

Table 6 shows the target recognition accuracy of LiOSR-SAR across the FIS-SBR, FIS-BSBR, and MSTAR datasets. The simulated datasets (FIS-SBR, FIS-BSBR) contain ISAR images in four polarizations, namely HH, HV, VH, and VV, whereas the measured dataset (MSTAR) operates exclusively in the HH polarization. Recognition accuracy on the MSTAR dataset reaches 94.1%, which is higher than the 86.9% achieved with the FIS-SBR, but slightly lower than the 97.9% with the FIS-BSBR. The superior accuracy of the FIS-BSBR is due to its ability to capture detailed scattering features in ISAR images, thereby improving target recognition performance.

4.1.2. Confusion Matrix Analysis

In Table 7, Table 8 and Table 9, the “Unknown” category refers to targets not seen during training. In the FIS-SBR and FIS-BSBR datasets, ship 3 is labeled as “Unknown” to assess the ability of LiOSR-SAR to identify unseen targets. In the MSTAR dataset, “Unknown” refers to targets different from the trained vehicle classes. This setup tests the capability of LiOSR-SAR to handle unknown targets in real-world scenarios.

Table 7, Table 8 and Table 9 display the confusion matrices for the three datasets. In Table 7, the performance on the FIS-SBR dataset for the unknown category is significantly lower than for the known categories, with only a 34.3% accuracy for unknown targets. This indicates limitations in the ability of the FIS-SBR to capture sufficient scattering detail for unknown target recognition. However, in Table 8, the FIS-BSBR dataset shows a marked improvement, with the accuracy for the unknown category increasing to 94.0%. This result highlights the advantage of the FIS-BSBR method in capturing more detailed and comprehensive scattering information, particularly in challenging unknown category recognition tasks. The higher accuracy demonstrates the effectiveness of using bidirectional ray tracing in improving feature extraction and classification performance.

Furthermore, Table 9 presents the confusion matrix for the real-world MSTAR dataset. The classification accuracy on this dataset remains consistently high across various vehicle types, with most diagonal elements close to or exceeding 90%, reflecting correct classifications. For example, targets like BRDM-2 and D7 achieve recognition rates of 97.8% and 100%, respectively, showcasing the robustness of the model in identifying known targets. Even for the unknown category, the model maintains a solid performance, with a classification accuracy of 93.4%.

These results confirm that LiOSR-SAR performs well not only on simulated datasets but also on measured datasets, demonstrating a strong generalization ability and robustness in real-world applications.

4.1.3. Model Size and Time

Table 10 presents the training duration, memory usage, and model size for the LiOSR-SAR across the three datasets. The model size consistently remains at 7.5 MB, showcasing the lightweight nature of LiOSR-SAR. Additionally, its minimal memory and training time requirements make it highly advantageous for engineering deployment. This compact design not only allows the model to be efficiently deployed in environments with limited computational resources but also ensures faster model updates and lower maintenance costs. The ability of LiOSR-SAR to balance performance and resource efficiency underlines its potential for real-time applications, where quick response times are critical.

4.2. Comparison with Other DL-Based Methods

In the experimental section, separate discussions of closed-set recognition (CSR) and open-set recognition (OSR) are conducted to assess the adaptability and efficacy of LiOSR-SAR under different conditions. Closed-set testing evaluates recognition performance on known categories, while open-set testing measures accuracy in identifying new, unseen categories, which is crucial for real-world applications.

4.2.1. Comparison with CSR Methods

Figure 7 depicts the performance of LiOSR-SAR against other DL methods such as ResNet-50 [50], ResNet-101 [51], EfficientNet-B3 [52], DenseNet-121 [53], MobileNetV3 [54], and RegNet-32 [55] across three datasets, the FIS-SBR, FIS-BSBR, and MSTAR. LiOSR-SAR demonstrates superior accuracy and lower training loss, particularly excelling with the FIS-BSBR dataset due to its rich detail. Despite the complexities of the MSTAR dataset, LiOSR-SAR maintains robust performance, showcasing its potential for effective application in radar image classification.

Table 11 and Table 12 highlight the advantages of LiOSR-SAR in terms of training time and model size, showing significant efficiency gains compared to other models. Specifically, it maintains high accuracy with substantially reduced model complexity and faster processing times.

Our LiOSR-SAR model demonstrates significant advantages in model size and computational efficiency. As evidenced in Table 11, our model significantly reduces complexity, with a size of just 7.5 MB, and outperforms other methods in processing speed, completing tasks in only 851.0 s. Despite its smaller size and faster processing, LiOSR-SAR maintains a 100% accuracy on the FIS-SBR and FIS-BSBR datasets and achieves a competitive 94.2% accuracy on the MSTAR dataset, as detailed in Table 12.

4.2.2. Comparison with OSR Methods

Currently, few models are specifically designed for OSR in SAR imageries. Both classic DL-based models and OSR methods from the literature have been benchmarked, providing a comprehensive evaluation. In Table 13, accuracy comparisons of complete OSR methods, directly referenced from the literature, are presented. Table 14 highlights combinations of feature extraction and classification strategies from different sources, allowing for a more granular comparison of their individual contributions to OSR performance in SAR imageries.

Our method achieved superior performance across all datasets in Table 13, with a 97.9% accuracy on the FIS-BSBR and 94.1% on MSTAR, outperforming the OSMIL [25] and extended OpenMax [29] methods. This suggests that our approach not only handles the complexities of SAR images effectively but also shows strong generalization capabilities in both simulated and real-world data.

Table 14 provides a comparison of model sizes between our proposed method and those from the literature. As shown, our model significantly reduces the size to 7.5 MB, making it much more lightweight compared to the literature models, which range from 36.7 MB to 44.2 MB.

Table 15 outlines the comparative effectiveness of different models, incorporating different classification layers such as SoftMax [27], OSmIL [25], and extended OpenMax [29]. It not only showcases the performance on closed sets but also demonstrates the effectiveness in managing unknown categories within open sets. LiOSR-SAR shows superior OSR accuracy, maintaining this lead even amidst the complexities of the MSTAR dataset. The consistency in model size across both open- and closed-set scenarios, due to the tests being differentiated only in the test phase, underscores the robustness and adaptability of our model.

4.3. Ablation Experiments

To validate the effectiveness of the integrated modules within our approach, we conducted three ablation experiments focusing on the compact attribute focusing (CAF) module and the open-prediction (OP) module. These experiments provide a detailed analysis of how each component affects the overall performance of our system.

4.3.1. On CAF Module

Table 16 presents a comparison of results with and without the CAF module. After incorporating the CAF module, recognition accuracy improved by 46.1% on the FIS-SBR, 18.7% on the FIS-BSBR, and 7.7% on MSTAR. Furthermore, as shown in Table 17, the inclusion of the CAF module leads to a substantial reduction in model size by 83.0% and a decrease in training time by 4.9%. The averages for model size and training times reported in Table 17 reflect data across the three datasets, indicating consistent performance enhancements due to the CAF module.

4.3.2. On OP Module

Table 18 details the improvements in recognition accuracy achieved by integrating the OP module compared to scenarios where it is absent. Table 19 presents specific data regarding model size and training time for both configurations. With the inclusion of the OP module, recognition accuracy increased by 53.6% for the FIS-SBR; 64.6% for the FIS-BSBR; and 38.4% for MSTAR. This demonstrates that the OP module is crucial for handling unknown classes, which is a core challenge in open-set recognition. Furthermore, despite the significant improvement in accuracy, the inclusion of the OP module does not increase model size or extend training time, as shown in Table 19. This balance between model efficiency and performance enhancement highlights the practical applicability of the OP module in real-world deployments, where computational resources may be limited.

4.3.3. Learning Rate Cycle Discussion

The proposed LiOSR-SAR employs a cosine annealing regulator to adjust the learning rate during the training process, as depicted in Figure 8. The model trains over 50 epochs, with the first five serving as a warm-up phase, where the learning rate linearly ascends to its peak. It is illustrated through three learning rate variation curves, namely a black dashed line for a 10-epoch cycle, a blue dotted line for a 25-epoch cycle, and a solid red line for a 45-epoch cycle.

Table 20 presents the accuracy performance across three datasets for cycles of 10, 25, and 45 epochs. The results demonstrate the crucial impact of the learning rate on model performance. For instance, on the FIS-SBR dataset, the accuracy peaks at a learning rate of 25 with 85.6%. It drops slightly at 45, indicating potential overfitting or instability. Conversely, for the FIS-BSBR dataset, a higher learning rate of 45 gives the best result, reaching 97.9%. This suggests that a more aggressive learning rate optimizes performance for this dataset. Similarly, on the MSTAR dataset, the model performs best at the highest learning rate, achieving 94.1%. These results show the importance of adjusting the learning rate according based on the characteristics of the dataset. This helps strike a balance between convergence speed and model stability.

To further assess the performance and learning dynamics, Figure 9 and Figure 10 plot the training loss and accuracy across epochs. During the initial 10 epochs, there is a rapid decrease in loss and a marked improvement in accuracy, signaling effective early learning. Examining different cosine annealing cycles (T = 10, 25, 45) reveals that a T = 45 cycle yields greater stability in accuracy during later training stages. Magnified views of the loss curves for the final 10 epochs, especially under logarithmic scaling, reveal smoother declines with a T = 45 cycle. It indicates that a longer cosine annealing cycle not only stabilizes but also enhances the generalizability and effectiveness across datasets, especially under unknown data conditions.

5. Conclusions

A novel LiOSR-SAR is proposed in this paper to address the dual challenges of maintaining high accuracy and managing model size in classifiers, particularly when dealing with unknown categories. The LiOSR-SAR integrates two advanced modules, the CAF module and the OP module. These are specifically designed to bolster classifier performance against unknown categories while preserving a lightweight model architecture.

The effectiveness of LiOSR-SAR, augmented by the CAF and OP modules, is demonstrated through a comparative analysis with existing DL-based classifiers. The efficacy of these modules is substantiated by detailed ablation studies. Additionally, the FIS-BSBR technique is proposed for the rapid creation of simulated datasets, generating radar images with enhanced target details that significantly benefit the testing of cost-effective classifiers.

The experimental results confirm that LiOSR-SAR achieves a 97.9% accuracy on the FIS-BSBR dataset and 94.1% on the MSTAR dataset, outperforming other lightweight classifiers. Ablation studies demonstrate that removing the CAF or OP modules leads to a significant drop in performance, highlighting their importance. These findings validate that LiOSR-SAR maintains an excellent balance between accuracy and model size, making it suitable for practical applications.

Author Contributions

Conceptualization, D.D.; methodology, J.Y.; software, J.X.; validation, J.Y. and J.X.; formal analysis, J.Y.; investigation, J.X.; data curation, J.Y.; writing—original draft preparation, J.Y.; writing—review and editing, J.Y. and J.G.; visualization, J.Y.; supervision, D.D.; project administration, D.D. and Z.C.; funding acquisition, D.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (grant no. 61931021, grant no. 62301258) and the Natural Science Foundation of Jiangsu Province (grant no. BK20230918).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Harger, R.O. Synthetic Aperture Radar System Design for Random Field Classification. IEEE Trans. Aerosp. Electron. Syst. 1973, AES-9, 732–740. [Google Scholar] [CrossRef]
Kirk, J. A Discussion of Digital Processing in Synthetic Aperture Radar. IEEE Trans. Aerosp. Electron. Syst. 1975, AES-11, 326–337. [Google Scholar] [CrossRef]
Zhang, Z.; Guo, W.; Zhu, S.; Yu, W. Toward Arbitrary-Oriented Ship Detection with Rotated Region Proposal and Discrimination Networks. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1745–1749. [Google Scholar] [CrossRef]
Chen, M.; Xia, J.-Y.; Liu, T.; Liu, L.; Liu, Y. Open Set Recognition and Category Discovery Framework for SAR Target Classification Based on K-Contrast Loss and Deep Clustering. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 3489–3501. [Google Scholar] [CrossRef]
Musman, S.; Kerr, D.; Bachmann, C. Automatic recognition of ISAR ship images. IEEE Trans. Aerosp. Electron. Syst. 1996, 32, 1392–1404. [Google Scholar] [CrossRef]
Kyung-Tae, K.; Dong-Kyu, S.; Hyo-Tae, K. Efficient classification of ISAR images. IEEE Trans. Antennas Propagat. 2005, 53, 1611–1621. [Google Scholar] [CrossRef]
Zhou, X.; Luo, C.; Ren, P.; Zhang, B. Multiscale Complex-Valued Feature Attention Convolutional Neural Network for SAR Automatic Target Recognition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 2052–2066. [Google Scholar] [CrossRef]
Schwartz, R. Minimax CFAR detection in additive Gaussian noise of unknown covariance (Corresp.). IEEE Trans. Inform. Theory 1969, 15, 722–725. [Google Scholar] [CrossRef]
Pei, J.; Huang, Y.; Huo, W.; Zhang, Y.; Yang, J.; Yeo, T.-S. SAR Automatic Target Recognition Based on Multiview Deep Learning Framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2196–2210. [Google Scholar] [CrossRef]
Zhao, D.; Zhang, Z.; Lu, D.; Kang, J.; Qiu, X.; Wu, Y. CVGG-Net: Ship Recognition for SAR Images Based on Complex-Valued Convolutional Neural Network. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Liu, Q.; Xiao, L.; Yang, J.; Wei, Z. CNN-Enhanced Graph Convolutional Network With Pixel- and Superpixel-Level Feature Fusion for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2021, 59, 8657–8671. [Google Scholar] [CrossRef]
Xu, M.; Li, H.; Yun, Y.; Yang, F.; Li, C. End-to-End Pixel-Wisely Detection of Oceanic Eddy on SAR Images with Stacked Attention Network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 9711–9724. [Google Scholar] [CrossRef]
Chen, S.; Wang, H.; Xu, F.; Jin, Y.-Q. Target Classification Using the Deep Convolutional Networks for SAR Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4806–4817. [Google Scholar] [CrossRef]
Feng, Y.; Chen, J.; Huang, Z.; Wan, H.; Xia, R.; Wu, B.; Sun, L.; Xing, M. A Lightweight Position-Enhanced Anchor-Free Algorithm for SAR Ship Detection. Remote Sens. 2022, 14, 1908. [Google Scholar] [CrossRef]
Yang, Y.; Ju, Y.; Zhou, Z. A Super Lightweight and Efficient SAR Image Ship Detector. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
Xiong, B.; Sun, Z.; Wang, J.; Leng, X.; Ji, K. A Lightweight Model for Ship Detection and Recognition in Complex-Scene SAR Images. Remote Sens. 2022, 14, 6053. [Google Scholar] [CrossRef]
Xu, X.; Zhang, X.; Zhang, T. Lite-YOLOv5: A Lightweight Deep Learning Detector for On-Board Ship Detection in Large-Scene Sentinel-1 SAR Images. Remote Sens. 2022, 14, 1018. [Google Scholar] [CrossRef]
Zhang, M.; An, J.; Yu, D.H.; Yang, L.D.; Wu, L.; Lu, X.Q. Convolutional Neural Network with Attention Mechanism for SAR Automatic Target Recognition. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Li, L.; Yuan, R.; Lv, Y.; Xu, S.; Hu, H.; Song, G. An efficient robotic-assisted bolt-ball joint looseness monitoring approach using CBAM-enhanced lightweight ResNet. Smart Mater. Struct. 2023, 32, 125008. [Google Scholar] [CrossRef]
Zhu, X.X.; Montazeri, S.; Ali, M.; Hua, Y.; Wang, Y.; Mou, L.; Shi, Y.; Xu, F.; Bamler, R. Deep Learning Meets SAR. arXiv 2021, arXiv:2006.10027. [Google Scholar] [CrossRef]
Wang, C.; Pei, J.; Yang, J.; Liu, X.; Huang, Y.; Mao, D. Recognition in Label and Discrimination in Feature: A Hierarchically Designed Lightweight Method for Limited Data in SAR ATR. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
Wu, Z.; Xie, H.; Hu, X.; He, J.; Wang, G. Lightweight Vehicle Detection and Recognition Method Based on Improved YOLOv5 in SAR Images. In Proceedings of the 2022 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Xi’an, China, 25–27 October 2022; IEEE: Xi’an, China, 2022; pp. 1–6. [Google Scholar]
Zhang, T.; Zhang, X. ShipDeNet-20: An Only 20 Convolution Layers and <1-MB Lightweight SAR Ship Detector. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1234–1238. [Google Scholar] [CrossRef]
Bendale, A.; Boult, T. Towards Open World Recognition. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; IEEE: Boston, MA, USA, 2015; pp. 1893–1902. [Google Scholar]
Dang, S.; Cao, Z.; Cui, Z.; Pi, Y.; Liu, N. Open Set Incremental Learning for Automatic Target Recognition. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4445–4456. [Google Scholar] [CrossRef]
Scherreik, M.; Rigling, B. Multi-Class Open Set Recognition for SAR Imagery. In Proceedings of the SPIE Defense + Security, Baltimore, MD, USA, 19–21 April 2016; Sadjadi, F.A., Mahalanobis, A., Eds.; p. 98440M. [Google Scholar]
Bendale, A.; Boult, T.E. Towards Open Set Deep Networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Las Vegas, NV, USA, 2016; pp. 1563–1572. [Google Scholar]
Scherreik, M.; Rigling, B. Automatic Threshold Selection for Multi-Class Open Set Recognition. In Proceedings of the SPIE Defense + Security, Anaheim, CA, USA, 11–14 September 2017; Sadjadi, F.A., Mahalanobis, A., Eds.; p. 102020K. [Google Scholar]
Oveis, A.H.; Giusti, E.; Ghio, S.; Martorella, M. Extended Openmax Approach for the Classification of Radar Images with a Rejection Option. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 196–208. [Google Scholar] [CrossRef]
Ma, X.; Ji, K.; Feng, S.; Zhang, L.; Xiong, B.; Kuang, G. Open Set Recognition with Incremental Learning for SAR Target Classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
Delisle, G.Y.; Wu, H. Moving target imaging and trajectory computation using ISAR. IEEE Trans. Aerosp. Electron. Syst. 1994, 30, 887–899. [Google Scholar] [CrossRef]
Zhao, Y.; Zhang, M.; Chen, H.; Yuan, X.-F. Radar Scattering From the Composite Ship-Ocean Scene: Doppler Spectrum Analysis Based on the Motion of Six Degrees of Freedom. IEEE Trans. Antennas Propagat. 2014, 62, 4341–4347. [Google Scholar] [CrossRef]
Dong, C.-L.; Guo, L.-X.; Meng, X.; Wang, Y. An Accelerated SBR for EM Scattering from the Electrically Large Complex Objects. Antennas Wirel. Propag. Lett. 2018, 17, 2294–2298. [Google Scholar] [CrossRef]
Bhalla, R.; Hao, L. Fast inverse synthetic aperture radar image simulation of complex targets using ray shooting. In Proceedings of the 1st International Conference on Image Processing, Austin, TX, USA, 13–16 November 1994; IEEE Comput. Soc. Press: Austin, TX, USA, 1994; Volume 1, pp. 461–465. [Google Scholar]
Kell, R.E. On the derivation of bistatic RCS from monostatic measurements. Proc. IEEE 1965, 53, 983–988. [Google Scholar] [CrossRef]
Buddendick, H.; Eibert, T.F. Application of a fast equivalent currents based algorithm for scattering center visualization of vehicles. In Proceedings of the 2010 IEEE Antennas and Propagation Society International Symposium, Toronto, ON, USA, 11–17 July 2010; IEEE: Toronto, ON, USA, 2010; pp. 1–4. [Google Scholar]
He, X.-Y.; Wang, X.-B.; Zhou, X.; Zhao, B.; Cui, T.-J. Fast Isar Image Simulation of Targets at Arbitrary Aspect Angles Using a Novel Sbr Method. PIER B 2011, 28, 129–142. [Google Scholar] [CrossRef]
Yun, D.-J.; Lee, J.-I.; Bae, K.-U.; Kwon, K.-I.; Myung, N.-H. Improvement in Accuracy of ISAR Image Formation Using the Shooting and Bouncing Ray. Antennas Wirel. Propag. Lett. 2015, 14, 970–973. [Google Scholar] [CrossRef]
Shang, R.; He, J.; Wang, J.; Xu, K.; Jiao, L.; Stolkin, R. Dense connection and depthwise separable convolution based CNN for polarimetric SAR image classification. Knowl.-Based Syst. 2020, 194, 105542. [Google Scholar] [CrossRef]
He, K.; Tang, J.; Liu, Z.; Yang, Z. HAFE: A Hierarchical Awareness and Feature Enhancement Network for Scene Text Recognition. Knowl.-Based Syst. 2024, 284, 111178. [Google Scholar] [CrossRef]
Yun, D.J.; Lee, J.I.; Yoo, J.H.; Myung, N.H. Fast bistatic ISAR image generation for realistic cad model using the shooting and bouncing ray technique. In Proceedings of the 2014 Asia-Pacific Microwave Conference, Sendai, Japan, 4–7 November 2014; pp. 1324–1326. [Google Scholar]
Ling, H.; Chou, R.-C.; Lee, S.-W. Shooting and bouncing rays: Calculating the RCS of an arbitrarily shaped cavity. IEEE Trans. Antennas Propagat. 1989, 37, 194–205. [Google Scholar] [CrossRef]
Taygur, M.M.; Sukharevsky, I.O.; Eibert, T.F. A Bidirectional Ray-Tracing Method for Antenna Coupling Evaluation Based on the Reciprocity Theorem. IEEE Trans. Antennas Propagat. 2018, 66, 6654–6664. [Google Scholar] [CrossRef]
Deleu, T.; Würfl, T.; Samiei, M.; Cohen, J.P.; Bengio, Y. Torchmeta: A Meta-Learning library for PyTorch. arXiv 2019, arXiv:1909.06576. [Google Scholar] [CrossRef]
Tong, X.; Zuo, Z.; Su, S.; Wei, J.; Sun, X.; Wu, P.; Zhao, Z. ST-Trans: Spatial-Temporal Transformer for Infrared Small Target Detection in Sequential Images. IEEE Trans. Geosci. Remote Sens. 2024, 1, 1. [Google Scholar] [CrossRef]
Munoz-Ferreras, J.M.; Perez-Martinez, F. On the Doppler Spreading Effect for the Range-Instantaneous-Doppler Technique in Inverse Synthetic Aperture Radar Imagery. IEEE Geosci. Remote Sens. Lett. 2010, 7, 180–184. [Google Scholar] [CrossRef]
Yun, D.-J.; Lee, J.-I.; Bae, K.-U.; Lim, H.; Myung, N.-H. Accurate and fast ISAR image formation for complex CAD using the shooting and bouncing ray. In Proceedings of the 2015 Asia-Pacific Microwave Conference (APMC), Nanjing, China, 6–9 December 2015; IEEE: Nanjing, China, 2015; pp. 1–3. [Google Scholar]
Ross, T.D.; Mossing, J.C. MSTAR Evaluation Methodology. In Proceedings of the AeroSense ’99, Orlando, FL, USA, 5–9 April 1999; Zelnio, E.G., Ed.; pp. 705–713. [Google Scholar]
Wang, C.; Shi, J.; Zhou, Y.; Yang, X.; Zhou, Z.; Wei, S.; Zhang, X. Semisupervised Learning-Based SAR ATR via Self-Consistent Augmentation. IEEE Trans. Geosci. Remote Sens. 2021, 59, 4862–4873. [Google Scholar] [CrossRef]
Wen, L.; Li, X.; Gao, L. A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput. Applic 2020, 32, 6111–6124. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; IEEE: Las Vegas, NV, USA, 2016; pp. 770–778. [Google Scholar]
Mekhalfi, M.L.; Nicolo, C.; Bazi, Y.; Rahhal, M.M.A.; Alsharif, N.A.; Maghayreh, E.A. Contrasting YOLOv5, Transformer, and EfficientDet Detectors for Crop Circle Detection in Desert. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
Zhang, J.; Li, Y.; Si, Y.; Peng, B.; Xiao, F.; Luo, S.; He, L. A Low-Grade Road Extraction Method Using SDG-DenseNet Based on the Fusion of Optical and SAR Images at Decision Level. Remote Sens. 2022, 14, 2870. [Google Scholar] [CrossRef]
Jiang, S.; Zhou, X. DWSC-YOLO: A Lightweight Ship Detector of SAR Images Based on Deep Learning. JMSE 2022, 10, 1699. [Google Scholar] [CrossRef]
Li, C.; Cui, H.; Tian, X. A Novel CA-RegNet Model for Macau Wetlands Auto Segmentation Based on GF-2 Remote Sensing Images. Appl. Sci. 2023, 13, 12178. [Google Scholar] [CrossRef]

Figure 1. (a) Existing approach using ResNet-18. (b) Proposed approach, illustrating the overall network architecture of the LiOSR-SAR.

Figure 2. Workflow of CAF module, where ⊕ denotes element-wise addition.

Figure 3. Workflow of OP module. (a) Step 1: Establish the feature space. (b) Step 2: Measure similarity scores in feature space. (c) Step 3: Make category determinations.

Figure 4. BSBR schematic.

Figure 5. Comparison of 2D ISAR images of a ship model under four polarization modes. (a) RDA [46]. (b) FIS-SBR [47]. (c) FIS-BSBR.

Figure 6. Sample presentation in the MSTAR dataset. (a) BRDM-2; (b) BTR-60; (c) BTR-70; (d) T-72; (e) ZSU23-4; (f) ZIL-131; (g) D7; (h) BMP-2; (i) T-62; (j) 2S1.

Figure 7. Training loss and training accuracy curves for different methods. (a) FIS-SBR: (a-1) loss–epoch curves on the FIS-SBR; (a-2) accuracy–epoch curves on the FIS-SBR. (b) FIS-BSBR: (b-1) loss–epoch curves on the FIS-BSBR; (b-2) accuracy–epoch curves on the FIS-BSBR. (c) MSTAR: (c-1) loss–epoch curves on MSTAR; (c-2) accuracy–epoch curves on MSTAR.

Figure 8. Modulating learning rates through various cosine cycles.

Figure 9. Different learning rates on different datasets. (a) FIS-SBR: (a-1) training loss–epoch curve; (a-2) training loss (40–50)–epoch curve. (b) FIS-BSBR: (b-1) training loss–epoch curve; (b-2) training loss (40–50)–epoch curve. (c) MSTAR: (c-1) training loss–epoch curve; (c-2) training loss (40–50)–epoch curve.

Figure 10. Different learning rates on different datasets. (a) Training accuracy–epoch curve on FIS-SBR. (b) Training accuracy–epoch curve on FIS-BSBR. (c) Training accuracy–epoch curve on MSTAR.

Table 1. Setting of simulation parameters.

Parameter	Value
Center frequency	4.8 GHz
Bandwidth	0.75 GHz
Pitch angle	30°
Azimuth angle	[40.525°, 49.475°]/0.0705°
Polarization mode	VV, VH, HV, HH
Range resolution	0.2 m
Azimuth resolution	0.2 m
Imaging plane size	25.5 m × 25.5 m

Table 2. Performance comparison.

Methods	* Data/s	* Imaging/s	* Total/s
RDA [46]	240	0.1	240.1
FIS-SBR [47]	0.1	12	12.1
FIS-BSBR (proposed)	0.1	64	64.1

* Denotes generation time.

Table 3. The sample distribution in the FIS-SBR.

Category	@ Training	@ Test	@ All
Ship 1	1788	196	1984
Ship 2	1783	201	1984
Ship 3	0	183	183

@ Denotes the number.

Table 4. The sample distribution in the FIS-BSBR.

Category	@ Training	@ Test	@ All
Ship 1	1796	188	1984
Ship 2	1774	209	1983
Ship 3	0	183	183

@ Denotes the number.

Table 5. The sample distribution in the MSTAR dataset.

Category	@ Training	@ Test	@ All
BRDM-2	298	274	572
BTR-60	256	195	451
BTR-70	233	196	429
T-72	232	196	428
ZSU23-4	299	274	573
ZIL-131	299	274	573
D7	299	274	573
BMP-2	0	195	195
T-62	0	273	273
2S1	0	274	274

@ Denotes the number.

Table 6. Recognition accuracy of LiOSR-SAR.

Datasets	Overall Accuracy
FIS-SBR	86.9%
FIS-BSBR	97.9%
MSTAR	94.1%

Table 7. Confusion matrix of LiOSR-SAR on the unknown category recognition task for the FIS-SBR.

Dataset		Ship 1	Ship 2	Unknown
Dataset	True	Ship 1	Ship 2	Unknown
FIS-SBR	Ship 1	96.9%	0%	3.1%
	Ship 2	0%	98.5%	1.5%
	Unknown	65.7%	0%	34.3%
Overall Accuracy		86.9%

Blue color indicates the probability of correct classification.

Table 8. Confusion matrix of LiOSR-SAR on the unknown category recognition task for the FIS-BSBR.

Dataset		Ship 1	Ship 2	Unknown
Dataset	True	Ship 1	Ship 2	Unknown
FIS-BSBR	Ship 1	99.5%	0%	0.5%
	Ship 2	0%	98.1%	1.9%
	Unknown	3.6%	2.4%	94.0%
Overall Accuracy		97.9%

Blue color indicates the probability of correct classification.

Table 9. Confusion matrix of LiOSR-SAR on the unknown category recognition task for MSTAR.

Dataset		BRDM-2	BTR-60	BTR-70	T-72	ZSU23-4	ZIL-131	D7	Unknowns
Dataset	True	BRDM-2	BTR-60	BTR-70	T-72	ZSU23-4	ZIL-131	D7	Unknowns
MSTAR	BRDM-2	97.8%	0.0%	0.0%	0.0%	0.0%	0.0%	0.4%	1.8%
	BTR-60	0.5%	91.3%	0.0%	0.0%	0.0%	0.0%	0.0%	8.2%
	BTR-70	0.0%	0.0%	95.9%	0.0%	0.0%	0.0%	0.0%	4.1%
	T-72	0.0%	0.0%	0.0%	79.9%	0.0%	0.0%	0.7%	19.3%
	ZSU23-4	0.0%	0.0%	0.0%	0.0%	98.0%	0.0%	0.0%	2.0%
	ZIL-131	0.0%	0.0%	0.0%	0.0%	0.0%	98.9%	0.0%	1.1%
	D7	0.0%	0.0%	0.0%	0.0%	0.0%	0.0%	100.0%	0.0%
	Unknowns	0.4%	0.7%	0.5%	0.0%	2.8%	0.5%	1.6%	93.4%
Overall Accuracy		94.1%

Blue color indicates the probability of correct classification.

Table 10. Performance of LiOSR-SAR on three datasets.

Dataset	Size/MB	Time/s
FIS-SBR	7.5	880.4
FIS-BSBR	7.5	869.0
MSTAR	7.5	703.0

Table 11. Model size comparison with DL-based methods for CSR.

Methods		ResNet-50 [50]	ResNet-101 [51]	EfficientNet-B3 [52]	DenseNet-121 [53]	MobileNetV3 [54]	RegNet-32 [55]	Ours
FIS-SBR	Size/MB	90.0	162.7	41.3	27.1	16.2	9.0	7.5
FIS-SBR	Time/s	1347.5	1344.4	1341.1	1345.0	1344.1	1341.8	882.1
FIS-BSBR	Size/MB	90.0	162.7	41.3	27.1	16.2	9.0	7.5
FIS-BSBR	Time/s	1341.5	1347.3	1360.8	1367.3	1385.2	1374.7	869.6
MSTAR	Size/MB	90.0	162.8	41.4	27.1	16.3	9.0	7.5
MSTAR	Time/s	987.6	1264.4	1039.5	1041.3	873.9	849.8	851.0

Table 12. Accuracy comparison with DL-based methods for CSR.

Methods	FIS-SBR	FIS-BSBR	MSTAR
ResNet-50 [50]	100%	100%	82.6%
ResNet-101 [51]	100%	100%	85.9%
EfficientNet-B3 [52]	100%	100%	73.7%
DenseNet-121 [53]	99.7%	100%	81.3%
MobileNetV3 [54]	100%	100%	73.6%
RegNet-32 [55]	99.5%	100%	78.5%
Ours	100%	100%	97.3%

Table 13. Accuracy comparison with DL-based methods for OSR from the literature.

Methods	FIS-SBR	FIS-BSBR	MSTAR
Literature [25]	83.5%	90.2%	88.2%
Literature [29]	84.4%	88.3%	85.8%
Ours	86.9%	97.9%	94.1%

Table 14. Model size comparison with DL-based methods for OSR from the literature.

Methods	FIS-SBR	FIS-BSBR	MSTAR
Literature [25]	36.7 MB	36.7 MB	36.7 MB
Literature [29]	44.2 MB	44.2 MB	44.2 MB
Ours	7.5 MB	7.5 MB	7.5 MB

Table 15. Accuracy comparison of combined feature extraction and classification strategies for OSR.

Methods	FIS-SBR	FIS-BSBR	MSTAR
ResNet-50 [50] + SoftMax [27]	83.1%	86.0%	79.5%
ResNet-50 [50] + OSmIL [25]	84.7%	88.9%	75.7%
ResNet-50 [50] + Extended OpenMax [29]	76.0%	76.2%	81.1%
ResNet-101 [51] + SoftMax [27]	83.3%	87.7%	78.6%
ResNet-101 [51] + OSmIL [25]	81.8%	84.3%	80.7%
ResNet-101 [51] + Extended OpenMax [29]	62.5%	77.2%	74.5%
DenseNet-121 [53] + SoftMax [27]	83.3%	89.5%	81.4%
DenseNet-121 [53] + OSmIL [25]	81.6%	85.8%	74.1%
DenseNet-121 [53] + Extended OpenMax [29]	75.0%	85.8%	82.1%
MobileNetV3 [54] + SoftMax [27]	83.7%	83.7%	75.0%
MobileNetV3 [54] + OSmIL [25]	86.2%	85.6%	69.6%
MobileNetV3 [54] + Extended OpenMax [29]	53.7%	89.1%	76.5%
EfficientNet-B3 [52] + SoftMax [27]	85.2%	87.0%	73.1%
EfficientNet-B3 [52] + OSmIL [25]	89.1%	86.2%	63.1%
EfficientNet-B3 [52] + Extended OpenMax [29]	63.9%	88.5%	72.3%
RegNet-32 [55] + SoftMax [27]	84.3%	89.1%	75.7%
RegNet-32 [55] + OSmIL [25]	89.1%	91.6%	69.5%
RegNet-32 [55] + Extended OpenMax [29]	87.9%	81.6%	77.2%
Ours	86.9%	97.9%	94.1%

Table 16. Ablation study for the CAF module on accuracy.

Dataset	CAF Module	Accuracy
FIS-SBR	×	40.8%
FIS-SBR	√	86.9%
FIS-BSBR	×	79.2%
FIS-BSBR	√	97.9%
MSTAR	×	82.4%
MSTAR	√	94.1%

Table 17. Ablation study for the CAF module on size and training time.

CAF Module	Size/MB	Training Time/s
×	44.2	910.4
√	7.5	867.6

Table 18. Ablation study for the OP module on accuracy.

Dataset	OP Module	Accuracy
FIS-SBR	×	33.3%
FIS-SBR	√	86.9%
FIS-BSBR	×	33.3%
FIS-BSBR	√	97.9%
MSTAR	×	55.7%
MSTAR	√	94.1%

Table 19. Ablation study for the OP module on size and training time.

OP Module	Size/MB	Training Time/s
×	7.5	867.6
√	7.5	867.6

Table 20. Accuracy under different learning rates of LiOSR-SAR.

	10	25	45
Datasets	10	25	45
FIS-SBR	83.5%	85.6%	84.4%
FIS-BSBR	94.0%	80.1%	97.9%
MSTAR	89.1%	85.9%	94.1%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Gu, J.; Xin, J.; Cong, Z.; Ding, D. LiOSR-SAR: Lightweight Open-Set Recognizer for SAR Imageries. Remote Sens. 2024, 16, 3741. https://doi.org/10.3390/rs16193741

AMA Style

Yang J, Gu J, Xin J, Cong Z, Ding D. LiOSR-SAR: Lightweight Open-Set Recognizer for SAR Imageries. Remote Sensing. 2024; 16(19):3741. https://doi.org/10.3390/rs16193741

Chicago/Turabian Style

Yang, Jie, Jihong Gu, Jingyu Xin, Zhou Cong, and Dazhi Ding. 2024. "LiOSR-SAR: Lightweight Open-Set Recognizer for SAR Imageries" Remote Sensing 16, no. 19: 3741. https://doi.org/10.3390/rs16193741

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LiOSR-SAR: Lightweight Open-Set Recognizer for SAR Imageries

Abstract

1. Introduction

2. LiOSR-SAR

2.1. Overview

2.2. CAF Module

2.3. OP Module

3. Datasets

3.1. Construction of Simulated Datasets

3.1.1. Image Formation

3.1.2. Validation of Simulated Datasets

3.2. MSTAR Dataset

4. Experiment

4.1. Classification Performance on Diverse Datasets

4.1.1. Accuracy Analysis

4.1.2. Confusion Matrix Analysis

4.1.3. Model Size and Time

4.2. Comparison with Other DL-Based Methods

4.2.1. Comparison with CSR Methods

4.2.2. Comparison with OSR Methods

4.3. Ablation Experiments

4.3.1. On CAF Module

4.3.2. On OP Module

4.3.3. Learning Rate Cycle Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI