Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method

Zhang, Haoxiang; Liu, Chao; Ma, Jianguang; Sun, Hui

doi:10.3390/math12010168

Open AccessArticle

Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method

¹

School of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu 610095, China

²

Institute of Systems Engineering, Academy of Military Sciences (AMS), People’s Liberation Army of China (PLA), Beijing 100091, China

³

Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(1), 168; https://doi.org/10.3390/math12010168

Submission received: 26 November 2023 / Revised: 26 December 2023 / Accepted: 30 December 2023 / Published: 4 January 2024

(This article belongs to the Section E: Applied Mathematics)

Download

Browse Figures

Versions Notes

Abstract

:

Deep learning technology has greatly propelled the development of intelligent and information-driven research on ship infrared automatic target recognition (SIATR). In future scenarios, there will be various recognition models with different mechanisms to choose from. However, in complex and dynamic environments, ship infrared (IR) data exhibit rich feature space distribution, resulting in performance variations among SIATR models, thus preventing the existence of a universally superior model for all recognition scenarios. In light of this, this study proposes a model-matching method for SIATR tasks based on bipartite graph theory. This method establishes evaluation criteria based on recognition accuracy and feature learning credibility, uncovering the underlying connections between IR attributes of ships and candidate models. The objective is to selectively recommend the optimal candidate model for a given sample, enhancing the overall recognition performance and applicability of the model. We separately conducted tests for the optimization of accuracy and credibility on high-fidelity simulation data, achieving Accuracy and EDMS (our credibility metric) of 95.86% and 0.7781. Our method improves by 1.06% and 0.0274 for each metric compared to the best candidate models (six in total). Subsequently, we created a recommendation system that balances two tasks, resulting in improvements of 0.43% (accuracy) and 0.0071 (EDMS). Additionally, considering the relationship between model resources and performance, we achieved a 28.35% reduction in memory usage while realizing enhancements of 0.33% (accuracy) and 0.0045 (EDMS).

Keywords:

ship infrared automatic target recognition; deep learning; bipartite graph model recommendation; simulate infrared data of a ship

MSC:

68T07

1. Introduction

Ship targets are crucial combat units in modern maritime warfare, and it is important to recognize them accurately and credibly to enhance maritime situational awareness and gain an advantage. Infrared (IR) imaging technology is advantageous due to its all-weather capability, long-range perception, and strong concealment. It, along with visible light and synthetic aperture radar (SAR) imaging, forms an important means of acquiring feature information about ship targets and is widely used in ship automatic target recognition (SATR) tasks [1,2,3]. In the marine environment, different types of ships exhibit varying degrees of thermal characteristics and IR radiation spectra within the IR spectrum. This paper focuses on ship IR automatic target recognition (SIATR) technology, which utilizes sensors to capture IR images of ship targets. By combining image processing, recognition algorithms, and other techniques, this technology automatically and accurately extracts shape, IR radiation, and other feature information from the targets. This extracted information is then compared and matched against a pre-established feature information database to determine the type and identity of the target [4]. In the military domain, this technology provides category–priority information for subsequent tasks like target tracking, threat assessment, and target engagement, thereby delivering reliable target recognition and intelligence support for maritime operations. Additionally, it has significant applications in civilian sectors such as maritime surveillance, safety rescue, and related activities.

In recent decades, SATR technology has witnessed rapid development. Traditional SATR research primarily relies on machine learning and pattern recognition algorithms. Specifically, the recognition system preprocesses the acquired images, including steps such as image enhancement and target extraction. Subsequently, through feature extraction and selection, the texture, shape, size, and other information of the targets are transformed into numerical features. Finally, by utilizing a machine learning classifier, the numerical features are analyzed and evaluated, enabling the automated recognition of different targets. For instance, in [5], the original IR images are first segmented to obtain the contour features of the ship body. Next, the moment functions of the contour are calculated to ensure translation, rotation, and scale invariance. Finally, the extracted features are inputted into a backpropagation neural network to achieve category prediction. Luo et al. [6] utilized a moment-based shape analysis method to extract features from optical images. They employed a complex angle radial transform for the shape of binary images to generate feature vectors. Then, they employed the k-nearest neighbors algorithm to make recognition decisions. Li et al. [7] first detected the salient features of the targets and segment the useful regions. Afterward, they extracted IR features by utilizing the moment functions of the target boundaries and solid contours and fed them into a support vector machine to predict the category. However, traditional methods often rely on manual intervention and rule definition, which not only consume time and effort but also lead to the possibility of misjudgments and omissions. Additionally, these methods have limited adaptability and require major adjustments in different contexts, making it difficult to exhibit good robustness in large-scale data and complex environments [8].

In recent years, deep learning methods have emerged as a prominent force, injecting vitality into image recognition tasks across numerous domains. Their remarkable capability lies in their ability to automatically and accurately learn more complex and diverse feature representations from vast amounts of data [9]. These methods have an end-to-end mechanism that enables instant decision making and is expected to provide more robust and powerful technical support for SATR research [3]. Currently, several researchers have begun exploring the application of deep learning techniques in SATR research and have achieved phased results. Initially, the focus of research in this field was to construct ship datasets and simultaneously develop deep learning algorithms. For instance, reference [10] introduced the VAIS dataset, which comprised over 1000 ship images in both IR and visible light, covering six different categories. Subsequently, the study utilized a 16-layer convolutional neural network (CNN) model to validate on the VAIS dataset, demonstrating the advantages of deep learning methods over traditional methods in SATR task. Karabayir et al. [11] employed a CAD modeling approach to synthesize ship images encompassing multiple targets for military and civilian purposes, as a means to augment the dataset size. They concurrently validated the feasibility of this approach in providing parameter training for CNN models. With the development of intelligent SATR research, some scholars have begun to focus on improving the internal design or embedding additional mechanisms on the basis of a single model. These refinements have resulted in the better accuracy and generalization of SATR models. We summarize the key techniques and results of some related studies in Table 1.

Overall, the successful application of deep learning technologies in the SATR field not only signifies their current prominence but also suggests their potential to shape the future direction of SATR research [18,19]. Furthermore, it is anticipated that an array of models incorporating diverse mechanisms will emerge, offering researchers a broader spectrum of options for their specific recognition tasks.

However, ship targets in marine environments are influenced by various factors such as environmental conditions, target posture, and target behavior [20], exhibiting diverse IR characteristics and a rich feature distribution space. This poses significant challenges to the performance generalization and usability of recognition models. In fact, the performance of SATR models is influenced by their mechanisms and structures, making it particularly difficult to rely on a single model to achieve absolute performance advantages across all scenarios. The key focus of this paper is on how to further enhance the overall performance of SATR tasks while keeping the quantity and performance of existing deep learning-based SATR models unchanged. One strategy involves using ensemble learning to fuse the features of multiple models to enhance the ability to infer categories. However, this approach significantly increases overall complexity, resulting in resource and computational burdens that typically surpass those of all base models combined [21,22], and weakens the interpretability of the model [23]. Another viable approach is to adopt model-adaptive recommendation. This method explores and captures the inherent connection between specific scene samples and model performance, constructing a prior knowledge system to adaptively recommend the most suitable model, thus obtaining better recognition judgments with a lower-complexity structure.

Against this background, this paper focuses on IR-based SATR tasks and proposes a novel model-matching method for SIATR based on bipartite graph recommendation [24] (SIATR-BGR), aiming to enhance the performance and applicability of the model in complex environments. The SIATR-BGR method considers the IR characteristic attributes of samples and candidate models as nodes of different categories in a bipartite graph. Through a reward–penalty combination, the method systematically explores the matching relationship between sample attributes and models, i.e., the edge weights in the bipartite graph. In designing and computing the edge weights, the method establishes not only the relationship between attributes and the recognition accuracy of candidate models but also quantitatively evaluates the credibility of model feature extraction. When recommending a model, the method calculates a recommendation score for each candidate model to reflect its degree of match with a specific sample, thus selecting the most suitable candidate model. The proposal of this method offers a new approach to enhance the overall performance of SIATR tasks while also providing decision makers with clearer model selection criteria, aiding in a better understanding of model selection behavior. It is worth noting that there is currently no large-scale publicly available IR dataset that covers various complex scenarios for real ship targets. With the continuous maturation of simulation modeling technology, some studies have successfully utilized simulated data for SATR research [25,26,27]. In light of this inspiration, the validation process of this study will rely on highly realistic simulated IR data.

The contributions of this paper are as follows:

We propose a novel SIATR model recommendation method, distinguished by its ability to adaptively match the optimal model based on sample attributes, which enhances adaptability to complex scenarios and improves overall performance;
We have created a measure of SIATR model feature learning credibility. This measure, in combination with traditional accuracy metrics, provides a more comprehensive assessment of model usability;
During the experimental validation phase, we analyze both recognition accuracy and feature learning reliability, as well as the relationship between the model resource consumption of the recommendation system.

In the remaining sections of the paper, Section 2 will introduce the materials and methods used in this study, Section 3 will present the experiments and results, and Section 4 will summarize the work of this paper.

2. Materials and Methods

2.1. Dataset

Deep learning recognition models require comprehensive data as the fundamental input, where the quality and quantity of data directly impact the performance of the model [28]. Open datasets suitable for deep IR learning research on ships, such as VAIS and IRships [29], are relatively rare due to factors such as time and funding. Additionally, these datasets are limited by insufficient data volume, which makes it challenging to accurately reflect the changes in IR characteristics in different scenarios. As image simulation technology continues to advance, the use of IR simulation modeling to synthesize data for neural network parameter training has emerged as an alternative approach. The ship IR data used in this paper are obtained from our independently developed, physical, believable simulation modeling software for maritime scenarios. This software integrates highly credible meteorological model modules, allowing realistic simulation and analysis of various environmental factors, and can generate a large number of images and corresponding bounding boxes [30] in a short period.

For this study, we used a dataset consisting of 10,368 images with a resolution of 1024 × 768 pixels. The dataset encompasses three distinct categories: container freighter (3456 images), cruise ship (3456 images), and warship (3456 images). To ensure a comprehensive reflection of the differences in IR characteristics within real-world scenarios, we considered various factors that affect the generation and propagation of radiation, as well as diverse sensor capture elements when constructing the dataset. The relevant information is summarized in Table 2. In Figure 1, we showcase a selection of typical image samples and the automatically generated bounding boxes in sliced format.

2.2. Proposed Approach

IR images of ships can vary significantly due to different background environments and shooting conditions. This variation can result in differences in the accuracy and credibility of models under different mechanisms. Consequently, there exists an inherent correlation between the attribute information of the images with the efficacy of the models. If one conceptualizes these entities as nodes of different categories (sample’s attributes, models), it becomes evident that nodes within the same category lack direct connections, while relationships between nodes of different types can be represented as weighted edges. Thus, the relationship between attributes and models can be abstracted as the matching between nodes of a weighted bipartite graph. The SIATR-BGR method explores the potential impact of IR attributes on model performance from the perspectives of accuracy and credibility, and uses a bipartite graph to represent the relationship. Inspired by the analysis of causal factors influencing model decisions, as discussed in reference [31,32], SIATR-BGR follows the approach illustrated in Figure 2 to obtain masked images of the background region. The output differences of the model before and after masking are then quantified. This is because the training of the SIATR model mainly involves learning the feature space distribution of the samples, including the shape of the target, the brightness of the imaging, and the characteristics of the background region. When making prediction decisions, excellent candidate models primarily rely on the shape and brightness of the target, with minimal interference from the background region.

The SIATR-BGR method consists of two main components, namely, bipartite graph knowledge construction and model-adaptive recommendation, as shown in Figure 3. In the knowledge construction phase, SIATR-BGR establishes a bipartite graph mapping relationship between sample attributes and model selection by integrating the output values of the samples with their inherent attributes. The strength of these relationships is reflected by the edge weights of the bipartite graph. In the model-adaptive recommendation section, SIATR-BGR uses pre-established prior knowledge and provides sample attributes to extract relevant information. It then calculates the recommended model priority order, which is the targeted selection of the optimal recognition model.

2.2.1. Preliminaries

In this section, we will select and partition the sample IR attributes used to construct the SIATR-BGR framework and define some of the mathematical symbols that will be used. In the given simulated data, five types of attributes, including time, motion state, air temperature, zenith angle, and distance, can be perceived in advance through various means. This paper mainly focuses on these attributes, and their division is presented in Table 3. In our method, the following symbols represent the elements in a bipartite graph:

G = (M, A, E, U, P)

represents the bipartite graph, where

M = {M o d e l_{j} |j = 1, \dots, J}

and

A = {A t t r i b u t e_{a}^{t} |a = 1, \dots, A, t = 1, \dots, - 1}

represent the vertex sets corresponding to the models and the sample attributes, respectively. The

a

in

A t t r i b u t e_{a}^{t}

is used to determine the broad category of the sample attribute, and

t

is used to determine the small category of the attribute after division. If

t

equals −1, it represents the last one of the current small category.

E \in A \times M

represents the edge set between

A

and

M

. Lastly,

U = {{U_{j, a}}^{t}}

and

P = {{P_{j, a}}^{t}}

represent the weight set for the accuracy and credibility of model recognition, respectively.

2.2.2. Knowledge Construction

Knowledge construction aims to acquire the model–attribute weight matrix

U P = [(U_{j, a}^{t}, P_{j, a}^{t})]

that carries prior knowledge information. Part a of Figure 4 illustrates the process of obtaining the weight matrix. Assuming there are

n

samples used for constructing the weight matrix, the specific steps include the following parts:

For the i-th input image, the mask image is obtained using the method in Figure 2, and the attribute partition interval of the image is obtained using the following formula:

$A_{i} = [A t t r i b u t e_{1}^{t i}, \dots, A t t r i b u t e_{a}^{t i}, \dots, A t t r i b u t e_{A}^{t i}]$

(1)
Here, $t i$ is used to determine the specific attribute subclass corresponding to the sample.
Let $f_{m o d e l j} (\cdot)$ be the predicted output formula of candidate $m o d e l j$ ; calculate the predicted scores for image $i$ before and after masking using (2) and (3):

$s c o r e_{j, l}^{i} = f_{m o d e l j} (i m a g e i)$

(2)

$s c o r e_{j, l}^{i, m a s k} = f_{m o d e l j} (m a s k e d i m a g e i)$

(3)
Here, $l = 1, \dots, L$ represents the predicted categories. Record the predicted scores in vector form using (4) and (5):

$s c o r e_{j}^{i} = [s c o r e_{j, 1}^{i}, \dots, s c o r e_{j, L}^{i}]$

(4)

$s c o r e_{j}^{i, m a s k} = [s c o r e_{j, 1}^{i, m a s k}, \dots, s c o r e_{j, L}^{i, m a s k}]$

(5)
Based on this, adopt a reward–penalty approach to obtain the weights $U_{j, i}$ and $P_{j, i}$ for the model with respect to the sample (to be separately introduced later).
Construct the model–attribute weight matrix $U P_{i} = [(U_{j, a}^{t, i}, P_{j, a}^{t, i})]$ for image $i$ . Let the weight of the true attribute of the image be equal to $U_{j, i}$ and $P_{j, i}$ , and the weight of non-corresponding attributes be equal to 0, i.e., assigned according to the following equation:

$U {(P)}_{j, a}^{t, i} = \{\begin{cases} U {(P)}_{j, i}, i f t = t i \\ 0, o t h e r w i s e \end{cases}$

(6)
Combine the $U P_{i}$ corresponding to $i$ images and compute the average using Equation (7) to obtain the final model–attribute weight matrix $U P = [(U_{j, a}^{t}, P_{j, a}^{t})]$ .

$U {(P)}_{j, a}^{t} = \frac{1}{n} \sum_{i = 1}^{n} U {(P)}_{j, a}^{t, i}$

(7)

Part b of Figure 4 illustrates the calculation process for the weights

U_{j, i}

and

P_{j, i}

using a sample with a certain label set to 0. For the sake of simplicity, only the calculation results for three candidate models are listed in the figure. The calculation process for

U_{j, i}

is as follows:

Obtain each candidate model’s class prediction score matrix $S C O R E_{i} = [s c o r e_{j, l}^{i}]$ for the original image $i$ . The specific form of $S C O R E_{i}$ is shown in Equation (8):

$S C O R E_{i} = {[s c o r e_{1}^{i}, \dots, s c o r e_{J}^{i}]}^{T}$

(8)
To simplify the data and focus on the interested categories, extract the sub-matrix $S C O R E_{i, l i} = [s c o r e_{j, l i}^{i}]$ under the category from $S C O R E_{i}$ , where $l i$ represents the true category of image $i$ .
Normalize the elements of $S C O R E_{i, l i}$ using the softmax function, and denote the updated element as $s c o r e_{j, s o f t m a x}^{i}$ .

$f_{s o f t m a x} (s c o r e_{j, l i}^{i}) = \frac{e^{s c o r e_{j, l i}^{i}}}{\sum_{j = 1}^{J} e^{s c o r e_{j, l i}^{i}}}$

(9)
Using the precision-optimized reward–penalty function $f_{U} (\cdot)$ to calculate the weights $U_{j, i}$ , such that the weights for correctly predicted models are positive and the weights for incorrectly predicted models are negative. In the following equation, $α$ represents the accuracy-optimized penalty factor:

$f_{U} (s c o r e_{j, s o f t m a x}^{i}) = \{\begin{cases} s c o r e_{j, s o f t m a x}^{i}, p r e_l a b e l = r e a l_l a b e l \\ α (1 - s c o r e_{j, s o f t m a x}^{i}), o t h e r w i s e \end{cases}$

(10)

The credibility indicator

P_{j, i}

of SIATR-BGR is obtained by reshaping the pixel values in the background area outside the image annotation box and quantifying the credibility of feature learning by comparing the output values before and after masking. A smaller difference between the output values indicates that the model’s decision is less affected by irrelevant environmental information, while a larger difference indicates the opposite. The specific steps for computing

P_{j, i}

are as follows:

Obtain each candidate model’s class prediction score matrix $S C O R E_{i}^{m a s k} = [s c o r e_{j, l}^{i, m a s k}]$ for the masked image $i$ . The specific form of $S C O R E_{i}^{m a s k}$ is shown in Equation (11):

$S C O R E_{i}^{m a s k} = {[s c o r e_{1}^{i, m a s k}, \dots, s c o r e_{J}^{i, m a s k}]}^{T}$

(11)
Calculate the Euclidean distance between $S C O R E_{i}^{m a s k}$ and $S C O R E_{i}$ to measure the similarity of the model before and after image masking, and denote the result as $E D = [s c o r e_{j}^{i, d}]$ .

$f_{d} (s c o r e_{j}^{i, m a s k}, s c o r e_{j}^{i}) = \sqrt{\sum_{l = 1}^{L} {(s c o r e_{j, l}^{i, m a s k} - s c o r e_{j, l}^{i})}^{2}}$

(12)
Using Equation (9), normalize the elemental values of $E D$ , and denote the result as $s c o r e_{j, s o f t m a x}^{i, d}$ .
When the original image is predicted correctly, an excellent candidate model’s decision should be less sensitive to the background region of the target, resulting in a smaller $s c o r e_{j, s o f t m a x}^{i, d}$ and, thus, should be assigned a larger weight. Conversely, when the prediction is incorrect, negative weights should be assigned, and smaller values of $s c o r e_{j, s o f t m a x}^{i, d}$ should incur greater penalties. To achieve this, the absolute value is introduced into Equation (13) to design a reward–penalty function $f_{p} (\cdot)$ for calculating the weight $P_{j, i}$ , where $β$ represents the credibility penalty factor.

$f_{P} (s c o r e_{j, s o f t m a x}^{i, d}) = \{\begin{cases} |1 - s c o r e_{j, s o f t m a x}^{i, d}|, p r e_l a b e l = r e a l_l a b e l \\ - β |1 - s c o r e_{j, s o f t m a x}^{i, d}|, o t h e r w i s e \end{cases}$

(13)

2.2.3. Model-Adaptive Recommendation

In this section, we use the prior knowledge constructed in the previous section to adaptively recommend the optimal model to recognize the actual image based on its attribute information. Part a of Figure 5 demonstrates the adaptive recommendation process in the form of a bipartite graph, and its computational process is equivalent to matrix operations. Part b provides a corresponding schematic diagram of matrix operations. For each known IR attribute of image

i

, the process of model recommendation for candidate models includes two steps.

Extracting bipartite graph subgraph: image

i

can be represented in the form of Equation (1) with its attribute information, based on which the corresponding bipartite subgraph, denoted as

G_{s u b}

, is extracted from the original model–attribute bipartite graph

G

. This process can also be understood as retaining all

M

nodes in

G

while pruning the nodes not in

i

from

A_{i}

, along with their corresponding edges. The corresponding matrix description is as follows: extract the weights corresponding to image

i

from the weight matrix

U P = [(U_{j, a}^{t}, P_{j, a}^{t})]

to obtain the sub-weight matrix

U P_{s u b} = [(U_{j, a}^{t i}, P_{j, a}^{t i})]

.

The calculation of model recommendation scores: This process involves computing the recommendation score

s c o r e_m o d e l_{j}^{i}

of each candidate model for image

i

in

G_{s u b}

, which is essentially the weighted sum of edge weights for each

M

node in

G_{s u b}

. Subsequently, the recommendation scores of various models are compared to determine the optimal model

M o d e l_{j *}

under prior knowledge, which is then used for category prediction. The values of

s c o r e_m o d e l_{j}^{i}

and the subscript

j *

in

M o d e l_{j *}

are determined via Formulas (14) and (15), respectively:

s c o r e_m o d e l_{j}^{i} = W_{U} \sum_{a = 1}^{A} U_{j, a}^{t i} + W_{P} \sum_{a = 1}^{A} P_{j, a}^{t i}

(14)

j * = \underset{j}{\arg \max} {s c o r e_m o d e l_{1}^{i}, \dots, s c o r e_m o d e l_{J}^{i}}

(15)

where

W_{U}

and

W_{P}

are the weights used for accuracy and credibility optimization, respectively. They satisfy the condition

W_{U} + W_{P} = 1

to adjust the preference emphasis of the entire recommendation system. In summary, we present the computational process of the SIATR-BGR method in Algorithm A1 (refer to the Appendix A).

3. Experiments and Results

3.1. Evaluation Metrics

This article will quantitatively evaluate the predictive accuracy and reliability of the model. As for image recognition research, metrics such as

A c c u r a c y (A c c)

,

P r e c i s i o n (P r e c)

,

R e c a l l (R e c)

, and

F 1 - s c o r e (F 1)

will be used to evaluate the model’s performance in predicting categories, where

A c c

represents the ratio of correctly identified samples to the total number of samples;

P r e c

indicates the proportion of samples predicted as positive that are truly positive;

R e c

, however, refers to the ratio of actual positive samples that are correctly predicted as positive; and

F 1

is the harmonic mean of

P r e c

and

R e c

, thus providing a comprehensive assessment of the model’s performance. The calculation formulas for each metric are as follows:

A c c = \frac{t p + t n}{t p + f p + f n + t n}

(16)

R e c = \frac{t p}{t p + f n}

(17)

P r e c = \frac{t p}{t p + f p}

(18)

F 1 = 2 \cdot \frac{R e c \cdot P r e c}{R e c + P r e c}

(19)

where

t p

and

t n

represent the number of true positive and true negative samples, while

f p

and

f n

represent the number of false positive and false negative samples, respectively. In Section 2.2.2, the paper compares the Euclidean distance of model output values before and after image masking to construct a knowledge matrix.

Similarly, we evaluate the credibility of the output values using the Euclidean distance and Formula (13) as a reference to design the Euclidean distance mean of samples (

E D M S

) as the evaluation metric. A larger

E D M S

indicates better reliability. The calculation formula is as follows:

E D M S = \frac{1}{n} \sum_{i \in t e s t s e t} η_{i} |1 - s c o r e_{s o f t m a x}^{i, d}|

(20)

where, when sample

i

is predicted correctly,

η_{i} = 1

; otherwise,

η_{i} = - 1

.

3.2. Experimental Settings

The proposed SIATR-BGR method is validated using the ship’s IR simulation data from Section 2.1, and the data are divided into training, validation, and test sets in a ratio of 7:1:2. We will utilize six CNN models, including ResNet18 [33], SqueezeNet1_1 [34], DenseNet121 [35], MobileNet_v3_large [36], MnasNet1_3 [37], and ShuffleNet_v2_x10 [38], as candidate models for constructing the SIATR-BGR framework (hereinafter, the CNN version numbers will not be mentioned, for example, ResNet refers to ResNet18). During the model training, we utilized an 11 GB RTX 2080 Ti GPU running on the PyTorch environment. Each CNN model underwent training for 100 epochs, with a batch size of 32 and a learning rate of 0.01. We set the momentum to 0.9, and the parameter updates were performed using the Stochastic Gradient Descent optimizer. We evaluated the performance of each CNN model on the test set, a summary of the performance of each model is shown in Table 4, and we found that MobileNet outperformed the other models across all metrics.

In the preceding sections, Formulas (10) and (13) used

α

and

β

as penalty factors to calculate the weights. We set

α

and

β

to vary within the range [0, 3], with a step size of 0.1, and employed grid search on the validation set to determine their optimal values. Figure 6 shows the results of the penalty factor search for SIATR-BGR performance, showing that the best performance for

A c c

, at 95.91%, was achieved when

α = 0.4

(

W_{U} = 1, W_{P} = 0

), while the best performance for

E D M S

, at 0.7787, was achieved when

β = 0.6

(

W_{U} = 0, W_{P} = 1

).

3.3. Results and Discusion

In this section, we first consider only recognition performance as the evaluation criterion and present an SIATR recommendation system that optimizes overall accuracy performance. Secondly, we only consider credibility as the evaluation criterion and present an SIATR model recommendation system that optimizes overall credibility. Finally, we design and construct a recommendation system that considers both recognition accuracy and credibility.

3.3.1. The Recommendation System Aims to Improve the Accuracy

To further validate the effectiveness of our method, we will also compare the performance of our approach with the voting ensemble learning method [39] (Hard Voting and Soft Voting). Hard Voting determines the final predicted label through a majority vote. In contrast, Soft Voting takes into account the predicted probabilities of each model and performs a weighted average to determine the final predicted label. The computation formulas for them are as follows:

P r e d i c t e d_l a b e l_{H a r d_V o t i n g} = \underset{l}{\arg \max} (\sum_{j = 1}^{J} I (y_{j, i} = l))

(21)

P r e d i c t e d_l a b e l_{S o f t_V o t i n g} = \underset{l}{\arg \max} (\frac{1}{J} \sum_{j = 1}^{J} f_{_{m o d e l j}}^{l} (i m a g e i))

(22)

where

y_{j, i}

represents the predicted label of

m o d e l j

for image

i

.

I (\cdot)

denotes the indicator function, with

I (\cdot)

taking a value of 1 when

y_{j, i}

equals

l

, and 0 otherwise.

f_{_{m o d e l j}}^{l} (i m a g e i)

represents the predicted score of

m o d e l j

for label

l

.

Table 5 shows a performance comparison between the SIATR-BGR method (

α = 0.4, W_{U} = 1, W_{P} = 0

) and the best candidate model MobileNet, as well as the voting algorithms for both. Our SIATR-BGR method achieves a performance of 95.86% across all four metrics on the test set. Taking

A c c

as an example, compared to MobileNet, Soft Voting, and Hard Voting, the performance of our method is 1.06%, 0.63%, and 0.58% higher, respectively, which demonstrates the superiority of the SIATR-BGR method.

Additionally, to further analyze the performance of our proposed method compared to other approaches in specific scenarios, we have generated an

A c c

heatmap depicting the relationship between sample attributes and candidate models (refer to Figure 7). The heatmap reveals that there are performance differences among candidate models, making it challenging for a single model to establish absolute superiority across all conditions. Instead, a relative balance of advantages is maintained among the candidate models. For example, among the various candidate models, SqueezeNet exhibits the best performance under the “Far” condition (94.63%), while it performs the worst under the “Zenith_low” condition (91.53%). In terms of time conditions, encompassing “Night,” “Morning and Evening,” and “Daytime” attributes, excellent performances are demonstrated by MobileNet (88.95%), ResNet (98.39%), and SqueezeNet (98.10%), respectively, instead of a single model consistently ranking first across all conditions. Compared to the candidate models, the SIATR-BGR method demonstrates excellent recognition performance across various scenarios. This further validates the original intention behind the design of our recommended system: to adaptively recommend the optimal model based on the differences in recognition performance across different scenarios, thereby ensuring overall performance superiority over consistently using a single model. Furthermore, compared to the two types of voting algorithms, the SIATR-BGR method is only inferior to Soft Voting and Hard Voting in the “Far” and “Zenith_low” conditions, respectively. In most scenarios, our method achieves the highest

A c c

. By observing Figure 7, it is noticeable that there are performance variations among the different methods within the same major scenario. Specifically, during the “Morning and Evening” and “Daytime” attributes, the overall model performance is considerably better than during the “Nighttime” attribute. This is due to the fact that during nighttime conditions, the ship’s hull emits low radiation toward the outside because of the lack of solar radiation and lower temperatures. This results in a less clear ship contour, which negatively affects the model’s recognition. Additionally, Figure 8 provides the corresponding confusion matrix for our method, which shows a relatively balanced prediction for the three categories.

3.3.2. The Recommendation System Aims to Improve the Credibility

The SIATR-BGR method in this study (with parameters set as

β = 0.6

,

W_{U} = 0

, and

W_{P} = 1

) exhibits an

E D M S

of 0.7781 on the test set, indicating a performance improvement of 0.0274 compared to MobileNet, which achieves the highest

E D M S

value for the candidate models. Analogous to Figure 7, a heatmap depicting the

E D M S

between sample attributes and candidate models is presented in Figure 9. Except for under the “Far” condition, the

E D M S

value of our approach is 0.7201, which is lower than MnasNet’s 0.7319. In various scenarios, our method demonstrates superior recognition confidence performance. Additionally, upon contrasting Figure 7 and Figure 9, it is evident that models with better overall

A c c

performance under a specific attribute correspond to superior

E D M S

performance. For instance, models under the “Near” condition outperform those under the “Far” condition in both Figure 7 and Figure 9. This suggests that the model’s recognition accuracy and credibility are significantly influenced by changes in the scene and exhibit a consistent pattern of variation.

3.3.3. The Recommendation System Aims to Improve the Accuracy and Credibility

In Section 3.3.1 and Section 3.3.2, we observed that the SIATR-BGR method performed significantly better in optimizing accuracy and credibility independently when the penalty factors

α

and

β

were set to 0.4 and 0.6, respectively. Based on these findings, we set

α

and

β

to 0.4 and 0.6, respectively, when developing a recommendation system that combines accuracy and credibility. In Figure 10, we present the changes in the validation set’s

A c c

and

E D M S

for the SIATR-BGR method under different values of the weight

W_{U}

(

W_{U} = 1 - W_{P}

). Observations indicate that as

W_{U}

gradually increases,

A c c

also correspondingly improves, while

E D M S

exhibits the opposite trend. The shaded region

W_{U} \in [0.45 - 0.8]

shown in Figure 10 represents the segments where both metrics outperform the MobileNet model. When

W_{U}

is between 0.7 and 0.8, the

A c c

significantly surpasses the range of

W_{U}

values between 0.45 and 0.65, slightly exceeding the performance of Hard Voting on the test set. However, within the range of

W_{U}

from 0.45 to 0.7, there is a gradual decrease observed in

E D M S

. By comparing the

W_{U}

range of 0.45 to 0.8, it is evident that a

W_{U}

value of 0.7 achieves a better balance between

A c c

and

E D M S

, with the corresponding performance metrics of the SIATR-BGR method on the test set presented in Table 6.

3.3.4. The Recommendation System with Reduced Resource Consumption

In this section, we analyze the relationship between resource consumption and the performance of the SIATR-BGR method. In Figure 11, we provide statistical plots for the number of model recommendations and model sizes. Subfigure a of Figure 11 demonstrates the number of times each model is recommended for accuracy optimization and credibility optimization tasks. We found that the top three models with the best performance, as shown in Table 4, have a significant proportion of total recommendations. For example, SqueezeNet, MobileNet, and MnasNet with higher accuracy are primarily recommended for accuracy optimization, while ResNet, MobileNet, and MnasNet, with higher

E D M S

, are mainly recommended for credibility optimization. Subfigure b illustrates a comparison of parameter count and memory usage for the six models in the PyTorch environment. We observed that some models (DenseNet and ShuffleNet) have fewer occurrences in actual model recommendations. However, including them in the recommendation system would increase overall complexity and memory size. Especially for devices with limited memory, it is necessary to selectively reduce the resource consumption of the recommendation system.

In subgraph a of Figure 11, the top three recommended model combinations for the two tasks are referred to as the dominant models for accuracy (DMA) and the dominant models for credibility (DMC). Furthermore, Table 7 and Table 8 provide a comparative analysis of the performance on the test dataset for different candidate model combinations pertaining to the two tasks. It is evident that the overall performance of SIATR-BGR is predominantly determined by DMA (DMC), with the absence of other low-performance candidate models having a relatively minor impact on the recommendation system’s performance. This phenomenon occurs because when recommending a model for a certain sample, even if the original high-volume recommendation system’s chosen model is not available in the reduced-volume recommendation system, the latter’s chosen candidate model typically exhibits the capability to accurately recognize this sample. To optimize both accuracy and credibility, and considering the limited frequency of recommendations for DenseNet and ShuffleNet across the two tasks, this study opts for the utilization of ResNet, SqueezeNet, MobileNet, and MnasNet in constructing a resource-efficient SIATR-BGR recommendation system. When

W_{U}

and

W_{P}

are individually set to 1, the accuracy and credibility performances of the recommendation system in this section align with DMA + ResNet (

α = 0.4

) in Table 7 and DMC + SqueezeNet (

β = 0.7

) in Table 8, respectively. In contrast to the original SIATR-BGR model, the resource-efficient SIATR-BGR model exhibits a 28.35% reduction in memory size, with

A C C

and

E D M S

metrics experiencing marginal decreases of 0.05% and 0.0027, respectively. Similar to Table 6, this study presents the overall performance of the resource-efficient SIATR-BGR recommendation system in Table 9, under appropriate

W_{U}

and

W_{P}

settings.

4. Conclusions

This paper focuses on the application of ship IR automatic target recognition (SIATR) technology, with a dedicated effort to enhance overall recognition performance and applicability under the constraint of a given number of CNN models and consistent performance levels. To achieve this objective, we propose an innovative method for adaptive recommendation models, called SIATR-BGR. This method, guided by optimization goals centered on recognition accuracy and feature learning credibility, selects six classes of CNN models as candidate models. By establishing a bipartite graph mapping relationship between sample attributes and models, our method can effectively reflect the relationships of influence using edge weight values. In the model recommendation phase, the method extracts corresponding subgraphs based on sample IR attributes and calculates the priority recommendation order of models with knowledge priors. The proposed method is validated using high-fidelity simulation data. Initially, we conduct separate analyses for the optimization of recognition accuracy and credibility. The results indicate that compared to the six candidate models, our approach effectively enhances various performance metrics. Subsequently, we further analyze the accuracy and credibility results under different recommendation biases of

W_{U}

and

W_{P}

, selecting appropriate values to better balance the performance of these two metrics. Finally, we explore the relationship between candidate model resource consumption and the performance of the recommendation system. We propose a recommendation system that considers resource consumption, accuracy, and credibility in a balanced manner. The introduction of this method not only provides new avenues and insights for improving the performance of SIATR tasks but also offers valuable references for similar studies in the SAR and visible light domains.

However, there are some limitations to our method’s practical application. Firstly, to enhance the credibility of our method, we require the bounding box information for the target, which means that our method needs the support of exact object detection algorithms for practical applications. Additionally, we have not yet tested the effectiveness of our method using real-world data. Going forward, we plan to further test and optimize our approach under appropriate conditions to promote its application in real-world scenarios.

Author Contributions

Conceptualization, H.Z., and C.L.; Methodology, H.Z., C.L, J.M. and H.S.; Software, H.Z. and J.M.; Validation, H.Z.; Data curation, J.M. and H.S.; Writing—original draft, H.Z.; Writing—review & editing, H.Z., C.L.; Visualization, H.Z.; Supervision, C.L.; Project administration, C.L.; Funding acquisition, C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant number: 62105370).

Data Availability Statement

The data used in this study originates from an internally developed modeling software that is not publicly available. Unfortunately, we lack the authorization to publicly disclose the data generated by this software.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Algorithm A1. The proposed SIATR-BGR model-matching method.

Part 1: Knowledge construction
Input:

n

ship IR images.
Output: Model–attribute weight matrix

U P = [(U_{j, a}^{t}, P_{j, a}^{t})]

.
01: for

i = 1

to

n

do
02: Obtain attribute interval

A_{i}

according to Equation (1).
03: Generate

m a s k e d i m a g e i

using the method shown in Figure 2.
04: for

j = 1

to

J

do
05: Calculate model output scores:

s c o r e_{j, l}^{i}

and

s c o r e_{j, l}^{i, m a s k}

using Equations (2) and (3).
06: Construct vectors

s c o r e_{j}^{i}

and

s c o r e_{j}^{i, m a s k}

according to Equations (4) and (5).
07:      end for
              Reward–penalty calculation of weights:
08:              Construct matrix

S C O R E_{i}

and

S C O R E_{i}^{m a s k}

according to Equation (8) and Equation (11), respectively.
09: Extract submatrix

S C O R E_{i, l i} = [s c o r e_{j, l i}^{i}]

from

S C O R E_{i}

.
10: Calculate

U_{j, i}

using Equations (9) and (10) and

S C O R E_{i, l i}

.
11: Construct matrix

E D = [s c o r e_{j}^{i, d}]

using Equation (12).
12: Calculate

P_{j, i}

using Equation (9), Equation (13), and

E D = [s c o r e_{j}^{i, d}]

.
13: Construct matrix

U P_{i} = [(U_{j, a}^{t, i}, P_{j, a}^{t, i})]

using

U_{j, i}

,

P_{j, i}

, and Equation (6).
14: end for
15: Construct matrix

U P = [(U_{j, a}^{t}, P_{j, a}^{t})]

using Equation (7).
Part 2: Model-adaptive recommendation
Input: Unknown IR ship image

i

.
Output: Recommended model:

M o d e l_{j *}

.
16: Obtain attribute interval

A_{i}

according to Equation (1).
17: Extract submatrix

U P_{s u b} = [(U_{j, a}^{t i}, P_{j, a}^{t i})]

from

U P = [(U_{j, a}^{t}, P_{j, a}^{t})]

.
18: Calculate

s c o r e_m o d e l_{j}^{i}

using Equation (14).
19: Obtain

j *

using Equation (15).

References

He, Y.; Deng, B.; Wang, H.; Cheng, L.; Zhou, K.; Cai, S.; Ciampa, F. Infrared machine vision and infrared thermography with deep learning: A review. Infrared Phys. Technol. 2021, 116, 103754. [Google Scholar] [CrossRef]
Pawar, S.; Gandhe, S. SAR (Synthetic Aperture Radar) Image Study and Analysis for Object Recognition in Surveillance. IJISAE 2023, 11, 552–573. [Google Scholar]
Wang, N.; Wang, Y.; Er, M.J. Review on deep learning techniques for marine object recognition: Architectures and algorithms. Control. Eng. Pract. 2022, 118, 104458. [Google Scholar] [CrossRef]
Özertem, K.A. Key features for ATA/ATR database design in missile systems. In Proceedings of the Automatic Target Recognition XXVII, Anaheim, CA, USA, 10–11 April 2017; pp. 106–111. [Google Scholar]
Alves, J.A.; Rowe, N.C. Recognition of Ship Types from an Infrared Image Using Moment Invariants and Neural Networks; NAVAL Postgraduate School: Monterey, CA, USA, 2001. [Google Scholar]
Luo, Q.; Khoshgoftaar, T.M.; Folleco, A. Classification of ships in surveillance video. In Proceedings of the 2006 IEEE International Conference on Information Reuse & Integration, Waikoloa, HI, USA, 16–18 September 2006; pp. 432–437. [Google Scholar]
Li, H.; Wang, X. Automatic recognition of ship types from infrared images using support vector machines. In Proceedings of the 2008 International Conference on Computer Science and Software Engineering, Wuhan, China, 12–14 December 2008; pp. 483–486. [Google Scholar]
Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A survey of deep learning and its applications: A new paradigm to machine learning. Arch. Comput. Methods Eng. 2020, 27, 1071–1092. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Zhang, M.M.; Choi, J.; Daniilidis, K.; Wolf, M.T.; Kanan, C. VAIS: A dataset for recognizing maritime imagery in the visible and infrared spectrums. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA, 7–12 June 2015; pp. 10–16. [Google Scholar]
Karabayır, O.; Yücedağ, O.M.; Kartal, M.Z.; Serim, H.A. Convolutional neural networks-based ship target recognition using high resolution range profiles. In Proceedings of the 2017 18th International Radar Symposium (IRS), Prague, Czech Republic, 28–30 June 2017; pp. 1–9. [Google Scholar]
Khellal, A.; Ma, H.; Fei, Q.J.S. Convolutional neural network based on extreme learning machine for maritime ships recognition in infrared images. Sensors 2018, 18, 1490. [Google Scholar] [CrossRef] [PubMed]
Ren, Y.; Yang, J.; Guo, Z.; Zhang, Q.; Cao, H.J.E. Ship classification based on attention mechanism and multi-scale convolutional neural network for visible and infrared images. Electronics 2020, 9, 2022. [Google Scholar] [CrossRef]
Huang, L.; Wang, F.; Zhang, Y.; Xu, Q. Fine-grained ship classification by combining CNN and swin transformer. Remote Sens. 2022, 14, 3087. [Google Scholar] [CrossRef]
Zhang, Y.; Lei, Z.; Yu, H.; Zhuang, L.J.I.G.; Letters, R.S. Imbalanced high-resolution SAR ship recognition method based on a lightweight CNN. IEEE Geosci. Remote Sens. Lett. 2021, 19, 1–5. [Google Scholar] [CrossRef]
Liu, B.; Xiao, Q.; Zhang, Y.; Ni, W.; Yang, Z.; Li, L. Intelligent recognition method of low-altitude squint optical ship target fused with simulation samples. Remote Sens. 2021, 13, 2697. [Google Scholar] [CrossRef]
Sharifzadeh, F.; Akbarizadeh, G.; Seifi Kavian, Y. Ship classification in SAR images using a new hybrid CNN–MLP classifier. J. Indian Soc. Remote Sens. 2019, 47, 551–562. [Google Scholar] [CrossRef]
Teixeira, E.; Araujo, B.; Costa, V.; Mafra, S.; Figueiredo, F. Literature Review on Ship Localization, Classification, and Detection Methods Based on Optical Sensors and Neural Networks. Sensors 2022, 22, 6879. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Yu, Z.; Yu, L.; Cheng, P.; Chen, J.; Chi, C. A Comprehensive Survey on SAR ATR in Deep-Learning Era. Remote Sens. 2023, 15, 1454. [Google Scholar] [CrossRef]
Baorong, X.; Yu, Z.; Shuyi, F.; Xian, L.; Songfeng, D. Research of the infrared ship target recognition technology based on the complex background. In Proceedings of the 2018 IEEE/ACIS 17th International Conference on Computer and Information Science (ICIS), Singapore, 6–8 June 2018; pp. 850–854. [Google Scholar]
Dong, X.; Yu, Z.; Cao, W.; Shi, Y.; Ma, Q. A survey on ensemble learning. Front. Comput. Sci. 2020, 14, 241–258. [Google Scholar] [CrossRef]
Mohammed, A.; Kora, R. A comprehensive review on ensemble deep learning: Opportunities and challenges. J. King Saud Univ. -Comput. Inf. Sci. 2023, 35, 757–774. [Google Scholar] [CrossRef]
Bai, X.; Wang, X.; Liu, X.; Liu, Q.; Song, J.; Sebe, N.; Kim, B. Explainable deep learning for efficient and robust pattern recognition: A survey of recent developments. Pattern Recognit. 2021, 120, 108102. [Google Scholar] [CrossRef]
Tanimoto, S.L.; Itai, A.; Rodeh, M. Some matching problems for bipartite graphs. J. ACM 1978, 25, 517–525. [Google Scholar] [CrossRef]
Westlake, S.T.; Volonakis, T.N.; Jackman, J.; James, D.B.; Sherriff, A. Deep learning for automatic target recognition with real and synthetic infrared maritime imagery. In Proceedings of the Artificial Intelligence and Machine Learning in Defense Applications II, Online, 21–25 September 2020; pp. 41–53. [Google Scholar]
Ward, C.M.; Harguess, J.; Hilton, C. Ship classification from overhead imagery using synthetic data and domain adaptation. In Proceedings of the OCEANS 2018 MTS/IEEE Charleston, Charleston, SC, USA, 22–25 October 2018; pp. 1–5. [Google Scholar]
Rizaev, I.G.; Achim, A. SynthWakeSAR: A synthetic sar dataset for deep learning classification of ships at sea. Remote Sens. 2022, 14, 3999. [Google Scholar] [CrossRef]
Mathew, A.; Amudha, P.; Sivakumari, S. Deep learning techniques: An overview. Adv. Mach. Learn. Technol. Appl. Proc. AMLTA 2021, 1141, 599–608. [Google Scholar]
Mumtaz, A.; Jabbar, A.; Mahmood, Z.; Nawaz, R.; Ahsan, Q. Saliency based algorithm for ship detection in infrared images. In Proceedings of the 2016 13th International Bhurban Conference on Applied Sciences and Technology (IBCAST), Islamabad, Pakistan, 12–16 January 2016; pp. 167–172. [Google Scholar]
Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part I 13. pp. 818–833. [Google Scholar]
Kortylewski, A.; He, J.; Liu, Q.; Yuille, A.L. Compositional convolutional neural networks: A deep architecture with innate robustness to partial occlusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 8940–8949. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Howard, A.; Sandler, M.; Chu, G.; Chen, L.-C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
Tan, M.; Chen, B.; Pang, R.; Vasudevan, V.; Sandler, M.; Howard, A.; Le, Q.V. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2820–2828. [Google Scholar]
Ma, N.; Zhang, X.; Zheng, H.-T.; Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
Leon, F.; Floria, S.-A.; Bădică, C. Evaluating the effect of voting methods on ensemble-based classification. In Proceedings of the 2017 IEEE International Conference on INnovations in Intelligent Systems and Applications (INISTA), Gdynia, Poland, 3–5 July 2017; pp. 1–6. [Google Scholar]

Figure 1. Example images and bounding box localization demonstration of the dataset for cruise ship (a), warship (b), and container freighter (c). (b,c) The diversity of target imaging brightness variations and posture distribution using warship and container freighter as examples, respectively.

Figure 2. The generation process of masking the target background area.

Figure 3. The basic framework of the SIATR-BGR method.

Figure 4. The acquisition method of knowledge construction in the SIATR-BGR method. (a) The calculation process of the weight matrix

U P

; (b) three candidate model examples and the acquisition method of the weights

U_{j, i}

and

P_{j, i}

for a single sample with a certain label set to 0.

Figure 4. The acquisition method of knowledge construction in the SIATR-BGR method. (a) The calculation process of the weight matrix

U P

; (b) three candidate model examples and the acquisition method of the weights

U_{j, i}

and

P_{j, i}

for a single sample with a certain label set to 0.

Figure 5. Illustration of model-adaptive recommendation for the SIATR-BGR method. (a) Candidate model selection in the form of a bipartite graph, and (b) the corresponding matrix numerical computation method.

Figure 6. The impact of the penalty factors

α

(

W_{U} = 1, W_{P} = 0

) and

β

(

W_{U} = 0, W_{P} = 1

) on the

A c c

and

E D M S

of the SIATR-BGR recommendation system under different search values.

Figure 6. The impact of the penalty factors

α

(

W_{U} = 1, W_{P} = 0

) and

β

(

W_{U} = 0, W_{P} = 1

) on the

A c c

and

E D M S

of the SIATR-BGR recommendation system under different search values.

Figure 7. Heat map matrix of various methods under multi-class scenarios. The numerical values in each cell of the figure represent the

A c c

of the method in the corresponding scenarios.

Figure 7. Heat map matrix of various methods under multi-class scenarios. The numerical values in each cell of the figure represent the

A c c

of the method in the corresponding scenarios.

Figure 8. The confusion matrix of the SIATR-BGR method in this paper when

α = 0.4

(

W_{U} = 1, W_{P} = 0

). (a) The form of class-wise counts; (b) the corresponding percentage form.

Figure 8. The confusion matrix of the SIATR-BGR method in this paper when

α = 0.4

(

W_{U} = 1, W_{P} = 0

). (a) The form of class-wise counts; (b) the corresponding percentage form.

Figure 9. Heat map matrix of various methods under multi-class scenarios. The numerical values in each cell of the figure represent the

E D M S

of the method in the corresponding scenarios.

Figure 9. Heat map matrix of various methods under multi-class scenarios. The numerical values in each cell of the figure represent the

E D M S

of the method in the corresponding scenarios.

Figure 10. The impact of varying values of

W_{U}

on the

A c c

and

E D M S

performance of the SIATR-BGR recommendation system under conditions

α = 0.4

and

β = 0.6

.

Figure 10. The impact of varying values of

W_{U}

on the

A c c

and

E D M S

performance of the SIATR-BGR recommendation system under conditions

α = 0.4

and

β = 0.6

.

Figure 11. A statistical chart of the recommended frequencies concerning candidate models relative to their sizes. (a) The recommended frequencies of models under conditions where

α = 0.4

,

W_{U} = 1

,

W_{P} = 0

, and

β = 0.4

,

W_{U} = 0

,

W_{P} = 1

; (b) the resource consumption statistics of each model.

Figure 11. A statistical chart of the recommended frequencies concerning candidate models relative to their sizes. (a) The recommended frequencies of models under conditions where

α = 0.4

,

W_{U} = 1

,

W_{P} = 0

, and

β = 0.4

,

W_{U} = 0

,

W_{P} = 1

; (b) the resource consumption statistics of each model.

Table 1. Some recent research on SATR tasks based on deep learning.

Reference	Data Source	Method Description	Result
Khellal et al. [12]	IR	CNN + extreme learning machine	Compared to CNN based on backpropagation, the method has better accuracy and faster training speed.
Ren et al. [13]	IR/ visible light	Multi-scale CNN + attention mechanism	The performance of the proposed method is superior to traditional machine learning methods and some CNN-based methods.
Huang et al. [14]	Visible light	CNN + swin transformer	The parallel network structure can extract features more fully. It performs best among multiple image recognition models.
Liu et al. [15]	Visible light	CNN + feature fusion mechanism + supplement sample size using simulation data	Compared to the CNN model as the backbone, the feature fusion mechanism and sample size complementation effectively and iteratively optimize the overall recognition performance.
Sharifzadeh et al. [16]	SAR	CNN + multilayer perceptron	Compared with using CNN or multilayer perceptron alone for ship recognition, the hybrid approach can better extract features and achieve higher accuracy.
Zhang et al. [17]	SAR	CNN + deep metric learning + gradually balanced sampling	The residual neural network embedded with the new mechanism performs more accurately on multiple public datasets.

Table 2. IR attribute information of the dataset.

Elements	Attributes
Classes	Container freighter (CF), cruise ship (CS), warship (WS).
Radiation factors	Time, air temperature, water temperature, maritime area, motion state, wind speed, and weather (sunny, rainy, foggy, etc.).
Camera shooting factors	Distance, azimuth, zenith angle.

Note: in the simulated environment, the number of samples for each class is equal.

Table 3. The selected IR attributes and dividing information for constructing the recommendation system.

Attributes	Attribute Division
Attributes	Classes	Interval
Time	Daytime	8:00–17:00
	Morning and vening	5:00–8:00, 17:00–20:00
	Nighttime	20:00–5:00
Motion state	Moving	Speed greater than 0
Motion state	Static	Speed equals 0
Air temperature	Temperature_high	0–10 °C
Air temperature	Temperature_low	10–20 °C
Zenith angle	Zenith_high	Angle greater than 45°
Zenith angle	Zenith_low	Angle less than 45°
Distance	Near	3–6 km
Distance	Far	6–9 km

Table 4. Comparison of performance of various candidate CNN models on the test set.

CNN Model	Acc (%)	Prec (%)	Rec (%)	F1 (%)	EDMS
ResNet	93.83	93.90	93.83	93.85	0.7455
SqueezeNet	94.46	94.48	94.46	94.45	0.7386
DenseNet	93.35	94.05	93.35	93.41	0.7382
MobileNet	94.80	94.90	94.80	94.81	0.7507
MnasNet	94.36	94.41	94.36	94.37	0.7405
ShuffleNet	93.64	93.65	93.64	93.65	0.7049

Note: since the number of samples in each class is equal in the test set, the

A c c

and

R e c

of each model are equal.

Table 5. Comparison of the prediction accuracy performance between the SIATR-BGR method and three other methods: MobileNet, Soft Voting, and Hard Voting.

Model/Recommendation System	Acc (%)	Prec (%)	Rec (%)	F1 (%)
SIATR-BGR ( $α = 0.4$ )	95.86	95.86	95.86	95.86
Hard Voting	95.28	95.30	95.28	95.28
Soft Voting	95.23	95.31	95.23	95.24
MobileNet	94.80	94.90	94.80	94.81

Table 6. The performance metrics of the SIATR-BGR recommendation system with six candidate models when aiming to improve accuracy and credibility.

	Acc (%)	Prec (%)	Rec (%)	F1 (%)	EDMS
SIATR-BGR (α = 0.4, β = 0.6. W_U = 0.7)	95.23	95.31	95.23	95.24	0.7578

Table 7. The performance and resource consumption of the SIATR-BGR recommendation system under different combinations of candidate models when accuracy (

W_{U} = 1

) optimization is the objective.

Table 7. The performance and resource consumption of the SIATR-BGR recommendation system under different combinations of candidate models when accuracy (

W_{U} = 1

) optimization is the objective.

Evaluation Metrics	DMA $(α = 0.5)$	DMA + DenseNet $(α = 0.5)$	DMA + ResNet $(α = 0.4)$	DMA + ShuffleNet $(α = 0.4)$	DMA + DenseNet + ResNet $(α = 0.4)$	DMA + DenseNet + ShuffleNet $(α = 0.5)$	DMA + ResNet+ ShuffleNet $(α = 0.5)$	All Models $(α = 0.4)$
Acc (%)	95.62	95.71	95.81	95.76	95.81	95.76	95.86	95.86
Prec (%)	95.62	95.72	95.81	95.76	95.81	95.76	95.86	95.86
Rec (%)	95.62	95.71	95.81	95.76	95.81	95.76	95.86	95.86
F1 (%)	95.62	95.71	95.81	95.75	95.81	95.75	95.86	95.86
Parameters (million)	13.00	20.98	24.69	14.37	32.67	22.35	26.06	34.04
Memory size (MB)	38.39	65.49	81.09	43.37	108.19	70.47	86.07	113.17

Table 8. The performance and resource consumption of the SIATR-BGR recommendation system under different combinations of candidate models when credibility (

W_{P} = 1

) optimization is the objective.

Table 8. The performance and resource consumption of the SIATR-BGR recommendation system under different combinations of candidate models when credibility (

W_{P} = 1

) optimization is the objective.

Evaluation Metrics	DMC $(β = 0.8)$	DMC + SqueezeNet $(β = 0.7)$	DMC + DenseNet $(β = 0.8)$	DMC + ShuffleNet $(β = 0.5)$	DMC + SqueezeNet + DenseNet $(β = 0.6)$	DMC + SqueezeNet + ShuffleNet $(β = 0.6)$	DMC + DenseNet + ShuffleNet $(β = 0.8)$	All Models $(β = 0.6)$
EDMS	0.7657	0.7754	0.7737	0.7682	0.7781	0.7778	0.7778	0.7781
Parameters (million)	23.45	24.69	31.43	24.82	32.67	26.06	32.8	34.04
Memory size (MB)	78.30	81.09	105.4	83.28	108.19	86.07	110.38	113.17

Table 9. The performance metrics of the SIATR-BGR recommendation system with three candidate models when aiming to improve accuracy and credibility, t.

	Acc (%)	Prec (%)	Rec (%)	F1 (%)	EDMS
SIATR-BGR ( $α = 0.4$ , $β = 0.6$ , $W_{U} = 0.6$ )	95.13	95.14	95.13	95.13	0.7552

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Liu, C.; Ma, J.; Sun, H. Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method. Mathematics 2024, 12, 168. https://doi.org/10.3390/math12010168

AMA Style

Zhang H, Liu C, Ma J, Sun H. Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method. Mathematics. 2024; 12(1):168. https://doi.org/10.3390/math12010168

Chicago/Turabian Style

Zhang, Haoxiang, Chao Liu, Jianguang Ma, and Hui Sun. 2024. "Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method" Mathematics 12, no. 1: 168. https://doi.org/10.3390/math12010168

APA Style

Zhang, H., Liu, C., Ma, J., & Sun, H. (2024). Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method. Mathematics, 12(1), 168. https://doi.org/10.3390/math12010168

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ship Infrared Automatic Target Recognition Based on Bipartite Graph Recommendation: A Model-Matching Method

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. Proposed Approach

2.2.1. Preliminaries

2.2.2. Knowledge Construction

2.2.3. Model-Adaptive Recommendation

3. Experiments and Results

3.1. Evaluation Metrics

3.2. Experimental Settings

3.3. Results and Discusion

3.3.1. The Recommendation System Aims to Improve the Accuracy

3.3.2. The Recommendation System Aims to Improve the Credibility

3.3.3. The Recommendation System Aims to Improve the Accuracy and Credibility

3.3.4. The Recommendation System with Reduced Resource Consumption

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI