Next Article in Journal
An Empirical Study on Retinex Methods for Low-Light Image Enhancement
Previous Article in Journal
Gated Path Aggregation Feature Pyramid Network for Object Detection in Remote Sensing Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Total Variation Weighted Low-Rank Constraint for Infrared Dim Small Target Detection

1
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China
2
University of Chinese Academy of Sciences, Beijing 100039, China
3
Key Laboratory of Space-Based Dynamics and Rapid Optical Imaging Technology, Chinese Academy of Sciences, Changchun 130033, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(18), 4615; https://doi.org/10.3390/rs14184615
Submission received: 2 August 2022 / Revised: 7 September 2022 / Accepted: 10 September 2022 / Published: 15 September 2022

Abstract

:
Infrared dim small target detection is the critical technology in the situational awareness field currently. The detection algorithm of the infrared patch image (IPI) model combined with the total variation term is a recent research hotspot in this field, but there is an obvious staircase effect in target detection, which reduces the detection accuracy to some extent. This paper further investigates the problem of accurate detection of infrared dim small targets and a novel method based on total variation weighted low-rank constraint (TVWLR) is proposed. According to the overlapping edge information of image background structure characteristics, the weights of constraint low-rank items are adaptively determined to effectively suppress the staircase effect and enhance the details. Moreover, an optimization algorithm combined with the augmented Lagrange multiplier method is proposed to solve the established TVWLR model. Finally, the experimental results of multiple sequence images indicate that the proposed algorithm has obvious improvements in detection accuracy, including receiver operating characteristic (ROC) curve, background suppression factor (BSF) and signal-to-clutter ratio gain (SCRG). Furthermore, the proposed method has stronger robustness under complex background conditions such as buildings and trees.

Graphical Abstract

1. Introduction

The infrared imaging system uses the target radiation received by the sensor to image, which has the advantages of being unaffected by the environment, small size and passive imaging. As a crucial technique of situation awareness, the detection and tracking of infrared dim and small targets play a significant role in the precision strike, perception and early warning systems [1]; this has been extensively studied [2,3,4]. Generally, the overall contrast and signal-to-noise ratio of infrared images are low and the target is easily submerged in the clutter background and noise interference, which is difficult to distinguish. Furthermore, the number of pixels occupied by the target is small. The Society of Photo-optical Instrumentation Engineers (SPIE) defines that the imaging area of infrared dim and small target is less than 0.12% of the total pixel number, no more than 81 pixels on an image of 256 × 256 [5]. However, in the actual imaging process, the target is even much smaller than this value and lacks shape and texture features, which results in a significant increase in the difficulty of target detection. Therefore, infrared dim target-detection technology has become a challenging and hot topic.
Generally, infrared dim small target-detection algorithms are mainly divided into multi-frame detection and single-frame detection. Multi-frame detection uses the continuity and correlation of moving targets in multi-frame images to achieve detection, while single-frame detection mainly uses the single-frame image to extract the gradient, grayscale, contrast and other characteristics of the small target. Compared with multi-frame detection, it has the advantages of low complexity, high execution efficiency and easy hardware implementation.
The traditional single-frame-detection algorithm divides an infrared image into the background region, the target region and the noise region and the model is as follows:
f D ( i , j ) = f E ( i , j ) + f T ( i , j ) + f B ( i , j )
where ( i , j ) , f D , f E , f T and f B represent the position of a specific pixel, the infrared small target image, noise region, target region and background region, respectively.
It can be seen that the performance of conventional algorithms depends on assumptions about the background and target, which has great limitations. Since the image has nonlocal autocorrelation properties, f B ( i , j ) can be considered a low-rank matrix. In the meantime, since the target possesses very few image pixels, f T ( i , j ) can be treated as a sparse matrix. Therefore, the traditional infrared image model is extended to the infrared patch image (IPI) model.
The IPI model regards target detection as the optimization problem of separating sparse matrix and low-rank matrix and is solved by using principal component analysis. The following is the target image redefined by the IPI model:
D = E + T + B
where D, E, T and B, respectively, represent the infrared patch image, noise region, target region and background region.
Neglecting the noise part, target detection is carried out by constraining the sparse matrix and the low-rank matrix, respectively:
min B , T B * + λ T 1 , s . t . D = T + B
where λ is the positive balance parameter, · * represents the nuclear norm and · 1 represents the l 1 norm.
However, the sparse term is constrained by the l 1 norm that makes part of the background remain in the target image or excessively narrow the target. Some low brightness non-target points also show sparsity under the l 1 norm constraint, which causes false detection. Moreover, when there is a strong edge in the image, the target image will leave the edge residual and make the estimated background fuzzy. This paper proposes the TVWLR mode to address these issues. The overlapping edge information and total variation ( T V ) regularization term are combined to characterize the background structural features and the constraint of the low-rank term is strengthened to reduce the false detection rate of target detection. Meanwhile, we employ the adaptive weight constraint low-rank term to accurately evaluate the background image.
The following are the main contributions of this paper:
(1) Considering the problem that it is difficult to accurately detect targets in complicated backgrounds, a total variational weighted low-rank constraint method is proposed. The proposed method strengthens the constraints on low-rank terms, which can better evaluate the background image and improves target-detection probability.
(2) By applying overlapping edge information (OEI) to determine the weights that constrain low-rank terms, the staircase effect is effectively suppressed. Meanwhile, the l 2 , 1 norm is introduced to remove strong edges so as to solve the problem of false detection caused by low-brightness non-target points.
(3) An optimization algorithm combined with the alternating direction method of multipliers (ADMM) is given to resolve the TVWLR model accurately. Moreover, the solution process is simplified by using tolerance error as a stopping condition.
(4) We conduct many experiments on some of the scene images after determining the specific values of the pivotal parameters. The feasibility of the suggested method is verified by qualitative and quantitative analysis of the experimental results.
The following shows the organization of the remaining parts of this paper: Section 2 briefly introduces the related works on infrared small target detection; we describe in detail the process of proposing the TVWLR model and related optimization methods in Section 3; we carry out experiments on six sequential images and conduct qualitative and quantitative analysis, respectively, in Section 4; Section 5 is the discussion; Section 6 summarizes the conclusions of this paper.

2. Related Works

2.1. Sequence Image-Detection Methods

The methods require multi-frame image information, which leads to low detection efficiency and poor practicability. In the case of uniform background distribution, the methods such as dynamic programming [6], spatial filtering [7] and matched filtering [8] have good background suppression ability. However, the relative speed of the image detector and the target is fast in the actual application process, which makes it difficult to ensure that the image has a uniform background, resulting in poor detection performance [9].

2.2. Single-Frame Image-Detection Methods

The methods utilize gray and contrasting characteristics of the image to achieve detection; this involves low complexity and high detection efficiency. The methods are mainly composed of traditional filtering methods, methods based on human vision, optimization-based methods and method-based deep learning.
Traditional filtering methods such as Tophat transform [10], maximum mean and maximum median [11] utilize the residual image of the original image and the filtered image to achieve target enhancement and background noise suppression. Due to the background complexity of the actual application environment, the algorithms mentioned above usually cannot meet the detection accuracy requirements.
The methods based on human vision take the saliency of the target in the adjacent area as the detection basis. Based on the spatially discontinuous features of the target [12], Chen et al. developed the local contrast map (LCM) algorithm [13]. The gray difference of a 3 × 3 neighborhood is utilized to estimate the saliency of pixels in the neighborhood. To improve the detection speed of the algorithm, an improved local contrast metric method (ILCM) was proposed by Han et al. [14]. Based on the characteristics of bright and dark targets, Wei et al. created a multi-scale patch-based contrast method (MPCM) [15]. By using the matching filter and the principle of closest mean, Han et al. designed an enhanced closest-mean background estimation (ECMBE) model to suppress high brightness backgrounds and improve the signal-to-noise ratio [16]. Bai et al. fused the contrast measurement mapping derived from different derivative sub-bands and proposed a contrast measurement method based on derivative entropy (DECM) [17]. However, these methods depend on the brightness difference between the background and target and cannot achieve ideal detection results when the brightness difference is low.
The optimization-based methods treat target detection as an ill-posed inverse problem. Combining this idea, Gao et al. [18] created the IPI model, which exploited the nonlocal autocorrelation of the background to turn target detection into an optimization problem of the background matrix and the target matrix. According to the thermal characteristics of the target, Dai et al. developed a non-negative infrared patch-image model (NIPPS) [19,20]. Zhang et al. [21] introduced an advanced local prior graph that simultaneously encodes background-related and target-related information and proposed a detection method incorporating the partial sum of tensor kernel norm (PSTNN), which can significantly reduce the algorithm complexity and computational time. Wang et al. [22] designed the total variation regularization and principal component pursuit (TV-PCP) model to effectively preserve the background edge information. Zhang et al. [23] used self-regularization terms to describe background features and devised the self-regularized weighted sparse model (SRWS). The above methods reach good detection results, but the detection accuracy of images with special strong edges is poor and the false alarm rate and missed detection rate are high.
The methods based on deep learning are the latest technology in the field of target detection. Wang et al. [24] adopted a dictionary learning method and considered the non-local characteristics of background and target and developed a more flexible stable multi-subspace learning model (SMSL). Shi et al. [25] designed a denoising autoencoder model (CDAE), which regarded small targets as noise, used a denoising autoencoder for denoising reconstruction and obtained a detection image by subtracting the original image from the reconstructed image. In order to improve the performance of network-detection targets, Du et al. [26] proposed a target-oriented shallow-deep features (TSDFs) model based on deep semantic features and shallow detail features of targets. Gao et al. [27] devised a feature mapping deep neural network (FMDNN) to solve the problem that small target features are difficult to extract. For star maps with non-uniform backgrounds, Xue et al. [28] designed a StarNet that employed pixel-level classification to quickly separate backgrounds and targets. To extract targets in cluttered backgrounds, Zhou et al. [29] proposed a 3D-based convolutional network that could reconstruct small targets. These methods show good detection ability. However, there are few infrared dim small target datasets publicly available at present, resulting in unsatisfactory robustness in diverse backgrounds.

3. Proposed Method

First, we briefly introduce the total variational models to characterize background features and preserve background information in this section. Second, we explain the concept and structure of overlapping edge information, which is utilized to constrain the image background and eliminate the staircase effect created by total variation. Third, the total variational weighted low-rank model and associated optimization algorithm are proposed. Finally, the quantitative evaluation metrics and qualitative evaluation methods are described.

3.1. TV Model

Rudin et al. [30] first presented a total variation model to remove image noise. This model smoothes the image inside the image and reduces the difference between adjacent pixels and as far as possible does not smooth the edge of the image. Therefore, it is an anisotropic model. If the infrared small target image is represented by X R m × n and the pixel in row i and column j of image X is x i , j , the definition of T V norm can be described by:
T V ( X ) = j = 1 n 1 x m , j + 1 x m , j + i = 1 m 1 x i + 1 , n x i , n + j = 1 n 1 i = 1 m 1 x i , j + 1 x i , j 2 + x i + 1 , j x i , j 2
It can be seen from Equation (4) that if the edge information is not considered, the total variation norm can be regarded as the l 2 norm of the image derivative. If we convert image X to a column vector and use P i to represent the corresponding gradient operator, the discrete gradient of pixels at i can be represented by P i R 2 . Therefore, the following non-differentiable, non-convex function can be obtained:
T V ( X ) = i P i X 2
The total variational model is an effective regular item to maintain the image smoothness [31]. The model can reduce the disparity of the image to closely match the original image, remove unwanted details and retain crucial details such as edges. In addition, the total variation model can also accurately evaluate discontinuities in infrared images. Thus, we introduce total variation to characterize the image background features.
The total variation term enables detection algorithms to better preserve background information such as strong edges, which better estimate the background image. Some sparse parts of non-target points are removed to reduce the false detection rate of target detection. However, the total variation model will appear as a significant staircase effect in practical applications [32,33,34], which makes it difficult to accurately detect the target.

3.2. Overlapping Edge Information

To address the staircase effect problem, the structural features of the image are characterized by OEI. It can be found from Figure 1 that the edge portion of the OEI feature image is rather visible and numerous. Based on this property, we utilize OEI to obtain the equivalent weight to constrain low-rank terms that suppress the staircase effect. To get the OEI of image X, the matrix Q is obtained by combining the overlapping matrix of horizontal and vertical derivatives of the image:
Q ( i , j ) = Q v ( i , j ) + Q h ( i , j )
where Q v ( i , j ) = i = m 1 m 2 j = m 1 m 2 G v ( i , j ) , Q h ( i , j ) = i = m 1 m 2 j = m 1 m 2 G h ( i , j ) , G v ( i , j ) and G h ( i , j ) represent the first derivative of pixel ( i , j ) in the vertical direction and horizontal direction, respectively. Among them, m 1 = l 1 2 , m 2 = l 2 and l represents the number of overlapping information groups; operator n is the largest integer equal to or less than the number n.
The smaller the element difference in the OEI, the better it can characterize the structural features and suppress the background. Therefore, we use OEI to constrain the low-rank term to highlight the target in the image, thereby improving the background suppression ability and detection accuracy of the detection algorithm.
Figure 1. The featured image of OEI. (a) original image. (b) OEI feature image. (c) three-dimensional image of OEI.
Figure 1. The featured image of OEI. (a) original image. (b) OEI feature image. (c) three-dimensional image of OEI.
Remotesensing 14 04615 g001

3.3. TVWLR Model

As shown in Figure 2, there are large residuals in the target image detected by the IPI model, which can cause false detection. To improve this phenomenon, we introduce the total variation regularization term into the IPI model to retain the background edge well:
min B , T B * + λ 1 T V ( B ) + λ 2 T 1 , s . t . D = E + T + B , E F δ
where · F represents the F norm, T V ( · ) represents the T V norm, λ 1 and λ 2 are two positive balance parameters, δ is a positive parameter that changes with the image.
According to Section 3.1, the total variational term is represented by Equation (5):
min B , T B * + λ 1 i P i B 2 + λ 2 T 1 , s . t . D = E + T + B , E F δ
where P i represents the gradient operator. Both the nuclear norm and the T V norm constrain the background, which can obtain better background estimation.
Figure 2. The IPI model detection results. (a,b) original images. (c,d) target images with clutter.
Figure 2. The IPI model detection results. (a,b) original images. (c,d) target images with clutter.
Remotesensing 14 04615 g002
According to the background structure information, the low-rank term is weighted to suppress the staircase effect so as to accurately characterize the image background feature and obtain a clear background estimation. First, the matrix Q is converted into patch image Q. Then, the following weight equation is obtained by the OEI of the infrared image:
ω = exp ( α * Q Q min Q max Q min )
where α is the tensile coefficient, Q min and Q max the minimum value and the maximum value of the matrix Q, respectively.
In order to remove the residuals left by the strong edges of a complex infrared image in the target image, we introduce the following defined l 2 , 1 norm:
E 2 , 1 = i j E i j 2
In summary, the proposed total variation weighted low-rank constraint (TVWLR) model is as follows:
min B ω , * + λ 1 i P i B 2 + λ 2 T 1 + β E 2 , 1 , s . t . D = E + T + B
where P i represents the gradient operator; β is the penalty factor. We obtain the detected target image and related target information after solving Equation (11).
Figure 3 describes the framework of the proposed method:
1. Specify a sliding window and step size, obtain each patch in turn and then vectorize these patches into the column vectors to form a new matrix, thereby obtaining the patch image.
2. Calculate the OEI of the original image and then use the same step and sliding window size as the previous step to obtain the patch weight.
3. Initialize the relevant parameters, input the patch image and patch weight into Algorithm 1 and solve it through the designed optimization algorithm.
Figure 3. The framework of the proposed method.
Figure 3. The framework of the proposed method.
Remotesensing 14 04615 g003
Algorithm 1: The solution process of the TVWLR model.
Remotesensing 14 04615 i001

3.4. Optimization Algorithm

We propose an optimization method by combining ADMM to solve Equation (11) in this section. First, Equation (11) is equivalent to:
min Z 1 ω , * + λ 1 i z i 2 + λ 2 T 1 + β E 2 , 1 , s . t . Z 1 = B Z 2 = [ z 1 , z 2 , , z m n ] , z i = P i B D = B + T + E
Then, Equation (12) is transformed into the augmented Lagrange function:
L A = Z 1 ω , * + λ 1 i z i 2 + λ 2 T 1 + β E 2 , 1 + Y 1 , Z 1 B + μ 2 Z 1 B F 2 + i y i , z i P i B + μ i 2 z i P i B F 2 + Y 3 , D T B E + μ 2 D T B E F 2
where · F is the F norm, · , · represents the interior product of two matrices, Y 1 , Y 3 and Y 2 ( Y 2 = [ y 1 , y 2 , , y m n ] R 2 × m n ) denote the Lagrange multiplier, μ is the penalty factor.
Equation (13) can be solved iteratively with our designed optimization algorithm. When the ( t + 1 ) t h iteration is performed:
Z 1 t + 1 = arg min Z 1 Z 1 ω , * + Y 1 , Z 1 B + μ t 2 Z 1 B k F 2
The singular value threshold method can be used to solve Equation (14). The following is the singular threshold function:
S V T ε ( M ) = U d i a g [ ( τ ε ) + ] V T , ( τ ε ) + = τ ε τ > ε 0 o t h e r w i s e
where M = U V T represents the singular value decomposition of matrix M.
Z 2 t + 1 = arg min Z 2 i z i 2 + y i , z i P i B + μ i t 2 z i P i B F 2
According to Ref. [35], Equation (16) can be solved by using a two-dimensional shrinkage-like formula:
z i = max P i B y i μ i t 2 1 μ i t , 0 · P i B y i μ i t P i B y i μ i t 2
The solution process of B t + 1 is as follows:
B t + 1 L A B = 0
Equation (18) is a linear problem and its solution process is as follows:
B t + 1 = [ Y 1 t + Y 3 t + i D i T y i + μ i D i T z i + μ Z 1 t + 1 + D T t + 1 E t + 1 ] 2 μ + i μ i D i T D i
Updates to E t + 1 and T t + 1 are as follows:
E t + 1 = arg min E β E 2 , 1 + μ t 2 D B t + 1 T t + 1 E Y 3 t μ t F 2
T t + 1 = arg min T λ 2 T 1 + μ t 2 D B t + 1 T E t Y 3 t μ t F 2
According to Ref. [36] and Ref. [9], Equations (20) and (21) are solved, respectively:
E t + 1 ( : , i ) = M ( : , i ) 2 β μ t M ( : , i ) 2 M ( : , i ) i f M ( : , i ) 2 > β μ t 0 o t h e r w i s e
T t + 1 = λ 2 μ t D B t + 1 E t Y 3 t μ t
In Equation (22), M = D B t + 1 T t + 1 Y 3 t μ t . In Equation (23), ε ( · ) represents the soft threshold operation [37].
Updates to Y i t + 1 and μ t + 1 are as follows:
Y 1 t + 1 = Y 1 t + μ t ( Z 1 t + 1 B t + 1 ) , Y 2 t + 1 = Y 2 t + μ t ( Z 2 t + 1 D B t + 1 ) , Y 3 t + 1 = Y 3 t + μ t ( D B t + 1 T t + 1 E t + 1 )
μ t + 1 = ρ μ t
where ρ > 0 .
Finally, we describe the whole iterative optimization process in Algorithm 1.

3.5. Evaluation Metrics

We introduce the definitions of several evaluation metrics in this subsection, including receiver operating characteristic (ROC) curve, background suppressor factor (BSF) and signal-to-clutter ratio gain (SCRG).
SCRG and BSF are good evaluations of the ability of detection target and background suppression and can be expressed by the following two formulas:
S C R G = S o u t / C o u t S i n / C i n
B S F = C i n C o u t
where C i n and C o u t represent the standard deviation of the background region of the original infrared image and the output target image, respectively. S i n and S o u t represent the amplitude of the target region of the original infrared image and the detected target image, respectively.
S = T max T min
where T max and T min , respectively, are the maximum and minimum gray values of the target region.
In order to avoid infinity (Inf) [38] when calculating SCRG and BSF, as shown in Figure 4, we adopt the definition of background region in Ref. [23], as shown in Figure 4. The red square and black square in the figure represent the target region of size a and the background region of size d, respectively. To ensure that all target pixels are included in the selected region, we set a = 11 and d = 81.
To comprehensively assess the detection capability of all methods, two important evaluation matrices are introduced: the false alarm rate F a and the probability of detection P d , which are expressed by the following two formulas:
F a = N f N I
P d = N t N T
where N f and N I represent the number of small targets detected incorrectly and the number of images, respectively; N t and N T represent the number of small targets actually detected and the actual number of small targets, respectively.
When drawing the ROC curve, we use F a and P d as the horizontal axis and vertical axis, respectively. Therefore, as the ROC curve approaches the upper left corner, the target detection ability improves. We also quantify the detection effect by calculating the area under the ROC curve (AUC) of all methods. Generally, the better the effect of target detection, the higher the AUC.

4. Experiments and Results

Firstly, the experimental parameters are determined, then we compare the TVWLR model with the other seven baseline methods.

4.1. Parameter Setting

The main parameters that influence the proposed method’s detection performance are determined in this part. The low-rank term and the T V regularization term are both balanced by λ 1 . It is an empirical value of around 0.01, so we set it to 0.005. λ 2 is utilized to balance the mutual influence between the target region and background region. Considering the influence of the T V term, we use the value λ 2 = 150 max ( p , q ) in the experiment, where p and q, respectively, represent the width and length of the original patch image. μ directly affects the soft threshold operator of the calculation target and the convergence speed of the iterative process, which is a penalty operator. If μ is too small, the target will not be recognized; if μ is too large, the target image will have a lot of noise. To make μ change adaptively, we set μ = z max ( m , n ) , where we set the range of μ from 0.5 to 5. For six sequential images, the detection performance is best when z = 2 or z = 3 . In order to ensure the convergence speed, we choose z = 3 . For the proposed method, we define the tolerance error as follows:
t o l = D T t B t E t F D F
where t represents the number of the iterative process of the optimization method. The iterative process stops when t o l < 10 5 , which is considered convergent.

4.2. Experimental Preparation

First, two scenes in Figure 2 are tested with the proposed method, which verifies the effectiveness of our detection algorithm. The processing results and the corresponding three-dimensional views are shown in Figure 5, where the red box marks the target.
In order to further test the performance of the proposed model, a large number of experiments are performed in this paper. The experimental scenes include high-light clutter, cloud background, ground, sea level, etc., and the results of some of the scenes are selected for display. These six scenes and three-dimensional views are shown in Figure 6 and Figure 7, respectively. These images feature various target sizes, diverse backgrounds and low signal-to-noise ratios, which make it difficult to successfully detect the target using traditional methods.
Each image sequence consists of a series of images for each scene, Figure 6. Table 1 shows the specific information of Figure 6. The proposed method is compared with seven representative methods containing Tophat transform, LCM, MPCM, IPI, TV-PCP, PSTNN and SRWS.

4.3. Qualitative Results

In this section, the experiments on six groups of sequential images are carried out to test the detection performance of the eight algorithms. Figure 8 and Figure 9 show the detected target images of all detection methods of six image sequences. In particular, we mark targets in different scenes with red boxes. The three-dimensional views of gray images of target images are shown in Figure 10 and Figure 11.
As shown in Figure 8 and Figure 9, Tophat is particularly sensitive to noise and background edges. Under complex background clutters such as seq 2, seq 5 and seq 6, a large amount of background information will be left in the target image, which results in a large false alarm rate. LCM can achieve fast and efficient detection. However, the targets are easily overwhelmed when the gray value of the target and background have trivial difference. The detection effect of IPI is good, but in the case of complex background and low signal-to-noise ratio, the target will not be detected and the accuracy of target detection cannot be guaranteed, as shown in seq 2 and seq 3. TV-PCP can recover the target well. As shown in seq 2, seq 5 and seq 6, due to the constraint of T V regularization, TV-PCP will produce the staircase effect, especially when the target moves rapidly. PSTNN and SRWS have good detection results, but when there is extremely rich background information, they will leave some non-target sparse points in the target image, which are difficult to distinguish from the real target, such as seq 5 and seq 6. Compared with the above seven baseline methods, our method can more effectively separate the background and target, has better target detection performance and can accurately estimate the background image and precisely detect the target.
Figure 11. From left to right are the 3D gray images of the detection results of the 1∼6 sequences of images (Continued).
Figure 11. From left to right are the 3D gray images of the detection results of the 1∼6 sequences of images (Continued).
Remotesensing 14 04615 g011
As shown in Figure 10 and Figure 11, in the case of a complex background, the detection results of Tophat, LCM, IPI and TV-PCP all have much background noise. The detection images of MPCM, PSTNN and SRWS are good, but in the particularly complex background, such as seq 5 and seq 6, the target image has multiple peaks and the detection is not accurate enough. As shown in Figure 8, Figure 9, Figure 10 and Figure 11, our method has a better detection effect on various backgrounds, improves the accuracy of target detection and enhances the robustness of detection.

4.4. Quantitative Results

We quantitatively evaluate the detection effects of the eight algorithms in this part. SCRG and BSF are two important evaluation metrics. The specific results of all methods are shown in Table 2.
As shown in Table 2, due to the extremely low signal-to-noise ratio of the images in seq 2, IPI fails to detect and the corresponding metrics cannot be obtained. The evaluation metrics obtained by the proposed method are almost always the maximum or sub-maximum values. The SCRG and BSF are significantly improved compared with the other seven algorithms, which indicates that the proposed method can separate background and target well and has better background suppression and robustness.
A large number of experiments to test the ROC curve of each sequence image are made to more comprehensively evaluate the detection ability of each method and reflect the advantages of our method, as shown in Figure 12. Meanwhile, Table 3 summarizes the AUC of the ROC curve.
It can be seen that Tophat and LCM have the worst performance; they cannot handle complex background images well. The target is easily submerged when MPCM processes cluttered and noisy images occurred. The detection performance of IPI is unstable, especially in seq 2; the accuracy of target detection is extremely poor. In the case of extraordinarily rich background information, TV-PCP cannot accurately predict the background image. PSTNN and SRWS have good detection performance, but the robustness of the algorithms cannot be guaranteed in the complex background of seq 5 and seq 6. Based on the above evaluation metrics, our method outperforms other methods in terms of background-suppression ability and target-detection ability and the robustness of our method to various complex backgrounds is proved.
We also calculate the running time for all methods in six sequence scenes. These experiments are all implemented on a computer with 16G of memory and an Intel Celeron 2.90 GHz CPU. As shown in Table 4, Tophat requires the least computation time and LCM and MPCM can also achieve fast detection because they all filter in the spatial domain. Both IPI and TV-PCP greatly increase the computational complexity, which requires a long computation time. With the premise of evaluating the background more accurately, we propose a solution strategy combined with ADMM, which simplifies the solution process, improves the convergence speed and greatly reduces the running time.

5. Discussion

Traditional filtering methods have a simple idea and a small amount of computation and only to some extent play a role in suppressing uniform background and cannot solve the problem of complex background. The methods based on human vision are mainly suitable for scenes where the target brightness is significantly different from the surrounding background. The optimization-based methods are obviously applicable to almost all kinds of complex and rapidly changing backgrounds and have strong robustness. The background data in the early IPI model is represented by the nuclear norm, which has good applicability to the background with slow change and uniformity, but still cannot deal with the image with complex background. The TV-PCP model improves the clarity of edges and corners and reduces noise interference, but there are residual dim edges in the background image, showing a distinct staircase effect. PSTNN and SRWS fully consider the target characteristics, but their parameter settings limit the robustness. To improve the detection ability in complex backgrounds, we propose a TVWLR model.
The model introduces the TV regularization term to constrain the target to address the defect of the l 1 norm sparsity measurement and uses OEI to weight the background data to eliminate the obvious staircase effect. The proposed method is superior to other methods; Figure 8, Figure 9, Figure 10 and Figure 11 show that our method has higher detection accuracy, Table 2 and Table 3 demonstrate the robustness of our method to various complex backgrounds.
Although the proposed method has an excellent performance concerning detection ability, like other optimization-based algorithms, our method requires a lot of iterative operations. Compared with the traditional spatial domain algorithms, our method increases computational complexity and requires a slightly longer running time. Our future research work will focus on solving this problem. In addition, driven by big data and artificial intelligence, small target detection algorithms based on deep learning have made great progress. It is also a good idea to use the semantic segmentation model to detect small targets in complex backgrounds, which will also be our future research. The above experimental results indicate that, in the spatial domain algorithm, Tophat has extremely poor detection probability for complex background images. A lot of background clutter remains in the target image with LCM and missed detection and false detection occur in the process of MPCM. Compared with the spatial domain algorithm, the optimized detection algorithm has better detection performance. The IPI model has good detection performance in the case of uniform background, but it still cannot handle images with complex backgrounds. There are residual dim edges in the background image of the TV-PCP model, showing a distinct staircase effect. PSTNN and SRWS fully consider the target characteristics, but their parameter settings limit the robustness. The proposed method is superior to other methods, Figure 8, Figure 9, Figure 10 and Figure 11 show that our method has higher detection accuracy, Table 2 and Table 3 demonstrate the robustness of our method to various complex backgrounds.
Although the proposed method has an excellent performance concerning detection ability, like other optimization-based algorithms, our method uses iterative calculation. Compared with the traditional spatial domain algorithms, our method increases computational complexity and requires a slightly longer running time. Our future research work will focus on solving this problem.

6. Conclusions

In this paper, a new detection algorithm TVWLR is proposed to improve the detection accuracy of infrared dim small targets. The algorithm utilizes OEI to characterize the structural features of the image background and has the ability to adaptively determine the weight of the constraint low-rank term. It can suppress the staircase effect caused by the T V regularization term, enhance the details and edge information of the image and effectively reduce the false detection rate. The l 2 , 1 norm is introduced to remove strong edges and residuals in the image, which greatly improves the background suppression ability of the algorithm. Finally, we propose a solution algorithm combining ADMM for the TVWLR model. A large number of extensive experimental results demonstrate that the proposed method has better detection accuracy, better subjective and objective consistency and stronger robustness compared with the other seven methods.

Author Contributions

Conceptualization, X.C. and W.X.; Methodology, X.C.; Software, X.C. and Y.P.; Investigation, X.C. and T.G.; Formal Analysis, X.C.; Writing—Original Draft, X.C. and S.T.; Funding Acquisition, W.X.; Resources, W.X. and Q.F.; Supervision, W.X. and Q.F.; Writing—Review and Editing, W.X.; Data Curation, S.T.; Visualization, T.G.; Validation, Y.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Natural Science Foundation of China under Grant 62075219, 61805244, in part by the Key Technological Research Projects of Jilin Province, China under Grant 20190303094SF.

Data Availability Statement

The image data used in this paper are available at the following link: https://github.com/Tianfang-Zhang/SRWS and https://github.com/YimianDai/sirst, accessed on 25 December 2021.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Dawson, J.A.; Bankston, C.T. Space debris characterization using thermal imaging systems. In Proceedings of the Advanced Maui Optical and Space Surveillance Technologies Conference, Wailea, Maui, HI, USA, 14–17 September 2010; Volume E40. [Google Scholar]
  2. Kim, S. Analysis of small infrared target features and learning-based false detection removal for infrared search and track. Pattern Anal. Appl. 2014, 17, 883–900. [Google Scholar] [CrossRef]
  3. Sadjadi, F.A.; Chun, C.S.L. Automatic detection of small objects from their infrared state-of-polarization vectors. Opt. Lett. 2003, 28, 531–533. [Google Scholar] [CrossRef] [PubMed]
  4. Kim, S.; Lee, J. Small infrared target detection by region-adaptive clutter rejection for sea-based infrared search and track. Sensors 2014, 14, 13210–13242. [Google Scholar] [CrossRef]
  5. Tartakovsky, A.; Kligys, S.; Petrov, A. Adaptive sequential algorithms for detecting targets in a heavy IR clutter. SPIE Proc, Signal Data Process. Small Targets (SDPST) 1999, 3809, 231–242. [Google Scholar]
  6. Tonissen, S.M.; Evans, R.J. Performance of dynamic programming techniques for track-before-detect. IEEE Trans. Aerosp. Electron. Syst. 1996, 32, 1440–1451. [Google Scholar] [CrossRef]
  7. Luo, J.H.; Ji, H.B.; Liu, J. An algorithm based on spatial filter for infrared small target detection and its application to an all directional IRST system. In Proceedings of the 27th International Congress on High-Speed Photography and Photonics, Xi’an, China, 17–22 September 2006; Volume 6279. [Google Scholar]
  8. Reed, I.S.; Gagliardi, R.M.; Stotts, L.B. Optical moving target detection with 3-D matched filtering. IEEE Trans. Aerosp. Electron. Syst. 2002, 24, 327–336. [Google Scholar] [CrossRef]
  9. Guo, J.; Wu, Y.; Dai, Y. Small target detection based on reweighted infrared patch-image model. IET Image Process. 2018, 12, 70–79. [Google Scholar] [CrossRef]
  10. Tom, V.T.; Peli, T.; Leung, M.; Bondaryk, J.E. Morphology-based algorithm for point target detection in infrared backgrounds. Signal Data Process. Small Targets 1993, 1954, 2–11. [Google Scholar]
  11. Deshpande, S.D.; Meng, H.E.; Ronda, V.; Chan, P. Max-Mean and Max-Median Filters for Detection of Small-Targets. In Proceedings of the SPIE’s International Symposium on Optical Science, Engineering and Instrumentation, Denver, CO, USA, 18 July 1999; Volume 3809. [Google Scholar]
  12. Xia, C.; Li, X.; Zhao, L.; Yu, S. Modified Graph Laplacian Model With Local Contrast and Consistency Constraint for Small Target Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5807–5822. [Google Scholar] [CrossRef]
  13. Chen, C.; Li, H.; Wei, Y.; Xia, T.; Tang, Y.Y. A Local Contrast Method for Small Infrared Target Detection. IEEE Trans. Geosci. Remote Sens. 2013, 52, 574–581. [Google Scholar] [CrossRef]
  14. Han, J.; Yong, M.; Bo, Z.; Fan, F.; Liang, K.; Yu, F. A Robust Infrared Small Target Detection Algorithm Based on Human Visual System. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2168–2172. [Google Scholar]
  15. Wei, Y.; You, X.; Li, H. Multiscale patch-based contrast measure for small infrared target detection. Pattern Recognit. 2016, 58, 216–226. [Google Scholar] [CrossRef]
  16. Han, J.; Liu, C.; Liu, Y.; Luo, Z.; Zhang, X.; Niu, Q. Infrared Small Target Detection Utilizing the Enhanced Closest-Mean Background Estimation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 645–662. [Google Scholar] [CrossRef]
  17. Bai, X.; Bi, Y. Derivative Entropy-Based Contrast Measure for Infrared Small-Target Detection. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2452–2466. [Google Scholar] [CrossRef]
  18. Gao, C.; Meng, D.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared Patch-Image Model for Small Target Detection in a Single Image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef] [PubMed]
  19. Oh, T.H.; Tai, Y.W.; Bazin, J.C.; Kim, H.; Kweon, I.S. Partial sum minimization of Singular Values in Robust PCA: Algorithm and applications. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 744–758. [Google Scholar] [CrossRef]
  20. Dai, Y.; Wu, Y.; Song, Y.; Guo, J. Non-negative infrared patch-image model: Robust target-background separation via partial sum minimization of singular values. Infrared Phys. Technol. 2017, 81, 182–194. [Google Scholar] [CrossRef]
  21. Zhang, L.; Peng, Z. Infrared Small Target Detection Based on Partial Sum of the Tensor Nuclear Norm. Remote Sens. 2019, 11, 382. [Google Scholar] [CrossRef]
  22. Wang, X.; Peng, Z.; Kong, D.; Zhang, P.; He, Y. Infrared dim target detection based on total variation regularization and principal component pursuit. Image Vis. Comput. 2017, 63, 1–9. [Google Scholar] [CrossRef]
  23. Tz, A.; Zp, A.; Hao, W.A.; Yh, A.; Cl, B.; Cy, A. Infrared small target detection via self-regularized weighted sparse model - ScienceDirect. Neurocomputing 2021, 420, 124–148. [Google Scholar]
  24. Wang, X.; Peng, Z.; Kong, D.; He, Y. Infrared dim and small target detection based on stable multisubspace learning in heterogeneous scene. IEEE Trans. Geosci. Remote Sens. 2017, 55, 5481–5493. [Google Scholar] [CrossRef]
  25. Shi, M.; Wang, H. Infrared dim and small target detection based on denoising autoencoder network. Mob. Netw. Appl. 2020, 25, 1469–1483. [Google Scholar] [CrossRef]
  26. Du, J.; Lu, H.; Hu, M.; Zhang, L.; Shen, X. CNN-based infrared dim small target detection algorithm using target-oriented shallow-deep features and effective small anchor. IET Image Process. 2021, 15, 1–15. [Google Scholar] [CrossRef]
  27. Gao, Z.; Dai, J.; Xie, C. Dim and small target detection based on feature mapping neural networks. J. Vis. Commun. Image Represent. 2019, 62, 206–216. [Google Scholar] [CrossRef]
  28. Xue, D.; Sun, J.; Hu, Y.; Zheng, Y.; Zhu, Y.; Zhang, Y. Dim small target detection based on convolutinal neural network in star image. Multimed. Tools Appl. 2020, 79, 4681–4698. [Google Scholar] [CrossRef]
  29. Zhou, S.; Gao, Z.; Xie, C. Dim and small target detection based on their living environment. Digit. Signal Process. 2022, 120, 103271. [Google Scholar] [CrossRef]
  30. Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 1992, 60, 259–268. [Google Scholar] [CrossRef]
  31. Stefan, W.; Renaut, R.A.; Gelb, A. Improved Total Variation-Type Regularization Using Higher Order Edge Detectors. SIAM J. Imaging Sci. 2010, 3, 232–251. [Google Scholar] [CrossRef]
  32. Chan, T.; Marquina, A.; Mulet, P. High-order total variation-based image restoration. SIAM J. Sci. Comput. 2001, 22, 503–516. [Google Scholar] [CrossRef]
  33. Bredies, K.; Kunisch, K.; Pock, T. Total Generalized Variation. SIAM J. Imaging Sci. 2010, 3, 492–526. [Google Scholar] [CrossRef]
  34. Liu, X.; Chen, Y.; Peng, Z.; Wu, J. Infrared Image Super-Resolution Reconstruction Based on Quaternion and High-Order Overlapping Group Sparse Total Variation. Sensors 2019, 19, 5139. [Google Scholar] [CrossRef] [PubMed]
  35. Li, C. An Efficient Algorithm for Total Variation Regularization with Applications to the Single Pixel Camera and Compressive Sensing. Ph.D. Thesis, Rice University, Houston, TX, USA, 2011. [Google Scholar]
  36. Yuan, M.; Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2006, 68, 49–67. [Google Scholar] [CrossRef]
  37. Hale, E.T.; Yin, W.; Zhang, Y. Fixed-point continuation for l1-minimization: Methodology and convergence. SIAM J. Optim. 2008, 19, 1107–1130. [Google Scholar] [CrossRef]
  38. Gao, C.; Wang, L.; Xiao, Y.; Zhao, Q.; Meng, D. Infrared small-dim target detection based on Markov random field guided noise modeling. Pattern Recognit. 2018, 76, 463–475. [Google Scholar] [CrossRef]
Figure 4. The background region and target region of the infrared small image.
Figure 4. The background region and target region of the infrared small image.
Remotesensing 14 04615 g004
Figure 5. The TVWLR model detection results. (a,b) the target images processed by the TVWLR model. (c,d) the target images processed by the TVWLR model.
Figure 5. The TVWLR model detection results. (a,b) the target images processed by the TVWLR model. (c,d) the target images processed by the TVWLR model.
Remotesensing 14 04615 g005
Figure 6. Infrared images of some real scenes. (a) Scene 1 (Seq1). (b) Scene 2 (Seq2). (c) Scene 3 (Seq3). (d) Scene 4 (Seq4). (e) Scene 5 (Seq5). (f) Scene 6 (Seq6).
Figure 6. Infrared images of some real scenes. (a) Scene 1 (Seq1). (b) Scene 2 (Seq2). (c) Scene 3 (Seq3). (d) Scene 4 (Seq4). (e) Scene 5 (Seq5). (f) Scene 6 (Seq6).
Remotesensing 14 04615 g006
Figure 7. Infrared images of some real scenes. (a) Scene 1 (Seq1). (b) Scene 2 (Seq2). (c) Scene 3 (Seq3). (d) Scene 4 (Seq4). (e) Scene 5 (Seq5). (f) Scene 6 (Seq6).
Figure 7. Infrared images of some real scenes. (a) Scene 1 (Seq1). (b) Scene 2 (Seq2). (c) Scene 3 (Seq3). (d) Scene 4 (Seq4). (e) Scene 5 (Seq5). (f) Scene 6 (Seq6).
Remotesensing 14 04615 g007aRemotesensing 14 04615 g007b
Figure 8. From left to right are the detection results of 1∼6 sequence images.
Figure 8. From left to right are the detection results of 1∼6 sequence images.
Remotesensing 14 04615 g008
Figure 9. From left to right are the detection results of 1∼6 sequence images (Continued).
Figure 9. From left to right are the detection results of 1∼6 sequence images (Continued).
Remotesensing 14 04615 g009aRemotesensing 14 04615 g009b
Figure 10. From left to right are the 3D gray images of the detection results of the 1∼6 sequences of images.
Figure 10. From left to right are the 3D gray images of the detection results of the 1∼6 sequences of images.
Remotesensing 14 04615 g010
Figure 12. ROC curve of eight methods. (a) Seq1. (b) Seq2. (c) Seq3. (d) Seq4. (e) Seq5. (f) Seq6.
Figure 12. ROC curve of eight methods. (a) Seq1. (b) Seq2. (c) Seq3. (d) Seq4. (e) Seq5. (f) Seq6.
Remotesensing 14 04615 g012
Table 1. The specific information of the six image sequences.
Table 1. The specific information of the six image sequences.
SequenceSizeNumberTarget DescriptionBackground Description
1320 × 24050Irregular shape Low contrastCloudy background Background changes quickly
2319 × 19267Move quicklyHeavy noise Bright background
3407 × 272185Small Vague and unclearComplex background with trees
4298 × 18640Tiny Very low contrastDim background Heavy noise
5320 × 240200Small and bright Slow-motionSea background with bridge
6332 × 221300The cloud obscures the target Size variationHeavy cloud background Clouds change quickly
Table 2. SCRG and BSF of eight methods (bold red number: maximum value; bold blue number: second-highest value).
Table 2. SCRG and BSF of eight methods (bold red number: maximum value; bold blue number: second-highest value).
Seq1Seq2Seq3Seq4Seq5Seq6
MethodSCRGBSFSCRGBSFSCRGBSFSCRGBSFSCRGBSFSCRGBSF
Tophat4.44214.60073.89413.84193.26093.38321.08771.17182.72202.79652.23232.2788
LCM1.41920.65681.66250.79811.60130.59120.82500.21481.67550.47261.25860.2192
MPCM7.21782.61657.30791.66092.69070.86520.96950.31561.93631.13901.41271.0858
IPI7.20450.85983.85041.58991.46950.58084.25573.48832.65922.3908
TV-PCP7.12538.09671.62061.58013.58754.33711.43351.88473.88744.80422.55882.8488
PSTNN7.02041.51937.23970.76213.44871.84671.36571.04964.04302.63672.62042.0141
SRWS19.511014.76515.69294.98134.27703.38595.68385.47594.70644.29715.07284.7346
Proposed20.982515.87865.35704.68746.85385.42599.41679.07226.89726.29758.85368.2634
Table 3. AUC values of eight methods (bold red number: maximum value; bold blue number: second-highest value).
Table 3. AUC values of eight methods (bold red number: maximum value; bold blue number: second-highest value).
MethodSeq1Seq2Seq3Seq4Seq5Seq6
Tophat0.72070.74730.64830.80820.57180.4698
LCM0.71580.74320.67170.86610.63880.4761
MPCM0.84580.80480.73720.92780.66930.5418
IPI0.81740.72520.74420.92510.65790.5601
TV-PCP0.85670.80320.73120.92470.66010.5531
PSTNN0.90410.87840.84700.96770.74630.7604
SRWS0.94280.92090.85710.98980.87800.6521
Proposed0.94840.91680.90781.00000.88670.8850
Table 4. Computation time (seconds) for all methods (bold red number: minimum value; bold blue number: second-smallest value).
Table 4. Computation time (seconds) for all methods (bold red number: minimum value; bold blue number: second-smallest value).
MethodSeq1Seq2Seq3Seq4Seq5Seq6
Tophat0.00120.00140.00110.00090.00100.0009
LCM0.09640.04020.04430.03990.03040.0295
MPCM0.21840.49670.49710.54900.51480.9059
IPI39.67789.357655.936511.999030.472030.1724
TV-PCP59.172715.4766123.717464.470290.504188.1987
PSTNN0.09740.04300.05390.02980.10320.0504
SRWS0.97830.82612.48770.62081.33011.1024
Proposed7.86377.360915.49346.253811.61418.6011
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Chen, X.; Xu, W.; Tao, S.; Gao, T.; Feng, Q.; Piao, Y. Total Variation Weighted Low-Rank Constraint for Infrared Dim Small Target Detection. Remote Sens. 2022, 14, 4615. https://doi.org/10.3390/rs14184615

AMA Style

Chen X, Xu W, Tao S, Gao T, Feng Q, Piao Y. Total Variation Weighted Low-Rank Constraint for Infrared Dim Small Target Detection. Remote Sensing. 2022; 14(18):4615. https://doi.org/10.3390/rs14184615

Chicago/Turabian Style

Chen, Xiaolong, Wei Xu, Shuping Tao, Tan Gao, Qinping Feng, and Yongjie Piao. 2022. "Total Variation Weighted Low-Rank Constraint for Infrared Dim Small Target Detection" Remote Sensing 14, no. 18: 4615. https://doi.org/10.3390/rs14184615

APA Style

Chen, X., Xu, W., Tao, S., Gao, T., Feng, Q., & Piao, Y. (2022). Total Variation Weighted Low-Rank Constraint for Infrared Dim Small Target Detection. Remote Sensing, 14(18), 4615. https://doi.org/10.3390/rs14184615

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop