Many single-frame algorithms have been proposed in recent years in the field of infrared small target detection. In this paper, these methods are divided into two groups: traditional methods based on characteristics of targets and those based on RPCA theory. There are obvious differences between the traditional methods and RPCA methods in terms of solution approach and detection performance. The traditional approaches take a localized view of the detection. These methods argue that the presence of the target destroys the local correlation between background pixels, thus enabling detection by designing the corresponding filter operators. Alternatively, from the perspective of target saliency, local contrast is studied to achieve detection. On the contrary, RPCA solutions take a holistic view. The RPCA method treats the infrared image as a superposition of the whole background and the targets. The background can be considered as low-rank, while the small targets are sparse. By transforming the target detection problem into a problem of decomposing the low-rank and sparse components of the matrix, RPCA solutions achieve target detection, which is considered from the whole image. In view of the detection performance, the traditional methods are fast and can increase the brightness of the target area, but they have poor background suppression ability, which leaves lots of residual; RPCA methods suppress the background well while enhancing targets, but they are not as fast as traditional approaches.
2.1. Traditional Methods
Traditional single-frame infrared small target detection is mainly divided into two categories. One is based on a background-based assumption, which believes that the local background is uniform. The appearance of the target destroys this uniformity; thus, the correlation between background pixels is broken. By exploiting this feature, target detection is accomplished. This type of algorithm includes the top-hat filter (Tophat) [
4,
5], max-mean filter [
6], the two dimensional adaptive least mean square algorithm [
7,
8]. They have very fast computational speed and can detect targets in real-time. But they perform poorly in background suppression. The other is based on target saliency, via importing the human visual system [
9] and calculating the local contrast to obtain a saliency map to detect infrared small targets. These methods include local contrast measure [
10], improved local contrast measure (ILCM) [
11], etc. Motivated by the local contrast method, the Laplacian Gaussian operator [
12] is incorporated into the method due to its sensitivity to noise. To improve the detection performance, the multi-scale contrast method [
13] and the weighted singular local contrast method [
14] have been proposed. These types of methods enhance the true targets, while they are sensitive to strong clutters and edges.
2.2. RPCA Methods
John Wright et al. [
3] proposed the RPCA algorithm in 2009. This algorithm solves the problem of recovering the low-rank matrix and provides theoretical support for the application of low-rank theory in target detection. The original objective function of PCA is:
where
A represents the low-rank part,
E represents the sparse part, and gamma represents the coefficient of the sparse part, which can be tuned.
John Wright et al. proposed to solve the problem by replacing the rank of
A with the matrix nuclear norm of
A and the
l0 norm of
E with the
l1 norm of
E. The optimized objective function is as shown in Equation (2):
where * indicates the matrix nuclear norm, the sum of the singular values of the matrix; 1 indicates the
l1 norm, the sum of the absolute values of each element of the matrix; lambda represents the coefficient of the sparse part, which can be adjusted.
In order to solve Equation (2), John Wright et al. used the Frobenius norm to constrain the loss term of the objective function. After mathematical derivation, Equation (3) was used by John et al. as the final format to solve this optimization problem.
where
D is the input image matrix and
F represents the Frobenius norm.
RPCA has distinctive characteristics. For the infrared small target detection problem, since RPCA does not need to take into account the target’s morphology and location information, it has a wide range of applications and can achieve robust detection in complex scenarios. In addition, RPCA directly strips the background by low-rank and sparse matrix decomposition, which provides excellent background suppression and yields a clean target image. However, RPCA also has some drawbacks. The algorithm structure of RPCA is complex and requires lots of iteration; thus, the performance is inferior to the traditional method in terms of speed. It is also difficult for RPCA to eliminate the interference of clutter that is brighter than the target. Based on the RPCA theory, a series of infrared small target detection methods have been derived. Lin et al. proposed the inexact augmented Lagrange method (IALM) [
15] to solve the low-rank matrix recovery problem, and weighted tensor robust principal component analysis (WTRPCA) [
16] adopted IALM to accomplish infrared small target detection. Infrared patch image model (IPI) [
17] introduced a patch model by sampling the full picture information step by step through a fixed-sized sliding window, successfully separating the low-rank part and sparse part of the infrared image. The Markov on Gaussian operator [
18] combined Markov random field (MRF) on the basis of IPI and reduced the background clutter through the local correlation characteristics of MRF. The non-negative image patch model [
19], the non-convex rank approximation method (NRAM) [
20], and the non-convex optimization of
lp norm constraint model (NOLC) [
21] are successively proposed under the inspiration of IPI. Dai et al. [
19] employed the singularity minimization theory, and NRAM combined alternating direction method of multipliers [
22] and differences of convex programming [
23]. Zhang et al. [
21] invoked the
lp norm for optimization. Zhang et al. [
24] introduced the tensor nuclear norm and the weighted
l1 norm, and solved the problem on the three-dimensional patch. Inspired by this, Guan et al. [
25] improved the low-rank patch background tensor constraint by non-convex tensor rank second generation, enhancing the robustness of the tensor algorithm. In addition, there are areas where morphological theory and RPCA theory can be fused, and Zhu et al. [
5] successfully fused the Tophat regularization operator into a low-order tensor complement, exploiting knowledge of the prior target structure.
2.3. Motivation
Influential methods like IPI, NRAM, and NOLC are all based on RPCA theory. However, they have certain problems in the details of infrared small target detection. Among them, IPI introduces the patch model, but it is too redundant and leads to long detection time. On the other hand, IPI adopts the Frobenius norm and has poor suppression of strong edges. Methods like NRAM and NOLC, although they suppress the strong edges by adding constraints, have the problem of target loss.
It can be seen that the redundant patch is very time consuming, and the constraint on the strong edge leads to the target loss problem. We conclude that these methods have a common problem: the effectiveness and timeliness of infrared small target detection cannot be maintained at a high level at the same time.
The time consumption problem results from the above methods having redundant patches. Adjusting the patch size and step size to better detect targets is one research direction [
2]. Wang et al. [
26] introduced the idea of non-repeating sampling into the process of constructing a patch image; Zhou et al. [
27] reduced the size of the patches while maintaining the classical step size, resulting in a reduction of the overlap. Their work has contributed to improving the performance of the patch image. There are also ways to change the sampling method for fast speed; Li et al. [
28] used an observation vector to project the original image and obtain an observation matrix for detection; Reference [
24] adopted a three-dimensional patch to target very complex scenes.
The detection effect problem results from the above algorithms having accepted the relaxation of the l0 norm, while the Frobenius norm cannot endure clutter and noise. They do work on variants of the l1 norm. In order to solve this problem, we investigated the method of expressing sparse terms based on the l0 norm and its variants.
The
lp norm, log-sum [
29], and weighted
l1 norm [
30] combined with the minimum optimization are employed to replace Equation (1). Chartrand [
31] proposed a proximal
p norm to replace the
l1 norm. Sun et al. [
32] proposed truncating the
l1 norm and applying it to the singular values of the sparse and low-rank parts. Under the influence of these studies, Zhou et al. [
33] proposed Equation (4) based on SPCP [
34], which is very close to Equation (1):
In Equation (4), the
l0 norm is adopted on the sparse term, which is the biggest difference from Equation (3). In addition, Equation (4) adds an adjustable coefficient to the last term. Further, Liu et al. [
35] proposed SRPCP, which introduces a regular term to extract the low-rank part of the picture, and proposed a direct
l0 norm for enhancing the sparse term. The idea of SRPCP that directly adopts the
l0 norm without approximation coincides with the mentioned problems: fast detection and pretty detection results.
SRPCP [
36] is a new model based on the robust principal component analysis theory. Different from the traditional relaxation method, SRPCP is inspired by the model proposed by Zhou et al. [
33], and adopts the “
l0 norm” to constrain the sparse part of the matrix. Further, for the shortcomings of the Frobenius norm in detecting infrared small targets, SRPCP restricts the loss term of the target function by using the
l1 norm instead. The objective function proposed by the SRPCP model is shown in Equation (5) [
35]:
Compared with Equations (1)–(4), Equation (5) adopts the structure of the “
l0-
l1 norm” to perform sparse decomposition of matrix addresses. The advantage of applying SRPCP reduces the relaxation error caused by the relaxation of the
l0 norm and significantly improves the recovery performance of the low-rank part [
36]. It shows robustness against input-intensive noise. In SRPCP, the objective function is Equation (6):
The solution method adopts the heuristic method [
36]. The procedure is mainly divided into two steps [
35], which will be specifically explained in the solution of the NOP model.
In summary, in order to overcome the shortcomings of models based on PRCA and its series of derivatives, a non-overlapping sliding window model via l0-l1 sparse regularization is proposed. The detailed contributions of this paper are as follows.
- (1)
In order to reduce the redundancy of patch image, an adaptive non-overlapping patch sampling model is proposed. The proposed model has a lightweight structure, which adjusts the size of the model according to the size of the input image automatically. Compared with the fixed patch size methods, NOP is more flexible when facing different sizes of images. It can greatly reduce the detection time while maintaining the high detection effect. We adopt NOP in many different scenes and detect small targets successfully. NOP provides a simple frame to detect infrared small targets.
- (2)
SRPCP is applied to the previously proposed NOP framework. To the best of our knowledge, SRPCP is first applied in the area of infrared small target detection. A direct objective function was adopted. The background reconstruction effect and noise immunity are optimized via the feature of the l0-l1 norm. Sparse parts are represented via the l0 norm, and the objective function is constrained via the l1 norm. NOP has achieved pretty results in infrared small target detection.
The rest of the paper is organized as follows. In the third part, we briefly introduce the SRPCP theory. In the fourth part, the proposed NOP model and its solution method are presented in detail, showing the complete small target detection process. A series of relevant experiments and experimental results are presented and discussed in detail in
Section 4. Finally, the conclusion of this paper is given.