Hyperspectral Anomaly Detection via Low-Rank Representation with Dual Graph Regularizations and Adaptive Dictionary

Cheng, Xi; Mu, Ruiqi; Lin, Sheng; Zhang, Min; Wang, Hai

doi:10.3390/rs16111837

Open AccessArticle

Hyperspectral Anomaly Detection via Low-Rank Representation with Dual Graph Regularizations and Adaptive Dictionary

by

Xi Cheng

,

Ruiqi Mu

,

Sheng Lin

,

Min Zhang

and

Hai Wang

^*

School of Aerospace Science and Technology, Xidian University, Xi’an 710126, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(11), 1837; https://doi.org/10.3390/rs16111837

Submission received: 18 April 2024 / Revised: 16 May 2024 / Accepted: 17 May 2024 / Published: 21 May 2024

(This article belongs to the Special Issue Feature Extraction and Data Classification in Hyperspectral Imaging II)

Download

Browse Figures

Versions Notes

Abstract

:

In a hyperspectral image, there is a close correlation between spectra and a certain degree of correlation in the pixel space. However, most existing low-rank representation (LRR) methods struggle to utilize these two characteristics simultaneously to detect anomalies. To address this challenge, a novel low-rank representation with dual graph regularization and an adaptive dictionary (DGRAD-LRR) is proposed for hyperspectral anomaly detection. To be specific, dual graph regularization, which combines spectral and spatial regularization, provides a new paradigm for LRR, and it can effectively preserve the local geometrical structure in the spectral and spatial information. To obtain a robust background dictionary, a novel adaptive dictionary strategy is utilized for the LRR model. In addition, extensive comparative experiments and an ablation study were conducted to demonstrate the superiority and practicality of the proposed DGRAD-LRR method.

Keywords:

hyperspectral anomaly detection; low-rank representation; dual graph regularization; adaptive dictionary

Graphical Abstract

1. Introduction

In contrast to RGB images [1,2,3,4], multispectral images [5], and others [6], the hyperspectral image possesses several hundreds of bands that can construct a continuous spectral curve. Based on the differences in spectral curves, different feature types are easy to discriminate. Therefore, hyperspectral images (HSIs) [7] have been employed in various fields [8,9,10,11,12] in recent years. Hyperspectral technology contains many image processing tasks, such as change detection [13], classification [14,15], anomaly detection [16,17,18], fusion [5,19,20], band selection [21], and so on [22,23,24,25].

Hyperspectral anomaly detection (HAD) [26,27] is one of the most challenging aspects of hyperspectral technology because of a lack of prior knowledge. This characteristic is more in line with actual detection scenarios. For this reason, HAD has been applied to mineral detection, food safety, search and rescue, ocean exploration and other fields. Recently, low-rank representation (LRR) has become a hot topic in HAD, and most existing LRR approaches have yielded satisfying detection performances for HAD. However, there are still two issues that need to be explored, namely constraint strategies for the LRR optimization problem, and the dictionary construction method.

(1) Strategies for constraining the LRR optimization problem: The LRR characteristic effectively captures the global spatial structure of a hyperspectral image by dividing it into low-rank background and sparse anomaly parts. In hyperspectral data, local spectra exhibit a significant correlation in distinguishing various surface targets, with similar ground targets often observed in local spatial dimensions. The local pattern present in both spectral and spatial data aids in the discrimination between backgrounds and anomalies. In most LRR methods, the utilization of the local spectral and local spatial information is limited, which influences the ability to capture the abnormal. To be concrete, some enhanced LRR models have been proposed, incorporating total variation, graph theory, and structural sparsity to improve spectral utilization in LRR operations, while neglecting the significance of spatial information.

(2) Dictionary construction methods: The background dictionary plays a critical role in accurately depicting the background component in the LRR model. Various approaches have been devised to enhance the LRR model through diverse dictionary construction methods. Typically, unsupervised techniques, like clustering-based methods, are employed to establish the background dictionary. Throughout the dictionary creation process, these unsupervised methods often necessitate manual parameter tuning to achieve a robust background dictionary. This procedure in dictionary construction can be arduous and time-consuming in real-world applications.

To address the aforementioned issues, a novel low-rank representation with dual graph regularization and an adaptive dictionary is proposed in this paper. Inspired by [28], the GTVLRR utilizes the graph and total variation regularization to mine the local geometrical structure for spatial information, and this method exhibits advanced performance. Therefore, we believe that spatial correlation must be considered alongside the relationship among spectra, as effective spectral features are essential for distinguishing various land features. The incorporation of a spectral-spatial graph offers a comprehensive representation of a hyperspectral image (HSI) in the Low-Rank Representation (LRR) model. Consequently, a novel form of regularization, termed spectral-spatial graph regularization, is developed to optimize the LRR in this paper. Furthermore, by employing principles of concept factorization, an original HSI is used to construct a background dictionary. Throughout the iterative optimization of LRR, the background dictionary is dynamically updated to enhance detection performance.

The main contribution of the DGRAD-LRR is concisely summarized as follows.

(1) A new paradigm with the spectral-spatial graph is proposed for the LRR model in the HAD. This dual graph strategy can effectively exploit the local geometrical structure in the spectral and spatial dimensions to enhance the representation ability for the hyperspectral image.

(2) To obtain the relatively robust background dictionary, an adaptive dictionary method based on concept factorization is designed for the LRR model. The background dictionary is obtained by the linear transformation of an original HSI. In the optimization stage, the background dictionary can be constantly updated in each iteration. In this way, the separation effect of backgrounds and anomalies can be improved.

2. Related Work

In the past two decades, scholars have observed that the anomaly occupies a very small part of an HSI and is different from its surroundings. According to this, various HAD algorithms have been proposed. The Reed-Xiaoli (RX) algorithm [29] is the earliest approach in HAD and is regarded as a benchmark. It assumes that an entire HSI with anomalies conforms to a Gaussian distribution. Thus, the mean vector and covariance matrix are used to denote the backgrounds, and the degree of each pixel can be acquired by the Mahalanobis distance from the background feature. Subsequently, several researchers found that this benchmark has some flaws. Concretely, a local RX with a double-window method is designed to deal with the effect of background homogeneity. The kernel RX detector [30] adopts a nonlinear mapping strategy to excavate the correlation of the original HSI space. To improve the RX detector, some variants are proposed, such as the fractional Fourier entropy-based RX, weighted RX [31], and segment RX. In practice, the background is difficult to fit to a Gaussian distribution and this assumption is not reasonable. To address this, a reconstruction theory is put forward, and its ideology is that backgrounds in an HSI can be well represented but anomalies cannot. The reconstruction residual is to measure the anomaly degree of each pixel. These reconstruction algorithms consist of two types: deep-learning algorithms and representation-based algorithms.

2.1. Deep-Learning Algorithms

As deep learning models are widely applied in various fields [32,33,34], they have also become an effective tool for HAD. Existing deep learning methods can be roughly classified into two parts: autoencoder (AE)-based methods and generative adversarial network (GAN)-based methods.

AE models focus on the background representation, and the anomaly has a bad reconstructed effect. This feature promotes high reconstruction errors of anomalies and highlights anomalies. To be specific, the manifold-constrained AE network (MC-AEN) [35] integrates manifold learning into AE to preserve the intrinsic structure of an HSI. Similarly, to extract the intrinsic feature in HAD, SAOCNN introduces the self-attention mechanism and one-class strategy to construct an AE model [36]. In the one-class classification network, a novel deep support vector data description is combined with spectral-spatial information [37]. As for the inter-band correlation, SSRICAD [38] adopts the approach of spatial-spectral joint reconstruction. In [39], a novel two-branch AE adopts 3D convolution to excavate the spectral-spatial information of the HSI [39]. Due to the lack of prior knowledge in HAD, some models leverage low-rank representation to guide the AE optimization, such as DeepLR [40], LELRP-AD [41], DFAN [42] and DLRSPs-DAEs [43]. To improve the background suppression, a differential convolution network is put forward for HAD [44].

In the GAN-based methods, Jiang et al. developed an adaptive semi-supervised algorithm [45] to acquire pure backgrounds and enhance the background representation. To optimize distinguishing between backgrounds and anomalies, a spectral adversarial feature learning framework [46] is put forward for HAD, which employs spectral loss and adversarial loss to mine the intrinsic spectral characteristics. As for the weakly supervised method, a novel discriminative learning network [47] with spectral constraints was designed to improve detection accuracy. Considering the influence of noise and spatial resolution, the ability to separate the abnormal from the background is limited. To deal with this, FTGAN [48] maps the original spectral information to the fractional Fourier space, and it adopts FrFD features to construct the GAN model.

In addition, some other types of approaches [49] have been proposed to obtain good performance recently. Among them, the CL-GaGAN [50] combines continual learning with a capsule network to achieve a unified detector. The pixel-associated AE [51] is proposed to enhance the global pixel similarity in an HSI.

2.2. Representation-Based Algorithms

Deep learning methods have achieved satisfactory performance. Nevertheless, as is widely recognized, the efficacy of deep learning techniques relies heavily on the quantity of training data, and the HAD operates in an unsupervised fashion without access to training labels. Additionally, deep learning models exhibit limited interpretability. These issues pose significant challenges in deep learning methodologies. In contrast, traditional methods offer simplicity and stability in unsupervised tasks along with superior interpretability. Among them, the representation-based approach is a classic and effective method which envisages that each pixel of an HSI is able to be denoted by a certain dictionary and the rational model, with the residual acting as a measurement of the degree of the anomaly. As for most representation-based methods, they include three categories, namely spare representation (SR) approaches, collaboration representation (CR) approaches, and low-rank representation (LRR) approaches.

SR methods typically treat each background pixel in an HSI as denoted by overcomplete background atoms, but abnormalities cannot be resolved. The larger the representation errors of a testing pixel, the higher its degree of anomaly. The first approach [52] in HAD utilizes local sparsity divergence to construct the SR model, and the dictionary acquisition is achieved using a sliding dual window method. In [53], BJSR leverages an adaptive orthogonal background complementary subspace method to obtain the pure local normal atoms, and it employs an adaptive subspace detector to optimize the detection result. In order to ensure physical relevance, a linear mixed model is utilized to impose a summation constraint and a non-negativity constraint on the abundance vector, while removing the upper bound constraint on the sparsity level, with the aim of improving the recovery of the test pixels [54]. To enhance the construction of the background dictionary, [55] combines background estimation with adaptive weighted sparse representation.

Due to the competitive relationship between dictionary atoms in SR, the CR model leverages the collaboration of dictionary atoms to optimize HSI representation, that is to say, it allows entire dictionary atoms to accomplish linear representation. Based on this, it advocates that each background pixel may be easily denoted using all background atoms, but the anomaly pixel cannot [56]. To mine the contribution of each normal pixel, weight regularization is introduced to adaptively model the backgrounds even when there is abnormal pollution [57]. A modified CR approach [58] adopts a dual window method to evaluate the backgrounds and remove some anomaly pixels that are different from most other pixels. To solve noise contamination, a fractional Fourier transform method is integrated into the sliding dual widow CR detector [59]. To strengthen hyperspectral image representation, an effective method that integrates spare representation and collaborative representation [60] has been proposed. The nonlinear function and the guided filter are employed to reprocess the original detection result. In terms of dictionary purification, a novel approach utilizes saliency weight to guide collaborative-competitive representation for HAD [61].

SR and CR utilize the pixel-to-pixel detection method, and LRR denotes global characteristics in an HSI [27]. This is the difference between LRR and the other types (SR and CR). Further, in an HSI, backgrounds are globally homogeneous and represent low-rank elements; in contrast, anomalies occupy a very small portion and highlight sparsity. For this reason, an HSI can be separated into two parts, namely the sparse abnormal and the low-rank background. Low-rank and sparse representation (LRASR) [62] is the most classical work in the LRR literature as it presents the first LRR implementation for HAD. The LRASR can effectively utilize global and local information, it designs sparsity-inducing regularization to mine the local structure in an HSI, and a stable and discriminative dictionary is leveraged. To enhance the robust dictionary, a novel PAB-DC (potential anomaly and background dictionary construction) detector [63] is proposed for HAD, local-nonlocal similarity is adopted to achieve the background dictionary building, and a residual method based on the local area is used to construct the anomaly dictionary. To improve the ability to isolate the abnormal and noise, the LSDM-MoG [64] was put forward. As for the problem of separability of the abnormal and background, AHMID [65] employs a novel joint regularization including L_1,1 and F norm to represent the complex noise part of an HSI, and an effective hierarchical structure strategy is adopted to separate anomalies and backgrounds. Based on its good performance in multiple fields of image processing, the generalized nonconvex low-rank tensor approximation (GNLTR) [66] detector was designed to optimize the LRR model. In addition, some other variants [60,67,68] based on the LRR have been put forward. ALRTT adopts a novel adaptive low-rank transformed tensor approach [69]. Structured sparsity is integrated into the LRR model to maintain local information [70].

3. Materials and Methods

As shown in Figure 1, the processing of DGRAD-LRR includes three steps: (1) the optimization problem construction of DGRAD-LRR; (2) the optimization solution of DGRAD-LRR; and (3) anomaly detection. The detailed description is as follows.

3.1. Low Rank Representation

The observed hyperspectral dataset is

X \in R^{b \times n}

, where n denotes the pixel number and b is the number of spectral bands. In HAD, an inherent attribute is that the anomaly occupies a very small portion of an HSI and the background occupies a significant portion, and many backgrounds are similar and have low-rank characteristics. For this reason, an HSI can be separated into two components, namely the low-rank background and the spare anomaly. It can be formulated by

X = A H + E

(1)

where

A H

represents the background part of an HSI;

A \in R^{b \times r}

is the background dictionary and

H \in R^{r \times n}

means the coefficient of the background dictionary;

E \in R^{b \times n}

denotes the anomaly part;

r

is the number of atoms in the background dictionary.

In order to model and represent Equation (1) reasonably, low-rank representation is an effective tool to utilize in this paper. LRR adopts some prior regularizations to constrain the low-rank background item that is the background. Thereby, the optimization problem of LRR can be expressed by

\begin{array}{l} \min_{H, E} r a n k (H) + λ {‖E‖}_{0} \\ X = A H + E \end{array}

(2)

where

λ > 0

is a weighting coefficient; the

r a n k (H)

is the rank of

H

;

{‖\cdot‖}_{0}

is the 0 norm function and measures the number of nonzero terms. Obviously, Equation (2) suffers from an NP problem because of the rank function and 0 norm function. To deal with this, a good convex surrogate is utilized, formulated by

\begin{array}{l} \min_{H, E} {‖H‖}_{*} + λ {‖E‖}_{2, 1} \\ X = A H + E \end{array}

(3)

where

{‖\cdot‖}_{*}

means the nuclear norm and

{‖\cdot‖}_{2, 1}

denotes the L_2,1 norm.

3.2. Adaptive Dictionary Construction

The construction of the background dictionary is an important step for LRR optimization. The existing LRR methods usually leverage some unsupervised models to select the background dictionary, and some manual settings need to be adjusted to obtain a purer background. However, this operation is tedious and time-consuming. To address this, a novel adaptive strategy, which is based on the concept of factorization, is designed to construct the background dictionary in this paper. It embeds the dictionary construction into the representation model. With the iterative optimization process in the LRR, the background dictionary and coefficient matrix are constantly updated. To some extent, compared with the previous methods of dictionary construction, this strategy is helpful for LRR optimization.

(1) Nonnegative Matrix Factorization (NMF): It is necessary to quickly and effectively reduce the dimensionality of information since the image contains a lot of redundant information. To this end, NMF can be applied as it can decompose a non-negative matrix into two non-negative matrix factors. Specifically, the given data can be defined by a non-negative matrix

X = [x_{1}, x_{2}, \dots, x_{n}] \in R^{m \times n}

, where n is the quantity of data and m is the feature dimension of data. The NMF can be denoted by

\begin{array}{l} \min_{U, V} {‖X - U V‖}_{F}^{2} \\ s . t U \geq 0, V \geq 0 \end{array}

(4)

where

U \in R^{m \times r}

is a basis matrix and each column of

U

is a basis vector to represent

X

;

V \in R^{r \times n}

is a coefficient matrix;

{‖\cdot‖}_{F}

is the F norm. As is shown in Equation (4),

U V

approximates

X

by NMF, namely

X \approx U V

.

(2) Concept Factorization (CF): NMF has satisfying decomposition results, but it has two drawbacks [71]: (1) the decomposed data must be non-negative; (2) it can only be performed in the original data space and cannot be effectively executed in the transformed space. Thus, CF is introduced to address the two problems. It should be noted that each column

U

can be represented by the original matrix

X

in CF. The operation is denoted by

u_{j} = \sum_{i = 1}^{n} w_{i j} x_{i}, w_{i j} > 0, W = [\begin{matrix} w_{11} & \dots & w_{1 j} \\ ⋮ & ⋱ & ⋮ \\ w_{i 1} & \dots & w_{i j} \end{matrix}]

(5)

Using Equation (5),

X \approx U V

is transformed to

X \approx X W V

. Meanwhile, the objective function of CF is

\begin{array}{l} \min_{W, V} {‖X - X W V‖}_{F}^{2} \\ s . t W \geq 0, V \geq 0 \end{array}

(6)

where

W \in R^{n \times r}

and

V \in R^{r \times n}

are the incidence matrix and the representation matrix, respectively.

Based on CF theory, the original HSI data (

X \in R^{b \times n}

) can be utilized to represent the background dictionary (

A

), and

A

is defined by

A = X W

(7)

Using Equation (7), we adopt an adaptive dictionary strategy to achieve the LRR. Inspired by [72,73], the

{‖\cdot‖}_{F}^{2}

is utilized to constraint characterize

E

with the noise, and we apply orthogonal constraints to

W

, namely

W W^{T} = I

. Thus, the objective function of LRR based on an adaptive dictionary is formulated by

\begin{array}{l} \min_{H, W} \frac{1}{2} {‖X - X W H‖}_{F}^{2} + λ {‖H‖}_{*} \\ s . t W W^{T} = I, H \geq 0, W \geq 0 \end{array} .

(8)

3.3. Low-Rank Representation Based on Dual Graph Regularization

In the previous methods based on LRR, for one thing, they always focus on the global Euclidean structure in the spatial feature, and ignore the importance of local geometric structure; for another, their utilization of spectral information is limited. Overall, they have not been able to fully utilize the HSI information, resulting in suboptimal LRR representation ability. Inspired by [28], the idea of a graph is introduced to construct the spectral-spatial graph to constrain the LRR model. In this way, the model not only uncovers potential spatial manifolds but also preserves the inherent geometric structure of the spectrum, and it enhances the learning ability of LRR.

(1) Spatial Graph Regularization: It is built by the K-nearest neighbors (K-NN). An HSI can be denoted by

X = \{x_{1}, x_{2}, \dots, x_{n}\} \in R^{b \times n}

, the spatial graph is formulated by the adjacent matrix

W

W_{i j} = \{\begin{cases} e x p (- \frac{{‖x_{i} - x_{j}‖}_{2}^{2}}{2 σ^{2}}), x_{i} \in K (x_{j}) o r x_{j} \in K (x_{i}) \\ 0, o t h e r w i s e \end{cases}

(9)

where

K (\cdot)

means the k-NN function and the number of adjacent pixels is set to 5;

σ

is a scalar parameter and its value is set to 1 in this paper. The set of

W_{i j}

is

W

. The premise for calculating

W_{i j}

that

x_{i}

is among the k-nearest neighbors of

x_{j}

or

x_{j}

is among the k-nearest neighbors of

x_{i}

(x_{j} \in K (x_{i}))

; otherwise, the value of

W_{i j}

is 0.

As shown in Equation (8),

H

is a coefficient matrix in the LRR with adaptive dictionary. According to the manifold hypothesis, if two pixels of

x_{i}

and

x_{j}

in a hyperspectral image are close to each other, their mapping of

h_{i}

and

h_{j}

are also close.

x_{i}

means the i-th column of the hyperspectral dataset

X

. The value of the weight

W_{i j}

is inversely proportional to the distance between

h_{i}

and

h_{j}

.

Based on the above analysis, the derivation of the spatial graph regularization model is as follows:

\begin{array}{l} \frac{1}{2} \sum_{i, j = 1}^{n} {‖h_{i} - h_{j}‖}_{2}^{2} W_{i j} \\ = \sum_{i, j = 1}^{n} (h_{i} W_{i j} h_{i} ​^{T} - h_{i} W_{i j} h_{j} ​^{T}) \\ = T r (H D_{s} H^{T}) - T r (H W H^{T}) \\ = T r (H L_{s} H^{T}) \end{array}

(10)

where

W

is an adjacency matrix;

D_{s}

is a degree matrix;

L_{s}

is a Laplacian matrix of spatial graphs. It should be noted that

L_{s} = D_{s} - W

. The optimization problem of the spatial graph regularization is

\min_{H} T r (H L_{s} H^{T})

.

(2) Spectral Graph Regularization: As is well known, compared with traditional images, HSIs have many spectral bands. There is a good correlation between spectra in a hyperspectral image, and this characteristic is helpful to distinguish different cover lands. The local information is important for HAD and needs to be preserved. Therefore, it is similar to spatial graph regularization in preserving the local geometric structure information of spectral features. An HSI is

Y = \{y_{1}, y_{2}, \dots, y_{b}\}, Y = X^{T}

, and the adjacent matrix of the spectral graph is denoted by

A

A_{i j} = \{\begin{cases} e x p (- \frac{{‖y_{i} - y_{j}‖}_{2}^{2}}{2 ψ^{2}}), y_{i} \in K (y_{j}) o r y_{j} \in K (y_{i}) \\ 0, o t h e r w i s e \end{cases}

(11)

where

{‖y_{i} - y_{j}‖}_{2}^{2}

means the Euclidean distance of

y_{i}

and

y_{j}

;

K (y_{j})

is the set of k-nearest neighbors of

y_{j}

and

y_{j} \in K (y_{i})

is the set of k-nearest neighbors of

y_{i}

;

ψ

is set to 1.

By

Y = X^{T} a n d X \in R^{b \times n}

, the row vector in an HSI represents the spectral characteristics of a band, namely

X_{i, :}

is the i-th row of

X

. In Equation (8), the relationship of spectral bands can be denoted by

X W

. According to the manifold hypothesis, the closer

X_{i, :}

and

X_{j, :}

are, the closer

X_{i, :} W

and

X_{j, :} W

are. In the meantime, the closer

X_{i, :}

is to

X_{j, :}

, the greater the value assigned to the weight

A_{i j}

.

A_{i j}

is the weight value of the i-th band and j-th band, and it also means the value of the vertex corresponding to the adjacency matrix

A

in the spectrogram. Thus, the spectral-based graph regularization model is shown below:

\begin{array}{l} \frac{1}{2} \sum_{i, j = 1}^{b} {‖X_{i, :} W - X_{j, :} W‖}_{2}^{2} A_{i j} \\ = \sum_{i, j = 1}^{b} ({(X_{i, :} W)}^{T} A_{i j} (X_{i, :} W) - {(X_{j, :} W)}^{T} A_{i j} (X_{i, :} W)) \\ = T r ({(X W)}^{T} D_{m} X W) - T r ({(X W)}^{T} A X W) \\ = T r ({(X W)}^{T} L_{m} X W) \end{array}

(12)

where

A

is an adjacency matrix based on spectral band graph;

D_{m}

, which is a diagonal matrix, represents the degree matrix of

A

, and it can denote the similarity sum of

X_{i, :}

and

X_{j, :}

;

L_{m}

is the Laplace matrix of the spectrogram, and

L_{m} = D_{m} - A

.

In conclusion, the optimization problem of the spectral graph regularization is

\min_{W} T r {(X W)}^{T} L_{m} X W

.

(3) LRR with Dual Graph Regularization: The regularization terms of the spectral graph and the spatial graph are added to the objective function of Equation (8), the following optimization can be represented by

\begin{array}{l} \min_{H, W} \frac{1}{2} {‖X - X W H‖}_{F}^{2} + λ {‖H‖}_{*} + β t r (H L_{s} H^{T}) + γ ({(X W)}^{T} L_{m} X W) \\ s . t W W^{T} = I, H \geq 0, W \geq 0 \end{array}

(13)

where

λ

,

β

, and

γ

are weighted coefficients;

H \in R^{r \times n}

and

W \in R^{n \times r}

; n is the pixel number of an HSI; r is the number of low dimensional subspaces, it is an important hyper-parameter in this paper and its range is

r ≪ \min (n, b)

. Also, as shown in Equation (13), the adaptive dictionary and the spectral-spatial graph regularization are integrated into the LRR model, and this approach comprehensively considers the global geometric information of hyperspectral data and the geometric structure information of local space and spectrum. To some extent, the proposed method improves the representation ability of LRR and enhances the separation of backgrounds and anomalies.

3.4. Model Optimization

The DGRAD-LRR is a convex optimization problem, and it has multiple constraints. Based on this, an alternating direction method of multipliers (ADMM) is employed to optimize the DGRAD-LRR. The detailed description is as follows.

Since the objective function (13) was inseparable [74], auxiliary variables are introduced and they are

Z_{1}, Z_{2}, Z_{3}, Z_{4}, V_{1}, V_{2}

. The optimization problem is converted to

\begin{array}{l} \min_{H, W} \frac{1}{2} {‖X - Z_{2} H‖}_{F}^{2} + λ {‖V_{1}‖}_{*} + β t r (V_{2} L_{s} V_{2} ​^{T}) + γ t r (Z_{4} ​^{T} L_{m} Z_{4}) \\ s . t W W^{T} = I, H \geq 0, W \geq 0, Z_{1} = W, Z_{2} = X Z_{1}, Z_{3} = W, Z_{4} = X Z_{3}, V_{1} = H, V_{2} = H \end{array}

(14)

The corresponding Lagrange augmentation function is

\begin{array}{l} L (H, W, V_{1}, V_{2}, Z_{1}, Z_{2}, Z_{3}, Z_{4}) \\ = \frac{1}{2} {‖X - Z_{2} H‖}_{F}^{2} + λ {‖V_{1}‖}_{*} + β t r (V_{2} L_{s} V_{2} ​^{T}) + γ t r (Z_{4} ​^{T} L_{m} Z_{4}) \\ + \frac{μ}{2} {‖Z_{1} - W - D_{1}‖}_{F}^{2} + \frac{μ}{2} {‖Z_{2} - X Z_{1} - D_{2}‖}_{F}^{2} + \frac{μ}{2} {‖Z_{3} - W - D_{3}‖}_{F}^{2} \\ + \frac{μ}{2} {‖Z_{4} - X Z_{3} - D_{4}‖}_{F}^{2} + \frac{μ}{2} {‖V_{1} - H - D_{5}‖}_{F}^{2} + \frac{μ}{2} {‖V_{2} - H - D_{6}‖}_{F}^{2} \end{array}

(15)

where

μ

is a penalty parameter, and

D_{1}, D_{2}, D_{3}, D_{4}, D_{5}, D_{6}

are Lagrange multipliers. The ADMM sequentially solves one variable at a time, updating the variable being solved and fixing the other variables that are not relevant to this solution, alternating each variable in turn.

(1) Update

H

, the subproblem is formulated by

\begin{array}{l} \min_{H} \frac{1}{2} {‖X - Z_{2} H‖}_{F}^{2} + \frac{μ}{2} {‖V_{1} - H - D_{5}‖}_{F}^{2} + \frac{μ}{2} {‖V_{2} - H - D_{6}‖}_{F}^{2} \\ = \min_{H} \frac{1}{2} {‖Z_{2} H - X‖}_{F}^{2} + \frac{μ}{2} {‖H - (V_{1} - D_{5})‖}_{F}^{2} + \frac{μ}{2} {‖H - (V_{2} - D_{6})‖}_{F}^{2} \end{array} .

(16)

Take the derivative of

H

in the Formula (16) and set it to 0.

H

is

H = {(Z_{2} ​^{T} Z_{2} + 2 μ I)}^{- 1} (Z_{2} ​^{T} X + μ (V_{1} - D_{5}) + μ (V_{2} - D_{6})) .

(17)

(2) Update

W

, the subproblem is converted to

\begin{array}{l} \min_{W} \frac{μ}{2} {‖Z_{1} - W - D_{1}‖}_{F}^{2} + \frac{μ}{2} {‖Z_{3} - W - D_{3}‖}_{F}^{2} \\ = \min_{W} t r (W^{T} (Z_{1} - D_{1})) + t r (W^{T} (Z_{3} - D_{3})) \\ s . t W W^{T} = I \end{array} .

(18)

Equation (18) can be derived as

\begin{array}{l} \min_{W} t r (W^{T} (Z_{1} - D_{1})) + t r (W^{T} (Z_{3} - D_{3})) \\ s . t W W^{T} = I \end{array} .

(19)

In order to solve

W

, let

Q_{1} = Z_{1} - D_{1}, Q_{2} = Z_{3} - D_{3}

, Equation (19) is equal to

\begin{array}{l} \min_{W} t r (W^{T} Q_{1}) + t r (W^{T} Q_{2}) \\ s . t W W^{T} = I \end{array} .

(20)

By introducing symmetric matrix multipliers

Λ

into the above equation to construct the Lagrange function, it is obtained that:

\begin{array}{l} L (W, Λ) = t r (W^{T} Q_{1}) + t r (W^{T} Q_{2}) - \frac{1}{2} t r (Λ^{T} (W^{T} W - I)) \\ = t r (W^{T} Q_{1}) + t r (W^{T} Q_{2}) - \frac{1}{2} t r (W Λ W^{T}) \end{array} .

(21)

Meanwhile, the following equation can be generated

\begin{array}{l} L_{W} = Q_{1} + Q_{2} - W Λ = 0 \\ \Rightarrow Λ = W^{T} (Q_{1} + Q_{2}) \end{array} .

(22)

Let

Q = Q_{1} + Q_{2} = Z_{1} - D_{1} + Z_{3} - D_{3}

and

Λ = W^{T} Q

, the following equation can be generated

\begin{array}{l} Λ^{T} Λ = Q^{T} W W^{T} Q = Q^{T} Q = V Ω U^{T} U Ω V^{T}, Λ = Λ^{T} \\ \Rightarrow Λ = V Ω V^{T} \end{array} .

(23)

We can use singular value decomposition to find an optimal

W

that is

Q = U Ω V^{T}, W = U V^{T}, Ω = d i a g (ω)

(24)

where

(U, Ω, V)

is the SVD decomposition of the matrix

Q

.

(3) Update

V_{1}

, the subproblem is defined by

\min_{V_{1}} λ {‖V_{1}‖}_{*} + \frac{μ}{2} {‖V_{1} - H - D_{5}‖}_{F}^{2} .

(25)

The above equation can be equivalent to

\min_{V_{1}} \frac{λ}{μ} {‖V_{1}‖}_{*} + \frac{1}{2} {‖V_{1} - (H + D_{5})‖}_{F}^{2} .

(26)

By the SVT operator, the solution

V_{1}

is

V_{1} = Θ_{\frac{λ}{μ}} (H + D_{5})

(27)

where

Θ

is the SVT operator.

(4) Update

V_{2}

, the subproblem is converted to

\begin{array}{l} \min_{V_{2}} β t r (V_{2} L_{s} V_{2} ​^{T}) + \frac{μ}{2} {‖V_{2} - H - D_{6}‖}_{F}^{2} \\ = \min_{V_{2}} β t r (V_{2} L_{s} V_{2} ​^{T}) + \frac{μ}{2} {‖V_{2} - (H + D_{6})‖}_{F}^{2} \end{array} .

(28)

Similarity, we take the derivative of

V_{2}

in Formula (28) and set it to 0.

V_{2}

is

V_{2} = (μ (H + D_{6})) {(2 β L_{s} + μ I)}^{- 1} .

(29)

(5) Update

Z_{1}

, the subproblem of

Z_{1}

is

\begin{array}{l} \min_{Z_{1}} \frac{μ}{2} {‖Z_{1} - W - D_{1}‖}_{F}^{2} + \frac{μ}{2} {‖Z_{2} - X Z_{1} - D_{2}‖}_{F}^{2} \\ = \min_{Z_{1}} {‖Z_{1} - (W + D_{1})‖}_{F}^{2} + {‖X Z_{1} - (Z_{2} - D_{2})‖}_{F}^{2} \end{array} .

(30)

We take the derivative of

Z_{1}

and set Equation (30) to 0.

Z_{1}

is

Z_{1} = {(I + X^{T} X)}^{- 1} ((W + D_{1}) + X^{T} (Z_{2} - D_{2})) .

(31)

(6) Update

Z_{2}

, the subproblem

Z_{2}

can be denoted by

\begin{array}{l} \min_{Z_{2}} \frac{1}{2} {‖X - Z_{2} H‖}_{F}^{2} + \frac{μ}{2} {‖Z_{2} - X Z_{1} - D_{2}‖}_{F}^{2} \\ = \min_{Z_{2}} \frac{1}{2} {‖Z_{2} H - X‖}_{F}^{2} + \frac{μ}{2} {‖Z_{2} - (X Z_{1} + D_{2})‖}_{F}^{2} \end{array} .

(32)

We take the derivative of

Z_{2}

and set Equation (32) to 0.

Z_{2}

is

Z_{2} = (X H^{T} + μ (X Z_{1} + D_{2})) {(H H^{T} + μ I)}^{- 1} .

(33)

(7) Update

Z_{3}

, the subproblem

Z_{3}

can be represented by

\begin{array}{l} \min_{Z_{3}} \frac{μ}{2} {‖Z_{3} - W - D_{3}‖}_{F}^{2} + \frac{μ}{2} {‖Z_{4} - X Z_{3} - D_{4}‖}_{F}^{2} \\ = \min_{Z_{3}} {‖Z_{3} - (W + D_{3})‖}_{F}^{2} + {‖X Z_{3} - (Z_{4} - D_{4})‖}_{F}^{2} \end{array} .

(34)

We take the derivative of

Z_{3}

in Formula (34) and set it to 0.

Z_{3}

is

Z_{3} = {(I + X^{T} X)}^{- 1} ((W + D_{3}) + X^{T} (Z_{4} - D_{4})) .

(35)

(8) Update

Z_{4}

, the subproblem of

Z_{4}

is

\begin{array}{l} \min_{Z_{4}} γ t r (Z_{4} ​^{T} L_{m} Z_{4}) + \frac{μ}{2} {‖Z_{4} - X Z_{3} - D_{4}‖}_{F}^{2} \\ = \min_{Z_{4}} γ t r (Z_{4} ​^{T} L_{m} Z_{4}) + \frac{μ}{2} {‖Z_{4} - (X Z_{3} + D_{4})‖}_{F}^{2} \end{array} .

(36)

We take the derivative of

Z_{4}

and set Equation (36) to 0.

Z_{4}

is

Z_{4} = {(2 γ L_{m} + μ I)}^{- 1} (μ (X Z_{3} + D_{4})) .

(37)

When all variable updates are completed, Lagrange multipliers (

D_{1}, D_{2}, D_{3}, D_{4}, D_{5}, D_{6}

) and a penalty parameter are as follows:

\begin{array}{l} D_{1} = D_{1} - (Z_{1} - W) \\ D_{2} = D_{2} - (Z_{2} - X Z_{1}) \\ D_{3} = D_{3} - (Z_{3} - W) \\ D_{4} = D_{4} - (Z_{4} - X Z_{3}) \\ D_{5} = D_{5} - (V_{1} - H) \\ D_{6} = D_{6} - (V_{2} - H) \\ μ = \min (ρ μ, μ_{\max}) \end{array}

(38)

In the ADMM processing, the optimization is stopped when the iteration criteria are satisfied or the predetermined maximum number of iterations is reached. The predetermined maximum number of iterations (Itr_max) is set to 400 in this paper. The stopping criteria for iteration are outlined as follows:

\begin{array}{l} {‖X - X W H‖}_{F} + {‖V_{1} - H‖}_{F} + {‖V_{2} - H‖}_{F} + \\ {‖Z_{1} - W‖}_{F} + {‖Z_{2} - X Z_{1}‖}_{F} + {‖Z_{3} - W‖}_{F} + {‖Z_{4} - X Z_{3}‖}_{F} \leq ε \end{array}

(39)

where

ε

denotes a convergence parameter, and it is set to 1 × 10⁻⁷ in this article. Further details are listed in Algorithm 1.

Algorithm 1: Optimization of the proposed DGRAD-LRR

1. Input: an HSI

X \in R^{b \times n}

, regularization parameters of λ, β, γ, Itr_max = 400

2. Initialize:

Z_{1} = Z_{2} = Z_{3} = Z_{4} = V_{1} = V_{2} = D_{1} = D_{2} = D_{3} = D_{4} = D_{5} = D_{6} = 0

,

ε

= 1 × 10⁻⁷

3. While (39) is not satisfied or Itr < Itr_max, do

4. Update

H

by (17)

5. Update

W

by (24)

6. Update

V_{1}

by (27)

7. Update

V_{2}

by (29)

8. Update

Z_{1}

by (31)

9. Update

Z_{2}

by (33)

10. Update

Z_{3}

by (35)

11. Update

Z_{4}

by (37)

12. Update

D_{1}, D_{2}, D_{3}, D_{4}, D_{5}, D_{6}

by (38)

13. End while

14. Output:

H

,

W

3.5. Anomaly Detection

After the above optimization processing, the reconstructed background can be denoted by

X W H

, and the anomaly part is obtained by

O = X - X W H

. To obtain the anomaly degree of each pixel in a hyperspectral image, the

{‖\cdot‖}_{2}

is utilized to calculate each pixel of

O

.

4. Results

4.1. Experimental Setup

In this section, parameter experiments are presented to illustrate the influence of their settings, and quantitative and qualitative comparisons are conducted to reflect the superiority of the proposed method.

4.1.1. Hyperspectral Dataset

Six hyperspectral datasets [26,27] are employed for the experimental comparisons, and they are Gulfport, Texas Coast, Los Angeles, Salinas, San Diego-1 and San Diego-2. Only Salinas is a simulated dataset, the rest are real datasets. All datasets adopt the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensors. The pseudo-color map and ground truth are shown in Figure 2. The related description is listed in Table 1.

(1) Gulfport Dataset: It captures the airport area in Gulfport, MS, USA. The size is 100 × 100 and the spatial resolution is 3.4 m. The dataset has 191 spectral bands, and its range of wavelengths is from 550 nm to 1850 nm. In this scene, three airplanes are labeled as the anomaly.

(2) Texas Coast: The size of the data is 100 × 100 × 207, and the spatial resolution is 17.2 m. The spectral coverage channels are from 450 nm to 1350 nm. The labeled anomalies are buildings.

(3) Los Angeles: The picture covers the city of Los Angeles and the buildings are labeled as anomaly objects. It has 100 × 100 pixels and the pixel resolution is 7.1 m. The band wavelength is 430–860 nm and the band number is 205.

(4) Salinas: The scene is from the Salinas Valley, CA, USA in 1998. The anomalies are synthetic targets. The 3D size of this data is 120 × 120 × 204 and its spatial resolution is 3.7 m.

(5) San Diego-1: This image was collected from San Diego, CA, USA, and the anomaly targets are three airplanes. The image has 189 bands, the wavelengths of which are from 370 nm to 2510 nm. The spatial size is 100 × 100 and its resolution is 3.5 m.

(6) San Diego-2: The characteristic parameter in this dataset is the same as in San Diego-1. The difference is in the location of anomalies. The size of the San Diego-2 dataset is 100 × 100. The color map and ground truth are shown in Figure 2.

4.1.2. Compared Methods

In order to illustrate the superiority of the proposed DGRAD-LRR algorithms, ten advanced methods were selected for comparison in this article, and they include Reed-Xiaoli (RX), guided AE-based detector (GAED), collaboration representation detector (CRD), non-negative constrained joint collaborative representation (NJCR), low-rank and sparse representation (LRASR), graph and total variation regularized LRR (GTVLRR), anti-noise hierarchical mutual-incoherence-induced discriminative learning (AHMID), low-rank and sparse decomposition with mixture of gaussian (LSDM-MoG) and generalized nonconvex low-rank tensor representation (GNLTR).

(1) RX [31]: the most classic HAD method is RX, and it is based on the idea of statistics. It adopts the mean vector and covariance matrix to denote the backgrounds of an HSI, and it leverages the Mahalanobis distance to evaluate the anomaly score of each pixel.

(2) CRD [57]: CRD is the most representative CR detector, and it leverages a distance-weighted regularization and the sum-to-one constraint to optimize the CR model.

(3) NJCR [75]: NJCR combines the nonnegative and sum-to-one constraint with the background-anomaly union dictionary to improve HSI representation.

(4) Auto-AD [76]: The framework adopts the fully convolutional layer with the skip connection, and a novel adaptive-weight loss function is designed to effectively optimize the Auto-AD.

(5) GAED [77]: It is a deep-learning-based method, and it introduces curvature filtering to mine the spectral similarity and guide the optimization of the autoencoder. The remaining methods are representation-based models.

(6) LRASR [62]: In the LRR model, the LRASR is the baseline due to its first utilization for HAD; it proposes that backgrounds can be denoted by a low-rank matrix and anomalies can be represented by sparse noises. It employs sparsity-inducing regularization to mine the local structure in an HSI.

(7) GTVLRR [28]: To improve the detection performance, the GTVLRR introduces graph and total variation regularization to effectively constrain the LRR model.

(8) LSDM-MoG [64]: LSDM-MoG integrates Gaussian mixture into LRR to deal with the problem of a single distribution assumption in the traditional approaches.

(9) AHMID [65]: AHMID utilizes the structure incoherent constraint and the first-order statistics constraint to address the mixed noise interference.

(10) GNLTR [66]: The GNLTR designs some nonconvex penalty functions to apply for tensor tubal rank and optimize backgrounds with low-rank characteristics.

4.1.3. Evaluation Metrics

To quantitatively measure the detection results, several evaluation indexes are utilized in this paper, and they include the receiver operating characteristic (ROC) curve, the area under the curve (AUC), running time, and the separation map of backgrounds and anomalies. Among them, the ROC curve records the scores of the P_D and P_F when τ takes different values. Correspondingly, the AUC value is the quantitative form of the ROC curve. The popular AUC scores are AUC_(D,τ), AUC_(F,τ) and AUC_(D,F); the AUC_TD, AUC_ODP, and AUC_TD-BS are harmonic forms of the AUC_(D,τ), AUC_(F,τ) and AUC_(D,F). They can denoted by the formula of

\begin{array}{l} {AUC}_{TD} = {AUC}_{(D, F)} + {AUC}_{(D, τ)} \\ {AUC}_{ODP} = {AUC}_{(D, τ)} + (1 - {AUC}_{(F, τ)}) \\ {AUC}_{TD-BS} = {AUC}_{(D, τ)} - {AUC}_{(F, τ)} \end{array} .

(40)

Among the AUC values, AUC_(D,τ), AUC_(D,F), and AUC_TD measure target detectability; AUC_(F,τ) evaluates background suppressibility; AUC_ODP and AUC_TD-BS reflect the performance of joint target detectability with background suppressibility. As regards the separation degree of backgrounds and anomalies, the separation map of backgrounds and anomalies is employed to analyze the distribution of backgrounds and anomalies. Also, running time reflects the efficiency of detecting anomalies.

4.1.4. Implement Details

All experiments in this paper were performed using a computer (Intel i7-9700 CPU @3.00GHz and RAM 16G), and the implementation platform of compared algorithms and the proposed algorithm is Matlab-2017b in Windows 10. In the optimization stage, the maximum number of iterations is important, and it is set to 400 to acquire a relatively ideal optimization result with the proposed method. There are four parameters in this paper, and their optimal settings are obtained by the parameter analyses.

4.2. Parameter Analysis

In the proposed DGRAD-LRR, four parameters are required to conduct experimental analyses and obtain the optimal detection performance, and they include the number of dimensions in the transformation space (r), the trade-off coefficient of F norm constraint terms (λ), trade-off coefficient of the spatial graph regularization term (β), and trade-off coefficient of the spectral graph regularization term (γ). The AUC score of (P_D, P_F) is utilized to measure the detection performance of different parameters in this experiment. This parameter originates (r) from the incidence matrix (

W \in R^{n \times r}

) and the representation matrix (

V \in R^{r \times n}

) during the adaptive dictionary construction process, and its configuration impacts the efficiency of eliminating redundant information and representing spectral information. For the LRR model, the trade-off coefficients of λ, β, and γ show different degrees of low-rank properties, local spatial structure, and local spectral structure in various hyperspectral images. Thus, achieving the optimal solution to the optimization problem necessitates a judicious balance among these three coefficients.

(1) Number of dimensions in the transformation space (r): To get the optimal setting, the value of r is set to the range of [1,2,3,4,5,10,20,30,40,60]. The influence of detection rates in six hyperspectral datasets is displayed in Figure 3a. We can observe that the change of AUC_(D,F) scores is significant in the Gulfport, San Diego-1 and San Diego-2 datasets. The AUC values in the Gulfport dataset first increase, then decrease, then increase again, and finally fluctuate within a small range. In the other two datasets, the detection rates first rise rapidly and then fluctuate in a small range; their highest values are at 3 and 4, respectively. As for the Texas Coast, Los Angeles, and Salinas datasets, their detection accuracy is within a certain range; the change range of the Salinas dataset is relatively bigger than the change ranges of the remaining two datasets. From these results, we can see that the setting of r is vital for adaptive dictionary building.

(2) Trade-off coefficient of F norm constraint terms (λ): As shown in Figure 3b, the preset values of λ are [0.001, 0.01, 0.1, 1, 10, 100]. Except for the Salinas dataset, the influence of different settings on the detection performance is weakened because their changes of AUC_(D,F) scores are in different small ranges. However, in the Salinas dataset, its value gradually decreases in the latter part of the range.

(3) Trade-off coefficient of spatial graph regularization term (β): Its impact is significant in the hyperspectral data except for Los Angeles and San Diego-2. Specifically, in the Texas Coast and Salinas datasets, the first majority of the curve remains almost unchanged, followed by a small portion that decreases; in the Gulfport and San Diego-1 datasets, the overall change curve is in the form of a single peak, with peaks occurring at 0.2 and 1.

(4) Trade-off coefficient of the spectral graph regularization term (γ): Its settings are [0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1, 10, 100, 1000]. In four datasets (Texas Coast, Los Angeles, San Diego-1, and San Diego-2) the AUC scores are almost unchanged, with their fluctuation centers at 0.9994, 0.9971, 0.9933, and 0.9958, respectively. In the Gulfport dataset, we find that the huge weight coefficient can reduce the detection ability for anomalies. For example, when the γ is 1000, the AUC value decreases rapidly and reaches 0.9928. In the Salinas dataset, the detection rates in the different settings γ present unimodal change.

According to the above experimental analyses, we can obtain the optimal settings of four parameters in the proposed DGRAD-LRR. The detailed settings are listed in Table 2.

4.3. Detection Performance

4.3.1. Qualitative Performance

In order to display detection results of ten HAD approaches, detection maps are employed in Figure 3.

Firstly, in the datasets where airplanes are labeled as the anomaly (the Gulfport, San Diego-1, and San Diego-2), the detection maps of RX, CRD, Auto-AD, GAED, and AHMID methods present a dull overall appearance in Figure 4, especially in the RX and CRD detector. This situation makes it difficult to distinguish the backgrounds and anomalies well. However, in the San Diego-1 dataset, this problem is slightly alleviated; the two airplanes in the bottom left corner can be captured by the above four algorithms and the detection effect of the remaining airplane is limited. As for the NJCR, LRASR, GTVLRR, and LSDM-MoG, although they can effectively highlight partial abnormalities, some parts of anomaly airplanes were not detected. For example, in the Gulfport dataset, the relatively big airplane can be detected well whereas the detection results of the two small ones are unsatisfying. With GNLTR and DGRAD-LRR, all anomaly targets can be captured and their edges are intact; the performance of DGRAD-LRR is slightly better than the performance of GNLTR in the three datasets, but DGRAD-LRR suffers from more false positives.

Then, in the Texas Coast and Los Angeles datasets where the anomalies are buildings, ten HAD approaches achieve satisfying performance. In regard to detection results for Texas Coast, RX, CRD, Auto-AD and LSDM-MoG can only detect the buildings on the left and cannot properly detect the buildings on the right; AHMID can capture partial anomalies on the right; the other methods (NJCR, GAED, LRASR, GNLTR, and DGRAD-LRR) can highlight all buildings, with detection performances being outstanding with GNLTR and DGRAD-LRR. In the Los Angeles dataset, all methods except LSDM-MoG, GNLTR, and DGRAD-LRR cannot highlight most anomalies; among the LSDM-MoG, GNLTR, and DGRAD-LRR detectors, DGRAD-LRR has the best detection effect, followed by GNLTR.

In the Salinas dataset where the anomalies are simulated, the highlighting of anomalies is clear in the NJCR, GTVLRR, LSDM-MoG, GNLTR, and DGRAD-LRR methods. Among them, LSDM-MoG and the proposed DGRAD-LRR can capture more anomaly pixels. With regard to the false positive problem, LRASR is the most fallible of the algorithms tested; NJCR, GTVLRR, GNLTR, and DGRAD-LRR are oblivious; CRD, Auto-AD, GAED and AHMID can suppress the background well.

In addition, we find that RX, CRD, Auto-AD and AHMID have good performances in suppressing the backgrounds in all hyperspectral datasets. However, the highlighting of anomalies is limited to some extent.

4.3.2. Quantitative Performance

To quantitatively evaluate the detection results of ten approaches, four types of ROC curves and six kinds of AUC scores are utilized in this article.

(1) Comprehensive indicator performance: As shown in Figure 5, the 3D ROC curves of the proposed model are above the 3D curves of the compared methods in all HSI datasets. Meanwhile, the AUC_ODP and AUC_TD-BS scores are comprehensive evaluation indicators of joint target detectability with background suppressibility, and the scores of the two AUCs in our method are the highest. This result illustrates that our method has a better overall performance compared to the state-of-the-art algorithms. Significantly, in all LRR methods (LRASR, GTVLRR, AHMID, LSDM-MoG, GNLTR, and DGRAD-LRR), the proposed approach is the most outstanding. This proves that the design of DGRAD-LRR is the most reasonable and effective of the LRRs. To be concrete, concerning the AUC_ODP performance, the AUC_ODP value of the DGRAD-LRR is the second highest in the San Diego-2 dataset and is the highest in the other HSI datasets; as for the AUC_TD-BS scores, the proposed method achieves the best performance in the Gulfport, Texas Coast, Los Angeles, and San Diego-1 datasets; and its performance is suboptimal in the San Diego-2 dataset.

(2) Anomaly detectability: The 2D ROC_(D,F) curve, 2D ROC_(D,τ) curve, and three types of AUC values (AUC_(D,τ), AUC_(D,F), and AUC_TD) are employed to evaluate the target detectability of different detectors. In the 2D ROC_(D,F) curve results, the curve of the proposed method in each dataset is closer to the upper left corner, and the values of P_D in the proposed method are always higher than the values of P_D in the compared methods; correspondingly, the AUC_(D,F) scores of the DGRAD-LRR detector in the Gulfport, Texas Coast, Los Angeles, Salinas, San Diego-1 and San Diego-2 datasets are the highest in all algorithms, and the scores are 0.9960, 0.9994, 0.9971, 0.9962, 0.9935 and 0.9958 in Table 3, respectively. With respect to the results of 2D ROC_(D,τ) curves and AUC_(D,τ), the proposed model is outstanding in all HSI datasets except San Diego-2. At the same time, our overall performance of AUC_TD is excellent in all hyperspectral datasets. In addition, we observe that the detection accuracy of the proposed network is optimal to all LRR networks; this is attributed to the dual graph regularization and adaptive dictionary improving the ability of the LRR model to capture anomalies.

(3) Background suppressibility: As shown in Figure 5, CR, NICR, Auto-AD and GAED exhibit good performance in the 2D ROC_(F,_τ₎ curves due to their curves being located in the lower-left corner of the image. Meanwhile, their AUC_(F,τ) values are relatively lower than other algorithms. In the LRR models, AHMID alleviates the problem of false positives to some extent because AHMID utilizes the structure incoherent constraint and the first-order statistics constraint to address the mixed noise interference. However, the other LRR methods suffer from different degrees of false positives. From the overall results of the ROC_(F,_τ₎ curve and AUC_(F,τ) score, the errors of the proposed DGRAD-LRR are not very serious. Based on its excellent performance in anomaly detection, it is acceptable for hyperspectral anomaly detection.

(4) Separation effect of backgrounds and anomalies: In Figure 5, a separation map is employed to analyze the distribution of the different pixel values in the backgrounds and anomalies. Using this characteristic, the length of the red box indicates the highlighted degree of anomalies, namely the anomaly detectability. The length of the blue box indicates the background suppressibility. The smaller the length and range value of the blue box, the better the background suppressibility. The distance between the red box and the blue box denotes the separation effect of backgrounds and anomalies, and its size is directly proportional to the effect. In the separation performance of background and anomalies, there is overlap between the red box and the blue box in the RX, CRD, LRASR, and GTVLRR detectors, with the overlapping range being bigger in the LRASR; GAED, AHMID, and LSDM-MoG slightly separate the background and the anomalies; the separation effect is obvious in the NJCR, GNLTR, and the DGRAD-LRR methods, with the DGRAD-LRR method performing best and GNLTR second best. Meanwhile, we can see that the proposed framework exhibits optimal performance amongst the LRR methods. To some extent, this result illustrates the effectiveness of the DGRAD-LRR design. As regards background suppressibility, CRD, Auto-AD, GAED, and AHMID achieve excellent performances, as we can see the length and range values of their blue boxes are very small; the effects of other methods are satisfactory. In terms of anomaly detection, as displayed in Figure 6, our overall performance is slightly better than that of the compared algorithms. Concretely, the proposed model exhibits the highest median value of the red box in the Gulfport, Texas Coast, Los Angeles, and San Diego-1 datasets; and its median values are the second highest in the remaining two HSI datasets. All in all, the separation effect of backgrounds and anomalies is superior in the proposed method.

(5) Running time: To measure the detection efficiency of the different detectors, running time is employed in this paper. The running time results are listed in Table 4. We can observe that RX took very little time for HAD in each HSI dataset and is the fastest detector of those tested; however, the detection accuracy of RX is not competitive. Among the deep learning methods, the average running time of Auto-AD and GAED are 21.532 and 42.305, respectively, therefore Auto-AD is better than GAED. The time taken by LRASR is the second lowest of all methods and is the lowest in all LRR methods (LRASR, GTVLRR, AHMID, LSDM-MoG, GNLTR, and the proposed method); this is because LRASR is one of the earliest LRR approaches and its design is simple. GTVLRR needs the most time for model optimization with a running time of 206.186s; it utilizes graph regularization and total variation regularization and requires more time to mine the local geometrical structure and spatial relationship. Based on this, the proposed method is also time-consuming due to the utilization of the spectral-spatial graph. Nevertheless, the adaptive dictionary technique of our method is relatively simpler than the dictionary construction techniques of the compared methods, and this method can reduce the time cost. The running time of the proposed approach is less than the running time of GTVLRR. In addition, considering the outstanding performance in detection accuracy, the proposed model is deemed acceptable [78].

5. Discussion

In this section, an ablation study is performed to explain the effectiveness of each designed module in the proposed method. The components that need to be analyzed include the adaptive dictionary (AD) module and the dual graph regularization (DGR) module. The DGRAD-LRR design is based on the LRR model, and the LRR model adopts a normal dictionary which is the same as used in LRASR. To measure the detection effect, the AUC_(D,F) score is utilized in this paper. The experimental results are listed in Figure 7. When the adaptive dictionary was added to the LRR model, significant improvement in detection accuracy was achieved. To be specific, the AUC_(D,F) scores increased by at least 1% in each dataset and the average score in all datasets increased by 2.17%. This proves that the adaptive dictionary construction is effective in covering more comprehensive information of surface features and improving the ability to detect anomalies. With dual graph regularization integrated into the LRR + AD approach, the AUC_(D,F) scores in the Gulfport, Texas Coast, Los Angeles, Salinas, San Diego-1, and San Diego-2 datasets improve by 0.83%, 0.14%, 0.17%, 1.52%, 0.37%, 0.67%, and 0.61%, respectively. The detection performances are further improved in each HSI dataset; at the same time, the average detection accuracy increases by 0.61% and its value is 0.9964. Therefore, dual graph regularization design is deemed reasonable and efficient for the optimization of the LRR model. Also, compared with the baseline (LRR), the performance of LRR + AD + DGR is significantly improved. The combination of AD and DGR is helpful for LRR to capture anomalies.

6. Conclusions

In this article, a novel LRR with dual graph regularization and adaptive dictionary is proposed for HAD. To be specific, a spectral and spatial graph is designed for HSI analysis which can preserve the local geometrical structure of the spectral-spatial information to effectively constrain the LRR model. An adaptive dictionary strategy is employed for background dictionary building, and we employ concept factorization to linearly transform the original hyperspectral image and generate a relatively robust background dictionary. To highlight the superiority of the proposed DGRAD-LRR, nine state-of-the-art HAD methods were employed for comparative experiments. In addition, we conducted component analysis experiments to illustrate the practicality of the proposed DGRAD-LRR.

Although the performance of the DGRAD-LRR method is advanced, the effect of DGRAD-LRR is limited in complex images. Inspired by the deep learning model, in HAD performed as an unsupervised task, LRR can serve as the prior knowledge for the deep learning model. This approach promotes the interaction between the LRR and deep learning models to effectively capture anomalies. Specifically, LRR provides purer backgrounds and improves background reconstruction in the deep learning model, while the deep learning model offers more effective hyperspectral features and optimizes the separation of backgrounds and anomalies for the LRR model. Therefore, we plan to explore this strategy for HAD to improve detection accuracy in future works.

Author Contributions

X.C., R.M. and H.W. provided the methodology; X.C. wrote the original draft; X.C., S.L. and R.M. performed experiments; S.L., R.M., H.W. and M.Z. revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No.12003018), Fundamental Research Funds for the Central Universities, and the Innovation Fund of Xidian University.

Data Availability Statement

The hyperspectral data used in this paper are available at http://xudongkang.weebly.com/ (accessed on 11 January 2013).

Acknowledgments

The authors gratefully acknowledge the School of Aerospace Science and Technology, Xidian University.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tolie, H.F.; Ren, J.; Elyan, E. DICAM: Deep inception and channel-wise attention modules for underwater image enhancement. Neurocomputing 2024, 584, 127585. [Google Scholar] [CrossRef]
Zhao, S.; Lin, S.; Cheng, X.; Zhou, K.; Zhang, M.; Wang, H. Dual-GAN complementary learning for real-world image denoising. IEEE Sens. J. 2024, 24, 355–366. [Google Scholar] [CrossRef]
Egmont-Petersen, M.; de Ridder, D.; Handels, H. Image processing with neural networks—A review. Pattern Recognit. 2002, 35, 2279–2301. [Google Scholar] [CrossRef]
Chen, Y.; Tang, Y.; Xiao, Y.; Yuan, Q.; Zhang, Y.; Liu, F.; He, J.; Zhang, L. Satellite video single object tracking: A systematic review and an oriented object tracking benchmark. ISPRS J. Photogramm. Remote Sens. 2024, 210, 212–240. [Google Scholar] [CrossRef]
Li, J.; Zheng, K.; Yao, J.; Gao, L.; Hong, D. Deep unsupervised blind hyperspectral and multispectral data fusion. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6007305. [Google Scholar] [CrossRef]
Tang, S.; Zhang, X.; He, Z.; Chen, Z.; Du, W.; Li, Y.; Zhang, J.; Guo, P.; Zhang, L.; So, H.C. Practical issue analyses and imaging approach for hypersonic vehicle-borne SAR with near-vertical diving trajectory. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5204316. [Google Scholar] [CrossRef]
Chakrabarti, A.; Zickler, T. Statistics of real-world hyperspectral images. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; pp. 193–200. [Google Scholar]
Huo, Y.; Cheng, X.; Lin, S.; Zhang, M.; Wang, H. Memory-augmented autoencoder with adaptive reconstruction and sample sttribution mining for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2024, 1–19. [Google Scholar] [CrossRef]
Li, J.; Zheng, K.; Liu, W.; Li, Z.; Yu, H.; Ni, L. Model-guided coarse-to-fine fusion network for unsupervised hyperspectral image super-resolution. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5508605. [Google Scholar] [CrossRef]
Li, J.; Zheng, K.; Gao, L.; Ni, L.; Huang, M.; Chanussot, J. Model-informed multistage unsupervised network for hyperspectral image super-resolution. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5516117. [Google Scholar] [CrossRef]
Cheng, X.; Yu, H.; Lin, S.; Dong, Y.; Zhao, S.; Zhang, M.; Wang, H. Deep feature aggregation network for hyperspectral anomaly detection. IEEE Trans. Instrum. Meas. 2024. [Google Scholar] [CrossRef]
Yan, Y.; Ren, J.; Sun, H.; Williams, R. Nondestructive quantitative measurement for precision quality control in additive manufacturing using hyperspectral imagery and machine learning. IEEE Trans. Ind. Informat. 2024. [Google Scholar] [CrossRef]
Li, Y.; Ren, J.; Yan, Y.; Liu, Q.; Ma, P.; Petrovski, A.; Sun, H. CBANet: An end-to-end cross band 2-D attention network for hyperspectral change detection in remote sensing. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5513011. [Google Scholar] [CrossRef]
Cui, X.; Zheng, K.; Gao, L.; Zhang, B.; Yang, D.; Ren, J. Multiscale spatial-spectral convolutional network with image-based framework for hyperspectral imagery classification. Remote Sens. 2019, 11, 2220. [Google Scholar] [CrossRef]
Zhao, C.; Qin, B.; Feng, S.; Zhu, W.; Zhang, L.; Ren, J. An unsupervised domain adaptation method towards multi-level features and decision boundaries for cross-scene hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5546216. [Google Scholar] [CrossRef]
Cheng, X.; Zhang, M.; Lin, S.; Zhou, K.; Wang, L.; Wang, H. Multiscale superpixel guided discriminative forest for hyperspectral anomaly detection. Remote Sens. 2022, 14, 4828. [Google Scholar] [CrossRef]
Wang, D.; Zhuang, L.; Gao, L.; Sun, X.; Huang, M.; Plaza, A.J. PDBSNet: Pixel-shuffle downsampling blind-spot reconstruction network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5511914. [Google Scholar] [CrossRef]
Wang, D.; Zhuang, L.; Gao, L.; Sun, X.; Zhao, X.; Plaza, A. Sliding dual-window-inspired reconstruction network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5504115. [Google Scholar] [CrossRef]
Li, J.; Hong, D.; Gao, L.; Yao, J.; Zheng, K.; Zhang, B.; Chanussot, J. Deep learning in multimodal remote sensing data fusion: A comprehensive review. Int. J. Appl. Earth Obs. 2022, 112, 102926. [Google Scholar] [CrossRef]
Gao, L.; Li, J.; Zheng, K.; Jia, X. Enhanced autoencoders with attention-embedded degradation learning for unsupervised hyperspectral image super-resolution. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5509417. [Google Scholar] [CrossRef]
Sun, W.; Du, Q. Hyperspectral band selection: A review. IEEE Geosci. Remote Sens. Mag. 2019, 7, 118–139. [Google Scholar] [CrossRef]
Luo, F.; Zhou, T.; Liu, J.; Guo, T.; Gong, X.; Ren, J. Multiscale diff-changed feature fusion network for hyperspectral image change detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5502713. [Google Scholar] [CrossRef]
Ma, P.; Ren, J.; Sun, G.; Zhao, H.; Jia, X.; Yan, Y.; Zabalza, J. Multiscale superpixelwise prophet model for noise-robust feature extraction in hyperspectral images. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5508912. [Google Scholar] [CrossRef]
Chen, Y.; Yuan, Q.; Tang, Y.; Xiao, Y.; He, J.; Zhang, L. SPIRIT: Spectral awareness interaction network with dynamic template for hyperspectral object tracking. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5503116. [Google Scholar] [CrossRef]
Li, J.; Zheng, K.; Li, Z.; Gao, L.; Jia, X. X-shaped interactive autoencoders with cross-modality mutual learning for unsupervised hyperspectral image super-resolution. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5518317. [Google Scholar] [CrossRef]
Xu, Y.; Zhang, L.; Du, B.; Zhang, L. Hyperspectral anomaly detection based on machine learning: An overview. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2022, 15, 3351–3364. [Google Scholar] [CrossRef]
Su, H.; Wu, Z.; Zhang, H.; Du, Q. Hyperspectral anomaly detection: A survey. IEEE Geosci. Remote Sens. Mag. 2022, 10, 64–90. [Google Scholar] [CrossRef]
Cheng, T.; Wang, B. Graph and total variation regularized low-rank representation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2020, 58, 391–406. [Google Scholar] [CrossRef]
Reed, I.S.; Yu, X. Adaptive multiple-band CFAR detection of an optical pattern with unknown spectral distribution. IEEE Trans. Acoust. Speech Signal Process. 1990, 38, 1760–1770. [Google Scholar] [CrossRef]
Heesung, K.; Nasrabadi, N.M. Kernel RX-algorithm: A nonlinear anomaly detector for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2005, 43, 388–397. [Google Scholar] [CrossRef]
Guo, Q.; Zhang, B.; Ran, Q.; Gao, L.; Li, J.; Plaza, A. Weighted-RXD and linear filter-based RXD: Improving background statistics estimation for anomaly detection in hyperspectral imagery. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2014, 7, 2351–2366. [Google Scholar] [CrossRef]
Huo, Y.; Qian, X.; Li, C.; Wang, W. Multiple instances complementary detection and difficulty evaluation for weakly supervised object detection in remote sensing images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6006505. [Google Scholar] [CrossRef]
Qian, X.; Cheng, X.; Cheng, G.; Yao, X.; Jiang, L. Two-stream encoder GAN with progressive training for co-saliency detection. IEEE Signal Process. Lett. 2021, 28, 180–184. [Google Scholar] [CrossRef]
Cheng, X.; Zhang, M.; Lin, S.; Zhou, K.; Zhao, S.; Wang, H. Two-stream isolation forest based on deep features for hyperspectral anomaly detection. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5504205. [Google Scholar] [CrossRef]
Lu, X.; Zhang, W.; Huang, J. Exploiting embedding manifold of autoencoders for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2020, 58, 1527–1537. [Google Scholar] [CrossRef]
Wang, J.; Ouyang, T.; Duan, Y.; Cui, L. SAOCNN: Self-Attention and One-Class Neural Networks for Hyperspectral Anomaly Detection. Remote Sens. 2022, 14, 5555. [Google Scholar] [CrossRef]
Li, K.; Ling, Q.; Qin, Y.; Wang, Y.; Cai, Y.; Lin, Z.; An, W. Spectral-spatial deep support vector data description for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5522316. [Google Scholar] [CrossRef]
Zhu, D.; Du, B.; Dong, Y.; Zhang, L. Spatial-spectral joint reconstruction with interband correlation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5529513. [Google Scholar] [CrossRef]
Lv, S.; Zhao, S.; Li, D.; Pang, B.; Lian, X.; Liu, Y. Spatial–spectral joint hyperspectral anomaly detection based on a two-branch 3D convolutional autoencoder and spatial filtering. Remote Sens. 2023, 15, 2542. [Google Scholar] [CrossRef]
Wang, S.; Wang, X.; Zhang, L.; Zhong, Y. Deep low-rank prior for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5527017. [Google Scholar] [CrossRef]
Wang, S.; Wang, X.; Zhong, Y.; Zhang, L. Hyperspectral anomaly detection via locally enhanced low-rank prior. IEEE Trans. Geosci. Remote Sens. 2020, 58, 6995–7009. [Google Scholar] [CrossRef]
Cheng, X.; Zhang, M.; Lin, S.; Li, Y.; Wang, H. Deep self-representation learning framework for hyperspectral anomaly detection. IEEE Trans. Instrum. Meas. 2024, 73, 5002016. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Shi, L.; Gamba, P.; Wang, H. Dynamic low-rank and sparse priors constrained deep autoencoders for hyperspectral anomaly detection. IEEE Trans. Instrum. Meas. 2024, 73, 2500518. [Google Scholar] [CrossRef]
Zhang, J.; Xiang, P.; Teng, X.; Zhao, D.; Li, H.; Song, J.; Zhou, H.; Tan, W. Enhancing hyperspectral anomaly detection with a novel differential network approach for precision and robust background suppression. Remote Sens. 2024, 16, 434. [Google Scholar] [CrossRef]
Jiang, K.; Xie, W.; Li, Y.; Lei, J.; He, G.; Du, Q. Semisupervised spectral learning with generative adversarial network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2020, 58, 5224–5236. [Google Scholar] [CrossRef]
Xie, W.; Liu, B.; Li, Y.; Lei, J.; Chang, C.; He, G. Spectral adversarial feature learning for anomaly detection in hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2020, 58, 2352–2365. [Google Scholar] [CrossRef]
Tao, J.; Weiying, X.; Yunsong, L.; Jie, L.; Qian, D. Weakly supervised discriminative learning with spectral constrained generative adversarial network for hyperspectral anomaly detection. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6504–6517. [Google Scholar]
Wang, D.; Gao, L.; Qu, Y.; Sun, X.; Liao, W. Frequency-to-spectrum mapping GAN for semisupervised hyperspectral anomaly detection. CAAI Trans. Intell. Technol. 2023, 8, 1258–1273. [Google Scholar] [CrossRef]
Zhao, R.; Yang, Z.; Meng, X.; Shao, F. A novel fully convolutional auto-encoder based on dual clustering and latent feature adversarial consistency for hyperspectral anomaly detection. Remote Sens. 2024, 16, 717. [Google Scholar] [CrossRef]
Wang, J.; Guo, S.; Hua, Z.; Huang, R.; Hu, J.; Gong, M. CL-CaGAN: Capsule differential adversarial continual learning for cross-domain hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5517315. [Google Scholar] [CrossRef]
Xiang, P.; Ali, S.; Zhang, J.; Jung, S.K.; Zhou, H. Pixel-associated autoencoder for hyperspectral anomaly detection. Int. J. Appl. Earth Obs. 2024, 129, 103816. [Google Scholar] [CrossRef]
Yuan, Z.; Sun, H.; Ji, K.; Li, Z.; Zou, H. Local sparsity divergence for hyperspectral anomaly detection. IEEE Geosci. Remote Sens. Lett. 2014, 11, 1697–1701. [Google Scholar] [CrossRef]
Li, J.; Zhang, H.; Zhang, L.; Ma, L. Hyperspectral anomaly detection by the use of background joint sparse representation. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2015, 8, 2523–2533. [Google Scholar] [CrossRef]
Ling, Q.; Guo, Y.; Lin, Z.; An, W. A constrained sparse representation model for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2018, 57, 2358–2371. [Google Scholar] [CrossRef]
Zhu, L.; Wen, G. Hyperspectral anomaly detection via background estimation and adaptive weighted sparse representation. Remote Sens. 2018, 10, 272. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Zhao, S.; Shi, L.; Wang, H. Hyperspectral anomaly detection using spatial–spectral-based union dictionary and improved saliency weight. Remote Sens. 2023, 15, 3609. [Google Scholar] [CrossRef]
Li, W.; Du, Q. Collaborative representation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1463–1474. [Google Scholar] [CrossRef]
Su, H.; Wu, Z.; Du, Q.; Du, P. Hyperspectral Anomaly Detection Using Collaborative Representation With Outlier Removal. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2018, 11, 5029–5038. [Google Scholar] [CrossRef]
Zhao, C.; Li, C.; Feng, S. A spectral–spatial method based on fractional fourier transform and collaborative representation for hyperspectral anomaly detection. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1259–1263. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Zhou, K.; Zhao, S.; Wang, H. Hyperspectral Anomaly Detection via Sparse Representation and Collaborative Representation. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2022, 16, 946–961. [Google Scholar] [CrossRef]
Yang, Y.; Su, H.; Wu, Z.; Du, Q. Saliency-guided collaborative-competitive representation for hyperspectral anomaly detection. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2023, 16, 6843–6859. [Google Scholar] [CrossRef]
Xu, Y.; Wu, Z.; Li, J.; Plaza, A.; Wei, Z. Anomaly detection in hyperspectral images based on low-rank and sparse representation. IEEE Trans. Geosci. Remote Sens. 2016, 54, 1990–2000. [Google Scholar] [CrossRef]
Huyan, N.; Zhang, X.; Zhou, H.; Jiao, L. Hyperspectral anomaly detection via background and potential anomaly dictionaries construction. IEEE Trans. Geosci. Remote Sens. 2019, 57, 2263–2276. [Google Scholar] [CrossRef]
Lu, L.; Wei, L.; Qian, D.; Ran, T. Low-rank and sparse decomposition with mixture of gaussian for hyperspectral anomaly detection. IEEE Trans. Cybern. 2020, 51, 4363–4372. [Google Scholar]
Guo, T.; He, L.; Luo, F.; Gong, X.; Li, Y.; Zhang, L. Anomaly detection of hyperspectral image with hierarchical anti-noise mutual-incoherence-induced low-rank representation. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5510213. [Google Scholar]
Qin, H.; Shen, Q.; Zeng, H.; Chen, Y.; Lu, G. Generalized nonconvex low-rank tensor representation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5526612. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Zhao, S.; Wang, H. Dual collaborative constraints regularized low rank and sparse representation via robust dictionaries construction for hyperspectral anomaly detection. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2022, 16, 2009–2024. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Wang, L.; Xu, M.; Wang, H. Hyperspectral anomaly detection via dual dictionaries construction guided by two-stage complementary decision. Remote Sens. 2022, 14, 1784. [Google Scholar] [CrossRef]
Sun, S.; Liu, J.; Zhang, Z.; Li, W. Hyperspectral anomaly setection based on adaptive low-rank transformed tensor. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–13. [Google Scholar] [CrossRef] [PubMed]
Zhao, Y.-P.; Li, H.; Chen, Y.; Wang, Z.; Li, X. Hyperspectral anomaly detection via structured sparsity plus enhanced low-rankness. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5515115. [Google Scholar] [CrossRef]
Cai, D.; He, X.; Han, J. Locally consistent concept factorization for document clustering. IEEE Trans. Knowl. Data Eng. 2010, 23, 902–913. [Google Scholar] [CrossRef]
Candes, E.J.; Plan, Y. Matrix completion with noise. Proc. IEEE. 2010, 98, 925–936. [Google Scholar] [CrossRef]
Chen, J.; Mao, H.; Wang, Z.; Zhang, X. Low-rank representation with adaptive dictionary learning for subspace clustering. Knowl. Based Syst. 2021, 223, 107053. [Google Scholar] [CrossRef]
Wang, X.; Wang, L.; Wang, J.; Sun, K.; Wang, Q. Reweighted nuclear norm and total variation regularization with sparse dictionary construction for hyperspectral anomaly detection. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2022, 15, 1775–1790. [Google Scholar] [CrossRef]
Chang, S.; Ghamisi, P. Nonnegative-constrained joint collaborative representation with union dictionary for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5534913. [Google Scholar] [CrossRef]
Wang, S.; Wang, X.; Zhang, L.; Zhong, Y. Auto-AD: Autonomous hyperspectral anomaly detection network based on fully convolutional autoencoder. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5503314. [Google Scholar] [CrossRef]
Xiang, P.; Ali, S.; Jung, S.K.; Zhou, H. Hyperspectral anomaly detection with guided autoencoder. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5538818. [Google Scholar] [CrossRef]
Gao, L.; Wang, D.; Zhuang, L.; Sun, X.; Huang, M.; Plaza, A. BS³LNet: A new blind-spot self-supervised learning network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5504218. [Google Scholar] [CrossRef]

Figure 1. Flowchart of DGRAD-LRR.

Figure 2. The pseudo-color image and ground truth of six hyperspectral datasets. (Ⅰ) pseudo-color image; (Ⅱ) ground truth.

Figure 3. Parameter analyses in the DGRAD-LRR model. (a) The number of dimensions in the transformation space; (b) the trade-off coefficient of F norm constraint terms; (c) the trade-off coefficient of the spatial graph regularization term; (d) the trade-off coefficient of the spectral graph regularization term.

Figure 4. Detection maps of ten HAD methods in the six hyperspectral datasets. (a–f) represent Gulfport, Texas Coast, Los Angeles, Salinas, San Diego-1, and San Diego-2, respectively. GT—ground truth.

Figure 5. ROC curves of eleven HAD algorithms in the six hyperspectral datasets.

Figure 6. Separation maps of backgrounds and anomalies in the ten HAD approaches. a: RX; b: CRD; c: NJCR; d: Auto-AD; e: GAED; f: LRASR; g: GTVLRR; h: AHMID; i: LSDM-MoG; j: GNLTR; k: Ours. (Ⅰ–Ⅵ) denote the Gulfport, Texas Coast, Los Angeles, Salinas, San Diego-1, and San Diego-2 datasets, respectively.

Figure 7. Component analysis of the proposed DGRAD-LRR. LRR is the original LRR and is the baseline of DGRAD-LRR, and it adopts a normal dictionary which is the same as the LRASR algorithm; AD represents the adaptive dictionary strategy; DGR denotes the dual graph regularization.

Table 1. Detailed description of six hyperspectral datasets.

Dataset	Spatial Size	Spectral Bands	Spatial Resolution	Wavelength	Anomaly Type
Gulfport	100 × 100	191	3.4 m	400–2500 nm	Airplanes
Texas Coast	100 × 100	207	17.2 m	450–1350 nm	Buildings
Los Angeles	100 × 100	205	7.1 m	430–860 nm	Buildings
Salinas	120 × 120	204	3.7 m	─────	Simulated Targets
San Diego-1	100 × 100	189	3.5 m	370–2510 nm	Airplanes
San Diego-2	120 × 120	189	3.5 m	370–2510 nm	Airplanes

Table 2. Detailed paragraph settings in the proposed DGRAD-LRR.

Dataset	Gulfport	Texas Coast	Los Angeles	Salinas	San Diego-1	San Diego-2
r	40	3	2	40	3	4
λ	0.1	0.01	30	0.1	10	1
β	0.2	0.5	10	0.001	1	0.1
γ	0.7	0.1	0.1	100	100	0.3

Table 3. AUC scores of ten HAD algorithms in the six hyperspectral datasets. The red annotation represents the best performance and the green annotation denotes the second best performance. The “↑” represents the relationship between detection performance and numerical size, that is, the larger the numerical value, the better the detection performance; the “↓” is opposite to the “↑”.

Dataset	Metrics	RX	CRD	NJCR	Auto-AD	GAED	LRASR	GTVLRR	AHMID	LSDM −MoG	GNLTR	Ours
Gulfport	AUC_(D,F)↑	0.9526	0.8921	0.9760	0.9938	0.9830	0.8377	0.9647	0.9827	0.9435	0.9839	0.9960
	AUC_(D,τ)↑	0.0736	0.0231	0.5034	0.4025	0.2719	0.4603	0.4061	0.1695	0.3115	0.4175	0.8184
	AUC_(F,τ)↓	0.0248	0.0108	0.0984	0.0141	0.0342	0.2878	0.1151	0.0208	0.1021	0.0919	0.2110
	AUC_TD↑	1.0262	0.9152	1.4794	1.3962	1.2549	1.2980	1.3709	1.1522	1.2550	1.4014	1.8144
	AUC_ODP↑	1.0015	0.9045	1.3810	1.3821	1.2207	1.0102	1.2558	1.1314	1.1529	1.3094	1.6034
	AUC_TD-BS↑	0.0489	0.0124	0.4050	0.3884	0.2377	0.1725	0.2911	0.1487	0.2094	0.3256	0.6074
Texas Coast	AUC_(D,F)↑	0.9946	0.9431	0.9950	0.9941	0.9945	0.7569	0.9105	0.9956	0.9786	0.9953	0.9994
	AUC_(D,τ)↑	0.1178	0.0627	0.2517	0.0798	0.1026	0.1662	0.1820	0.1299	0.1014	0.2557	0.3548
	AUC_(F,τ)↓	0.0135	0.0044	0.0252	0.0008	0.0047	0.0803	0.0465	0.0265	0.0197	0.0462	0.0862
	AUC_TD↑	1.1124	1.0057	1.2467	1.0739	1.0971	0.9231	1.0924	1.1255	1.0800	1.2510	1.3541
	AUC_ODP↑	1.0989	1.0013	1.2215	1.0731	1.0924	0.8428	1.0459	1.0990	1.0603	1.2049	1.2680
	AUC_TD-BS↑	0.1043	0.0582	0.2265	0.0790	0.0978	0.0858	0.1355	0.1033	0.0817	0.2096	0.2686
Los Angeles	AUC_(D,F)↑	0.9887	0.9794	0.9591	0.9954	0.9852	0.8551	0.9526	0.9943	0.9745	0.9884	0.9971
	AUC_(D,τ)↑	0.0891	0.0366	0.1077	0.0736	0.0306	0.0601	0.0933	0.0103	0.1362	0.1187	0.2574
	AUC_(F,τ)↓	0.0114	0.0018	0.0111	0.0044	0.0007	0.0172	0.0154	0.0002	0.0331	0.0173	0.0559
	AUC_TD↑	1.0778	1.0161	1.0668	1.0646	1.0158	0.9152	1.0459	1.0046	1.1107	1.1071	1.2544
	AUC_ODP↑	1.0664	1.0143	1.0556	1.0689	1.0150	0.8980	1.0305	1.0044	1.0776	1.0898	1.1986
	AUC_TD-BS↑	0.0777	0.0349	0.0965	0.0692	0.0299	0.0429	0.0779	0.0100	0.1031	0.1014	0.2015
Salinas	AUC_(D,F)↑	0.8073	0.9635	0.9888	0.9925	0.9528	0.8137	0.9755	0.9641	0.9938	0.9683	0.9962
	AUC_(D,τ)↑	0.2143	0.3012	0.5373	0.4096	0.2318	0.4975	0.5741	0.3749	0.5274	0.5620	0.5778
	AUC_(F,τ)↓	0.0314	0.0069	0.0353	0.0026	0.0023	0.1263	0.0482	0.0083	0.0168	0.0555	0.0952
	AUC_TD↑	1.0216	1.2647	1.5261	1.4021	1.1846	1.3112	1.5496	1.3390	1.5212	1.5304	1.5740
	AUC_ODP↑	0.9903	1.2577	1.4908	1.3995	1.1823	1.1849	1.5014	1.3306	1.5044	1.4748	1.4788
	AUC_TD-BS↑	0.1829	0.2942	0.5020	0.4070	0.2295	0.3712	0.5259	0.3665	0.5106	0.5065	0.4826
San Diego-1	AUC_(D,F)↑	0.9403	0.9412	0.9736	0.9856	0.9907	0.8940	0.9360	0.9732	0.9388	0.9764	0.9935
	AUC_(D,τ)↑	0.1778	0.0911	0.3807	0.0653	0.2337	0.3681	0.3354	0.1930	0.1744	0.3425	0.4837
	AUC_(F,τ)↓	0.0589	0.0214	0.0692	0.0038	0.0079	0.1199	0.0738	0.0094	0.0515	0.0592	0.0980
	AUC_TD↑	1.1181	1.0323	1.3542	1.0509	1.2244	1.2621	1.2714	1.1662	1.1132	1.3189	1.4772
	AUC_ODP↑	1.0592	1.0110	1.2850	1.0471	1.2165	1.1422	1.1976	1.1569	1.0617	1.2597	1.3792
	AUC_TD-BS↑	0.1189	0.0698	0.3114	0.0615	0.2258	0.2482	0.2616	0.1837	0.1229	0.2832	0.3857
San Diego-2	AUC_(D,F)↑	0.9111	0.9791	0.9568	0.9797	0.9871	0.9610	0.9858	0.9878	0.9320	0.9938	0.9958
	AUC_(D,τ)↑	0.0791	0.1375	0.2560	0.0469	0.1292	0.2836	0.2618	0.1073	0.1619	0.4301	0.3131
	AUC_(F,τ)↓	0.0406	0.0360	0.1021	0.0068	0.0095	0.0975	0.0402	0.0081	0.0882	0.1012	0.0387
	AUC_TD↑	0.9902	1.1166	1.2128	1.0266	1.1162	1.2446	1.2476	1.0951	1.0939	1.4240	1.3088
	AUC_ODP↑	0.9496	1.0806	1.1107	1.0198	1.1068	1.1471	1.2074	1.0870	1.0057	1.3228	1.2701
	AUC_TD-BS↑	0.0385	0.1015	0.1539	0.0401	0.1197	0.1861	0.2215	0.0992	0.0737	0.3290	0.2744

Table 4. Running time of ten HAD algorithms in the six hyperspectral datasets.

Dataset	RX	CRD	NJCR	Auto-AD	GAED	LRASR	GTVLRR	AHMID	LSDM-MoG	GNLTR	Ours
Gulfport	0.283	7.218	30.820	36.386	36.104	1.215	227.397	19.583	11.009	3.467	86.197
Texas Coast	0.311	3.683	31.309	22.921	37.430	1.323	212.385	21.668	17.040	3.302	12.803
Los Angeles	0.361	5.624	17.736	16.213	37.241	1.480	156.376	25.378	18.136	3.498	41.781
Salinas	0.528	6.382	43.480	21.471	56.359	1.992	214.851	28.854	12.424	4.515	32.592
San Diego-1	0.305	10.579	21.905	15.896	35.778	1.580	206.854	19.560	4.697	3.319	13.101
San Diego-2	0.531	15.137	36.115	16.307	50.915	2.415	219.254	31.136	8.453	4.816	89.660
Average	0.387	8.104	30.228	21.532	42.305	1.668	206.186	24.363	11.960	3.820	46.022

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cheng, X.; Mu, R.; Lin, S.; Zhang, M.; Wang, H. Hyperspectral Anomaly Detection via Low-Rank Representation with Dual Graph Regularizations and Adaptive Dictionary. Remote Sens. 2024, 16, 1837. https://doi.org/10.3390/rs16111837

AMA Style

Cheng X, Mu R, Lin S, Zhang M, Wang H. Hyperspectral Anomaly Detection via Low-Rank Representation with Dual Graph Regularizations and Adaptive Dictionary. Remote Sensing. 2024; 16(11):1837. https://doi.org/10.3390/rs16111837

Chicago/Turabian Style

Cheng, Xi, Ruiqi Mu, Sheng Lin, Min Zhang, and Hai Wang. 2024. "Hyperspectral Anomaly Detection via Low-Rank Representation with Dual Graph Regularizations and Adaptive Dictionary" Remote Sensing 16, no. 11: 1837. https://doi.org/10.3390/rs16111837

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral Anomaly Detection via Low-Rank Representation with Dual Graph Regularizations and Adaptive Dictionary

Abstract

1. Introduction

2. Related Work

2.1. Deep-Learning Algorithms

2.2. Representation-Based Algorithms

3. Materials and Methods

3.1. Low Rank Representation

3.2. Adaptive Dictionary Construction

3.3. Low-Rank Representation Based on Dual Graph Regularization

3.4. Model Optimization

3.5. Anomaly Detection

4. Results

4.1. Experimental Setup

4.1.1. Hyperspectral Dataset

4.1.2. Compared Methods

4.1.3. Evaluation Metrics

4.1.4. Implement Details

4.2. Parameter Analysis

4.3. Detection Performance

4.3.1. Qualitative Performance

4.3.2. Quantitative Performance

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI