1. Introduction
As opposed to visible and multispectral images, hyperspectral images (HSIs) have been regarded as a remarkable invention in the field of remote sensing imaging sciences, due to their practical capacity to capture high-dimensional spectral information from different scenes on the Earth’s surface [
1]. HSIs consist of innumerable contiguous spectral bands that span the electromagnetic spectrum, providing rich and detailed physical attributes of land covers, which facilitate the development of various applications such as change detection [
2,
3,
4,
5], land-cover classification [
6,
7,
8], retrieval [
9], scene classification [
10,
11] and anomaly detection [
12,
13].
Anomaly detection aims to find abnormal patterns whose distribution is inconsistent with most instances in data [
14,
15]. As one significant branch of hyperspectral remote sensing target detection, hyperspectral anomaly detection (HAD) involves the unsupervised identification of targets (e.g., plastic plates in a field or military camouflage targets [
16,
17,
18]) that exhibit spatial or spectral dissimilarities from their surrounding background without relying on prior information in practical situations [
19,
20]. Therefore, in essence, HAD is a binary classification with training in an unsupervised manner. Although the precise definition of an anomaly has yet to be established [
21,
22], it is generally accepted that HAD typically exhibits the following characteristics: (1) generally, anomalies constitute a very small proportion of the entire hyperspectral image; (2) anomalies can be distinguished from the background in terms of spectral or spatial characteristics. (3) there is a lack of spatial and spectral prior information about anomalies or anomalies; (4) in real-world situations, occasional spectral mixing of anomalies and background may appear as pixels or subpixels. These attributes render hyperspectral anomaly detection a prominent yet challenging topic in the field of remote sensing [
23,
24,
25]. Over recent decades, numerous HAD algorithms can be broadly classified into two categories based on their motivation and theoretical basis: traditional and deep-learning-based methods.
Traditional HAD methods have two important branches: statistics- and representation-based methods. Statistics-based HAD methods generally assume that the background of the HSIs may obey some distributions, such as multivariate Gaussian distribution. In contrast, anomalous objects inconsistent with distribution can be identified based on the Mahalanobis distance or the Euclidean distance. Among these statistics-based HAD methods, the Reed-Xiaoli (RX) [
26] detector, one of the most well-known benchmarks, considers the original image as the background statistics based on Gaussian multivariate distribution, and pixels exhibiting deviations from the distribution are identified as anomalies. As the neglect of local context in RX, local variants of RX (LRX) [
27,
28] estimate the test pixels by modeling a small neighborhood as the local background statistics. However, the two versions of RX suffer from the limitation of the fact that the real-world scene may obey the complex high-order distribution. Thus, some nonlinear variants of the RX detector, kernel RX (KRX) [
29,
30,
31], were presented, which nonlinearly map the entire images to high-dimension feature space by different kernel functions. Since most RX-based methods ignore the spatial information, He et al. [
32] developed RX with extended multi-attribute profiles in a recursive manner. In addition to the RX and its variants, several effective methods based on statistics have also been developed. As a unified approach to object and change detection, the Cluster-Based Anomaly Detection (CBAD) [
33] calculates background statistics across clusters rather than sliding windows, enabling the detection of objects with varying sizes and shapes. As an important algorithm of statistical learning, the kernel isolation forest detector (KIFD) [
34] and its improvement [
35] have been used for HAD, which assumes the anomalies are more prone to isolation within the kernel space.
Based on the improvement of compressed sensing theory, representation-based methods can detect anomalies without certain assumptions about the background distribution. The fundamental concept of representation-based methods is that all test pixels can be reconstructed by utilizing a specific background dictionary in a logical model where the residual represents the abnormal level of the pixels. Moreover, representation-based methods include collaborative representation (CR), the sparse representation (SR), and the low-rank representation (LRR). The CR-based methods assume collaboration between dictionary atoms is more crucial than competition in small sample cases. Specifically, the background pixel can be linearly represented well by its surroundings while the objects cannot, and the representation residual is considered to be an abnormal level of the test pixels [
36,
37]. To improve the robustness of CR and reduce testing time, a non-negative-constrained joint collaborative representation (NJCR) [
38] model has been proposed. In contrast to CR, SR focuses on the competition of atoms, it assumes that background representation can be achieved by a few atoms from an overcomplete dictionary. In the work by [
39], based on a binary hypothesis, background joint sparse representation was utilized to detect anomalies. The robust background regression-based score estimation algorithm (RBRSE) [
40] exploits a kernel expansion technique to formulate the information as a density feature representation to facilitate robust background estimation. Due to high spatial and spectral correlation in HSIs, background exhibits global low-rank characteristics while anomalies demonstrate sparsity owing to their low probability of occurrence and limited presence. Therefore, unlike pixel-by-pixel detection of CR and SR, LRR locates the objects by characterizing the global structure of the HSIs [
41]. In order to exploit the spatial–spectral information, the adaptive low-rank transformed tensor [
42] restrains the frontal slices of the transformed tensor with low-rank constraint. Because of the highly mixed phenomenon of pixels, based on low-rank decomposition, Qu et al. [
43,
44] obtained discriminative vectors by spectral unmixing.
Since deep-learning-based methods can obtain discrimination in the space of latent semantic features, they have emerged as a prominent area of research in recent years. To date, deep learning has tackled numerous challenging issues in the field of computer vision [
45,
46]. In a supervised manner, based on the convolutional neural network (CNN) and fully connected layer, Li et al. [
47] explored the performance of transfer learning for HAD. Many unsupervised deep-learning-based methods introduce the autoencoders (AE) as the backbone [
48]. In AE-based methods, the input layer encodes the testing pixels
X into hidden layers with a lower dimension and sparsity, and then the output layer decodes features to construct the pixels
. The residual between
X and
demonstrates the detection result. For instance, based on low-rank and sparse matrix decomposition, Zhao et al. [
49] developed a spectral–spatial stacked autoencoder to extract deep features. To reduce the high dimensionality and remove deteriorated spectral channels, an adversarial autoencoder [
50] has been proposed, it optimized the model with an adaptive weight and spectral angle distance. Wang et al. [
51] presented an autonomous AE-based HAD framework to reduce manual parameter setting and simplify the processing procedures. In [
52], AE-based network to reserve geometric structure by embedding a supergraph which improved the performance. As for reducing the feature representation of the anomaly targets, guided autoencoder(GAED) [
53] adopted a multilayer AE network with a guided module.
Although the representations learned by AEs benefit background estimation, there are still several problems in AE-based methods. First, most AE-style methods ignore the preprocessing of band selection. The strong spectral redundancy of HSIs may affect the performance and the excessive number of spectral bands in HSIs leads to significant computational burden [
54,
55]. Moreover, high-dimensional volumes may create “the curse of dimensionality”, which decreases detection accuracy [
56,
57]. Second, despite the recent advancements in AE-based techniques for HAD, it is important to acknowledge that the information equivalence between input and supervision in reconstruction cannot effectively force the AE to learn the required semantic features [
58]. Besides, the local spatial characteristics of the HSI and the inter-pixel correlation are not explicitly considered when adopting AE to reconstruct pixels, and the lack of prior spectral and spatial information can impact the performance of HAD [
52,
53].
In order to address the abovementioned drawbacks, this study proposes a novel multi-prior graph autoencoder (MPGAE) for hyperspectral anomaly detection. First, inspired by PTA [
59] that utilizes a piecewise-smooth prior to achieving total variation norm regularization, a novel band selection module is designed to simplify the HSIs, and it can remove redundant spectral information. Next, to balance the reconstruction of the background and anomaly targets, a new loss function is presented by combining the global RX and local salient weight based on the local salient prior. Finally, combining the new loss function and the compressed HSIs, the supergraph [
52] is introduced into autoencoder to achieve the final detection.
Compared with other HAD approaches, the major contributions of the proposed MPGAE can be summarized as follows:
- 1
The MPGAE is proposed to handle the situations where anomalies are present in hyperspectral images. Based on the piecewise-smooth prior, the band selection module can eliminate the unnecessary spectral bands to improve the performance.
- 2
Based on the combination of a global RX detector and local salient weight, a new loss function is presented. The loss function can improve performance by adjusting background and anomaly feature learning.
- 3
The supergraph [
52] is introduced into autoencoder for preserving spatial consistency and information about the local geometric structure, which can improve the robustness of the proposed MPGAE.
The experimental results utilizing five real datasets captured by various sensors, with extensive metrics, and quantitative and visual illustrations demonstrated that the proposed MPGAE method is significantly superior to the other competing methods in terms of detection accuracy.
The remainder of this article is arranged as follows.
Section 2 introduces the details of the proposed method, MPGAE. Then,
Section 3 discusses the experimental results of the proposed method, MPGAE, with other advanced hyperspectral anomaly detectors on five real hyperspectral datasets. Finally,
Section 4 summarizes this paper and demonstrates the trends of future research.