*2.3. LS-RXD*

The traditional L-RXD exploits only one sliding window to estimate the neighborhood background statistic for the pixel under test. It is difficult to detect a multi-pixels anomaly target by L-RXD if the local distributions of some windows are mostly occupied by anomaly pixels because the background statistic will be contaminated seriously by anomaly pixels. In order to solve this problem, a local summation RX detector is proposed in [13]. Figure 1 takes 3 × 3 size multiple local windows to demonstrate the implementation of the local summation strategy.

**Figure 1.** Multiple local window filters [13] (**a**) Window 1. (**b**) Window 2 and (**c**) Window 9.

As illustrated in Figure 1, nine local windows will be taken for the pixel under test, represented by a yellow pixel if the local window is chosen to be 3 × 3 size. For an *ω* × *ω* size local window, the sliding filter contains *ω* × *ω* local windows for summation. The summation detector result for the pixel under test *r* is specified by

$$\delta\_{LS-RXD}(r) = \sum\_{i=1}^{\omega \times \omega} \left( r - \mu\_{\mathbf{W}i} \right)^{T} \Sigma\_{\mathbf{W}i}^{-1} (r - \mu\_{\mathbf{W}i}) \tag{3}$$

where *Wi* is the local pixel samples dataset from window *i*, *μ<sup>W</sup>i* and <sup>Σ</sup>*Wi* are the mean vector and covariance matrix of *Wi*, respectively.

Suppose that the pixel samples dataset in the local window is denoted as *W* = {*<sup>r</sup>pij*}, where *i* = 1, 2, .., *ω*, *j* = 1, 2, .., *ω* and *pij* is the global location of *<sup>r</sup>pij* in the whole data set {*<sup>r</sup>i*}*Ni*=1. As a matter of fact, the LS-RXD specified by (3) can be implemented by recursively updating the detection result of each pixel in *W* as the window is sliding, that is

$$\delta\_{LS-RXD}^{t+1}(\mathbf{r}\_{\overline{p}\overline{\boldsymbol{\eta}}}) = \delta\_{LS-RXD}^{t}(\mathbf{r}\_{\overline{p}\overline{\boldsymbol{\eta}}}) + (\mathbf{r}\_{\overline{p}\overline{\boldsymbol{\eta}}} - \boldsymbol{\mu}\_{\overline{\boldsymbol{\mathcal{W}}}})^{T} \boldsymbol{\Sigma}\_{\overline{\boldsymbol{\mathcal{W}}}}{}^{-1}(\mathbf{r}\_{\overline{p}\overline{\boldsymbol{\eta}}} - \boldsymbol{\mu}\_{\overline{\boldsymbol{\mathcal{W}}}}) \tag{4}$$

where *μW* and Σ*W* are the mean vector and covariance matrix of *W*, *<sup>δ</sup>tLS*−*RXD* and *δ<sup>t</sup>*+<sup>1</sup> *LS*−*RXD* are the *t*, *t* + 1 times updated detection result, respectively.

In doing so, the only difference between L-RXD and LS-RXD is that as the local window is sliding; only the detection result of the centered pixel in the local window is calculated by L-RXD, while the detection results of all *ω* × *ω* pixels in the local window are updated by LS-RXD.

It is worth noting that the local summation RX detector in [13] is called as LSAD for short. Subspace feature projection is used in LSAD to approximately calculate the <sup>Σ</sup>*Wi*<sup>−</sup><sup>1</sup> in Equation 3 to enable LSAD with robust background feature statistics. However, it is difficult to realize a timely process due to the subspace feature projection in practice. Band selection onboard before data transmission is feasible to avert the singularity of an inversed local covariance. Therefore, we only focus on the recursive process of L-RXD and LS-RXD in the following.

## **3. Recursive LS-RXD**

In the aforementioned local summation detection algorithms, a new local covariance matrix inversion is repeatedly calculated as the local window slides. The key issue of the recursive process of L-RXD and LS-RXD is how to perform a recursive computation for every independent covariance matrix inversion.

In what follows, we describe how to calculate the covariance matrix inversion of a casual sliding array window recursively.

#### *3.1. The Covariance Matrix Inversion of Causal Sliding Array Window*

Figure 2 shows the causal sliding array window at *rn*−1 depicted by dotted lines and the causal sliding array window at *rn* depicted by dashed lines, where *a* is the array window size. The farthest pixel *rn*−*<sup>a</sup>* from *rn* in the causal sliding array window at *rn* is removed from the causal sliding array window at *rn*, while the most recent data sample vector *rn* is added to the causal sliding array window at *<sup>r</sup>n*+1.

**Figure 2.** Casual sliding array window at *rn* with width specified by *a*.

Defining *<sup>R</sup>a*(*n*)=(1/*a*) ∑*<sup>r</sup>i*∈*<sup>W</sup> <sup>r</sup>i<sup>r</sup>Ti* , where *W* = {*<sup>r</sup>i*}*ni*=*n*−*a*+1. *<sup>R</sup>a*(*n*) is called the "causal" sample auto correction matrix correlation matrix, and is formed by data sample vectors in the causal sliding array window. Then *<sup>R</sup>a*(*n*) can further be expressed as

$$\mathcal{R}\_{\mathfrak{a}}(n) = \left[ (\mathcal{R}\_{\mathfrak{a}}(n-1) - \frac{r\_{n-a}r\_{n-a}^T}{a}) + \frac{r\_n r\_n^T}{a} \right] \tag{5}$$

By repeatedly use of the following Woodbury matrix identity [20] twice:

$$[A + \mu v^T]^{-1} = A^{-1} - \frac{[A^{-1}u][v^T A^{-1}]}{1 + v^T A^{-1} u} \tag{6}$$

the inverse of *<sup>R</sup>a*(*n*) can be updated recursively via *R*−<sup>1</sup> *a* (*n*) by virtue of (7) and (8) [18]

$$\mathcal{R}\_{a}^{-1}(n) = \left(\mathcal{R}\_{a}(n-1) - \frac{\boldsymbol{r}\_{a-a}\boldsymbol{r}\_{a-a}^{\top}}{a}\right)^{-1} - \frac{\left[\left(\mathcal{R}\_{a}(n-1) - \frac{\boldsymbol{r}\_{a-a}\boldsymbol{r}\_{a-a}^{\top}}{a}\right)^{-1}\frac{\boldsymbol{r}\_{a}}{\sqrt{a}}\right] \left|\frac{\boldsymbol{r}\_{a}^{\top}}{\sqrt{a}}\left(\mathcal{R}\_{a}(n-1) - \frac{\boldsymbol{r}\_{a-a}\boldsymbol{r}\_{a-a}^{\top}}{a}\right)^{-1}\right|}{1 + \frac{\boldsymbol{r}\_{a}^{\top}}{\sqrt{a}} \left(\mathcal{R}\_{a}(n-1) - \frac{\boldsymbol{r}\_{a-a}\boldsymbol{r}\_{a-a}^{\top}}{a}\right)^{-1}\frac{\boldsymbol{r}\_{a}}{\sqrt{a}}}\tag{7}$$

$$(\mathbf{R}\_a(n-1) - \frac{\mathbf{r}\_{n-d}\mathbf{r}\_{n-d}^T}{a})^{-1} = \mathbf{R}\_a^{-1}(n-1) + \frac{[\mathbf{R}\_d^{-1}(n-1)\frac{\mathbf{r}\_{n-d}}{\sqrt{a}}][\frac{\mathbf{r}\_{n-d}^T}{\sqrt{a}}\mathbf{R}\_d^{-1}(n-1)]}{1 - \frac{\mathbf{r}\_{n-d}^T}{\sqrt{a}}\mathbf{R}\_d^{-1}(n-1)\frac{\mathbf{r}\_{n-d}}{\sqrt{a}}}\tag{8}$$

The "causal" covariance matrix formed by all the data sample vectors in the sliding array window can be specified by

$$\mathbf{K}\_{\mathfrak{a}}(n) = \mathbf{R}\_{\mathfrak{a}}(n) - \mu\_{\mathfrak{a}}(n)\mu\_{\mathfrak{a}}^{\mathrm{T}}(n) \tag{9}$$

where

$$
\mu\_{\mathfrak{a}}(n) = \mu\_{\mathfrak{a}}(n-1) + (1/a)(r\_{\mathfrak{n}} - r\_{\mathfrak{n}-\mathfrak{a}}) \tag{10}
$$

is the "causal" sample mean of sliding array window. Using *Woodbury* matrix identity again, by letting *A* = *<sup>R</sup>a*(*n*), *u* = <sup>−</sup>*μa*(*n*), *v* = *μa*(*n*), then *K*−<sup>1</sup> *a* (*n*) can be further expressed as

$$\mathbf{K}\_{a}^{-1}(n) = \mathbf{R}\_{a}^{-1}(n) + \frac{[\mathbf{R}\_{a}^{-1}(n)\mu\_{a}(n)][\mu\_{a}^{T}(n)\mathbf{R}\_{a}^{-1}(n)]}{1 - \mu\_{a}^{T}(n)\mathbf{R}\_{a}^{-1}(n)\mu\_{a}(n)}\tag{11}$$

By virtue of (7), (8), (10) and (11), *K*−<sup>1</sup> *a* (*n*) can be updated recursively by *R*−<sup>1</sup> *a* (*n* − 1) and *μa*(*<sup>n</sup>* − <sup>1</sup>), via deleting the pixel *rn*−*<sup>a</sup>* and adding the current pixel *rn*.

#### *3.2. Recursive Processing of the Covariance Matrix Inversion of Sliding Window*

Figure 3 illustrates two continually sliding windows with size of *ω* × *ω* depicted by black dashed lines and orange dashed lines, respectively, where *<sup>r</sup>pωω*+<sup>1</sup> denotes the most recent received data sample vector. The sample data vectors update process in sliding windows can be implemented in *ω* steps by removing one pixel and adding one pixel each step. Suppose that *pωω* = *n* − 1, the inverses of correlation matrices of the local window at *<sup>r</sup>pωω* and *<sup>r</sup>pωω*+<sup>1</sup> are denoted as *R*−<sup>1</sup> *ω*<sup>2</sup> (*n* − 1) and *R*−<sup>1</sup> *ω*<sup>2</sup> (*n*) respectively, and inverses of the covariance matrices of the local window at *<sup>r</sup>pωω* and *<sup>r</sup>pωω*+<sup>1</sup> are denoted as *K*−<sup>1</sup> *ω*<sup>2</sup> (*n* − 1) and *K*−<sup>1</sup> *ω*<sup>2</sup> (*n*), respectively. In analogy with (7), (8), (10), and (11), *K*−<sup>1</sup> *ω*<sup>2</sup> (*n*) can be updated recursively as follows.


**Figure 3.** Sliding window with size of *ω* × *ω*.

For *i* = 1 to *ω*, do

$$(\mathbf{R}\_{\omega^2}(n-1) - \frac{r\_{p\_{\rm li}}r\_{p\_{\rm II}}^T}{\omega^2})^{-1} = \mathbf{R}\_{\omega^2}^{-1}(n-1) + \frac{[\mathbf{R}\_{\omega^2}^{-1}(n-1)\frac{r\_{p\_{\rm I}}}{\omega}][\frac{r\_{p\_{\rm I}}^T}{\omega}\mathbf{R}\_{\omega^2}^{-1}(n-1)]}{1 - \frac{r\_{p\_{\rm I}}^T}{\omega}\mathbf{R}\_{\omega^2}^{-1}(n-1)\frac{r\_{p\_{\rm I}}}{\omega}}\tag{12}$$

$$\boldsymbol{\mathcal{R}}\_{\omega^{\Delta}}^{-1}(n) = (\boldsymbol{\mathcal{R}}\_{\omega^{\Delta}}(n-1) - \frac{\boldsymbol{r}\_{p1}\boldsymbol{r}\_{p\_{\parallel}}^{T}}{\omega^{\Delta}})^{-1} - \frac{[(\boldsymbol{\mathcal{R}}\_{\omega^{\Delta}}(n-1) - \frac{\boldsymbol{r}\_{p\_{\parallel}}\boldsymbol{r}\_{p\_{\parallel}}^{T}}{\omega^{\Delta}})^{-1}\frac{\boldsymbol{r}\_{p\_{\parallel}}}{\omega^{\Delta}}] \left[\boldsymbol{\mathcal{R}}\_{\omega^{\Delta}}(n-1) - \frac{\boldsymbol{r}\_{p\_{\parallel}}\boldsymbol{r}\_{p\_{\parallel}}^{T}}{\omega^{\Delta}}\right]^{-1}}{1 + \frac{\boldsymbol{r}\_{p\_{\parallel}}^{T}}{\omega^{\Delta}} \left(\boldsymbol{\mathcal{R}}\_{\omega^{\Delta}}(n-1) - \frac{\boldsymbol{r}\_{p\_{\parallel}}\boldsymbol{r}\_{p\_{\parallel}}^{T}}{\omega^{\Delta}}\right)^{-1}\frac{\boldsymbol{r}\_{p\_{\parallel}}}{\omega}} \tag{13}$$

Meanwhile, update *R*−<sup>1</sup> *ω*<sup>2</sup> (*n* − 1) = *R*−<sup>1</sup> *ω*<sup>2</sup> (*n*) after each iteration. Then

$$
\mu\_{\omega^2}^{-1}(n) = \mu\_{\omega^2}^{-1}(n-1) + (1/\omega^2)(\sum\_{i=1}^{\omega} (r\_{p\_{i\omega}} - r\_{p\_{i1}})) \tag{14}
$$

$$\mathbf{K}\_{\omega^2}^{-1}(n) = \mathbf{R}\_{\omega^2}^{-1}(n) + \frac{[\mathbf{R}\_{\omega^2}^{-1}(n)\mu\_{\omega^2}(n)][\mu\_{\omega^2}^T(n)\mathbf{R}\_{\omega^2}^{-1}(n)]}{1 - \mu\_{\omega^2}^T(n)\mathbf{R}\_{\omega^2}^{-1}(n)\mu\_{\omega^2}(n)}\tag{15}$$

#### *3.3. Recursive Processing of LS-RXD*

Except for the recursive processing of the covariance matrix inversion of the sliding window, some other issues should also be considered.

The first issue is the edge expansion. To ensure that there is no absence of detection on the edge of an image, the edge expansion is usually operated as a preprocessing for a local window detector. Due to the low probability of anomaly targets appearance in hyperspectral images, enplaned layers can be randomly chosen from the whole image [13]. With this consideration in mind, take the window with size of 3 × 3 as an example. We design the sliding window strategy, depicted in Figure 4, where the yellow, blue and purple grids, respectively, denote the latest pixel received, the processed data and the pixels to be processed. As Figure 4 shows, when the sliding window meets the right board of the hyperspectral image, the next several sliding windows are across the border by moving down one line and adding new data one by one. The last sliding window moves to the right-bottom until the last sample data *rRow*×*Col* is received. This design enables the recursive processing of LS-RXD more conveniently.

**Figure 4.** Sliding window strategy for recursive local summation RXD (R-LS-RXD): (**a**) *No*.(*Col* − 2) window; (**b**) *No*.(*Col* − 1) window; (**c**) *No*.*Col* window; (**d**) *No*.(*Col* + 1) window.

The second issue is how to keep track of which data sample vector should be removed and which data sample vector should be added as a matrix window moves on. Let *W* = {*<sup>r</sup>pij*}*<sup>ω</sup>i*,*j*=<sup>1</sup> denote the

local sliding window in an image with size *Row* × *Col*, where *pij* is the global location of *<sup>r</sup>pij* in the whole data set {*<sup>r</sup>i*}*<sup>N</sup> i*=1. For the first local window, *pij* can be expressed as *pij* = *<sup>x</sup>*(*<sup>i</sup>*−<sup>1</sup>)×*Col*+*j* . Using the strategy of Figure 4, it is very easy to update the global location of pixels in the follow-up window as *pij* = *pij*+<sup>1</sup> successively.

After the aforementioned issues are solved, the recursive LS-RXD, called as R-LS-RXD can be obtained by

$$\delta\_{\mathbb{R}-LS-RXD}(\mathbf{r}\_{\overline{\nu}\overline{\boldsymbol{\eta}}}) = \delta\_{\mathbb{R}-LS-RXD}(\mathbf{r}\_{\overline{\nu}\overline{\boldsymbol{\eta}}}) + \left(\mathbf{r}\_{\overline{\nu}\overline{\boldsymbol{\eta}}} - \boldsymbol{\mu}\_{\omega^2}(\mathbf{n})\right)^{\mathrm{T}} \mathbf{K}\_{\omega^2}^{-1}(\mathbf{n}) \left(\mathbf{r}\_{\overline{\nu}\overline{\boldsymbol{\eta}}} - \boldsymbol{\mu}\_{\omega^2}(\mathbf{n})\right) \tag{16}$$

Three comments are worthwhile:


#### *3.4. Background Suppression of Sliding Windows*

This section mainly discusses the background suppression sliding window furthermore. It is not convenient to set the current under test pixel to conclude in the local window background with other data samples, because it will reduce the separation between background information and anomaly information separation while the current under test pixel is anomaly [21]. In order to suppress the background information and improve the detection performance, we need to remove the current under test pixel (*<sup>r</sup>k*) from the recursive update processing.

Assume that *Rk* is the correlation matrix removed *rk*, and *Rk* is specified by

$$\begin{split} \mathbf{R}\_{k} &= \frac{1}{n-1} \sum\_{i=1, i \neq k}^{n} r\_{i} r\_{i}^{T} = \frac{1}{n-1} (\sum\_{i=1}^{n} r\_{i} r\_{i}^{T} - r\_{k} r\_{k}^{T}) \\ &= \frac{n}{n-1} \frac{1}{n} \sum\_{i=1}^{n} r\_{i} r\_{i}^{T} - \frac{1}{n-1} r\_{k} r\_{k}^{T} = \frac{n}{n-1} \mathbf{R}\_{n} - \frac{1}{n-1} r\_{k} r\_{k}^{T} \end{split} \tag{17}$$

Once using *Woodbury* matrix identity, letting *A* = *n <sup>n</sup>*−1 *R<sup>n</sup>*, *u* = −1 *<sup>n</sup>*−1 *rk*, *v* = *rk*, then

$$\begin{split} \mathbf{R}\_{k}^{-1} &= (\frac{n}{n-1}\mathbf{R}\_{n} - \frac{1}{n-1}\mathbf{r}\_{k}\mathbf{r}\_{k}^{T})^{-1} \\ &= \frac{n-1}{n}\mathbf{R}\_{n}^{-1} + \frac{[\frac{n-1}{n}\mathbf{R}\_{n}^{-1}\frac{1}{n-1}\mathbf{r}\_{k}][r\_{k}^{T}\frac{n-1}{n}\mathbf{R}\_{n}^{-1}]}{1 - r\_{k}^{T}\frac{n-1}{n}\mathbf{R}\_{n}^{-1}\frac{1}{n-1}r\_{k}} \end{split} \tag{18}$$

Assume that *μk* is the mean vector of background sample data removed *rk*, the inverse of covariance matrix could be specified by

$$\mathbf{K}\_{\rm BS}^{-1}(n) = \mathbf{R}\_k^{-1}(n) + \frac{[\mathbf{R}\_k^{-1}(n)\mu\_k(n)][\mu\_k^T(n)\mathbf{R}\_k^{-1}(n)]}{1 - \mu\_k^T(n)\mathbf{R}\_k^{-1}(n)\mu\_k(n)}\tag{19}$$

As a result, the background suppression recursive R-BS-LS-RXD can be specified by

$$\delta\_{R-BS-LS-RXD}(r\_k) = (r\_k - \mu\_k(n))^T \mathbf{K}\_{BS}^{-1}(n) (r\_k - \mu\_k(n)) \tag{20}$$

#### *3.5. Computational Complexity Analysis*

This section provides a detailed analysis on the computational complexity of calculating recursive update Equations (12)–(15).

The advantage of using causal sliding windows over local windows is the use of recursive Equations (12) and (13), where the *Woodbury* identity is implemented twice, instead of recalculating each time as long as a new data sample vector comes in. Table 1 shows the computation complexity of matrix algebra. Based on the information in Table 1, the matrix inversion computation complexity is higher than the matrix multiplication computation.



The usage frequency of the *Woodbury* matrix identity is determined by the size of the sliding window. Local background information is updated by calculating *ω* times of Equations (12) and (13), regardless of the number of pixels in the local background. Such a significant benefit arises from the recursive specialty in (12) and (13). Hence, the computational complexity of processing a single local window specified by its window size *ω* × *ω* requires *ω* times calculations of matrix multiplication. In addition, it only needs to calculate the inverse of the covariance matrix once.

Table 2 tabulates the number of floating operations (flops) required for LS-RXD and R-LS-RXD, which update *K*−<sup>1</sup> *a* (*n*) in different method, where the bands number is specified by *L*, local window size is specified by *a* = *ω* × *ω*, and the pixels number to be processed is specified by *N*. These parameters determine the number of flops in the algorithm. Figure 5 plots the number of floating operations required for every algorithm versus *L*, *a* and *N*. The configurations of parameters are shown in Table 3.

**Table 2.** Computational Complexity for local summation RXD (LS-RXD) and recursive local summation RXD (R-LS-RXD).




As shown in Figure 5, the comparison of different anomaly detectors depends on the specific configuration of the parameters. Generally speaking, R-LS-RXD is faster than LS-RXD.

**Figure 5.** Numbers of floating operations in various of (**a**) bands; (**b**) *ω* size; (**c**) processed pixels.

#### **4. Results and Discussion**

To demonstrate the performance of anomaly detection using recursive local summation RXD, two real hyperspectral image scenes were conducted for experiments. The first image data set is the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) image scene of Sandiego airport area which is located in California. A sub-image with size 100 × 100 along with its ground truth are shown in Figure 6a,b, respectively. It was acquired through 224 spectral bands with a spectral coverage from 0.4 to 2.5 μm where the spatial resolution is 3 m and spectral resolution is 20 nm. After removing low signal-to-noise ratio (SNR) and water absorption bands, a total of 126 spectral bands were used for experiments.

**Figure 6.** Sandiego hyperspectral image (**a**) 30th band scene; (**b**) ground truth.

The second image data set is the Hyperspectral Digital Imagery Collection Experiment (HYDICE) image scene shown in Figure 7a which was collected in August 1995 from a flight altitude of 10,000 ft with the ground sampling distance approximately 1.56 m. This scene has been studied extensively by many reports such as [2,14]. It has a total of 169 bands which were used for the experiments with low signal/high noise bands: bands 1–3 and bands 202–210; and water vapor absorption bands: bands 101–112 and bands 137–153, removed. There are 15 panels with three different sizes of 3 m × 3 m, 2 m × 2 m and 1 m × 1 m. Figure 7b shows the precise spatial locations of these 15 panels, where red pixels (R pixels) are the panel center pixels and the pixels in yellow (Y pixels) are panel boundary pixels mixed with the background (BKG) . As a result, there are a total of 19 R panel pixels. In particular, R panel pixels are denoted by *pij* with rows indexed by *i* = 1, ··· , 5 and columns indexed by *j* = 1, 2, 3 except that the panels in the 1st column with the 2nd , 3rd , 4th, 5th rows which are two-pixel panels, denoted by *p*211, *p*221, *p*311, *p*312, *p*411, *p*412, *p*511, *p*521. The 1.56 m-spatial resolution of the image scene suggests that most of the 15 panels are one pixel in size.

**Figure 7.** (**a**) A Hyperspectral Digital Imagery Collection Experiment (HYDICE) panel scene which contains 15 panels; (**b**) Ground truth map of spatial locations of the 15 panels.

In order to quantitatively evaluate detection performance, receiver operating characteristic (ROC) curves are used to compare the different algorithms. Based on the provided ground truth, we can perform an analysis via ROC curves of the false alarm ratio (*P f*) versus the detection ratio (*Pd*) by taking all the possible thresholds (*τ*). We can further calculate the area under the ROC curve (AUC) for a quantitative performance analysis. The algorithm with a larger AUC value is regarded as a better performance.

Traditional ROC curves is a 2D plot represented by values of *P f* and *Pd*. Furthermore, we can plot another 2D ROC curve of *P f* and *τ*, which provides crucial information of progressive background suppression as the threshold *τ* varies. when it comes to the interpretation of anomaly detection by visual inspection with no availability of ground truth or AUC values with similar performance.

Three experiments are conducted with the purpose of: (1) evaluating the influence of window size on the detection performance of R-LS-RXD; (2) comparing the detection performance of different algorithms; and (3) comparing computing times of different algorithms, respectively.

#### *4.1. Optimum Size of Sliding Window*

Band selection is very practical in anomaly detection [22,23]; nine bands are selected by signal-to-noise ratio estimation and maximal information (SNRE-MI) [23] in the experiment to obtain better result. To investigate the influence of window size on detection performance of R-LS-RXD, two hyperspectral images of different sensors (AVIRIS and HYDICE, respectively) in the previous section are used for experiments, the size of sliding window varies from 5 × 5 up to 17 × 17 with steps of two pixels side width. Figure 8a–g and Figure 9a–g show their detection abundance fractional maps with their detected abundance fractions in gray scale of AVIRIS and HYDICE hyperspectral image, respectively. According to the experiment, the detection result is poor with a window size of 5 × 5, where the background and anomaly are difficult to separate for both sensors. Additionally, the performance begins to improve as the window size increases. When the window size is greater than or equal to 11 × 11, detection performances are similar by visual inspection as shown in Figures 8 and 9. Figures 8h and 9h show the results of global background K-RXD detector for comparison.

**Figure 8.** Detection abundance fractional maps of AVIRIS by recursive local summation RXD (R-LS-RXD) with different sliding window size: (**a**) *ω* =5; (**b**) *ω* =7; (**c**) *ω* =9; (**d**) *ω* =11; (**e**) *ω* =13; (**f**) *ω* =15; (**g**) *ω* =17; (**h**) global.

**Figure 9.** Detection abundance fractional maps of HYDICE by R-LS-RXD with different sliding window size: (**a**) *ω* =5; (**b**) *ω* =7; (**c**) *ω* =9; (**d**) *ω* =11; (**e**) *ω* =13; (**f**) *ω* =15; (**g**) *ω* =17; (**h**) global.

In order for a further quantitative evaluation of detection performance with different window sizes, the ROC curves are implemented. To simplify our study, ROC curves for HYDICE data are not given, since the results are similar for both data sources. Figure 10 shows the ROC curves for AVIRIS data with different window sizes, with a traditional (*Pd*, *P f*) ROC in (a) and a (*P f* , *τ*) curve analysis in (b), respectively. Additionally, the AUC values, denoted by *Az*, are calculated for each (*Pd*, *P f*) curves and (*P f* , *τ*) curves. In general, the higher the value of *Az*(*Pd*, *P f*) and the lower the value of *Az*(*P f* , *<sup>τ</sup>*), the better the detection performance is. Results are tableted in Table 4, with the best results highlighted.

**Figure 10.** Receiver operating characteristic (ROC) curves analysis for AVIRIS data with different window size: (**a**) curve of (*Pd*, *P f*); (**b**) curve of (*P f* , *τ*)

Based on the result of Figure 10 and Table 4, as the window size goes up, the values of *Az* (*Pd*, *P f*) are increased while the values of *Az* (*P f* , *τ*) are decreased. The detector reaches the best detection power in size 13 × 13, and the best background suppression performs the best with size 17 × 17. However, the trend of *Az* (*P f* , *τ*) decreasing obviously slows down when the window size increases from 13 × 13 to 15 × 15. When it comes to the global size background, the value of *Az* (*Pd*, *P f*) decreases to an untrustworthy value and is difficult to be distinguished by visual inspection in Figures 8h and 9h.

The conclusions for the experiment are as follows. As with the size of window increases, the sliding window RXD window RXD detector obtains better detection performances. However, 13 × 13 is the optimum size for a Sandiego hyperspectral image. As an alternative interpretation, although a larger window size results in better background suppression, the detection performance is much more important in the detector evaluation.


**Table 4.** Area under the ROC curve (AUC) values of (*Pd*, *P f*) and (*P f* , *τ*) with different window sizes

#### *4.2. Performance Evaluation for Different Algorithms*

In this section, we compare the detection performance of the LS-RXD, causal sliding array window (CSA-RXD) [18], proposed R-LS-RXD and R-BS-LS-RXD. In order to obtain the best detection results, the sliding window is implemented with size of 13 × 13 for both AVIRIS and HYDICE hyperspectral images.

Detection results of the four detectors using AVIRIS and HYDICE data are shown in Figures 11 and 12, respectively. The first line shows the gray scale results with detected abundance fractions, and the second line demonstrates the binary detection maps separated in an appropriate threshold, which was calculated by Otsu algorithm[24].

**Figure 11.** Detection results of AVIRIS data for different algorithms: (**a**) LS-RXD; (**b**) R-LS-RXD; (**c**) Recursive background Suppression local summation RXD (R-BS-LS-RXD); (**d**) Causal sliding array window (CSA-RXD).

**Figure 12.** Detection results of HYDICE data for different algorithms: (**a**) LS-RXD; (**b**) R-LS-RXD; (**c**) R-BS-LS-RXD; (**d**) CSA-RXD.

Both hyperspectral images of different sensors came to the same conclusions as follows, showing the adaptation of proposed algorithms for different sensors. It can be found obviously from the detection results that CSA-RXD, which merely take partial advantage of spectral–spatial integration information, omit number targets by visual inspection as shown in Figures 11d and 12d. On the contrary, other anomaly detectors, which are implemented with spectral–spatial integrated information can acquire excellent detection performance. The maximum detection of ground target shows in Figures 6b and 7b can be detected by LS-RXD, R-LS-RXD and R-BS-LS-RXD by visual inspection in Figures 11a–c and 12a–c. As also shown in the figure, R-BS-LS-RXD gets better background suppression compared with R-LS-RXD and LS-RXD. This indicates that R-BS-LS-RXD can not only correctly detect anomaly target pixels as R-LS-RXD performs, but also acquires excellent background suppression as CSA-RXD performs.

Similarly, to simplify our study, a quantitative evaluation with traditional ROC curves, (*P f* , *τ*) curves, is demonstrated in Figure 13. AUC values are listed in Table 5, only for AVIRIS Sandiego data, since the results are similar for both data sources.

**Figure 13.** ROC analysis with different detectors: (**a**) curve of (*Pd*, *P f*); (**b**) curve of (*P f* , *τ*).


**Table 5.** AUC values with different detectors.

It is interesting to note that the ROC curves of LS-RXD, R-LS-RXD and R-BS-LS-RXD are overlapped completely. This indicates that these algorithms ge<sup>t</sup> similar detection power from the traditional ROC curve analysis. Meanwhile, R-BS-LS-RXD gets a better performance in background suppression as the (*P f* , *τ*) curve shows. It is clearly shown that anomaly detectors with spectral–spatial integration have better performance, where the ROC curves of LS-RXD, R-LS-RXD and R-BS-LS-RXD are much closer to the upper left corner than CSA-RXD.

AUC values tablet in Table 5 prove that the proposed R-LS-RXD and R-BS-LS-RXD ge<sup>t</sup> a similar detection performance with LS-RXD. In addition, *Az*(*Pd*, *P f*) of LS-RXD, R-LS-RXD and R-BS-LS-RXD is greater than CSA-RXD. By contrast, R-BS-LS-RXD produced lowest value of *Az*(*P f* , *<sup>τ</sup>*). In general, R-BS-LS-RXD can suppress the background information and improve the detection performance.

#### *4.3. Computing Time Comparison for Different Algorithms*

In order to verify the computing effectiveness of recursive LS-RXD, we design a comprehensive comparative analysis on the computer processing time (CPT) of R-LS-RXD and LS-RXD. The computer environments used for the experiments are 64-bit operating systems with Intel i5-4590, a central processing unit (CPU) of 3.3 GHz, and 8 GB of random access memory (RAM). In order to remove the pulse error caused by the computer itself, the following data on complexity analyses are averaged after five experiments. Table 6 tablets the computing time of algorithms with different window sizes in San Diego hyperspectral image.


**Table 6.** Computing Time (seconds).

Based on the results in Table 6, the computing time of R-LS-RXD is significantly less than LS-RXD in every window size. In addition, the acceleration is particularly noticeable when the window is in a small-scale. In the experiment, the speedup ratio is up to three when the window size is chosen as 7 × 7. As the window size grows up, the speedup ratio remains, at least, two.

To further evaluate computational complexity, Figure 14 plots the computing time versus the number of processed pixels for both R-LS-RXD and LS-RXD on the Sandiego hyperspectral image. Each algorithm was run and executed five times to produce an average computing time. As we can see, R-LS-RXD requires less time than LS-RXD does due to the fact that the former implements a recursive process, while the latter implements a nonrecursive process. As also shown in the figure, the computing time increases linearly as new pixels are added.

**Figure 14.** Plots of computing time versus number of processed pixels. (**a**) *ω* =7; (**b**) *ω* =9; (**c**) *ω* =11; (**d**) *ω* =13; (**e**) *ω* =15; (**f**) *ω* =17.
