Next Article in Journal
Estimating Completely Remote Sensing-Based Evapotranspiration for Salt Cedar (Tamarix ramosissima), in the Southwestern United States, Using Machine Learning Algorithms
Previous Article in Journal
Automatic Detection of VLF Tweek Signals Based on the YOLO Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Boundary-Aware Deformable Spiking Neural Network for Hyperspectral Image Classification

1
State Key Laboratory of High-Performance Computing, College of Computer Science, National University of Defense Technology, Changsha 410073, China
2
College of Computer, National University of Defense Technology, Changsha 410073, China
3
College of Advanced Interdisciplinary Studies, National University of Defense Technology, Changsha 410073, China
4
Beijing Institute for Advanced Study, National University of Defense Technology, Beijing 100020, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(20), 5020; https://doi.org/10.3390/rs15205020
Submission received: 12 September 2023 / Revised: 13 October 2023 / Accepted: 16 October 2023 / Published: 19 October 2023
(This article belongs to the Section Remote Sensing Image Processing)

Abstract

:
A few spiking neural network (SNN)-based classifiers have been proposed for hyperspectral images (HSI) classification to alleviate the higher computational energy cost problem. Nevertheless, due to the lack of ability to distinguish boundaries, the existing SNN-based HSI classification methods are very prone to falling into the Hughes phenomenon. The confusion of the classifier at the class boundary is particularly obvious. To remedy these issues, we propose a boundary-aware deformable spiking residual neural network (BDSNN) for HSI classification. A deformable convolution neural network plays the most important role in realizing the boundary-awareness of the proposed model. To the best of our knowledge, this is the first attempt to combine the deformable convolutional mechanism and the SNN-based model. Additionally, spike-element-wise ResNet is used as a fundamental framework for going deeper. A temporal channel joint attention mechanism is introduced to filter out which channels and times are critical. We evaluate the proposed model on four benchmark hyperspectral data sets—the IP, PU, SV, and HU data sets. The experimental results demonstrate that the proposed model can obtain a comparable classification accuracy with state-of-the-art methods in terms of overall accuracy (OA), average accuracy (AA), and statistical kappa ( κ ) coefficient. The ablation study results prove the effectiveness of the introduction of the deformable convolutional mechanism for BDSNN’s boundary-aware characteristic.

1. Introduction

Hyperspectral images contain much spectral-spatial information, with hundreds of narrow continuous bands. The significant value of the abundant information they carry has been more and more obvious in many fields, such as agricultural applications [1], geological exploration and mineralogy [2,3], forestry and environmental management [4,5], water and marine resources management [6], and military and defense applications [7,8]. Since HSI classification is one of the most essential procedures of HSI analysis, the innovation discovery in HSI classification has been an increasingly important promotion of the development of these fields mentioned above.
The main task of HSI classification is to label every image pixel based on the feature information carried by the training samples. Many pixel-wise-based HSI classification methods have been proposed based on the realistic idea that different categories of pixels should take different spectral information. For example, methods such as support vector machine (SVM) [9], random forests (RF) [10], and traditional distance metrics-based classifiers [11] treat a single pixel with several bands as a single sample. This view makes them only use spectral information for classification and neglect the rich spatial characteristics. Furthermore, since the features of pixels vary in the same class and resemble the different classes, which is called the salt-and-pepper noise problem, the aforementioned traditional machine learning algorithms can hardly achieve a desirable accuracy.
In recent decades, deep learning (DL) has shown great potential in natural language processing, computer vision, and object detection. In this context, DL-based classifiers, especially convolution neural networks (CNNs)-based approaches, could effectively utilize spatial-spectral information for HSI classification [12,13]. Chen et al. [14] proposed a 2D CNN stacked autoencoder and introduced the CNNs-based method to HSI classification for the first time. Furthermore, to achieve more efficient extractions of spatial-spectral features, the structure of the DL-based model is increasingly complicated, and the number of parameters of the models is increasingly enormous. Roy et al. [15] constructed a hybrid 2D–3D CNN to classify HSI. Hamida et al. [16] used a 3D CNN model to obtain even better classification results. Zhao et al. [17] proposed a convolutional transformer network, which introduced the promising transformer to HSI classification. To ensure that a deeper network achieves better effectiveness, researchers introduced the residual structure into HSI classification [18,19,20]. With the classification model becoming more complex and deeper, and the training and inference time increasing, the computational energy required also increases.
In the past few years, we are rapidly reaching a point where DL may no longer be feasible, while spiking neural networks are one of the most promising paradigms to cross it. Unlike artificial neural networks (ANNs), SNNs use spike sequences to represent information and take advantage of spatiotemporal information during training and inference. Inspired by biological neural networks, the spiking neurons, foundational components of SNN, will remain silent outside of a few active states. Because of the inherent asynchrony and sparseness of spike trains, SNN has the potential to reduce power consumption while maintaining a relatively good performance [21]. Due to the discontinuity of spike trains, to obtain a high-performance SNN, the selection of training methods is the first issue to consider. The current mainstream SNN training methods are divided into ANN to SNN conversion (ANN2SNN) [22] and backpropagation with surrogate gradient [23]. The ANN2SNN method trains an ANN and saves its parameters first, then converts it into an SNN by replacing the activation function with spiking neurons. However, to obtain an accuracy that matches the original ANN, a large timestep is needed for the converted SNN. This limitation causes a counteraction to the low latency characteristics of SNN. The surrogate gradient method achieves error backpropapgation by keeping the non-differentiable firing function in the forward and substituting it with a continuous and smooth surrogate function in the backward. Moreover, the surrogate method can gobtainet direct training SNNs, which could surpass ANNs with similar architecture in an end-to-end manner.
On this basis, many mechanisms and model schemes that have been proven helpful in ANN were introduced into direct training SNN. Fang et al. [24] use the idea of spike-element-wise (SEW) to introduce ResNet into SNN, which makes it possible to achivee a deeper SNN. Zhu et al. [25] proposed a temporal-channel joint attention (TCJA) mechanism and designed an SNN that carried out the weight allocation of time and channel joint information. With these developments, SNN models have achieved noteworthy results in many fields, especially image classification. In recent years, a few researchers have also tried to construct an SNN model to achieve HSI classification. Datta et al. [26] proposed a quantization-aware gradient descent method to train an SNN generated from iso-architecture CNNs for HSI classification. Liu et al. [27,28] proposed two SNN classifiers based on channel shuffle attention mechanisms with two different derivative algorithms. These SNN models for HSI classification tend to fall into the trap of the Hughes phenomenon with fewer training samples. In particular, through there are plenty of experiments for these methods, we found that the pixels on the edges of different categories are most likely to obtain the wrong label, which plays a major role in the causes of the Hughes phenomenon. These issues indicate the limitations of the existing SNN methods for boundary discrimination.
To address the above issues, inspired by the deformable convolutional mechanism [29] in computer vision tasks to distinguish ambiguous boundaries, we proposed a boundary-aware deformable spiking neural network (BDSNN). The contributions of this article are as follows:
  • We proposed a novel SNN-based model for HSI classification by integrating an attention mechanism and deformable convolution with a spiking ResNet. The spiking ResNet framework we used, named SEW ResNet, could overcome the vanishing/exploding gradient problems effectively. In addition, the temporal-channel joint attention (TCJA) mechanism was introduced for better feature extraction, by guiding the model to figure out what is useful and when, through filtering the abundant temporal and spectral information.
  • For boundary-awareness, we proposed the deformable SEW ResNet method by adding the deformable convolutional mechanism into the SEW ResNet block. The deformable convolution provides variable receptive fields for high-level features extraction and brings our method the boundary-awareness to mitigate the boundary confusion phenomenon.
The rest of this article is organized as follows. Section 2 introduces the proposed methods used to build an attention-based deformable SNN model and the model architecture in detail. Section 3 presents the results of experiments. Section 4 summarizes this paper.

2. Proposed Method

Represent the original HSI cube as X = [ x 1 , x 2 , x 3 x B ] T R ( N × M ) × B with B spectral channels and ( N × M ) samples. All the pixels x i = [ x i , 1 , x i , 2 , x i , 3 , x i , B ] T X belong to Y = y 1 , y 2 , y 3 , y c representing C land-cover classes. The proposed model’s framework for HSI classification comprises the spiking encoder, the spike-element-wise ResNet, the temporal-channel joint attention layer, the deformable spike-element-wise ResNet, the max-pooling layer, and the output layer. Figure 1 shows the framework of the proposed network, in which the methods we use will be introduced in detail in the following subsections.

2.1. Leaky Integrate and Fire Model

As the fundamental computing unit of an SNN, spike neuron models play the role that activation functions play in the traditional ANNs. Furthermore, it is one of the main differences between SNNs and ANNs. The distinction between different spike neuron models lies in the extent of modeling of the biological neurons in the human brain. In terms of complexity, the current mainstream spike neuron models are divided into the Hodgkin–Huxley (HH) model [30], the lzhikevich model [31], the Leaky Integrate and Fire (LIF) model [32], and the Integrate and Fire (IF) model [33]. The HH model has the highest biological precision with enormous computation, and the IF model is quite the opposite. Considering the balance between computing cost and biological plausibility, we choose the LIF model as our spike neuron model. The neuron is modeled as a parallel Resistor-Capacitor (RC) [34] circuit as shown in Figure 2. I ( t ) is the input current for the postsynaptic neuron at time t, which is parametrically related to the spikes emitted by the presynaptic neurons. There are two directions for the input current I ( t ) after being inputted; one is to capacitor C for charging and integration, and another is to the resistor R for leakage, expressed as
I ( t ) = V ( t ) V r e s t R + C d V ( t ) d t ,
where V ( t ) is the membrane potential at time t, V r e s t is the resting potential. Multiplying (1) by R and using τ = R C to represent the membrane time constant, we obtain the formula of subthreshold dynamics of the LIF model:
τ d V ( t ) d t = ( V ( t ) V r e s t ) + R I ( t ) .
When the membrane potential V ( t ) exceeds a preset threshold V t h , the neuron will emit a spike immediately to the postsynaptic neuron as one of its presynaptic neurons, and then V ( t ) is reset to reset value V r e s e t < V t h . Meanwhile, the membrane potential constantly leaks according to τ until it reaches the rest value V r e s t .
To enhance the characterization capabilities of the LIF model, we use a unified model named the parametric leaky integrate and fire (PLIF) model, based on [35]. This model contains a learnable membrane time constant, which makes the SNNs based on the PLIF model more robust than SNNs made on the LIF model.

2.2. Spiking Encoder

HSI is a stable image, which means there is no temporal information in it and the carrier of its data representation is the analog signal; it cannot be identified by SNNs directly. HSI should be coded into spike sequences by the spiking encoder first. To explain the information encoding mechanism, two broad categories of coding methods have been proposed—rate coding [36] and direct coding [37]. The analog value is converted to a spike sequence for rate coding using a Poisson generator function with a rate proportional to the input pixel value. The number of timesteps plays a significant role in the precision of rate coding. The larger the number of timesteps, the better the summation of the spike sequences from the encoder approximates the original pixel. Therefore, rate coding is limited by a lengthy processing period and slow information transmission. To explain the efficient and fast response mechanisms in our brains, we use direct coding, which has been widely adopted by many SNNs-based image classification works for our spiking encoder. Figure 3 shows the structure of the spiking encoder. First, we repeat the original HSI patch for T times. Then, the T patches were put into a learnable layer with spike neurons to generate the spiking images.

2.3. Spiking Element-Wise Residual Network

After being converted to spatiotemporal spikes, the spiking HSI cube will be inputted into the feature extractor. To enhance the efficiency of the extractor by deepening the network, we use a spike-element-wise ResNet based on [24] as the fundamental structure for our extractor. The detail of the SEW block and the differences between it and the standard ResNet block are shown in Figure 4. Unlike previous spiking ResNets, [24] changes the activation function of the standard ResNet block proposed in [38], but also adjusts the position of the residual connection and uses an element-wise function g to substitute the original summation function. Specifically, [24] provides three different element-wise functions g, A D D , A N D and I A N D , their expression is shown in Table 1. SEW ResNet can easily implement identity mapping and overcome the vanishing/exploding gradient problems, making the deeper SNNs achieve higher accuracy.

2.4. Temporal-Channel Joint Attention Mechanism

To further enhance the robustness of the proposed model, we introduce a kind of attention mechanism named temporal-channel joint attention based on [25] into our features extractor. The TCJA layer can effectively enforce the relevance of the spike sequence along both spatial and temporal dimensions. The detail of the mechanism is shown in Figure 5. The TCJA layer takes patched spiking HSI cube P R ( S × S ) × B × T as input. Firstly, P is expressed in the dimension of spatial through the pooling layer to get the feature vector U R ( 1 × 1 ) × B × T defined as
U = F e x p ( P ) = 1 S × S i = 1 S j = 1 S P ( i , j ) .
Two kinds of 1-D CNN are used to extract and learn the temporal and channel information of U , respectively, of which the kernel sizes are a kind of hyper parameter. The output feature maps U 1 and U 2 are defined as
U 1 = F t e m ( U ) = U w ( 1 × 1 × 3 ) + b U 2 = F c h a ( U ) = U w ( 1 × 1 × 9 ) + b ,
where ∗ denotes the 1-D convolution, and w and b are weights and bias of the 1-D convolutions. After that, we fuse the two feature vectors generated by the two 1-D CNN extractors to be the attention vector V R ( 1 × 1 ) × B × T defined as
V = F f u s ( U 1 , U 2 ) = σ ( U 1 U 2 ) ,
where σ denotes the sigmoid activation function and ⊙ is the element wise multiply operation. According to V , we could select the features of the original spiking HSI cube P and get the output Q , which contains more characterization.

2.5. Spiking Deformable Convolution Neural Networks

In order to utilize the spatial features, we select a patched image cube as one sample in which only the center pixel certainly carries the correct label. The other pixels around the center, especially the ones with different classes, may disturb the judgment of the classifier. As shown in Figure 6 (the classification map on Indian Pines of the SEWResNet [24]), the pixels on the boundary of a contiguous region with different classes are more likely to get wrong labels, called the salt-and-pepper phenomenon. We add a spiking deformable CNN [39] to the features extractor scheme to eliminate this phenomenon. As shown in Figure 7, deformable CNN could transform the shape of its receptive field to avoid the pixels with different class labels from the target pixel.
The receptive field of the regular convolution is denoted as an immutable grid over the input x. For example, a 3 × 3 kernel with dilation 1 is expressed as
D = { ( 1 , 1 ) , ( 1 , 0 ) , . . . , ( 0 , 1 ) , ( 1 , 1 ) } .
The output y is accumulated from each location p 0 and the locations around it within the range of grid D according to the weight w, expressed as
y ( p 0 ) = p n D w ( p n ) · x ( p 0 + p n ) ,
where p n enumerates the locations in D.
In deformable convolution, a step named offset { p n n = 1 , . . . , N , N = D } field generation is added before calculating the output feature map. The output y of deformable convolution is expressed as
y ( p 0 ) = p n D w ( p n ) · x ( p 0 + p n + p n ) .
As shown in Figure 8, the offsets are generated by convolutions over the input x. Due to the characteristics of convolutional calculation, the offset p n is typically fractional. To determine a rational value, a bilinear interpolation kernel function F is used to implement the x ( p ) as
x ( p ) = q F ( q , p ) · x ( q ) ,
where p ( p = p 0 + p n + p n ) is the fractional location, q denotes all integral locations selected from input x, and F is expressed as
F ( q , p ) = f ( q x , p x ) · f ( q y , p y ) ,
where f ( a , b ) = m a x ( 0 , 1 a b ) .
Furthermore, to avoid additional gradient problems, we proposed a deformable spike-element-wise block for the introduction of deformable CNN. Specifically, the deformable convolution models are plugged into the SEW Block to substitute the original CNN model. The detail of the deformable SEW Block is shown in Figure 9.

3. Experimental Results

In this section, we choose four CNN-based methods (ResNet [38], DPyResNet [18], SSRN [19], and A2S2KResNet [20]) one deformable-CNN-based model (DHCNet [29]), and one SNN-based model (HSI-SNN [28]) for comparison. To fully prove the effectiveness of the proposed method, the experiment was performed on five benchmark data sets: Indian Pines(IP), Kennedy Space Center (KSC), Houston University (HU), Pavia University (PU), and Salinas (SV). All experiments are conducted under the experimental environment of Ubuntu16, Titan-RTX GPUs, and 125G memory. We train and test the proposed model with the SpikingJelly [40] framework based on PyTorch [41]. Overall accuracy (OA), average accuracy (AA), and statistical kappa ( κ ) coefficient are used to evaluate the performance of the models.

3.1. Data Sets

AVIRIS [42] obtained the Indian Pines data set imaging of Indiana Indian Pine trees in the United States. Its spectrum is 200 (excluding 20 bands that cannot be reflected by water). Its size is 145 × 45 , composed of 21,025 pixels. There are 10,776 background pixels, and 10,249 object pixels used for training and testing are available.
Rosis-03 obtained the Pavia University data set [43] on Pavia City in Italy. It contains 103 available bands (12 noise affected by noise). Its size is 610 × 340 , including 207,400 pixels, of which 42,776 are object pixels.
The Salinas data set was shot by AVIRIS [42] sensors in Salinas Valley, California. The spatial resolution of these data is 3.7 m, and the size is 512 × 217 . The original data are 224 bands; after removing the bands with severe water vapor absorption, there are 204 bands left. These data include 16 crop categories.
The Houston University data set was obtained from the imaging of Houston University through the ITRES Casi-1500 sensor provided by the 2013 IEEE GRSS Data Fusion Contest. It has 144 bands. Its size is 349 × 1905 , of which 15,029 are object pixels.
The KSC data set was taken by AVIRIS [42] sensors at the Kennedy Space Center in Florida on 23 March 1996. These data contain 224 bands, leaving 176 bands after water vapor noise removal, with a spatial resolution of 18 m and a total of 13 categories.
Table 2 shows the main characteristics of the five data sets and the detail of our sample split strategy. Considering that the total sample size is close to or below 10,000, for IN and KSC data sets, we randomly select 10 % samples for training and 90 % for testing. While the extremely limited 1 % samples are randomly selected to train and 99 % to test for the other three data sets (PU, SV, and HU), their total sample sizes are big enough.

3.2. Experimental Setup and Parameter Evaluation

We train the proposed model using the stochastic gradient descent (SGD) optimizer and choose the cross entropy loss function to represent loss. The batch size is set to 32, and the learning rate is set to 0.1. The hyperparameters initial τ of PLIF, the kernel size of channel attention 1D-CNN k c , and temporal attention 1D-CNN k t for the TCJA layer are uniformly set to 2.0, 9, and 3, respectively. The experiment is repeated five times, using 200 epochs each time. The model with the highest accuracy in the validation process is selected for evaluation on test samples.
Firstly, the experiment on the KSC data set is implemented for the evaluation of element-wise functions g in SEW ResNet. Table 3 shows the accuracy of the proposed model using different element-wise functions. It is observed that model with the I A N D function gets optimal results, while the other two functions are more likely to fall into the vanishing/exploding gradient problems. Thus, we set the element-wise function g to I A N D for the following experiments.
The patch size of the input samples is an essential parameter for the extraction of spatial information, and it also influences the extension of disturbance from pixels with different classes. A smaller patch size will limit the model’s spatial feature extraction, decreasing classification accuracy. In contrast, a bigger patch size will aggravate the salt-and-pepper phenomenon. During these experiments, the timestep is set to 8. Table 4 shows the classification accuracy of the five data sets with four different patch sizes ( 7 × 7 , 9 × 9 , 11 × 11 , and 13 × 13 ). The five data sets’ best results are achieved using patch size 11 × 11 and 13 × 13 . Noting that a bigger patch size will make the other models achieve worse results, we fix 11 × 11 as the patch size of the proposed method for all the data sets for fair comparisons.
The timestep is another critical parameter for an SNN-based model. A tinier timestep limits the model’s ability to extract features of HSIs, and a longer timestep will enhance computational energy consumption. Table 5 shows the evaluation results for different timesteps (4, 8, 12, 16) on five data sets. The best OA, AA, and κ are achieved using timesteps 12 and 16. Considering the computational consumption, we set 12 as the timestep for the proposed model.

3.3. Classification Results

Table 6, Table 7 and Table 8 show the average OAs, AAs, κs, and accuracies for every class of the five repeat processes on four data sets (IP, PU, SV, and HU). Moreover, our proposed method obtains competitive results in all data sets, compared with the other ResNet-based, deformable-CNN-based, and SNN-based methods.
The results for the IP data set are shown in Table 6, and Figure 10 shows the classification maps of our model and others for comparison. Our model achieves the best OA (99.16 ± 0.003%), which is 0.34% higher than the best CNN-based model (A2S2KResNet) and 4.58% higher than HSI-SNN. In addition, the proposed model also achieves the biggest κ . However, the AA of the proposed model is 1.18% lower than A2S2KResNet, expressing the worse robustness of the SNN-based model for the IP data set, which has an uneven number of categories.
For the PU data set, the results are shown in Table 7. The proposed model achieves the best OA (96.15 ± 0.006%), which is 4.24% higher than the best results of the other models. And the proposed model also obtains a higher AA (93.96 ± 0.015%) and κ (0.9537 ± 0.008) than the others. The classification maps are shown in Figure 11.
Table 9 shows the results for the SV data set. The proposed model achieves the best OA (99.03 ± 0.002%), which is 5.45% higher than the best results of the other models. And the proposed model also obtains a higher AA (99.28 ± 0.001%) and κ (0.9892 ± 0.002) than the others. The classification maps are shown in Figure 12.
Table 8 shows the results for the HU data set. The proposed model achieves the best OA (86.29 ± 0.017%), which is 4.24% higher than the best results of the other models. And the proposed model also obtains a higher AA (86.63 ± 0.014%) and κ (0.8517 ± 0.018) than the others. The classification maps are shown in Figure 13.

3.4. Ablation Study

In order to further validate the methods we used in our proposed model, we evaluate the generalization performance for the HU data set of the proposed model and three other models without the specific methods we used. The details of the models are as follows:
  • Denoted as SEW + TCJA, the deformable CNN is removed from the proposed framework.
  • Denoted as SEW + DEF, the TCJA layer is removed from the proposed framework.
  • Denoted as SEW, the deformable CNN and TCJA layers are both removed from the proposed framework.
For the ablation experiments, we change the patch size to 13 × 13 to better reflect the boundary effect and keep the other experimental settings unchanged. Table 10 shows the classification results of the ablation experiments over HU data sets. Compared with SEW, SEW + DEF, SEW + TCJA, and the proposed model produce notable improvement in all of the three matrices (OA, AA, and κ ). Regarding OA, the TCJA layer method used in SEW + TCJA and the proposed model achieves 10.24% and 10.41% improvement than SEW and SEW + DEF, respectively. Furthermore, the deformable CNN method used in SEW + DEF and the proposed model obtain 0.63% and 0.8% advances compared with SEW and SEW + TCJA, respectively. The classification maps are shown in Figure 14. We can observe that the deformable CNN method can mitigate the boundary confusion phenomenon.

3.5. Comparison of Running Times

In this section, the training and testing time of three representative CNN-based methods and our proposed BDSNN on four data sets are shown in Table 11. Due to the limitations of the computing platform, we can only estimate the time on nonneuromorphic computers. As a result, all of the traditional deep learning methods have an advantage in training and test time compared with our proposed BDSNN, while the advantages of SNN in terms of energy saving and faster computing can only be demonstrated with the application to neuromorphic computers [44]. Concerning the SNN-based HSI-SNN, the time and OA are shown in Table 12. The training time of our proposed BDSNN is about 1.82–3.24 times as long as the training time of HSI-SNN and about 3.61–7.73 times in terms of test time, with about a 2.33–8.56% improvement of OA. Due to a more complex structure, the proposed BDSNN has a disadvantage in running times. The introductions of TCJA and deformable convolution create a burden for computation, as they have many non-spiking calculation processes, such as attention vector generation and offset generation. The solution to reducing the implications for computational efficiency will be one of our future research directions.

4. Conclusions

In this article, we proposed a boundary-aware deformable spiking neural network (BDSNN) for HSI classification. The proposed SNN is made of PLIF neurons, and the method for the spiking coder belongs to a direct coding scheme. The spiking element-wise ResNet is used for the proposed model to overcome the vanishing/exploding gradient problems. Moreover, the temporal-channel joint attention layer is introduced for effective temporal-spectral feature extraction. Furthermore, to mitigate boundary confusion, we introduced deformable CNN into SNN for the first time. By performing experiments on four hyperspectral data sets, it is proved that the proposed model outperforms other CNN-based models and an SNN-based model with limited training samples. The proposed BDSNN provides a promising way to improve the SNN-based methods for HSI classification. However, the running times comparison experiments show the limitations of BDSNN. While improving the feature extraction ability, the complex structure of BDSNN also increases the computational overhead. Therefore, we will pay attention to converting the non-spiking calculation processes into a spiking version of them in the future.

Author Contributions

Methodology, S.W.; Software, S.W.; Investigation, Y.P.; Writing—original draft, S.W.; Writing—review & editing, Y.P. and T.L.; Supervision, Y.P., L.W. and T.L.; Project administration, L.W. and T.L.; Funding acquisition, L.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the National Key R&D Program of China (2021ZD0140301), the National Natural Science Foundation of China: 91948303-1; the National Natural Science Foundation of China: No. 61803375, No. 12002380, No. 62106278, No. 62101575, No. 61906210; the Postgraduate Scientific Research Innovation Project of Hunan Province: QL20210018.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

All authors disclosed no relevant relationships.

References

  1. Teke, M.; Deveci, H.S.; Haliloğlu, O.; Gürbüz, S.Z.; Sakarya, U. A short survey of hyperspectral remote sensing applications in agriculture. In Proceedings of the 2013 6th International Conference on Recent Advances in Space Technologies (RAST), IEEE, Istanbul, Turkey, 12–14 June 2013; pp. 171–176. [Google Scholar]
  2. Resmini, R.; Kappus, M.; Aldrich, W.; Harsanyi, J.; Anderson, M. Mineral mapping with hyperspectral digital imagery collection experiment (HYDICE) sensor data at Cuprite, Nevada, USA. Int. J. Remote Sens. 1997, 18, 1553–1570. [Google Scholar] [CrossRef]
  3. Acosta, I.C.C.; Khodadadzadeh, M.; Tusa, L.; Ghamisi, P.; Gloaguen, R. A machine learning framework for drill-core mineral mapping using hyperspectral and high-resolution mineralogical data fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 4829–4842. [Google Scholar] [CrossRef]
  4. Coops, N.C.; Smith, M.L.; Martin, M.E.; Ollinger, S.V. Prediction of eucalypt foliage nitrogen content from satellite-derived hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2003, 41, 1338–1346. [Google Scholar] [CrossRef]
  5. Große-Stoltenberg, A.; Hellmann, C.; Werner, C.; Oldeland, J.; Thiele, J. Evaluation of continuous VNIR-SWIR spectra versus narrowband hyperspectral indices to discriminate the invasive Acacia longifolia within a Mediterranean dune ecosystem. Remote Sens. 2016, 8, 334. [Google Scholar] [CrossRef]
  6. Younos, T.; Parece, T.E. Advances in Watershed Science and Assessment; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  7. Richter, R. Hyperspectral Sensors for Military Applications; Technical Report; German Aerospace Center Wessling (DLR): Wessling, Germany, 2005. [Google Scholar]
  8. El-Sharkawy, Y.H.; Elbasuney, S. Hyperspectral imaging: Anew prospective for remote recognition of explosive materials. Remote Sens. Appl. Soc. Environ. 2019, 13, 31–38. [Google Scholar] [CrossRef]
  9. Melgani, F.; Bruzzone, L. Classification of hyperspectral remote sensing images with support vector machines. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1778–1790. [Google Scholar] [CrossRef]
  10. Ham, J.; Chen, Y.; Crawford, M.; Ghosh, J. Investigation of the random forest framework for classification of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 2005, 43, 492–501. [Google Scholar] [CrossRef]
  11. Du, Q.; Chang, C.I. A linear constrained distance-based discriminant analysis for hyperspectral image classification. Pattern Recognit. 2001, 34, 361–373. [Google Scholar] [CrossRef]
  12. Petersson, H.; Gustafsson, D.; Bergstrom, D. Hyperspectral image analysis using deep learning—A review. In Proceedings of the 2016 Sixth International Conference on Image Processing Theory, Tools and Applications (IPTA), IEEE, Oulu, Finland, 12–15 December 2016; pp. 1–6. [Google Scholar]
  13. Comai, A.V.; Matteucci, M. Deep Learning for Land Use and Land Cover Classification Based on Hyperspectral and Multispectral Earth Observation Data: A Review. Remote Sens. 2020, 12, 2495. [Google Scholar]
  14. Chen, Y.; Lin, Z.; Zhao, X.; Wang, G.; Gu, Y. Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2094–2107. [Google Scholar] [CrossRef]
  15. Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. HybridSN: Exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2019, 17, 277–281. [Google Scholar] [CrossRef]
  16. Hamida, A.B.; Benoit, A.; Lambert, P.; Amar, C.B. 3-D deep learning approach for remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4420–4434. [Google Scholar] [CrossRef]
  17. Zhao, Z.; Hu, D.; Wang, H.; Yu, X. Convolutional Transformer Network for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  18. Paoletti, M.E.; Haut, J.M.; Fernandez-Beltran, R.; Plaza, J.; Plaza, A.J.; Pla, F. Deep pyramidal residual networks for spectral–spatial hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 740–754. [Google Scholar] [CrossRef]
  19. Zhong, Z.; Li, J.; Luo, Z.; Chapman, M. Spectral–spatial residual network for hyperspectral image classification: A 3-D deep learning framework. IEEE Trans. Geosci. Remote Sens. 2017, 56, 847–858. [Google Scholar] [CrossRef]
  20. Roy, S.K.; Manna, S.; Song, T.; Bruzzone, L. Attention-based adaptive spectral–spatial kernel ResNet for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2020, 59, 7831–7843. [Google Scholar] [CrossRef]
  21. Nunes, J.D.; Carvalho, M.; Carneiro, D.; Cardoso, J.S. Spiking neural networks: A survey. IEEE Access 2022, 10, 60738–60764. [Google Scholar] [CrossRef]
  22. Hunsberger, E.; Eliasmith, C. Spiking deep networks with LIF neurons. arXiv 2015, arXiv:1510.08829. [Google Scholar]
  23. Neftci, E.O.; Mostafa, H.; Zenke, F. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Process. Mag. 2019, 36, 51–63. [Google Scholar] [CrossRef]
  24. Fang, W.; Yu, Z.; Chen, Y.; Huang, T.; Masquelier, T.; Tian, Y. Deep residual learning in spiking neural networks. Adv. Neural Inf. Process. Syst. 2021, 34, 21056–21069. [Google Scholar]
  25. Zhu, R.J.; Zhao, Q.; Zhang, T.; Deng, H.; Duan, Y.; Zhang, M.; Deng, L.J. TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks. arXiv 2022, arXiv:2206.10177. [Google Scholar]
  26. Datta, G.; Kundu, S.; Jaiswal, A.R.; Beerel, P.A. HYPER-SNN: Towards energy-efficient quantized deep spiking neural networks for hyperspectral image classification. arXiv 2021, arXiv:2107.11979. [Google Scholar]
  27. Liu, Y.; Cao, K.; Wang, R.; Tian, M.; Xie, Y. Hyperspectral image classification of brain-inspired spiking neural network based on attention mechanism. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  28. Liu, Y.; Cao, K.; Li, R.; Zhang, H.; Zhou, L. Hyperspectral Image Classification of Brain-Inspired Spiking Neural Network Based on Approximate Derivative Algorithm. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–16. [Google Scholar] [CrossRef]
  29. Zhu, J.; Fang, L.; Ghamisi, P. Deformable convolutional neural networks for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1254–1258. [Google Scholar] [CrossRef]
  30. Hodgkin, A.L.; Huxley, A.F. A quantitative description of membrane current and its application to conduction and excitation in nerve. J. Physiol. 1952, 117, 500. [Google Scholar] [CrossRef] [PubMed]
  31. Izhikevich, E.M. Simple model of spiking neurons. IEEE Trans. Neural Netw. 2003, 14, 1569–1572. [Google Scholar] [CrossRef]
  32. Lapicque, L. Recherches quantitatives sur l’excitation electrique des nerfs traitee comme une polarization. J. Physiol. Pathol. Générale 1907, 9, 620–635. [Google Scholar]
  33. Lu, S.; Sengupta, A. Exploring the connection between binary and spiking neural networks. Front. Neurosci. 2020, 14, 535. [Google Scholar] [CrossRef] [PubMed]
  34. Dutta, S.; Kumar, V.; Shukla, A.; Mohapatra, N.R.; Ganguly, U. Leaky integrate and fire neuron by charge-discharge dynamics in floating-body MOSFET. Sci. Rep. 2017, 7, 8257. [Google Scholar] [CrossRef]
  35. Fang, W.; Yu, Z.; Chen, Y.; Masquelier, T.; Huang, T.; Tian, Y. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 2661–2671. [Google Scholar]
  36. Diehl, P.U.; Zarrella, G.; Cassidy, A.; Pedroni, B.U.; Neftci, E. Conversion of artificial recurrent neural networks to spiking neural networks for low-power neuromorphic hardware. In Proceedings of the 2016 IEEE International Conference on Rebooting Computing (ICRC), IEEE, San Diego, CA, USA, 17–19 October 2016; pp. 1–8. [Google Scholar]
  37. Rathi, N.; Roy, K. Diet-snn: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv 2020, arXiv:2008.03658. [Google Scholar]
  38. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  39. Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; 2017; pp. 764–773. [Google Scholar]
  40. Fang, W.; Chen, Y.; Ding, J.; Chen, D.; Yu, Z.; Zhou, H.; Tian, Y. Spikingjelly. 2020. Available online: https://github.com/fangwei123456/spikingjelly (accessed on 11 September 2023).
  41. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An Imperative Style, High-Performance Deep Learning Library. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
  42. Green, R.O.; Eastwood, M.L.; Sarture, C.M.; Chrien, T.G.; Aronsson, M.; Chippendale, B.J.; Faust, J.A.; Pavri, B.E.; Chovit, C.J.; Solis, M.; et al. Imaging spectroscopy and the airborne visible/infrared imaging spectrometer (AVIRIS). Remote Sens. Environ. 1998, 65, 227–248. [Google Scholar] [CrossRef]
  43. Kunkel, B.; Blechinger, F.; Lutz, R.; Doerffer, R.; Van der Piepen, H.; Schroder, M. ROSIS (Reflective Optics System Imaging Spectrometer)—A candidate instrument for polar platform missions. In Optoelectronic Technologies for Remote Sensing from Space; SPIE: Bellingham, WA, USA, 1988; Volume 868, pp. 134–141. [Google Scholar]
  44. Ma, S.; Pei, J.; Zhang, W.; Wang, G.; Feng, D.; Yu, F.; Song, C.; Qu, H.; Ma, C.; Lu, M.; et al. Neuromorphic computing chip with spatiotemporal elasticity for multi-intelligent-tasking robots. Sci. Robot. 2022, 7, eabk2948. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Framework of the proposed SNN-based model for HSI classification. SEW block denotes the spiking element wise ResNet block as shown in Figure 4b; TCJA Layer denotes the temporal channel joint attention layer as shown in Figure 5; Deformable block denotes the deformable spiking element wise ResNet block as shown in Figures 8 and 9.
Figure 1. Framework of the proposed SNN-based model for HSI classification. SEW block denotes the spiking element wise ResNet block as shown in Figure 4b; TCJA Layer denotes the temporal channel joint attention layer as shown in Figure 5; Deformable block denotes the deformable spiking element wise ResNet block as shown in Figures 8 and 9.
Remotesensing 15 05020 g001
Figure 2. RC circuit of LIF model.
Figure 2. RC circuit of LIF model.
Remotesensing 15 05020 g002
Figure 3. Illustration of the structure of spiking encoder.
Figure 3. Illustration of the structure of spiking encoder.
Remotesensing 15 05020 g003
Figure 4. Illustration of (a) ResNet block and (b) Spike-element-wise block.
Figure 4. Illustration of (a) ResNet block and (b) Spike-element-wise block.
Remotesensing 15 05020 g004
Figure 5. Architecture of temporal-channel joint attention layer.
Figure 5. Architecture of temporal-channel joint attention layer.
Remotesensing 15 05020 g005
Figure 6. The classification map on Indian Pines of SEWResNet.
Figure 6. The classification map on Indian Pines of SEWResNet.
Remotesensing 15 05020 g006
Figure 7. Illustration of the difference between the convolution kernel of regular CNN and deformable CNN.
Figure 7. Illustration of the difference between the convolution kernel of regular CNN and deformable CNN.
Remotesensing 15 05020 g007
Figure 8. Illustration of deformable convolution.
Figure 8. Illustration of deformable convolution.
Remotesensing 15 05020 g008
Figure 9. Illustration of deformable spike-element-wise block.
Figure 9. Illustration of deformable spike-element-wise block.
Remotesensing 15 05020 g009
Figure 10. Classification results of the Indian Pines data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e) SSRN. (f) A2S2KResNet. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Figure 10. Classification results of the Indian Pines data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e) SSRN. (f) A2S2KResNet. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Remotesensing 15 05020 g010
Figure 11. Classification results of the Pavia University data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e) SSRN. (f) A2S2K. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Figure 11. Classification results of the Pavia University data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e) SSRN. (f) A2S2K. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Remotesensing 15 05020 g011
Figure 12. Classification results of the Salinas data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e)SSRN. (f)A2S2KResNet. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Figure 12. Classification results of the Salinas data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e)SSRN. (f)A2S2KResNet. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Remotesensing 15 05020 g012
Figure 13. Classification result of the Houston13 data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e) SSRN. (f) A2S2K. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Figure 13. Classification result of the Houston13 data set. (a) False-color composite image. (b) Ground truth. (c) ResNet. (d) DPyResNet. (e) SSRN. (f) A2S2K. (g) DHCNet. (h) HSI-SNN. (i) Proposed BDSNN. (j) Color labels.
Remotesensing 15 05020 g013
Figure 14. Classification results of the ablation experiments. (a) Ground truth. (b) SEW. (c) SEW + DEF. (d) SEW + TCJA. (e) Proposed BDSNN.
Figure 14. Classification results of the ablation experiments. (a) Ground truth. (b) SEW. (c) SEW + DEF. (d) SEW + TCJA. (e) Proposed BDSNN.
Remotesensing 15 05020 g014
Table 1. Expression for element-wise functions.
Table 1. Expression for element-wise functions.
Element-Wise FunctionsExpression of g ( a , b )
A D D a + b
A N D a b = a · b
I A N D ( ¬ a ) b = ( 1 a ) · b
Table 2. The summary of IP, PU, SV, HU, and KSC data sets.
Table 2. The summary of IP, PU, SV, HU, and KSC data sets.
CharacteristicsData Sets
IP PU SV HU KSC
SensorAVIRISROSISAVIRISITRESAVIRIS
Spectral Bands200103204144176
Spatial Size 145 × 145 610 × 340 512 × 217 349 × 1905 512 × 614
Classes1610161513
Total Samples10,24942,77654,12915,0294211
Train Samples1024427541150421
Test Samples922542,34953,58814,8793790
Table 3. Classification accuracy for different element-wise functions on the KSC data set.
Table 3. Classification accuracy for different element-wise functions on the KSC data set.
gOA (%)AA (%) κ
A D D 97.65 ± 0.00795.66 ± 0.0130.9738 ± 0.008
A N D 82.91 ± 0.12175.91 ± 0.1370.8075 ± 0.137
I A N D 99.13 ± 0.00296.13 ± 0.0200.9901 ± 0.003
Table 4. Classification accuracy for different patch sizes on the five data sets.
Table 4. Classification accuracy for different patch sizes on the five data sets.
Data SetPatch SizeOA (%)AA (%) κ
IP 7 × 7 98.71 ± 0.00295.37 ± 0.0210.9852 ± 0.002
9 × 9 99.05 ± 0.00295.24 ± 0.0110.9891 ± 0.002
11 × 11 99.15 ± 0.00395.30 ± 0.0180.9904 ± 0.003
13 × 13 99.13 ± 0.00296.13 ± 0.0200.9901 ± 0.003
PU 7 × 7 95.41 ± 0.00593.41 ± 0.0080.9389 ± 0.007
9 × 9 95.89 ± 0.00693.51 ± 0.0120.9453 ± 0.009
11 × 11 96.55 ± 0.00593.58 ± 0.0190.9542 ± 0.007
13 × 13 96.58 ± 0.00493.16 ± 0.0190.9545 ± 0.006
SV 7 × 7 97.37 ± 0.00298.71 ± 0.0020.9707 ± 0.003
9 × 9 98.51 ± 0.00399.16 ± 0.0020.9834 ± 0.003
11 × 11 99.00 ± 0.00399.27 ± 0.0010.9888 ± 0.003
13 × 13 99.14 ± 0.00298.93 ± 0.0050.9905 ± 0.003
HU 7 × 7 83.59 ± 0.01684.31 ± 0.0130.8224 ± 0.017
9 × 9 85.65 ± 0.01386.28 ± 0.0110.8448 ± 0.014
11 × 11 85.78 ± 0.00886.07 ± 0.0060.8462 ± 0.008
13 × 13 85.86 ± 0.01286.02 ± 0.0110.8471 ± 0.013
KSC 7 × 7 99.01 ± 0.01297.81 ± 0.0330.9890 ± 0.013
9 × 9 99.30 ± 0.00598.83 ± 0.0090.9922 ± 0.006
11 × 11 99.53 ± 0.00299.15 ± 0.0040.9947 ± 0.002
13 × 13 99.52 ± 0.00399.11 ± 0.0060.9946 ± 0.004
Table 5. Classification accuracy for different timesteps on the five data sets.
Table 5. Classification accuracy for different timesteps on the five data sets.
Data SetTimestepOA (%)AA (%) κ
IP499.22 ± 0.00296.33 ± 0.0150.9912 ± 0.003
899.16 ± 0.00395.30 ± 0.0180.9904 ± 0.003
1299.16 ± 0.00395.84 ± 0.0110.9905 ± 0.003
1699.18 ± 0.00396.36 ± 0.0150.9907 ± 0.003
PU495.24 ± 0.00990.72 ± 0.0320.9366 ± 0.012
896.55 ± 0.00593.58 ± 0.0190.9542 ± 0.007
1296.51 ± 0.00693.96 ± 0.0150.9537 ± 0.008
1696.79 ± 0.00494.22 ± 0.0080.9574 ± 0.006
SV499.00 ± 0.00299.36 ± 0.0010.9888 ± 0.002
899.00 ± 0.00399.27 ± 0.0010.9888 ± 0.003
1299.03 ± 0.00299.28 ± 0.0010.9892 ± 0.002
1699.10 ± 0.00299.16 ± 0.0020.9900 ± 0.002
HU485.59 ± 0.01585.80 ± 0.0140.8441 ± 0.016
885.78 ± 0.00886.07 ± 0.0060.8462 ± 0.008
1286.29 ± 0.01786.63 ± 0.0140.8517 ± 0.018
1686.03 ± 0.01486.63 ± 0.0120.8489 ± 0.016
KSC499.54 ± 0.00199.15 ± 0.0030.9949 ± 0.001
899.53 ± 0.00299.15 ± 0.0040.9947 ± 0.002
1299.53 ± 0.00399.22 ± 0.0040.9948 ± 0.003
1699.60 ± 0.00399.25 ± 0.0080.9956 ± 0.004
Table 6. Classification accuracy obtained by different methods for the Indian Pines data set.
Table 6. Classification accuracy obtained by different methods for the Indian Pines data set.
ClassesResNetDpyResNetSSRNA2S2KResNetDHCNetHSI-SNNBDSNN
197.01 ± 0.03999.29 ± 0.01478.12 ± 0.39298.99 ± 0.01292.49 ± 0.12237.62 ± 17.65185.65 ± 0.045
291.54 ± 0.03991.00 ± 0.02598.42 ± 0.00698.65 ± 0.00397.91 ± 0.01793.20 ± 2.04799.28 ± 0.005
389.17 ± 0.07588.62 ± 0.04597.03 ± 0.01097.78 ± 0.00998.41 ± 0.01193.36 ± 1.53299.18 ± 0.006
496.39 ± 0.04397.60 ± 0.04198.11 ± 0.02195.05 ± 0.07595.17 ± 0.04692.99 ± 6.15099.26 ± 0.007
596.82 ± 0.03095.07 ± 0.05096.39 ± 0.01298.30 ± 0.03096.57 ± 0.03094.02 ± 1.19998.86 ± 0.021
696.94 ± 0.02396.15 ± 0.01698.14 ± 0.01099.15 ± 0.01197.83 ± 0.01597.38 ± 1.86599.01 ± 0.007
784.88 ± 0.22293.60 ± 0.12857.89 ± 0.47492.64 ± 0.10665.39 ± 0.13539.23 ± 19.52188.57 ± 0.206
896.87 ± 0.02796.68 ± 0.01799.18 ± 0.015100.0 ± 0.00099.69 ± 0.00399.58 ± 0.612100.0 ± 0.000
950.29 ± 0.42159.78 ± 0.35060.00 ± 0.49080.71 ± 0.21557.06 ± 0.21175.56 ± 13.42676.12 ± 0.181
1094.53 ± 0.02593.60 ± 0.02597.73 ± 0.00698.57 ± 0.01298.14 ± 0.01192.00 ± 3.32498.83 ± 0.008
1193.16 ± 0.02794.23 ± 0.02599.31 ± 0.00499.35 ± 0.00298.49 ± 0.00396.87 ± 0.83599.55 ± 0.001
1292.81 ± 0.03689.68 ± 0.03996.37 ± 0.02497.67 ± 0.00292.18 ± 0.05191.09 ± 2.71398.85 ± 0.015
1396.16 ± 0.02993.38 ± 0.06399.00 ± 0.00599.13 ± 0.01197.97 ± 0.02595.35 ± 3.69299.63 ± 0.008
1497.32 ± 0.02795.27 ± 0.05299.30 ± 0.00499.96 ± 0.00098.89 ± 0.00697.72 ± 0.93799.96 ± 0.000
1591.40 ± 0.07695.85 ± 0.03898.86 ± 0.00999.56 ± 0.00696.26 ± 0.01693.39 ± 4.52999.81 ± 0.003
1694.85 ± 0.03193.39 ± 0.04686.60 ± 0.12996.76 ± 0.04184.06 ± 0.19185.48 ± 8.53290.92 ± 0.067
OA(%)93.59 ± 0.00693.30 ± 0.01398.22 ± 0.00398.82 ± 0.00397.18 ± 0.01094.58 ± 0.42599.16 ± 0.003
AA(%)91.26 ± 0.02092.07 ± 0.01991.28 ± 0.07297.02 ± 0.01791.66 ± 0.03785.93 ± 1.59595.84 ± 0.011
κ × 100 92.68 ± 0.00792.34 ± 0.01597.98 ± 0.00398.66 ± 0.00396.79 ± 0.01193.83 ± 0.48399.05 ± 0.003
Table 7. Classification accuracy obtained by different methods for the Pavia University data set.
Table 7. Classification accuracy obtained by different methods for the Pavia University data set.
ClassesResNetDpyResNetSSRNA2S2KResNetDHCNetHSI-SNNBDSNN
167.90 ± 0.04664.40 ± 0.04584.48 ± 0.11591.01 ± 0.06182.50 ± 0.07294.14 ± 4.84896.76 ± 0.011
286.93 ± 0.03385.87 ± 0.06297.45 ± 0.00999.65 ± 0.00493.29 ± 0.02899.43 ± 0.19499.95 ± 0.001
372.96 ± 0.19270.01 ± 0.07678.35 ± 0.09183.68 ± 0.09071.33 ± 0.17970.09 ± 25.49987.03 ± 0.015
498.32 ± 0.02598.03 ± 0.01698.16 ± 0.01588.01 ± 0.05896.55 ± 0.02685.92 ± 3.55294.64 ± 0.018
596.97 ± 0.01499.06 ± 0.01299.76 ± 0.00399.73 ± 0.00298.52 ± 0.01379.28 ± 31.75899.51 ± 0.003
690.36 ± 0.09895.74 ± 0.02094.49 ± 0.02889.49 ± 0.01989.08 ± 0.14997.42 ± 1.97797.70 ± 0.024
789.46 ± 0.08682.18 ± 0.09585.19 ± 0.11077.49 ± 0.12981.49 ± 0.13965.18 ± 34.70798.65 ± 0.011
875.17 ± 0.07473.64 ± 0.11874.25 ± 0.07370.62 ± 0.30475.89 ± 0.08988.02 ± 5.95984.65 ± 0.031
989.87 ± 0.09195.51 ± 0.03785.87 ± 0.18890.41 ± 0.03979.11 ± 0.09253.67 ± 16.77186.74 ± 0.108
OA(%)82.29 ± 0.01580.93 ± 0.03190.87 ± 0.02892.11 ± 0.03687.05 ± 0.03092.27 ± 3.01196.51 ± 0.006
AA(%)85.33 ± 0.02984.94 ± 0.02188.67 ± 0.03387.79 ± 0.03685.31 ± 0.02481.46 ± 9.87293.96 ± 0.015
κ × 100 75.65 ± 0.02273.54 ± 0.04987.85 ± 0.03889.48 ± 0.04882.54 ± 0.04289.68 ± 4.06595.37 ± 0.008
Table 8. Classification accuracy obtained by different methods for the houston13 data set.
Table 8. Classification accuracy obtained by different methods for the houston13 data set.
ClassesResNetDpyResNetSSRNA2S2KResNetDHCNetHSI-SNNBDSNN
173.06 ± 0.13665.05 ± 0.06586.57 ± 0.03488.47 ± 0.03685.82 ± 0.06988.99 ± 1.90296.26 ± 0.032
282.99 ± 0.19484.57 ± 0.04087.17 ± 0.08389.89 ± 0.04480.94 ± 0.11184.57 ± 10.12790.18 ± 0.032
399.82 ± 0.00498.71 ± 0.01697.95 ± 0.02097.90 ± 0.01598.36 ± 0.16993.72 ± 2.23698.01 ± 0.007
454.85 ± 0.25476.71 ± 0.05189.70 ± 0.05082.35 ± 0.13586.53 ± 0.11082.58 ± 6.48593.56 ± 0.047
590.54 ± 0.03692.21 ± 0.03095.30 ± 0.03294.91 ± 0.06392.01 ± 0.09399.97 ± 0.040100.0 ± 0.000
697.86 ± 0.01995.01 ± 0.04883.21 ± 0.20985.64 ± 0.05684.50 ± 0.11444.60 ± 25.61178.25 ± 0.043
745.28 ± 0.09658.80 ± 0.06468.64 ± 0.08474.75 ± 0.10162.67 ± 0.17362.53 ± 6.84784.11 ± 0.038
877.63 ± 0.23483.99 ± 0.19588.14 ± 0.02064.38 ± 0.07374.31 ± 0.21450.70 ± 4.93265.31 ± 0.026
931.12 ± 0.12442.40 ± 0.09174.69 ± 0.11459.23 ± 0.14955.99 ± 0.17160.57 ± 8.53670.98 ± 0.093
1054.67 ± 0.17235.89 ± 0.10467.66 ± 0.04075.08 ± 0.12353.10 ± 0.13678.37 ± 7.40085.29 ± 0.083
1134.63 ± 0.14129.99 ± 0.06772.86 ± 0.14278.80 ± 0.08652.89 ± 0.20387.21 ± 7.14984.67 ± 0.087
1246.66 ± 0.16855.95 ± 0.06374.40 ± 0.10471.28 ± 0.15368.50 ± 0.07671.37 ± 18.92481.82 ± 0.066
1345.52 ± 0.30649.50 ± 0.21282.63 ± 0.08762.21 ± 0.24442.15 ± 0.32560.00 ± 13.14571.59 ± 0.047
1497.79 ± 0.01898.02 ± 0.01795.94 ± 0.02098.86 ± 0.01195.71 ± 0.04996.32 ± 4.78999.38 ± 0.047
1595.73 ± 0.02295.88 ± 0.02595.50 ± 0.01999.44 ± 0.01091.10 ± 0.05597.92 ± 2.089100.0 ± 0.000
AA(%)57.53 ± 0.06260.23 ± 0.02880.84 ± 0.01780.07 ± 0.02369.20 ± 0.02977.73 ± 0.55786.29 ± 0.017
OA(%)68.54 ± 0.04570.84 ± 0.01284.02 ± 0.01081.55 ± 0.02574.97 ± 0.04877.29 ± 0.84386.63 ± 0.014
κ × 100 54.00 ± 0.06756.93 ± 0.03079.29 ± 0.01978.44 ± 0.02566.65 ± 0.03175.92 ± 0.59585.17 ± 0.018
Table 9. Classification accuracy obtained by different methods for the Salinas data set.
Table 9. Classification accuracy obtained by different methods for the Salinas data set.
ClassesResNetDpyResNetSSRNA2S2KResNetDHCNetHSI-SNNBDSNN
199.12 ± 0.01799.33 ± 0.00699.86 ± 0.00398.17 ± 0.03699.95 ± 0.00198.53 ± 2.29499.40 ± 0.008
297.34 ± 0.04898.57 ± 0.010100.0 ± 0.000100.0 ± 0.00099.35 ± 0.01299.96 ± 0.076100.0 ± 0.000
397.72 ± 0.02596.72 ± 0.02397.47 ± 0.04999.94 ± 0.00199.48 ± 0.00399.88 ± 0.083100.0 ± 0.000
496.95 ± 0.02395.92 ± 0.02496.17 ± 0.02799.88 ± 0.00297.97 ± 0.01596.25 ± 3.66598.95 ± 0.017
593.39 ± 0.11493.75 ± 0.08299.11 ± 0.01298.07 ± 0.00999.15 ± 0.01095.97 ± 5.66999.41 ± 0.007
699.38 ± 0.00699.57 ± 0.00599.94 ± 0.001100.0 ± 0.00099.54 ± 0.00499.64 ± 0.714100.0 ± 0.000
799.61 ± 0.00699.74 ± 0.00399.97 ± 0.00099.94 ± 0.00199.86 ± 0.00299.91 ± 0.07799.99 ± 0.000
888.04 ± 0.04688.39 ± 0.06188.49 ± 0.05095.57 ± 0.01789.47 ± 0.04592.93 ± 4.57997.25 ± 0.008
998.76 ± 0.01198.96 ± 0.00899.94 ± 0.00199.98 ± 0.00099.77 ± 0.00299.99 ± 0.013100.0 ± 0.000
1098.94 ± 0.01197.74 ± 0.01398.67 ± 0.01398.33 ± 0.01099.18 ± 0.00796.60 ± 1.55399.37 ± 0.010
1197.22 ± 0.04395.86 ± 0.07197.95 ± 0.02799.77 ± 0.00499.55 ± 0.00698.00 ± 1.57299.83 ± 0.003
1295.94 ± 0.05189.02 ± 0.14099.34 ± 0.01298.79 ± 0.00996.06 ± 0.05399.38 ± 0.78499.85 ± 0.002
1386.31 ± 0.11985.36 ± 0.09792.80 ± 0.03598.13 ± 0.00985.77 ± 0.07685.54 ± 1.36199.51 ± 0.005
1488.31 ± 0.15091.78 ± 0.05498.18 ± 0.00398.07 ± 0.00594.52 ± 0.04393.40 ± 7.72597.06 ± 0.016
1580.85 ± 0.06180.03 ± 0.05192.04 ± 0.00989.73 ± 0.02882.17 ± 0.03694.57 ± 3.37198.62 ± 0.012
1699.94 ± 0.00199.69 ± 0.00499.65 ± 0.00799.30 ± 0.00799.46 ± 0.00998.10 ± 1.10099.32 ± 0.010
OA(%)92.69 ± 0.01292.41 ± 0.01195.79 ± 0.01397.28 ± 0.00394.49 ± 0.01296.70 ± 1.36499.03 ± 0.002
AA(%)94.86 ± 0.01594.40 ± 0.00697.47 ± 0.00698.35 ± 0.00496.33 ± 0.01096.79 ± 1.31099.28 ± 0.001
κ × 100 91.85 ± 0.01391.55 ± 0.01395.31 ± 0.01496.97 ± 0.00393.86 ± 0.01496.33 ± 1.51598.92 ± 0.002
Table 10. The results of the ablation experiments.
Table 10. The results of the ablation experiments.
MatricesModels
SEWSEW + DEFSEW + TCJAProposed
OA(%)75.55 ± 0.01776.18 ± 0.01485.79 ± 0.01486.59 ± 0.016
AA(%)75.95 ± 0.01576.74 ± 0.01986.05 ± 0.01486.79 ± 0.015
κ × 100 73.54 ± 0.01874.22 ± 0.01584.63 ± 0.01585.49 ± 0.018
Table 11. Training time and test time of DpyResNet, SSRN, A2S2KResNet, and BDSNN for the four data sets.
Table 11. Training time and test time of DpyResNet, SSRN, A2S2KResNet, and BDSNN for the four data sets.
MethodsTime (s)IPPUSVHU
DpyResNetTraining923.02302.65453.48298.28
Test59.20256.47360.98202.12
SSRNTraining1401.74206.941056.4898.63
Test20.6058.24134.5230.68
A2S2KResNetTraining1682.08576.261055.07301.42
Test26.23151.84238.5594.64
BDSNNTraining2352.68955.681126.49340.63
Test83.14432.55537.22146.97
Table 12. Training time, test time, and OA of HSI-SNN and BDSNN for the four data sets.
Table 12. Training time, test time, and OA of HSI-SNN and BDSNN for the four data sets.
MethodsTime (s) & OA(%)IPPUSVHU
HSI-SNNTraining1155.07415.47618.7105.23
Test23.04103.2672.6619.01
OA94.5892.2796.7077.73
BDSNNTraining2352.68955.681126.49340.63
Test83.14432.55537.22146.97
OA99.1696.5199.0386.29
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, S.; Peng, Y.; Wang, L.; Li, T. Boundary-Aware Deformable Spiking Neural Network for Hyperspectral Image Classification. Remote Sens. 2023, 15, 5020. https://doi.org/10.3390/rs15205020

AMA Style

Wang S, Peng Y, Wang L, Li T. Boundary-Aware Deformable Spiking Neural Network for Hyperspectral Image Classification. Remote Sensing. 2023; 15(20):5020. https://doi.org/10.3390/rs15205020

Chicago/Turabian Style

Wang, Shuo, Yuanxi Peng, Lei Wang, and Teng Li. 2023. "Boundary-Aware Deformable Spiking Neural Network for Hyperspectral Image Classification" Remote Sensing 15, no. 20: 5020. https://doi.org/10.3390/rs15205020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop