ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification

Kang, Byungjin; Park, Inho; Ok, Changmin; Kim, Sungho

doi:10.3390/app12010174

Open AccessArticle

ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification

¹

Advanced Visual Intelligence Laboratory, Department of Electronic Engineering, Yeungnam University, 280 Daehak-ro, Gyeongsan 38541, Gyeongbuk-do, Korea

²

LIG Nex1, 207 Mabuk-ro, Giheung-gu, Yongin-si 16911, Gyeonggi-do, Korea

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(1), 174; https://doi.org/10.3390/app12010174

Submission received: 24 November 2021 / Revised: 14 December 2021 / Accepted: 16 December 2021 / Published: 24 December 2021

(This article belongs to the Special Issue Advances in Small Infrared Target Detection Using Deep Learning)

Download

Browse Figures

Versions Notes

Abstract

:

Recently, hyperspectral image (HSI) classification using deep learning has been actively studied using 2D and 3D convolution neural networks (CNN). However, they learn spatial information as well as spectral information. These methods can increase the accuracy of classification, but do not only focus on the spectral information, which is a big advantage of HSI. In addition, the 1D-CNN, which learns only pure spectral information, has limitations because it uses adjacent spectral information. In this paper, we propose a One Dimensional Parellel Atrous Convolution Neural Network (ODPA-CNN) that learns not only adjacent spectral information for HSI classification, but also spectral information from a certain distance. It extracts features in parallel to account for bands of varying distances. The proposed method excludes spatial information such as the shape of an object and performs HSI classification only with spectral information about the material of the object. Atrous convolution is not a convolution of adjacent spectral information, but a convolution between spectral information separated by a certain distance. We compare the proposed model with various datasets to the other models. We also test with the data we have taken ourselves. Experimental results show a higher performance than some 3D-CNN models and other 1D-CNN methods. In addition, using datasets to which random space is applied, the vulnerabilities of 3D-CNN are identified, and the proposed model is shown to be robust to datasets with little spatial information.

Keywords:

1D-CNN; hyperspectral image; atrous-convolution; classification; random spatial hyperspectral image

Graphical Abstract

1. Introduction

The hyperspectral image (HSI) is a technique of adding spectral information to spatial information and deriving the state, composition, characteristics, and variation of an object by composing two-dimensional image information according to the spectral band of an electromagnetic wave in the form of a hyperspectral cube [1]. The hyperspectral cube has more than 100 spectral resolution functions, and is classified into multispectral, hyperspectral, and ultra spectral according to the number of spectral bands. A human eye or color camera can also be referred to as spectral imaging because it recognizes the color or state of an object by acquiring spectral information of red, green, and blue. However, usually a spectral image means that it has a larger number of spectral bands. Hyperspectral images do not measure distant spectral bands, but rather contiguous spectral bands [2]. The higher the number of spectral bands, the higher the resolution. The ultra spectral images classify objects according to the chemical composition ratio of solid or liquid, and hyperspectral images can even analyze the chemical composition ratio of gases. These characteristics are used in various fields such as defense, geology, environment, and medical care [3,4,5].

There are two main methods for obtaining hyperspectral images. There are reflection spectroscopy using light reflected from an object and radiation spectroscopy using radiant heat information. Depending on the band of the spectrum to be detected, an appropriate spectroscopy method should be used. In general, reflection spectroscopy detects a spectrum in the VNIR∼SWIR (0.4–2.5

μ

m) band, and radiation spectroscopy detects MWIR (3–5

μ

m) and LWIR (8–14

μ

m). Since the reflected spectral area measures light reflected from an object, it is greatly affected by the surrounding environment, such as the reflected angle and the intensity of light. In contrast, radiative spectroscopy is less susceptible to atmospheric moisture. Reflected spectroscopy is relatively cheaper to acquire than radiated spectroscopy.

Recently, deep learning has been one of the most successful techniques and has been very spectacular in the field of computer vision [6]. Motivated by this successful technique development, deep learning was used to classify HSI in the field of remote sensing [7,8,9]. Compared with the existing manual classification process, HSI data composed of complex spectral bands can be automatically classified effectively through learning with a high level of function. This can effectively respond with the problem of large variability in the signature spectrum. However, the types of features extracted from the deep network may be different; for example, there is spectral information feature extraction, spatial information feature extraction, and spectral-spatial information feature extraction. Spectral information is the most important feature of HSI and is an important factor in classification. Traditional spectral feature extraction (e.g., PCA [10,11] ICA [12] and LDA [13]) is still good, but these linear models find it difficult to process the complex spectral information present in HSI. The 1D-CNN [14,15,16,17] is a representative deep learning network using spectral information feature extraction for HSI classification. In previous studies, it was proved that the performance can be further improved by adding spatial information features to the classifier in HSI classification [18,19]. Adding spatial information here is a subsequent fusion to other feature extraction. In [15,16,20,21,22], PCA is first applied to the entire HSI to reduce the dimensions of the original space, and spatial information of the peripheral pixels of the spectral information being entered is utilized for 2D-CNN. The above methods successfully combined CNN and PCA, combined spatial feature extraction, and reduced computational cost. In addition to the subsequent fusion of spatial information, networks that simultaneously feature spatial information and spectral information have become popular. These deep networks can be divided into three categories—feature fusion by shallowly extracting two features [23,24], feature extraction at once using 3D convolution [15,16,17,25,26,27] and deep feature extraction and the fusion of two pieces of information [28].

The method that subsequently fuses spatial information and the spatial-spectral information feature extraction method have a significantly better performance than the spectral information feature extraction method [9,15,29]. This is because both methods fuse the extraction of other information features in addition to the spectral information.

However, in the hyperspectral data, fusion of other information and the spectral information to classify the hyperspectral image greatly deteriorates the significance of the hyperspectral data.This is a big drawback in some fields. For example, in the military field, which is one of the hyperspectral fields, the detection and classification performance of objects disguised in remote detection is inevitably poor. This is because the disguised object is not given much spatial information. In contrast, the pure spectral information feature extraction is very robust against such problems because it does not classify as spatial information. There is a risk of overfitting due to the limited public dataset. This is verified through hyperspectral images, to which random spatial information is applied. In addition, 3D-CNN, which also uses spatial information, is diluted with feature extraction for spectral information. We prove the problem of 3D-CNN, a network using the spatial-spectral information feature extraction mentioned above, through an experiment in which spatial information is randomly made. Therefore, a deep learning network using only spectral information is the most useful method for HSI information. Among them, 1D-CNN is one of the most actively studied fields in spectral information feature extraction using deep learning. This network excels at spectral information features extraction using 1D-Convolution and it does not undermine the original purpose of the HSI data mentioned above. The 1D-Convolution extracts features of adjacent spectral information according to kernel size. This method produced significant results [14,15,16,17]. This means that 1D-Convolution is a good spectral information feature extractor. However, 1D-Convolution has a limitation because it extracts features only from adjacent spectral information. The spectral information should be able to see the characteristics of the bands away. In addition, by parallelizing feature extraction according to the spectral distance, a band-selective factor was also added to deep learning. In this paper, we propose networks capable of feature extraction even for spectral information that are separated from each other.

Atrous Convolution is also called Dilated convolution. The concept was first introduced in [30], and it can be seen that the performance of deep learning is greatly improved through [31]. Looking at Figure 1, unlike the conventional convolution, the Atrous Convolution can check spectral information separated from each other. This is a very positive factor for deep learning networks that learn spectral information. It is meaningful to classify the spectral information of a substance only with adjacent bands, but it is more meaningful from the viewpoint of HSI data because feature extraction is performed by looking at the association between not only adjacent bands but also distant bands. This can also be seen as an element of band selection. In general 1D-Convolution, the spectral resolution is impaired in the process of extracting features, and this damages spectral information a lot. Based on the above, Atrous Convolution is very suitable for spectral information feature extraction networks.

The three contributions in this paper are summarized as follows:

(1): The network based on spectral-spatial feature extraction currently being studied is not suitable for HSI classification because it dilutes spectral information in HSI classification. Therefore, a network based on spectral feature extraction is more suitable for HSI classification than a network based on spectral-spatial feature extraction.We train and compare the hyperspectral image data with disguised objects and public data of the hyperspectral image with randomized spatial information in our proposed network and 3D-CNNs [15,25,26,27];
(2): Existing 1D-CNN extracts features from adjacent spectral information. This refers to the limitation of 1D-CNN and is a weak point in the spectral information feature extraction network. So, we propose a spectral information feature extraction network using the Atrous Convolution;
(3): When using the Atrous Convolution Layer, spectral information features of various distances can be extracted through parallel processing. So we propose a parallel layer model of the 1D-Atrous Convolution Neural Network.

This paper is organized as follows. Atrous convolution is described in Section 2. In Section 3, we describe our proposed network, which is named ODPA-CNN. The experiment with HSI data is described in Section 4 and compared with other techniques. We conclude the paper in Section 5.

2. Related Work

2.1. Applying CNN to HSI Classification

Deep CNN was first devised in [32], and it has achieved breakthrough results in [33] as the most efficient and successful way to learn visual expressions in the field of image processing architecture. Learning and classifying these visual expressions is to find out differences in visual shapes and shapes between classes. Hyperspectral data with hundreds of spectral bands can be expressed as Figure 2. You can see that there are some classes that the human eye cannot distinguish, but they have a relatively different visual shape. CNN has proven many times that it is capable of a more competitive and higher performance with elements that cannot be seen by the human eye [34,35,36,37]. So, it is very suitable to apply CNN to HSI classification.

2.2. Atrous Convolution

Atrous Convolution was used in DeepLav3 [31], developed by Google, and the performance for segmentation problems in the field of computer vision was effectively improved. In an image, a general convolution is calculated between adjacent pixels, but an atrous convolution calculates a distant pixel according to the rate value and extracts features (Figure 3). This convolution has been applied in the direction used for segmentation in computer vision networks.

The big feature of this convolution is that feature extraction is performed by calculating the pixel values that are separated according to the value of rate. This is a great advantage in HSI classification using spectral information. Convolution of the existing spectral information feature extraction network does not learn deeply about spectral information by calculating it with adjacent spectral information. It does not contribute deeply to the extraction of spectral information features. This can be explained by not selecting adjacent spectroscopy when extracting HSI spectral band features [14,15,16,17].

The 1D-Atrous Convolution can be expressed by Equation (1) and the output y is applied to the spectral specific x for each position i of the filter w, where R means the spectral distance away. As the size of R increases, a wider spectral distance is used for feature extraction, and if

R = 1

, it is a typical convolution.

y [i] = \sum_{k} x [i + R \cdot k] w [k] .

(1)

3. Proposed One Dimensional Atrous Convolution Nerual Network

3.1. ODPA-CNN

We introduce One Dimensional Parallel Atrous Convolution Neural Network (ODPA-CNN), a CNN for new hyperspectral classification. The proposed CNN model is outlined in Figure 4. The input spectral data goes through the first convolution layer and then the atrous convolution layer. Here, each parallel computation is performed using various sized Atrous Convolutions. When using atrous convolution with a bigger size of rate value, it includes padding values. It causes spectral information corruption. After that, the calculated features are combined and then passed through the remaining convolutional layers. After that, the output is completed through a full connection. Paragraphs for each layer are in Table 1. Each rate size of 1D-Convolution layer is 1, 6, 12, and 18. The reason is: first, because parallel processing is performed, the effect of parallel processing is small if the difference between the rate sizes is small. Second, if the difference between the rate sizes is too large, there is a possibility that the largest layer exceeds the size of the spectral information, and the spectral information is distorted.

3.2. Atrous Convolution Layer

In CNN, the convolution layer is a part for feature extraction. In hyperspectral image classification using 1D-CNN, it is used to extract spectral information features. As mentioned earlier, the convolution layer in the existing 1D-CNN features only adjacent spectral information. This can be confirmed from Equation (2). On the other hand, the Atrous Convolution Layer can extract features from distant spectral information according to the size of l as shown in Equation (3). Figure 5 represents 1D-atruos convolution with a kernel size of 3 and a rate size of 3.

y [n] = x [n] * h [n] = \sum_{\infty}^{\infty} x [k] h [n - k]

(2)

y [n] = (x * R \cdot h) [n] = \sum_{\infty}^{\infty} x [k] h [n - R \cdot k] .

(3)

3.3. Activation Function

The features extracted from the above convolution layer come out quantitatively. What makes this more curved is the Activation function. The Rectified Linear Unit (ReLU) is one of the most popular activation functions. The advanced ReLU6 is a function in which x is 0 for less than 0, x for more than 0 and 6 for more than 6 (Equation (4)). In our proposed CNN, we used an activation function called Hard-swish (5). Hard-swish is shown better performance than other activation functions [38]. Hard-swish [38] is a function that is slightly dented on the negative side, unlike the existing ReLU function. Therefore, this function is more curved than ReLU. In other words, ReLU does not differentiate from 0 and stops updating, but Hard-swish does not. Figure 6 shows the graph of ReLU6 and Hard-swish

ReLU 6 (x) = \{\begin{matrix} 0 (if x < 0) \\ x (if 0 \leq x < 6) \\ 6 (if x \geq 6) \end{matrix}

(4)

Hard ‐ swish (x) = x \frac{ReLU 6 (x + 3)}{6} .

(5)

3.4. Optimizer and Loss Function

The loss function represents the interval between the actual correct answer and the predicted value. In other words, the higher the loss, the greater the gap difference, and the network learns in the direction of reducing it.

The proposed model uses Cross Entropy(CE).

t_{i}

is the ground truth (correct answer), and

s_{i}

is the i-th element of the score vector, which is the output of the last layer of CNN for each class i.

C E = - \sum_{i}^{C} t_{i} log (s_{i}) .

(6)

The optimizer is a function that finds the parameters that reduce the value of the loss function as much as possible in the network, that is, the weight and the bias. Optimizer functions include Batch Gradient Descent (BGD) [39] and Stochastic Gradient Descent (SGD) [40]. Currently, the most widely used optimizer is Adaptive Moment Estimation (Adam) [41]. Our proposed model also uses the Adam optimizer.

4. Experimental Result

4.1. The Datasets

In our study, hypersepctral datasets are widely used. They are three public datasets (Indian Pines, Salinas, Pavia University) and our dataset, which is named YU Paint data.

Indian Pines hyperspectral data were collected with an Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor, and was obtained in northwest Indiana, USA. It provides a spatial resolution of 20m and 200 spectral channels in the 0.4 to 2.45

μ

m region of the visible and infrared spectrum. There are a total of 16 classes. Figure 7 is about the class of the dataset, the RGB image, and the ground truth.

Salinas hyperspectral data were also collected with an AVIRIS sensor and provide 220 spectral channels with a spectral resolution of 3.7 m. Consisting of 16 classes, an area in the Salinas Valley, CA, USA was obtained. Figure 8 is about the class of the dataset, the RGB image, and the ground truth.

Pavia University hyperspectral data were collected with a Reflective Optics System Imaging Spectrometer (ROSIS) sensor. The band range is 0.43–0.86

μ

m, and the spatial resolution is 1.3 m. There are nine classes. Figure 9 is about the class of the dataset, the RGB image, and the ground truth.

As shown in Figure 10, the hyperspectral image acquisition system used in the experiment consists of a SPECIM hyperspectral camera, a rotating stage, and an experimental object. The test was painted on an aluminum plate. The spectral resolution of the hyperspectral camera used in the experiment is 2.8 nm, and the CCD sensor stores data as 1392 samples and 1040 bands. However, only 1/4 of the band was used for algorithm speed. The spectrum acquisition range is from 400 nm to 1000 nm. Figure 11 is about the class of the dataset, the RGB image, and the ground truth.

We experimented with the random spatial information of public datasets. Figure 12 presents images of each dataset’s randomly given spatial information. HSI, applying random spatial, can check where the model feature extraction is concentrated among spatial information and spectral information. It can also be assumed that this is an HSI with objects hiding spatial information.

4.2. Experiment Setup

For public datasets, training was conducted at a rate of 10% for trainest and 90% for testset for each label. The trainset and the testset were each randomly selected as different data. As for the hyperspectral data we obtained (hereinafter, YU Paint data), only 50 trainsets were selected from each label. You can see the specific number of trains and test samples in Table 2, Table 3, Table 4 and Table 5. For each model, when training each dataset, the batch size was set to 16∼64, and the epoch was set to 100∼800. The results were compared with the best performance.

Our proposed model is implemented in python and pytorch [42]. Pytorch is a python library for implementing deep learning models. It is an efficient library in which various convolution layers, activation functions, loss functions, optimizers and so forth are defined. The results are generated on a PC equipped with a AMD Ryzen Threadripper 1920X with 4 GHz and Nvdia Geforce GTX 1080ti graphics card.

We compared the proposed model with the representative 1D-CNN Hu model [14] and 3D-CNN models (e.g., Luo model [27], Li model [26], Hamida model [25] and Chen model [15]). We use the YU dataset to prove that the proposed model is better than 3D-CNN models in the HSI dataset with less spatial information. The dataset to which random spatial information was applied was trained in two approaches. First, the models were trained and tested with a dataset of random spatial information and, secondly, models were trained with a general dataset and were tested with a dataset of random spatial information. That is, we conducted the experiment in three ways: (1) Learning and testing with a normal dataset; (2) training and testing with a dataset of random spatial information; (3) training with a normal dataset and testing with a dataset of random spatial information.

4.3. Result and Comparison

The performance indicators for the results were expressed as F1-score (Equation (9)), Accuracy of each class, Overall Accuracy (OA) (Equation (10)), Average Accuracy (AA), and Kappa of all classes. The F1-score is the harmonic average of Precision (Equation (7)) and Recall (Equation (8)). Precision is the proportion of what the model classifies as true that is actually true. Recall is the proportion of what the model predicts as true among what is actually true. OA is the proportion of correct answers predicted correctly from the total data. AA is the average of the accuracy of each class. Kappa is a statistical measure for assessing the confidence of a match between a fixed number of appraisers when categorizing multiple items or classifying items. The measurement metric Kappa was calculated by weighting the measured accuracy. The last measure includes both diagonal and non-diagonal items in the confusion matrix and is a strong indicator of the degree of match.

Precision = \frac{TP}{TP + FP}

(7)

Recall = \frac{TP}{TP + FN}

(8)

F 1 ‐ score = 2 * \frac{1}{1 / Precision + 1 / Recall} = 2 * \frac{Precision * Recall}{Precision + Recall}

(9)

OA (Overall Accuracy) = \frac{TP + TN}{TP + FN + FP + TN} .

(10)

TP is True Positive, FP is False Positive, FN is False Negative, and TN is True Negative.

4.3.1. Train and Test with General datasets

Table 6 is the classification result of training and testing the proposed model and the existing 1D-CNN Hu model and 3D-CNN on the Indian Pines dataset. In the case of Indian Pines, the performance is severely problematic due to class imbalance. However, the Hamida model and the proposed model show high performance compared to other models with an OA of about 82%. In view of this, the proposed model is robust against class imbalance. Figure 13 shows the resulting images.

Table 7 shows the classification results of the proposed model and the existing 1D-CNN Hu model and 3D-CNNs trained and tested on the Salinas dataset. The Salinas dataset does not have severe class imbalance compared to the Indian Pines dataset. The proposed model shows overall higher performance than other models except for the Hamida model. This is because the training data is less than 10% of the total data. Hamida model and the proposed model showed high performance with OA of 96% and 92%, respectively. Figure 14 shows the resulting images.

Table 8 shows the classification results of the proposed model and the existing 1D-CNN Hu model and 3D-CNNs trained and tested on the Pavia University dataset. This dataset learns quickly and well because there are few classes. Therefore, high performance was obtained except for models that require a lot of training data. Among them, the Li model, Hamida model, and proposed model are the models with an OA of more than 90%. Figure 15 shows the resulting images.

Through the above data, it can be confirmed that our model is superior to the existing 1D-CNN in comparison with public HSI data. Also The proposed model is strong against class imbalance and less training data.

The Table 9 shows that YU Paint was compared to several 3D-CNN models. Among them, the Luo model, Li model, and Chen model showed low performance. Even looking at all the performance indicators, they came out very low. This comes from the problem of the 3D-CNN. In a situation where there is very little spatial information, 3D-Convolution, which includes spatial feature extraction in addition to spectral information feature extraction, cannot properly extract features. However, among the models based on 3D-CNN, the hamida model showed a very high performance. The reason is that this model contains 1D-Convolution in the middle, so it extracts features to some extent from spectral information. The Hu model is 1D-CNN, but shows low performance. This seems to be from few data. On the other hand, the proposed model shows very high performance with few data. Figure 16 shows the resulting images.

It was confirmed that spatial-spectral feature extraction by 3D-Convolution through the YU Paint data has little effect on data with too little spatial information. In the data, if it is very small, the effect is not great. Through this, it was confirmed that the performance of our proposed ODPA-CNN is very good for very limited data and new data.

4.3.2. Train and Test with Random Spatial datasets

In the previous experiment, the performance of each model was investigated with public datasets and the YU dataset. However, in order to clarify the weakness of the 3D-CNN model, random spatial information was tested.

The training results of the Indian Pines dataset to which the random spatial is applied are in the Table 10. All models except the proposed model deteriorated, and the 3D-CNN models especially were severely degraded.

Table 11 is the training result of the Salinas dataset to which random space is applied. It did not degrade the performance compared to random space Indian Pines, but it still shows a decrease. The Hu model and the proposed model, which are 1D-CNN, did not show any performance degradation.

In the case of the Pavia University dataset to which the random space was applied, like the Salinas dataset, the performance of the 3D-CNN models fell, and the 1D-CNN did not drop significantly. Table 12 shows the results.

4.3.3. Trained with General datasets and Tested with Random Spatial datasets

This experiment learns with data that do not apply random space, and tests the random space data to find out how vulnerable the trained 3D-CNN is in a situation with little spatial information.

The results of Indian Pines are presented in Table 13. The performance of the 3D-CNN models decreased significantly. Only 1D-CNN models maintain performance.

As in the Indian Pines dataset, in the case of the Salinas dataset, the 3D-CNN models have poor performance, and the 1D-CNN model maintains the performance. The results of Indian Pines are presented in Table 14.

The case of the Pavia University dataset is the same as that of other datasets. The 3D-CNN models have a poor performance, and the 1D-CNN models maintain their performance. The results of Indian Pines are presented in Table 15.

Through the above experiments, it can be seen that 3D-CNN models rely heavily on spatial information. It can be seen that 3D-CNN may overfit with excessive feature extraction. Therefore, 1D-CNN, which classifies classes only with spectral information of an object, is more advantageous than 3D-CNN in a situation where there is little spatial information.

4.4. Limitation

Even if the model is processed in parallel and the rate size of the atrous convolution is changed and used, feature extraction according to the band distance inevitably makes the model heavy. Additionally, if you use a more variable size rate, the model becomes heavier and there is a risk of overfitting.

5. Conclusions and Future Work

We propose a new HSI classification model, ODPA-CNN. This model is a CNN using band-selective yosho, spectral feature extraction at various band distances and parallel feature extraction. The proposed model is robust even on new data by experimenting with the data we collected in addition to public data. Through experiments, it was confirmed that the performance of ODPA-CNN in public data was excellent. In addition, it can be confirmed that it has excellent performance in experiments using new data. Through experiments on datasets to which random space is applied, we found out that the weakness of 3D-CNN is the lack of spatial information. In data with little spatial information, 1D-CNN dominates and, among them, our proposed model ODPA-CNN has an excellent performance. In terms of future research, we plan to study band selection in this network that can feature extraction at various band distances. Feature extraction at various band distances is expected to have a very good effect on band selection, which will be a great benefit.

Author Contributions

The contributions were distributed between authors as follows: B.K. wrote the text of the manuscript and programmed the ODPA-CNN. S.K. performed the in-depth discussion of the related literature, and confirmed the accuracy experiments that are exclusive to this paper. I.P. helped collect these new data and played a major role in making the ground truth. C.O. analyzed the experimental results. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by LIG Nex1 (grant number: LIGNEX1-2020-0890(02)). This research was supported by the 2021 Yeungnam University Research Grants.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Indian Pines dataset, Pavia University dataset, Salinas dataset (http://www.ehu.eus/ccwintco/index.php/Hyperspectral_Remote_Sensing_Scenes, accessed on 23 November 2021).

Acknowledgments

This work was supported by LIG Nex1 (contract no. LIGNEX1-2020-0890(02)).This work was supported by the 2021 Yeungnam University Research Grants.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chang, C.I. Hyperspectral Imaging: Techniques for SPECTRAL detection and Classification; Springer Science & Business Media: New York, NY, USA, 2003; Volume 1. [Google Scholar]
Hagen, N.A.; Kudenov, M.W. Review of snapshot spectral imaging technologies. Opt. Eng. 2013, 52, 090901. [Google Scholar] [CrossRef] [Green Version]
Lacar, F.; Lewis, M.; Grierson, I. Use of hyperspectral imagery for mapping grape varieties in the Barossa Valley, South Australia. In IGARSS 2001, Scanning the Present and Resolving the Future, Proceedings of the IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No. 01CH37217), Sydney, NSW, Australia, 9–13 July 2001; IEEE: New York, NY, USA, 2001; Volume 6, pp. 2875–2877. [Google Scholar]
Bioucas-Dias, J.M.; Plaza, A.; Camps-Valls, G.; Scheunders, P.; Nasrabadi, N.; Chanussot, J. Hyperspectral remote sensing data analysis and future challenges. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–36. [Google Scholar] [CrossRef] [Green Version]
Calin, M.A.; Parasca, S.V.; Savastru, D.; Manea, D. Hyperspectral imaging in the medical field: Present and future. Appl. Spectrosc. Rev. 2014, 49, 435–447. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Zhang, L.; Du, B. Deep learning for remote sensing data: A technical tutorial on the state of the art. IEEE Geosci. Remote Sens. Mag. 2016, 4, 22–40. [Google Scholar] [CrossRef]
Zhu, X.X.; Tuia, D.; Mou, L.; Xia, G.S.; Zhang, L.; Xu, F.; Fraundorfer, F. Deep learning in remote sensing: A comprehensive review and list of resources. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–36. [Google Scholar] [CrossRef] [Green Version]
Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. Deep learning for hyperspectral image classification: An overview. IEEE Trans. Geosci. Remote Sens. 2019, 57, 6690–6709. [Google Scholar] [CrossRef] [Green Version]
Licciardi, G.; Marpu, P.R.; Chanussot, J.; Benediktsson, J.A. Linear versus nonlinear PCA for the classification of hyperspectral data based on the extended morphological profiles. IEEE Geosci. Remote Sens. Lett. 2011, 9, 447–451. [Google Scholar] [CrossRef] [Green Version]
Prasad, S.; Bruce, L.M. Limitations of principal components analysis for hyperspectral target recognition. IEEE Geosci. Remote Sens. Lett. 2008, 5, 625–629. [Google Scholar] [CrossRef]
Villa, A.; Benediktsson, J.A.; Chanussot, J.; Jutten, C. Hyperspectral image classification with independent component discriminant analysis. IEEE Trans. Geosci. Remote Sens. 2011, 49, 4865–4876. [Google Scholar] [CrossRef] [Green Version]
Bandos, T.V.; Bruzzone, L.; Camps-Valls, G. Classification of hyperspectral images with regularized linear discriminant analysis. IEEE Trans. Geosci. Remote Sens. 2009, 47, 862–873. [Google Scholar] [CrossRef]
Hu, W.; Huang, Y.; Wei, L.; Zhang, F.; Li, H. Deep convolutional neural networks for hyperspectral image classification. J. Sens. 2015, 2015, 258619. [Google Scholar] [CrossRef] [Green Version]
Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251. [Google Scholar] [CrossRef] [Green Version]
Haut, J.M.; Paoletti, M.E.; Plaza, J.; Li, J.; Plaza, A. Active learning with convolutional neural networks for hyperspectral image classification using a new bayesian approach. IEEE Trans. Geosci. Remote Sens. 2018, 56, 6440–6461. [Google Scholar] [CrossRef]
Yang, X.; Ye, Y.; Li, X.; Lau, R.Y.; Zhang, X.; Huang, X. Hyperspectral image classification with deep learning models. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5408–5423. [Google Scholar] [CrossRef]
Ghamisi, P.; Maggiori, E.; Li, S.; Souza, R.; Tarablaka, Y.; Moser, G.; De Giorgi, A.; Fang, L.; Chen, Y.; Chi, M.; et al. New frontiers in spectral-spatial hyperspectral image classification: The latest advances based on mathematical morphology, Markov random fields, segmentation, sparse representation, and deep learning. IEEE Geosci. Remote Sens. Mag. 2018, 6, 10–43. [Google Scholar] [CrossRef]
Fang, L.; Li, S.; Kang, X.; Benediktsson, J.A. Spectral–spatial classification of hyperspectral images with a superpixel-based discriminative sparse model. IEEE Trans. Geosci. Remote Sens. 2015, 53, 4186–4201. [Google Scholar] [CrossRef]
Yue, J.; Zhao, W.; Mao, S.; Liu, H. Spectral–spatial classification of hyperspectral images using deep convolutional neural networks. Remote Sens. Lett. 2015, 6, 468–477. [Google Scholar] [CrossRef]
Makantasis, K.; Karantzalos, K.; Doulamis, A.; Doulamis, N. Deep supervised learning for hyperspectral data classification through convolutional neural networks. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy, 26–31 July 2015; pp. 4959–4962. [Google Scholar]
Fang, L.; Liu, Z.; Song, W. Deep hashing neural networks for hyperspectral image feature extraction. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1412–1416. [Google Scholar] [CrossRef]
Mei, S.; Ji, J.; Hou, J.; Li, X.; Du, Q. Learning sensor-specific spatial-spectral features of hyperspectral images via convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4520–4533. [Google Scholar] [CrossRef]
Kang, X.; Li, C.; Li, S.; Lin, H. Classification of hyperspectral images by Gabor filtering based deep network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 11, 1166–1178. [Google Scholar] [CrossRef]
Ben Hamida, A.; Benoit, A.; Lambert, P.; Ben Amar, C. 3-D Deep Learning Approach for Remote Sensing Image Classification. IEEE Trans. Geosci. Remote Sens. 2018, 56, 4420–4434. [Google Scholar] [CrossRef] [Green Version]
Li, Y.; Zhang, H.; Shen, Q. Spectral—Spatial Classification of Hyperspectral Imagery with 3D Convolutional Neural Network. Remote Sens. 2017, 9, 67. [Google Scholar] [CrossRef] [Green Version]
Luo, Y.; Zou, J.; Yao, C.; Zhao, X.; Li, T.; Bai, G. HSI-CNN: A novel convolution neural network for hyperspectral image. In Proceedings of the 2018 International Conference on Audio, Language and Image Processing (ICALIP), Shanghai, China, 16–17 July 2018; pp. 464–469. [Google Scholar]
Ghamisi, P.; Plaza, J.; Chen, Y.; Li, J.; Plaza, A.J. Advanced spectral classifiers for hyperspectral images: A review. IEEE Geosci. Remote Sens. Mag. 2017, 5, 8–32. [Google Scholar] [CrossRef] [Green Version]
Imani, M.; Ghassemian, H. An overview on spectral and spatial information fusion for hyperspectral image classification: Current trends and challenges. Inf. Fusion 2020, 59, 59–83. [Google Scholar] [CrossRef]
Yu, F.; Koltun, V.; Funkhouser, T. Dilated residual networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 472–480. [Google Scholar]
Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [Google Scholar] [CrossRef]
LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems; Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q., Eds.; Curran Associates, Inc.: London, UK, 2012; Volume 25. [Google Scholar]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv arXiv:1409.1556, 2015.
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper With Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Wang, W.; Zhu, Y.; Pang, R.; Vasudevan, V.; et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 1314–1324. [Google Scholar]
Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747. [Google Scholar]
Bottou, L. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade: Heildelberg, Dordrecht, London, New York; Springer: Berlin/Heidelberg, Germany, 2012; pp. 421–436. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: London, UK, 2019; pp. 8024–8035. [Google Scholar]

Figure 1. Differences of 1D-Convolution and 1D-Atrous Convolution. (a) 1D-Convolution. (b) 1D-Atrous Convolution. The gray regions are not included in the calculation when calculating one kernel in the convolution. In general, the 1D-Convolution calculates only limited neighbors, but the 1D-Atrous Convolution calculates far distances as well.

Figure 2. Examples of paint spectral of YU dataset acquired in an outdoor environment. This shows how difficult it is to classify spectral information.

Figure 3. Basic concept of Atrous convolution in 2D image processing.

Figure 4. Proposed CNN Model, ODPA-CNN: AtConv is 1D-Atrous Convolution. Conv is normal 1D-Convolution. Table 1 has detailed parameters of each convolution. In this model, AtConv is processed in parallel to extract features for bands separated by various distances.

Figure 5. Concept of spectral feature extraction using 1D-Atrous Convolution: Example has 3 kernel size and 3 rate size.

Figure 6. Graph of ReLU6 and Hard-swish: (a) ReLU6 (b) Hard-swish.

Figure 7. Indian Pines dataset: Left is the names of classes, Center is ground truth, Right is the RGB image.

Figure 8. Salinas dataset: Left is the names of classes, Center is ground truth, Right is the RGB image.

Figure 9. Pavia University dataset: Left is the names of classes, Center is ground truth, Right is the RGB image.

Figure 10. Acquisition of the YU paint data.

Figure 11. YU Paint dataset: Left is the names of classes, Top is ground truth, Bottom is the RGB image.

Figure 12. HSI datasets applied random spatial information: (a) Indian Pines, (b) Pavia University, (c) Salinas.

Figure 13. Classification results of Indian Pines: (a) ground truth, (b) Luo model, (c) Li model, (d) Hamida model, (e) Chen model, (f) Hu model, (g) proposed ODPA-CNN.

Figure 14. Classification results of Salinas: (a) ground truth, (b) Luo model, (c) Li model, (d) Hamida model, (e) Chen model, (f) Hu model, (g) proposed ODPA-CNN.

Figure 15. Classification results of Pavia University: (a) ground truth, (b) Luo model, (c) Li model, (d) Hamida model, (e) Chen model, (f) Hu model, (g) proposed ODPA-CNN.

Figure 16. Classification results of YU Paint and Comparison of 3D-CNNs and the proposed ODPA-CNN: (a) ground truth, (b) Luo model, (c) Li model, (d) Hamida model, (e) Chen model, (f) Hu model, (g) proposed ODPA-CNN.

Table 1. The parameters of ODPA-CNN.

Layers	Kernel Size	Feature Maps	Dilated Rate	Stride	Padding
Conv1	1	1	1	1	1
Conv2_1	1	32	1	1	0
Conv2_2	3	32	6	1	6
Conv2_3	3	32	12	1	12
Conv2_4	3	32	18	1	18
Conv3	1	64	1	1	same
Conv4	3	32	1	1	same
Conv5	3	32	1	1	same
FC1	Input: feature size, Output: feature size/2
FC2	Output: 128
FC3	Output: number of classes

Table 2. Number of train and test sample in the Indian Pines dataset.

#	Class Name	Total	Train	Test
1	Alfalfa	46	4	42
2	Corn-notill	1428	142	1286
3	Corn-mintill	830	83	747
4	Corn	237	23	214
5	Grass-pasture	483	48	435
6	Grass-trees	730	73	657
7	Grass-pasture-mowed	28	2	26
8	Hay-windrowed	478	47	431
9	Oats	20	2	18
10	Soybean-notill	972	97	875
11	Soybean-mintill	2455	245	2210
12	Soybean-clean	593	59	534
13	Wheat	205	20	185
14	Woods	1265	126	1139
15	Buildings-Grass-Trees-Drives	386	38	348
16	Stone-Steel-Towers	93	9	84

Table 3. Number of train and test sample in the Salinas dataset.

#	Class Name	Total	Train	Test
1	Brocoli_green_weeds_1	2009	200	1809
2	Brocoli_green_weeds_2	3726	372	3354
3	Fallow	1976	197	1779
4	Fallow_rough_plow	1394	139	1255
5	Fallow_smooth	2678	267	2411
6	Stubble	3959	395	3564
7	Celery	3579	357	3222
8	Grapes_untrained	11,271	1127	10,144
9	Soil_vinyard_develop	6203	620	5583
10	Corn_senesced_green_weeds	3278	327	2951
11	Lettuce_romaine_4wk	1068	106	962
12	Lettuce_romaine_5wk	1927	192	1735
13	Lettuce_romaine_6wk	916	91	825
14	Lettuce_romaine_7wk	1070	107	963
15	Vinyard_untrained	7268	726	6542
16	Vinyard_vertical_trellis	1807	180	1627

Table 4. Number of train and test sample in the Pavia University dataset.

#	Class Name	Total	Train	Test
1	Asphalt	6631	663	5968
2	Meadows	18,649	1864	16,785
3	Gravel	2099	209	1890
4	Trees	3064	306	2758
5	Painted metal sheets	1345	134	1211
6	Bare Soil	5029	502	4527
7	Bitumen	1330	133	1197
8	Self-Blocking Bricks	3682	368	3314
9	Shadows	947	94	853

Table 5. Number of train and test sample in the YU Paint dataset.

#	Class Name	Total	Train	Test
1	bright paint	26,734	50	26,684
2	dark paint	28,737	50	28,687
3	aluminum	10,977	50	10,927
4	grass	3516	50	3466
5	fallen leaves	1326	50	1276
6	shadow	2616	50	2566

Table 6. Classification results of the Indian Pines dataset to which random spatial information is not applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.189	12.20%	0.735	60.98%
2	0.002	0.08%	0.623	61.79%	0.744	76.34%
3	0.000	0.00%	0.539	50.47%	0.759	71.89%
4	0.000	0.00%	0.448	41.78%	0.803	83.10%
5	0.000	0.00%	0.878	81.84%	0.918	89.89%
6	0.516	79.76%	0.882	88.58%	0.972	97.11%
7	0.000	0.00%	0.783	72.00%	0.851	80.00%
8	0.364	32.56%	0.939	98.60%	0.978	99.07%
9	0.000	0.00%	0.389	38.89%	0.882	83.33%
10	0.000	0.00%	0.610	62.86%	0.766	75.77%
11	0.536	96.97%	0.725	73.21%	0.819	82.17%
12	0.000	0.00%	0.563	57.30%	0.683	64.04%
13	0.000	0.00%	0.945	96.76%	0.992	98.38%
14	0.816	99.21%	0.951	94.64%	0.970	97.37%
15	0.000	0.00%	0.551	50.43%	0.765	70.03%
16	0.000	0.00%	0.851	75.00%	0.957	92.86%
OA(%)	42.688%		61.34%		82.851%
AA(%)	19.29%		66.02%		82.64%
Kappa	0.297		0.7010		0.804
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.963	95.12%	0.000	0.00%	0.714	59.52%
2	0.911	89.73%	0.482	36.50%	0.783	79.32%
3	0.515	35.48%	0.506	45.78%	0.731	71.08%
4	0.427	28.64%	0.417	40.38%	0.639	68.22%
5	0.699	56.09%	0.140	8.28%	0.912	89.89%
6	0.959	96.04%	0.819	97.11%	0.936	93.15%
7	0.902	92.00%	0.312	20.00%	0.833	76.92%
8	0.739	59.07%	0.933	98.37%	0.964	99.54%
9	0.636	77.78%	0.000	0.00%	0.703	72.22%
10	0.856	80.46%	0.523	46.06%	0.761	76.46%
11	0.908	86.15%	0.685	86.97%	0.823	81.76%
12	0.748	63.48%	0.450	35.96%	0.760	74.91%
13	0.966	99.46%	0.862	92.97%	0.923	94.05%
14	0.842	74.28%	0.840	96.49%	0.931	92.27%
15	0.332	19.88%	0.332	22.77%	0.641	66.09%
16	0.642	51.19%	0.976	95.24%	0.897	98.81%
OA(%)	73.420%		64.455%		82.342%
AA(%)	69.05%		51.43%		80.89%
Kappa	0.704		0.585		0.799

Table 7. Classification results of the Salinas dataset to which random spatial information is not applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	11.06%	0.904	89.11%	0.983	96.74%
2	0.002	0.18%	0.947	94.28%	1	100.00%
3	0.000	39.24%	0.842	82.41%	0.993	99.72%
4	0.000	99.84%	0.958	94.34%	0.993	99.92%
5	0.000	94.73%	0.893	90.17%	0.995	99.50%
6	0.516	98.15%	0.990	98.74%	0.996	99.27%
7	0.000	99.29%	0.984	97.30%	0.995	99.22%
8	0.364	98.58%	0.818	80.58%	0.92	93.66%
9	0.000	97.21%	0.953	95.24%	0.998	99.77%
10	0.000	55.66%	0.871	84.85%	0.978	96.07%
11	0.536	0.00%	0.762	71.00%	0.974	96.36%
12	0.000	32.76%	0.876	86.57%	0.989	98.04%
13	0.000	0.00%	0.867	84.48%	0.99	98.42%
14	0.816	7.17%	0.864	83.28%	0.987	97.92%
15	0.000	0.00%	0.740	72.67%	0.867	86.37%
16	0.000	12.18%	0.940	90.29%	0.947	90.77%
OA(%)	59.605%		86.512%		95.269%
AA(%)	46.63%		87.21%		96.99%
Kappa	0.539		0.850		0.947
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.963	80.20%	0.985	100%	0.967	93.68%
2	0.911	98.87%	0.989	97.95%	0.980	99.69%
3	0.515	94.83%	0.930	95.88%	0.989	98.99%
4	0.427	92.83%	0.988	98.73%	0.988	97.66%
5	0.699	93.40%	0.959	93.56%	0.983	98.66%
6	0.959	91.05%	0.999	99.94%	0.998	99.94%
7	0.902	91.24%	0.993	99.13%	0.999	100%
8	0.739	84.39%	0.761	78.94%	0.862	80.6%
9	0.636	96.53%	0.992	99.09%	0.995	99.04%
10	0.856	88.85%	0.890	85.64%	0.967	97.52%
11	0.908	87.72%	0.908	92.54%	0.966	98.38%
12	0.748	86.97%	0.983	97.34%	0.986	97.31%
13	0.966	89.21%	0.964	96.25%	0.973	95.34%
14	0.842	86.81%	0.954	95.81%	0.964	97.15%
15	0.332	99.29%	0.667	64.73%	0.744	85.93%
16	0.642	39.54%	0.979	97.68%	0.994	99.81%
OA(%)	87.005%		88.66%		92.94%
AA(%)	87.61%		93.32%		96.23%
Kappa	0.857		0.874		0.9212

Table 8. Classification results of the Pavia University dataset to which random spatial information is not applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.837	95.16%	0.965	97.69%	0.982	97.45%
2	0.846	95.04%	0.973	94.77%	0.998	95.84%
3	0.000	0.00%	0.926	91.05%	0.979	93.17%
4	0.660	52.14%	0.979	98.15%	0.989	96.63%
5	0.995	99.26%	0.99	99.83%	0.988	99.83%
6	0.213	13.19%	0.979	99.87%	0.995	98.12%
7	0.000	0.00%	0.938	89.06%	0.996	97.83%
8	0.726	89.38%	0.962	94.33%	0.889	96.29%
9	0.985	97.07%	0.997	99.53%	0.996	99.18%
OA(%)	74.433%		95.932%		96.584%
AA(%)	60.14%		96.03%		97.15%
Kappa	0.642		0.947		0.955
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.963	78.74%	0.804	76.35%	0.942	94.44%
2	0.911	76.38%	0.867	79.48%	0.958	94.52%
3	0.515	74.01%	0.058	25.41%	0.805	75.89%
4	0.427	88.32%	0.787	78.32%	0.930	97.34%
5	0.699	100.00%	0.976	96.15%	0.996	99.59%
6	0.959	92.97%	0.355	81.65%	0.876	89.77%
7	0.902	99.58%	0.055	11.78%	0.864	92.64%
8	0.739	98.61%	0.747	62.28%	0.854	86.62%
9	0.636	95.31%	0.996	99.3%	0.998	99.77%
OA(%)	83.233%		76.71%		92.69%
AA(%)	89.32%		67.86%		92.29
Kappa	0.791		0.6678		0.9026

Table 9. Classification results of the YU Paint dataset to which random spatial information is not applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0%	0.942	99.33%	0.997	99.89%
2	0.000	0%	0.697	83.69%	0.985	99.61%
3	0.000	0%	0.936	99.20%	0.997	99.91%
4	0.090	4.71%	0.858	83.27%	0.955	97.71%
5	0.000	0%	0.887	97.35%	0.983	98.73%
6	0.000	0%	0.293	17.23%	0.888	81.79%
OA(%)	4.694%		68.58%		98.25%
AA(%)	0.79%		80.01%		96.27%
Kappa	0.000		0.7010		0.9748
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.514	37.36%	0.052	95.34%	0.996	99.44%
2	0.000	0%	0.799	99.29%	0.998	99.90%
3	0.000	0%	0.451	29.23%	0.990	99.63%
4	0.090	0%	0.815	99.87%	0.989	97.46%
5	0.000	0%	0.806	67.53%	0.990	98.08%
6	0.000	0%	0.343	20.69%	0.980	81.49%
OA(%)	35.54%		50.136%		99.47%
AA(%)	6.23%		68.66%		98.67%
Kappa	0.016		0.386		0.992

Table 10. Classification results of Indian Pines dataset to which random spatial information is applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.319	26.83%	0.527	58.54%
2	0.000	0.00%	0.477	48.79%	0.575	55.49%
3	0.000	0.00%	0.325	28.11%	0.401	38.02%
4	0.000	0.00%	0.330	32.86%	0.241	22.54%
5	0.000	0.00%	0.448	39.31%	0.616	52.64%
6	0.000	0.00%	0.746	74.28%	0.856	86.15%
7	0.000	0.00%	0.105	8.00%	0.474	36.00%
8	0.000	0.00%	0.808	75.35%	0.829	77.21%
9	0.000	0.00%	0.000	0.00%	0.435	27.78%
10	0.000	0.00%	0.403	39.20%	0.505	50.17%
11	0.000	100.00%	0.600	61.90%	0.678	71.45%
12	0.397	0.00%	0.300	26.40%	0.438	37.08%
13	0.000	0.00%	0.706	64.32%	0.866	89.19%
14	0.000	0.00%	0.785	80.68%	0.834	83.76%
15	0.000	0.00%	0.546	51.30%	0.554	53.31%
16	0.000	0.00%	0.809	67.86%	0.863	78.57%
OA(%)	0.000		54.504%		62.829%
AA(%)	6.25%		45.33%		57.37%
Kappa	0.010		0.481		0.575
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0	0.00%	0.791	82.93%
2	0.424	0.00%	0.424	34.63%	0.810	82.57%
3	0.126	0.00%	0.126	11.51%	0.788	76.97%
4	0.000	0.00%	0	0.00%	0.673	68.08%
5	0.000	0.00%	0	0.00%	0.886	87.59%
6	0.663	0.00%	0.663	97.87%	0.935	96.65%
7	0.000	0.00%	0	0.00%	0.683	56.00%
8	0.873	0.00%	0.873	98.60%	0.965	98.37%
9	0.000	0.00%	0	0.00%	0.606	55.56%
10	0.071	0.00%	0.071	4.80%	0.805	76.69%
11	0.583	64.89%	0.583	82.94%	0.854	85.20%
12	0.000	0.00%	0	0.00%	0.803	85.77%
13	0.000	0.00%	0	0.00%	0.933	94.05%
14	0.836	0.00%	0.836	98.60%	0.938	95.96%
15	0.000	0.00%	0	0.00%	0.602	51.59%
16	0.000	0.00%	0	0.00%	0.958	94.05%
OA(%)	49.821%		49.821%		84.715%
AA(%)	26.81%		26.81%		80.50%
Kappa	0.009		0.402		0.826

Table 11. Classification results of the Salinas dataset to which random spatial information is applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.935	91.54%	0.948	93.69%
2	0.737	94.04%	0.956	95.35%	0.930	90.10%
3	0.000	0.00%	0.868	86.40%	0.825	79.65%
4	0.961	96.02%	0.938	90.92%	0.955	92.99%
5	0.689	96.14%	0.892	87.51%	0.903	89.42%
6	0.978	95.87%	0.979	95.93%	0.974	95.17%
7	0.911	96.49%	0.984	97.33%	0.961	93.11%
8	0.677	93.36%	0.809	79.11%	0.813	80.80%
9	0.723	98.46%	0.949	94.68%	0.965	97.49%
10	0.085	6.20%	0.893	88.03%	0.897	90.81%
11	0.000	0.00%	0.754	68.68%	0.827	77.00%
12	0.000	0.00%	0.912	92.10%	0.900	90.02%
13	0.000	0.00%	0.904	88.61%	0.910	90.79%
14	0.581	41.12%	0.893	87.23%	0.905	90.65%
15	0.068	3.93%	0.731	71.84%	0.740	73.05%
16	0.000	0.00%	0.956	93.23%	0.977	97.23%
OA(%)	59.534%		86.508%		87.126%
AA(%)	45.10%		88.03%		88.87%
Kappa	0.537		0.850		0.857
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.632	64.16%	0.976	95.24%	0.997	99.45%
2	0.673	62.34%	0.985	99.88%	0.998	99.82%
3	0.375	29.45%	0.897	91.74%	0.992	99.61%
4	0.149	12.03%	0.985	99.60%	0.993	99.52%
5	0.258	20.54%	0.941	92.49%	0.985	98.46%
6	0.582	52.43%	0.998	99.55%	0.999	99.97%
7	0.513	45.89%	0.981	99.22%	0.998	99.81%
8	0.655	76.47%	0.791	91.01%	0.868	88.64%
9	0.736	69.59%	0.97	96.76%	0.997	99.98%
10	0.479	48.34%	0.865	87.59%	0.971	96.71%
11	0.271	23.52%	0.808	85.85%	0.985	99.69%
12	0.289	23.36%	0.967	99.65%	0.998	100.00%
13	0.141	14.18%	0.947	97.58%	0.998	99.64%
14	0.199	17.34%	0.931	92.32%	0.987	99.17%
15	0.602	64.44%	0.535	39.93%	0.788	76.00%
16	0.569	43.36%	0.955	92.13%	0.983	97.60%
OA(%)	51.438%		87.241%		93.953%
AA(%)	41.71%		91.28%		97.13%
Kappa	0.465		0.857		0.933

Table 12. Classification results of the Pavia University dataset to which random spatial information is applied.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.647	97.17%	0.796	79.74%	0.872	87.08%
2	0.834	95.17%	0.881	89.54%	0.912	90.97%
3	0.000	0.00%	0.477	45.61%	0.698	68.71%
4	0.487	38.76%	0.83	79.94%	0.877	84.34%
5	0.839	72.50%	0.968	95.46%	0.982	97.03%
6	0.026	1.33%	0.631	58.45%	0.780	75.32%
7	0.000	0.00%	0.478	43.36%	0.671	62.99%
8	0.198	13.88%	0.661	68.23%	0.773	76.10%
9	0.889	80.05%	0.953	92.97%	0.986	97.42%
OA(%)	64.734%		78.511%		85.142%
AA(%)	44.32%		72.59%		82.22%
Kappa	0.497		0.714		0.804
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.241	25.49%	0.871	88.79%	0.959	95.53%
2	0.659	64.23%	0.913	98.47%	0.978	97.96%
3	0.143	11.22%	0.607	58.71%	0.843	82.42%
4	0.143	15.08%	0.875	80.78%	0.958	96.01%
5	0.378	28.32%	0.993	98.76%	0.995	99.75%
6	0.311	26.14%	0.629	48.06%	0.938	93.28%
7	0.116	9.44%	0.693	61.65%	0.914	91.40%
8	0.161	15.12%	0.827	83.61%	0.886	90.37%
9	0.073	6.46%	0.999	99.88%	0.998	99.65%
OA(%)	39.284%		85.444%		95.369%
AA(%)	22.39%		79.86%		94.04%
Kappa	0.210		0.801		0.939

Table 13. Classification results of the Indian Pines dataset to which random spatial information is not applied for training and random spatial information to test.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.000	0.00%	0.000	0.00%
2	0.070	4.41%	0.071	4.20%	0.079	4.90%
3	0.000	0.00%	0.141	33.49%	0.088	6.63%
4	0.000	0.00%	0.046	16.03%	0.070	64.14%
5	0.000	0.00%	0.094	8.49%	0.182	21.12%
6	0.158	56.03%	0.079	6.44%	0.123	7.67%
7	0.000	0.00%	0.018	7.14%	0.036	10.71%
8	0.084	25.10%	0.120	10.88%	0.118	6.69%
9	0.000	0.00%	0.000	0.00%	0.000	0.00%
10	0.000	0.00%	0.049	3.09%	0.029	1.54%
11	0.361	36.25%	0.126	8.07%	0.229	15.32%
12	0.000	0.00%	0.079	6.41%	0.062	4.55%
13	0.000	0.00%	0.043	2.93%	0.270	25.85%
14	0.279	17.55%	0.192	11.94%	0.558	60.63%
15	0.000	0.00%	0.123	31.09%	0.063	10.88%
16	0.000	0.00%	0.000	0.00%	0.000	0.00%
OA(%)	16.626%		10.352%		17.074%
AA(%)	8.71%		9.39%		15.04%
Kappa	0.06478		0.0376		0.113
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.000	0.00%	0.744	63.04%
2	0.000	0.00%	0.483	36.41%	0.802	81.23%
3	0.006	0.30%	0.512	46.27%	0.757	73.86%
4	0.001	0.07%	0.432	41.77%	0.671	71.31%
5	0.000	0.00%	0.150	8.90%	0.921	90.89%
6	0.000	0.00%	0.821	97.40%	0.942	93.84%
7	0.000	0.00%	0.286	17.86%	0.846	78.57%
8	0.242	31.17%	0.933	98.33%	0.967	99.58%
9	0.000	0.00%	0.000	0.00%	0.714	75.00%
10	0.101	22.88%	0.521	45.68%	0.783	78.60%
11	0.000	0.00%	0.686	87.54%	0.840	83.42%
12	0.030	1.92%	0.467	37.10%	0.782	77.23%
13	0.029	25.22%	0.862	93.17%	0.930	94.63%
14	0.011	0.84%	0.841	96.60%	0.937	92.96%
15	0.001	0.03%	0.331	22.54%	0.671	68.91%
16	0.000	0.00%	0.972	94.62%	0.906	98.92%
OA(%)	8.403%		64.719%		83.969%
AA(%)	5.15%		51.51%		82.63%
Kappa	−0.0028		0.5878		0.817

Table 14. Classification results of the Salinas dataset to which random spatial information is not applied for training and random spatial information to test.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.000	0.00%	0.000	0.00%
2	0.029	1.48%	0.000	0.00%	0.089	5.07%
3	0.067	8.76%	0.015	0.76%	0.018	0.96%
4	0.048	2.94%	0.008	0.43%	0.003	0.14%
5	0.082	5.97%	0.053	3.25%	0.039	2.65%
6	0.012	0.63%	0.052	2.80%	0.219	14.85%
7	0.063	4.36%	0.000	0.00%	0.072	4.47%
8	0.370	73.33%	0.336	54.64%	0.368	54.68%
9	0.102	5.59%	0.060	3.77%	0.192	14.19%
10	0.133	22.94%	0.173	24.86%	0.142	10.37%
11	0.000	0.00%	0.014	0.84%	0.030	2.25%
12	0.040	4.88%	0.075	7.58%	0.083	10.59%
13	0.000	0.00%	0.049	20.63%	0.045	27.07%
14	0.021	3.46%	0.024	6.92%	0.033	9.35%
15	0.000	0.00%	0.163	13.11%	0.132	9.42%
16	0.000	0.00%	0.022	1.16%	0.061	3.76%
OA(%)	18.668%		16.292%		17.996%
AA(%)	8.40%		8.80%		10.61%
Kappa	0.0469		0.0432		0.0753
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.000	0.00%	0.971	97.76%	0.9988	99.75%
2	0.051	3.08%	0.980	97.45%	0.9973	99.95%
3	0.028	2.05%	0.871	80.97%	0.9832	97.82%
4	0.000	0.00%	0.979	98.21%	0.9932	99.00%
5	0.056	5.59%	0.923	96.86%	0.9860	99.70%
6	0.110	26.58%	0.997	99.67%	0.9990	99.80%
7	0.000	0.00%	0.975	98.71%	0.9976	99.58%
8	0.090	22.38%	0.774	94.28%	0.8598	85.36%
9	0.000	0.00%	0.963	97.37%	0.9944	99.53%
10	0.002	0.10%	0.854	81.73%	0.9555	94.69%
11	0.094	5.42%	0.751	72.94%	0.9653	98.97%
12	0.000	0.00%	0.946	99.90%	0.9979	99.95%
13	0.038	2.93%	0.938	98.91%	0.9851	97.60%
14	0.035	1.82%	0.927	90.37%	0.9718	98.13%
15	0.000	0.00%	0.401	26.50%	0.7933	80.09%
16	0.000	0.00%	0.950	92.81%	0.9922	98.84%
OA(%)	5.385%		85.281%		93.589%
AA(%)	4.37%		89.03%		96.80%
Kappa	0.0067		0.8348		0.9286

Table 15. Classification results of the Paviua University dataset to which random spatial information is not applied for training and random spatial information to test.

	Luo Model		Li Model		Hamida Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.212	17.78%	0.202	17.24%	0.056	3.17%
2	0.640	86.61%	0.069	3.64%	0.471	41.79%
3	0.000	0.00%	0.003	0.14%	0.006	0.29%
4	0.001	0.07%	0.172	21.54%	0.073	4.47%
5	0.004	0.22%	0.019	0.97%	0.000	0.00%
6	0.151	13.78%	0.244	73.35%	0.262	63.65%
7	0.000	0.00%	0.003	0.15%	0.000	0.00%
8	0.081	6.06%	0.221	30.53%	0.190	26.67%
9	0.002	0.11%	0.002	0.11%	0.000	0.00%
OA(%)	42.67%		17.098%		28.824%
AA(%)	13.85%		16.41%		15.56%
Kappa	0.1100		0.061		0.0897
	Chen Model		Hu Model		Proposed Model
#	F1-Score	Accuracy	F1-Score	Accuracy	F1-Score	Accuracy
1	0.157	12.74%	0.867	87.92%	0.950	94.60%
2	0.110	6.11%	0.912	97.30%	0.979	98.48%
3	0.000	0.00%	0.341	27.11%	0.833	83.18%
4	0.097	8.78%	0.881	85.15%	0.965	95.56%
5	0.113	11.23%	0.991	98.88%	0.994	99.85%
6	0.207	55.44%	0.639	50.07%	0.937	92.54%
7	0.000	0.00%	0.661	57.22%	0.904	91.58%
8	0.145	18.31%	0.781	87.86%	0.881	87.81%
9	0.002	0.11%	0.997	99.68%	0.996	99.26%
OA(%)	13.72%		84.02%		95.149%
AA(%)	12.52%		76.80%		93.65%
Kappa	0.0274		0.7824		0.9355

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kang, B.; Park, I.; Ok, C.; Kim, S. ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification. Appl. Sci. 2022, 12, 174. https://doi.org/10.3390/app12010174

AMA Style

Kang B, Park I, Ok C, Kim S. ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification. Applied Sciences. 2022; 12(1):174. https://doi.org/10.3390/app12010174

Chicago/Turabian Style

Kang, Byungjin, Inho Park, Changmin Ok, and Sungho Kim. 2022. "ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification" Applied Sciences 12, no. 1: 174. https://doi.org/10.3390/app12010174

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ODPA-CNN: One Dimensional Parallel Atrous Convolution Neural Network for Band-Selective Hyperspectral Image Classification

Abstract

1. Introduction

2. Related Work

2.1. Applying CNN to HSI Classification

2.2. Atrous Convolution

3. Proposed One Dimensional Atrous Convolution Nerual Network

3.1. ODPA-CNN

3.2. Atrous Convolution Layer

3.3. Activation Function

3.4. Optimizer and Loss Function

4. Experimental Result

4.1. The Datasets

4.2. Experiment Setup

4.3. Result and Comparison

4.3.1. Train and Test with General datasets

4.3.2. Train and Test with Random Spatial datasets

4.3.3. Trained with General datasets and Tested with Random Spatial datasets

4.4. Limitation

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI