Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding

Li, Yunsong; Ge, Chiru; Sun, Weiwei; Peng, Jiangtao; Du, Qian; Wang, Keyan

doi:10.3390/rs11050550

Open AccessArticle

Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding

by

Yunsong Li

^1,2,*,

Chiru Ge

^1,2

,

Weiwei Sun

³,

Jiangtao Peng

⁴,

Qian Du

⁵ and

Keyan Wang

^1,2

¹

The State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China

²

The School of Telecommunications Engineering, Xidian University, Xi’an 710071, China

³

The Department of Geography and Spatial Information Techniques, Ningbo University, Ningbo 315211, China

⁴

The Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, Hubei University, Wuhan 430062, China

⁵

The Department of Electrical and Computer Engineering, Mississippi State University, Starkville, MS 39762, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2019, 11(5), 550; https://doi.org/10.3390/rs11050550

Submission received: 10 January 2019 / Revised: 26 February 2019 / Accepted: 1 March 2019 / Published: 6 March 2019

(This article belongs to the Special Issue Hyperspectral Imagery Intelligent Processing for Coastal Environmental Studies)

Download

Browse Figures

Versions Notes

Abstract

:

A new method of superpixel segmentation-based local pixel neighborhood preserving embedding (SSLPNPE) is proposed for the fusion of hyperspectral and light detection and ranging (LiDAR) data based on the extinction profiles (EPs), superpixel segmentation and local pixel neighborhood preserving embedding (LPNPE). A new workflow is proposed to calibrate the Goddard’s LiDAR, hyperspectral and thermal (G-LiHT) data, which allows our method to be applied to actual data. Specifically, EP features are extracted from both sources. Then, the derived features of each source are fused by the SSLPNPE. Using the labeled samples, the final label assignment is produced by a classifier. For the open standard experimental data and the actual data, experimental results prove that the proposed method is fast and effective in hyperspectral and LiDAR data fusion.

Keywords:

hyperspectral; light detection and ranging (LiDAR); image fusion; extinction profile (EP); superpixel segmentation; local pixel neighborhood preserving embedding (LPNPE); Goddard’s LiDAR; hyperspectral and thermal (G-LiHT) data

Graphical Abstract

1. Introduction

With the development of sensor technologies, various information about different materials on the ground can be collected from various image sensor sources, e.g., hyperspectral images (HSI) with spectral characteristics, light detection and ranging (LiDAR) images with shape and elevation information, thermal images with temperature information and synthetic aperture radar (SAR) images with terrain texture information. These multi-source images make it possible to make full use of the complementary and redundant information of each source which can provide reliable image interpretation and improve the performance of intelligent processing algorithms. Although these datasets provide a wealth of information, the automatic interpretation of remote sensing data is still very challenging [1].

Hyperspectral images (HSI) have hundreds of narrow spectral bands throughout the visible and infrared portions of the electromagnetic spectrum. LiDAR is an active remote sensing method, using the pulsed laser to measure distances (ranges). Similar to Radar, the coherent light pulses are transmitted, reflected by a target, and detected by a receiver. LiDAR data products usually include LiDAR point cloud, digital terrain model (DTM), canopy height model (CHM) and digital surface model (DSM). LiDAR point cloud contains 3D coordinates, classified ground returns, above ground level (AGL) heights and apparent reflectance. The CHM represents the actual height of buildings, trees, etc. without the influence of ground elevation. The DTM is the bare-earth elevation. The DSM describes the height of all the objects on the surface of the earth, which is the sum of the DTM and the CHM. The passive sensing of hyperspectral systems can describe the spectral characteristics of the observed scenes, whereas the active sensing of LiDAR systems can offer the height and shape information of the scenes. LiDAR also has high accuracy and flexibility, since it can be operated at any time, less sensitive to weather conditions and system parameters are adjustable, for example, flying speed/height, scan angle, pulse rate, and scan rate amongst others.

Currently, various algorithms are only designed for HSI or LiDAR. Several algorithms for classification, feature extraction, and segmentation are proposed for HSI [2,3,4,5,6,7,8,9,10], while many feature extraction and detection algorithms are only designed for LiDAR [11,12,13,14,15,16]. However, it is evident that no single type of sensors can always be adequate for reliable image interpretation. For instance, HSI should not be utilized to distinguish objects composed of the same material, such as roofs and roads with the same pavement material. On the other hand, LiDAR data alone cannot be adopted to differentiate objects with the same elevation, such as roofs with the same height, but made of concrete or tile [1].

In many applications, HSI and LiDAR have been used in combination successfully, such as biomass estimation [17], micro-climate modeling [18], mapping plant richness [19] and fuel type mapping [20]. Furthermore, the combined use of HSI and LiDAR results in higher classification accuracies than using each source separately. For example, elevation and shape information in LiDAR data and spectral information acquired by HSI have been investigated [1,21,22,23,24,25,26,27,28]. The above work show that LiDAR and HSI can complement each other well, and by adequately integrating the two data sets, the advantages of both can be fully utilized while the shortcomings of each data set can be solved as well. In the above work, the results have shown that the combined use of LiDAR and optical data can improve classification accuracies in forests and urban areas. The research work sequence on the combined use of LiDAR and HSI led to the 2013 and 2018 data fusion contest organized by the Geoscience and Remote Sensing Society (GRSS).

Hyperspectral and LiDAR fusion classification methods can be feature level fusion or the decision level fusion. In the feature level fusion, features extracted in the HSI and LiDAR are stacked for further processing [21]. However, the stacked features may contain redundant information. With the limited number of labeled samples in many real applications, the stacked features may pose the problem of the curse of dimensionality and therefore result in the risk of overfitting the training data. Therefore, many works in the literature adopt the fusion framework of stacking and dimension reduction [22,23,26,27,28]. W. Liao et al. Y. Gu et al. and P. Ghamisi et al. use the graph-based method, where calculating the graph needs a lot of calculations and memory [22,23,28]. B. Rasti et al. use a sparse representation based method [26], and B. Rasti et al. use a total variation component analysis based method, in which solving the optimization equation by loops requires much calculation [27]. In the decision level fusion, the classification results of different classifiers are merged with the majority voting strategy [24]. However, voting can lead to rough results. In this paper, we propose an effective fusion method with low computational complexity which is suitable for real-time applications.

In our fusion framework, we need to extract the spatial features of HSI and LiDAR images. M. Dalla Mura et al. introduced the concept of attribute profile (AP) [29] as a generalization of morphological profile [30]. The AP extracts multi-level representations of an image, which utilizes a sequential application of morphological attribute filters. To further improve the conceptual capability of the AP and the corresponding classification accuracies, the extinction profiles (EPs) was proposed [31,32]. The EP has the following advantages. Firstly, it can remove insignificant details and preserve the geometrical characteristics of the input image simultaneously. Secondly, it can deliver a better recognition performance than the traditional spatial-spectral AP feature extraction method.

For fusion, the spatial features of HSI and LiDAR images as well as spectral features of HSI are stacked first. The stacked features have complete information. Then, to avoid the high dimensionality and the risk of overfitting the training data, we need to find a fast and efficient method to remove redundant information with the HSI and LiDAR data. For the traditional dimension reduction (DR) methods, for example, linear discriminant analysis (LDA) [33,34], local Fisher discriminant analysis (LFDA) [35], local discriminant embedding (LDE) [36], nonparametric weighted feature extraction (NWFE) [37] and so on, they belong to the spectral-based DR methods. However, for the stacked features, the vector-domain similarity is not sufficient to show the intrinsic relationships among different samples. If two samples have a small vector distance, they may have a large spatial pixel distance. The projection based on the vector similarity metric may result in wrong features. A. Plaza et al. have proved that spatial contexture information is useful to increase the classification accuracy of HSI [38]. Therefore, a spatial similarity-based method, a local pixel neighborhood preserving embedding (LPNPE) is utilized [39]. The LPNPE learns the discriminant projections through the local spatial pixel neighborhood. However, it may also have a problem that two samples in the same spatial pixel neighborhood may have a sizeable vector distance. To solve the above problem, an entropy rate superpixel (ERS) segmentation method is utilized [40]. A superpixel created by the ERS contains the pixels with small vector distance. Therefore, we intersect the superpixel and the defined local spatial pixel neighborhood to remove the pixels with large vector distances and get an ideal spatial neighborhood. Finally, the discriminant projections can be learned by the ideal spatial neighborhoods. Recently, some new clustering algorithms have been proposed [41,42]. These clustering algorithms provide the possibility to design better fusion algorithms and will be discussed in future work.

For the application, our proposed fusion framework can be successfully applied to G-LiHT data. The G-LiHT airborne imager created by the US National Aeronautics and Space Administration (NASA) is an airborne system that simultaneously collects LiDAR data, HSI, and thermal data [43]. The G-LiHT data contains CHM, DTM, Lidar Point Cloud, a hyperspectral reflectance image in the same area at 1m spatial resolution. The G-LiHT data provides new opportunities for the design of new hyperspectral and LiDAR fusion classification algorithms. C. Zhang et al. have evaluated the G-LiHT data for mapping urban land-cover types [24]. However, C. Zhang et al. take merely advantage of LiDAR’s CHM data and HSI, and it does not fully exploit the potential of the G-LiHT data [24].

In this paper, an innovative method, superpixel segmentation based local pixel neighborhood preserving embedding (SSLPNPE) method, is proposed for the fusion of HSI and LiDAR data. In particular, the main contributions of this paper are as follows.

(1): This paper presents a novel fusion method SSLPNPE. Our proposed method has low computational complexity and can significantly improve the classification accuracy of the LiDAR and HSI fusion.
(2): A new workflow is proposed to calibrate the G-LiHT data. With the workflow, our proposed method can be applied for practical applications. Experimental results prove that for the G-LiHT data, the proposed method SSLPNPE is fast and effective in hyperspectral and LiDAR data fusion. The proposed workflow can be generalized to any fusion classification method and the G-LiHT data in any scene.
(3): This paper presents that processing the CHM and the DTM separately can achieve higher classification accuracy than the DSM. To the best of our knowledge, we utilize the CHM and the DTM separately instead of the DSM for the first time in the remote sensing community.

The structure of the paper is as follows. Section 2 introduces the methodology. Section 3 presents the data, experiment setup and experimental results. Section 4 provides the main concluding remarks.

2. Methodology

Figure 1 shows the framework of the proposed method. In this framework, EPs were applied to the hyperspectral and LiDAR images to extract spatial and elevation information. Firstly, the hyperspectral spectral features, the hyperspectral spatial features, and the LiDAR elevation features are fed to kernel principal component analysis (KPCA) to reduce the noise [44]. Then, because different feature sources typically have a different range of values and different characteristics, all the features are normalized before the image fusion. Next, the proposed fusion method SSLPNPE was adopted to create the fused features, and the specific algorithm is shown in Algorithm 1. Finally, the fused features from SSLPNPE were classified by the support vector machine (SVM) classifier to get the classification map [45].

Algorithm 1 Feature fusion with superpixel segmentation-based local pixel neighborhood preserving embedding (SSLPNPE).

Input:: $H S I$ : HSI spectral features; $E M E P$ : HSI spatial features; $M E P_D T M$ : LiDAR DTM spatial features; $M E P_C H M$ : LiDAR CHM spatial features;
Output:: SSLPNEP features, F;

1:: Stack the features, $F_s t a = H S I + E M E P + M E P_D T M + M E P_C H M$ ;
2:: Randomly select p points in $F_s t a$ . $x_{i}$ means one of the p points. Determine the w × w local pixel neighborhood $N (x_{i}) = \{x (p, q) ∣ p \in [p_{i} - a, p_{i} + a], q \in [q_{i} - a, q_{i} + a]\}$ , $a = (w - 1) / 2$ ;
3:: Use the ERS on first PC of HSI to get the superpixel labels. According to the superpixel labels, determine the superpixel segmentation based local pixel neighborhood $S (x_{i})$ ;
4:: Obtain the final spatial local neighborhood through $F_{i} = N (x_{i}) \cap S (x_{i}) = {x_{i}, x_{i 1}, x_{i 2}, \dots x_{i m}}$ ;
5:: Get the local pixel neighborhood preserving scatter matrix by H = $\sum_{i = 1}^{p} \sum_{g = 1}^{m} (x_{i} - x_{i g}) {(x_{i} - x_{i g})}^{T}$ ;
6:: Get the total scatter matrix by $S = \sum_{i = 1}^{p} (x_{i} - m) {(x_{i} - m)}^{T}$ ;
7:: Acquire the optimal projection matrix $V = [v_{1}, v_{2}, \dots v_{ι}]$ by solving $S v = λ H v$ ;
8:: Get the SSLPNEP features, $F = F_s t a \times V$ ;
return F;

2.1. The Extinction Profile

The spatial features of HSI include shape and texture, while the spatial features of LiDAR images include shape, texture and elevation information. These spatial features are essential for classification tasks. Therefore, in our fusion framework, we need a spatial feature extraction method to obtain the spatial features of HSI and LiDAR images.

The extinction profile (EP) [31] has recently drawn much attention since it has the following advantages. Firstly, it can remove trivial details and preserve the geometrical characteristics of the input image simultaneously. Secondly, it can deliver a better recognition performance than the traditional spatial-spectral AP [29]. Therefore, we utilize the EP for spatial feature extraction in this paper.

The EP consists of a sequence of thinning and thickening transforms obtained by a set of extinction filters (EFs) which are connected filters. The EFs preserve the relative image extrema. For instance, The EP for the input grayscale image

I

can be defined as [31]:

\begin{matrix} EP (I) = & {\underset{thickening profile}{\underset{⏟}{ϕ^{P_{λ_{S}}} (I), ϕ^{P_{λ_{S - 1}}} (I), \dots ϕ^{P_{λ_{1}}} (I)}}, I \\ \underset{thinning profile}{\underset{⏟}{γ^{P_{λ_{S}}} (I), γ^{P_{λ_{S - 1}}} (I), \dots γ^{P_{λ_{1}}} (I)}}} . \end{matrix}

(1)

The terms

ϕ

and

γ

are thickening and thinning transforms, and

P_{λ_{S}} : \{P_{λ_{i}}\} (i = 1, \dots, S)

, a set of S ordered predicates (i.e.,

P_{λ_{i}} \subseteq P_{λ_{k}}, i \leq k

). Predicates determine the number of extrema in EPs.

In order to extract the EP from HSI, independent component analysis (ICA) was first utilized to obtain the most informative components, and the three independent components (ICs) are utilized as base images to produce the extended EP (EEP) [31]:

EEP (I) = \{EP (I C_{1}), EP (I C_{2}), EP (I C_{3})\}

. Since the ICs are relatively independent to each other [46], the obtained ICs can characterize the original HSI in different aspects which can offer distinctive and complementary information in the corresponding EP. Compared to morphological profiles (MPs) [30] which can only represent the structure and size of observed objects, EPs are more flexible. By these means, the extended multivariate EP (EMEP) joins different EEPs (e.g., height, area, diagonal of bounding box, volume and standard deviation) into a single stacked vector. The EMEP can be defined as:

EMEP (I) = \{E P_{a_{1}}, E P_{a_{2}}, \dots, E P_{a_{ω}}\}

where

a_{j}

,

j = \{1, \dots, ω\}

denotes different types of attributes [31]. Considering different extinction attributes can obtain complementary spatial information, the EMEP can better describe spatial information than a single EP.

To extract the EP from the LiDAR CHM and LiDAR DTM, we accepted the term multivariate EP (MEP) for the situation when different types of EPs are used to the LiDAR images because there is only single band available.

2.2. Fusion

In our proposed method, the features in the HSI, the HSI EMEP and the LiDAR MEP were first stacked. However, the stacked features may contain redundant information. To avoid an over-fitting risk in our fusion framework, we needed to reduce the dimension of the stacked features.

Since the adjacent pixels in the spatially homogeneous region were composed of the same material and belong to the same class, we utilized a spatial domain dimension reduction method LPNPE. To solve the problem that two samples in the same spatial pixel neighborhood may have a large vector distance, an entropy rate superpixel (ERS) segmentation method is utilized [40] to remove the pixels with large vector distances and get an ideal spatial neighborhood. The proposed method SSLPNPE can remove redundant information and effectively retain spatial contexture information which is useful to improve the classification accuracy [38].

To segment the HSI efficiently, we first utilize the principal component analysis [47] to obtain the first principal component (PC). Since the first PC contains the most spatial information of HSI, ERS segmentation [40] is then used to create superpixel segmentation map on the first PC. In particular, the ERS initially map the first PC to a graph

G = (V, E)

, where the vertex set V represents the pixels of the first PC, and the edge set E describes the pairwise similarity between neighboring pixels. Next, by selecting a subset A from E, the graph G is clustered into K connected subgraphs which correspond to superpixels. To get the compact, balanced and homogeneous superpixels, the objective function for the superpixel segmentation is shown as:

max_{A} [H (A) + μ B (A)] s . t . A \subseteq E,

(2)

where

H (A)

is the entropy rate constraint,

B (A)

is the balance constraint and

μ \geq 0

is the weight which controls the contribution of the balance term and the entropy rate term. M. Liu et al. solved this problem by the greedy algorithm [40]. Finally, the superpixel patch can be extracted from the segmentation map. The superpixel

S (x_{i})

represents the superpixel where

x_{i}

is located. It contains the local pixels subset

Q_{i} = {x_{i}, x_{i 1}, x_{i 2}, \dots x_{i k}}

, where k represents the number of pixels other than

x_{i}

in

S (x_{i})

.

Suppose the local pixel neighborhood center is

x_{i}

with coordinate

(p_{i}, q_{i})

, the spatial neighborhood can be expressed as

N (x_{i}) = \{x (p, q) ∣ p \in [p_{i} - a, p_{i} + a], q \in [q_{i} - a, q_{i} + a]\},

(3)

where

a = (w - 1) / 2

, and w is the width of the neighborhood window. In the spatial neighborhood

N (x_{i})

, the local pixels subset can be marked as

P_{i} = {x_{i}, x_{i 1}, x_{i 2}, \dots x_{i s}}

, where

s = w^{2} - 1

denotes the number of neighbors of

x_{i}

. In order to optimize the choice of local neighborhoods, the final selected spatial local neighborhood is as

F_{i} = P_{i} \cap Q_{i} = {x_{i}, x_{i 1}, x_{i 2}, \dots x_{i m}},

(4)

where m is the number of pixels other than

x_{i}

. As Figure 1 shows,

F_{i}

removes the point which is very different from other pixels in

N (x_{i})

. The distance scatter in the spatial local neighborhood

F_{i}

denotes

h_{i} = \sum_{g = 1}^{m} (x_{i} - x_{i g}) {(x_{i} - x_{i g})}^{T}

where

x_{i}

is the central pixel. Select p scattered points in the stacked feature space, the local pixel neighborhood preserving scatter matrix defines as

H = \sum_{i = 1}^{p} h_{i} = \sum_{i = 1}^{p} \sum_{g = 1}^{m} (x_{i} - x_{i g}) {(x_{i} - x_{i g})}^{T} .

(5)

The total scatter matrix can be defined as

S = \sum_{i = 1}^{p} (x_{i} - \bar{m}) {(x_{i} - \bar{m})}^{T},

(6)

where

\bar{m}

is the mean of the selected p scattered points.

SSLPNPE seeks a linear projection matrix to make the local pixel neighborhood keep scatter minimized while the total scatter is maximized in the projection space. The optimal projection matrix

V = [v_{1}, v_{2}, \dots v_{ι}]

can be acquired by solving the following eigenvalue problem.

S v = λ H v

(7)

3. Experimental Results and Discussion

3.1. Data

3.1.1. 2012 Houston Data

Experiments were conducted by using HSI and DSM data that were acquired on June 2012 across the University of Houston campus and the neighbouring urban area. The Houston data can be requested for free through the webpage (http://www.grss-ieee.org/community/technical-committees/data-fusion/2013-ieee-grss-data-fusion-contest/). The hyperspectral image had 144 spectral bands. The wavelength range of HSI was from 380 to 1050 nm. The same spatial resolution of HSI and DSM data is 2.5 m. The data for the entire scene consisted of 349 × 1905 pixels, containing 15 classes which were set by the DFTC via photo-interpretation. Table 1 shows the number of training and testing samples. The LiDAR data was collected on 22nd June 2012, between 14:37:55 and 15:38:10 UTC with 2000 feet the average height of the sensor. The HSI was collected on 23rd June 2012, between 17:37:10 and 17:39:50 UTC with 5500 feet the average height of the sensor. The data contain natural objects (e.g., water, soil, tree and grass) and artificial objects (e.g., parking lot, railway, highway, and road). Figure 2 gives the HSI, the LiDAR DSM, and the positions of the training and testing samples. There was a large cloud shadow in the image with no training samples and a significant number of testing samples to test the algorithms doing with cloud shadow.

3.1.2. Rochester Data

The G-LiHT airborne imager [43] is a unique system that provides simultaneous CHM, DTM, LiDAR Point Cloud, hyperspectral reflectance image at the 1m spatial resolution on a wide range of airborne platforms and google earth overlay by keyhole markup language (KML) at 0.25m spatial resolution, which are helpful for land cover mapping. The G-LiHT data can be obtained for free through an interactive webpage (https://glihtdata.gsfc.nasa.gov). G-LiHT has been used to collect data over 6500 km

^{2}

involving various ecological regions in the United States and Mexico. G-LiHT’s LiDAR data is from Riegl VQ-480 Scanning Lidar, and G-LiHT’s hyperspectral data is from the Headwall Hyperspec Imaging Spectrometer. G-LiHT provides already registered hyperspectral and LiDAR data.

Before using G-LiHT data to verify the performance of the land cover mapping algorithms, we need to make ground truth first. General image calibration usually requires a handheld GPS device and manual field measurement. However, the accuracy of the common handheld GPS device is 5 m, which may cause errors in image calibration. For the G-LiHT data, the spatial resolution was 1 m. With 0.25 m google earth overlay, G-LiHT data can be directly calibrated on the image with the high accuracy. A workflow to produce the G-LiHT data ground truth is proposed in this paper as shown in Figure 3. We used five types of data including the G-LiHT hyperspectral image (1 m) with spectral and spatial information, google earth RGB (0.25 m) with supplemental spatial information, G-LiHT LiDAR CHM (1 m) with canopy height information, G-LiHT LiDAR DTM (1 m) with terrain elevation information, and G-LiHT LiDAR slope (1 m) with texture information to determine the category first, that is to determine what is in the scene. For example, in Figure 4, we defined ten categories. Inspired by the calibration method [24], we utilized the superpixel segmentation on HSI first. Then, we utilized the result of the superpixel segmentation and the region of interest (ROI) selection tool to manually determine the initial object areas for each class in HSI. The result of the superpixel segmentation is beneficial to the selection of the boundary of the initial object areas. The spectral curve in HSI has an excellent ability to distinguish between substances, and the spatial resolution of HSI (1 m) is quite high. Subsequently, the selected initial object areas were manually refined by jointly checking HSI, google earth RGB, CHM, DTM, and the slope image. With the variety of information shown in Figure 3, we can eliminate some points with large differences in the same object area. Finally, the object areas were manually labeled, and the G-LiHT ground truth can be obtained as Figure 4d,e.

Experiments are done on the HSI, LiDAR-derived CHM data and LiDAR-derived DTM data that were acquired on June 2015 across the Rochester area (the location is at

43 ° 18^{'} 36^{″}

N latitude,

70 ° 57^{'} 36^{″}

W longitude). The hyperspectral image had 114 bands with spectral range 420–950 nm at a 1 m ground sampling distance (GSD). LiDAR CHM and DTM are also at a 1 m GSD. The data of the entire scene consists of 635 × 1169 pixels, including ten classes. In the experiment, we chose less than 200 pixels for each class in ground truth as the training set. The training samples were selected by regions and not a random selection. Therefore, the position of the training sample was concentrated, which is more in line with the way of sample selection in practical applications. Table 2 reports the land cover classes and the corresponding number of training and testing samples. Figure 4 shows the HSI, the LiDAR CHM, the LiDAR DTM, the positions of the testing samples, the positions of the training samples and legend of different classes.

3.2. Parameter Settings and Comparison of Methods

There are two parameters

α

and s that need to be set in creating the EP for diagonal of the bounding box, volume, and area. The parameter s represents the number of thresholds. For the parameter

α

, the larger the

α

, the more significant the differences among continuous images in the profile, whereas the smaller the

α

, the fewer extrema there will be, where most image information is usually retained. P. Ghamisi et al. suggested that the proper

α

value can be determined between 2 and 5 [31]. In our experiments,

α

and s were set to 5 and 7, respectively. The maximum value was divided into seven equidistant parts to create the EP for standard deviation and height. Considering the original image should also be involved in the profile, the size of the EP is 2s + 1. The four connected connectivity rule was used to compute the profiles.

To reduce the noise, the KPCA is used for all the features, and the dimension is set to 70.

For the fusion method, 3000 points were randomly chosen as the local pixel neighborhood center. Figure 5 shows that the

3 \times 3

window is the ideal window width. Therefore the width of the neighborhood window w was set to

3 \times 3

. For the superpixel segmentation method, the balancing parameter

μ

can be automatically adjusted according to M. Liu et al. [40]. In Table 3,

3 \times 3

+ superpixel means the spatial neighborhood determined by Equation (4), while 3× 3 means that the spatial neighborhood is

3 \times 3

window. Table 3 shows that the spatial neighborhood determined by

3 \times 3

+ superpixel performs better than the spatial neighborhood determined directly by 3 × 3. In Table 3, the number of superpixels is from 10,000 to 200,000 with an interval of 5000. We take three ranges and average the OAs in each range. The results show that too many superpixels and too few superpixels will reduce the performance of the proposed method. To make the superpixels effectively remove the points with large spectral distance in the

3 \times 3

neighborhoods, we recommend that the number of superpixels can be around

N / n^{2}

, where N is the total number of pixels in the image and n is the width of the window. Therefore, the number of superpixels in the Houston data was set to 60,000, and the number of superpixels in Rochester data was set to 74,000.

Our proposed method was compared with the generalized graph-based fusion (GGF) algorithm, which was the champion of the 2013 GRSS data fusion contest [1,22]. The numbers of nearest neighbors of features were set to 150. Since the original GGF algorithm only uses DSM, for the fair comparison, we replace the EP DSM features with the stacked features of EP CHM features and EP DTM features as one of the inputs of GGF.

The support vector machine (SVM) classifier was applied to all the methods. The parameters of the SVM classifier used cross-validation to determine parameters c and g. Overall accuracy (OA), average accuracy (AA), and Kappa statistic were applied to evaluate the classification results.

The experimental part uses the following names for simplicity.

H S I

,

C H M

,

D T M

,

D S M

,

C H M + D T M

,

H S I + D S M

and

H S I + C H M + D T M

present the classification accuracies of HSI, LiDAR CHM, LiDAR DTM, LiDAR DSM, the stack of CHM and DTM, the stack of HSI and DSM and the stack of HSI, CHM and DTM, respectively.

E P_{H S I}

,

E P_{C H M}

,

E P_{D T M}

and

E P_{D S M}

present the classification accuracies of EPs used to HSI, LiDAR CHM, LiDAR DTM and LiDAR DSM, respectively.

E P_{H S I} + E P_{D S M}

presents the classification accuracies of the stack of HSI EPs and LiDAR DSM EPs.

E P_{C H M} + E P_{D T M}

means the same as

E P_{H S I} + E P_{D S M}

.

H S I + E P_{H S I} + E P_{D S M}

shows the classification accuracies of the stacked features, which is the stack of HSI, HSI EPs and DSM EPs.

H S I + E P_{H S I} + E P_{D S M}

and

H S I + E P_{H S I} + E P_{C H M} + E P_{D T M}

mean the same as

H S I + E P_{H S I} + E P_{D S M}

.

G G F

and

S S L P N P E

present the classification accuracies of the method GGF for comparison and the proposed method SSLPNPE.

3.3. Results and Discussion

Figure 6 shows the classification performance as a function of varying the dimension for the Houston and Rochester data. For Houston data, when the reduced dimension is from seven to 32, the classification accuracy of the SSLPNPE was higher than the GGF. When the reduced dimension is from 33 to 40, the classification accuracy of the GGF is higher than the SSLPNPE. The dimension with the highest classification accuracy is 37 for GGF and 24 for SSLPNPE. For Rochester data, with varying the dimension, the classification accuracy of the SSLPNPE is a little higher than the GGF. The dimension with the highest classification accuracy is 31 for GGF and 40 for SSLPNPE.

Table 4 shows the classification accuracies obtained by different approaches for the Houston data.

H S I + D S M

is higher than

D S M

and

H S I

, which means that LiDAR DSM and hyperspectral data have complementary information. The use of EPs can significantly improve Kappa, OA, and AA since the EPs can efficiently extract spatial information and model the shape and size of different objects, which are helpful to precisely distinguish different classes of interest [31]. For example,

E P_{D S M}

is significantly higher than

D S M

and

E P_{H S I}

is higher than

H S I

, which confirms the information extraction capability of the EP. The value

E P_{D S M} + E P_{H S I}

is higher than

E P_{D S M}

and

E P_{H S I}

, which proves that

E P_{D S M}

and

E P_{H S I}

have the complementary information. The stacking features of HSI, HSI EPs and DSM EPs have more information than the stacking features of HSI EPs and DSM EPs, but

H S I + E P_{H S I} + E P_{D S M}

is lower than

E P_{H S I} + E P_{D S M}

due to the problem of the curse of dimensionality. Since GGF utilizes the complementary information of HSI, HSI EPs, and LiDAR EPs and maintains spectral similarity while reducing the dimensions of stacked features, GGF has higher classification accuracy than

H S I + E P_{H S I} + E P_{D S M}

. Figure 7 shows the classification map of the Houston data. Figure 7 shows that the SSLPNPE performs better on the “parking lot 1” category and the “running track” category. From Figure 6, Table 4 and Figure 7, we can see that the classification performance of the SSLPNPE is similar to the GGF. The reason why SSLPNPE performs well is that SSLPNPE can remove redundant information of the stacked features and effectively retain spatial contexture information by optimizing spatial neighborhoods with the HSI superpixels. Table 6 shows the time of fusion and classification between the GGF algorithm and the SSLPNPE algorithm. The experimental device had a 2.3 GHz Intel i5 CPU. For the Houston data, the fusion time of the GGF and the SSLPNPE were 25.77 and 12.48 s. The SSLPNPE was faster than the GGF because the GGF needs to calculate the norm distance between each feature to produce the graph, which needs a lot of calculations and memory. For the SSLPNPE, it intersected the superpixel and 3 × 3 local spatial neighborhoods to determine the final local spatial neighborhoods, which is much more efficient. Especially, the ERS superpixel method is a low computational algorithm [40]. As a result, our proposed method SSLPNPE was fast and effective for the fusion of HSI and LiDAR data.

Concerning Table 5, we can easily notice that

C H M + D T M

performed better than

C H M

and

D T M

, which shows the CHM and the DTM have complementary information for classification. Although the DSM is derived from the sum of the CHM and the DTM,

C H M + D T M

is higher than

D S M

, and

H S I + C H M + D T M

is higher than

H S I + D S M

, which means that the stack of the CHM and the DTM can reserve more information than the DSM. Same as the Houston data, the use of EPs can considerably improve the classification accuracy of the hyperspectral data and the LiDAR data as shown in Table 5. Figure 8 shows the classification map of the 2015 Rochester data. Figure 8 shows that the SSLPNPE performs better on the “grass/lawn” category, the “residential_gray” category and the “road” category. As shown in Figure 6, Table 5 and Figure 8, the classification performance of the SSLPNPE is a little higher than the GGF. However, the computing time of SSLPNPE is lower than the computing time of the GGF in Table 6.

For HSI and LiDAR fusion classification, there were a few methods proposed in recent years. However, many fusion methods are very complicated. In the follow-up work, we will add a comparison of more fusion methods. According to the comparison of the real samples in G-LiHT data, the experimental results have been able to demonstrate the effectiveness of SSLPNPE.

4. Conclusions

This paper introduces a fusion method SSLPNPE for the classification of LiDAR and HSI using EPs, superpixel segmentation and LPNPE. In this paper, the feature extraction methods EP have been utilized for spatial and elevation information extraction from HSI and LiDAR data. Then, the derived features of each source are fused by SSLPNPE. Using the labeled samples created by our proposed workflow that suits to calibrate the G-LiHT data, the final classification map is produced by SVM classifier. Our proposed method is not only fast and effective in hyperspectral and LiDAR data fusion, but also can be successfully applied to the G-LiHT data. The proposed workflow can be generalized to any fusion classification method and the G-LiHT data in any scene. This paper presents that processing the CHM and the DTM separately can achieve higher classification accuracy than the DSM for the first time in the remote sensing community.

Author Contributions

Methodology, J.P., W.S. and C.G.; software, C.G. and J.P.; supervision, Y.L. and Q.D.; funding acquisition, Y.L.; validation, C.G.; visualization, C.G., J.P., W.S. and Q.D.; writing—original draft preparation, C.G.; writing—review and editing C.G., W.S., J.P., Q.D., K.W. and Y.L.

Funding

This work was supported in part by the China Scholarship Counsil program (201706960055). This work was also partially financed in part by the National Natural Science Foundation of China (41671342, 61871177, 61801357, 61571345, 91538101, 61501346, 61502367 and 61701360) and the 111 project (B08038). It was also partially supported by the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2016JQ6023), the Yangtze River Scholar Bonus Schemes of China (No. CJT160102), and the Ten Thousand Talent Program. It was also partially supported by the Fundamental Research Funds for the Central Universities JB180104, and General Financial Grant from the China Postdoctoral Science Foundation (No. 2017M620440). This work was also partially funded by Zhejiang Provincial Natural Science Foundation of China (LR19D010001), by Open Fund of State Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University (18R05) and by the K. C. Wong Magna Fund in Ningbo University.

Acknowledgments

The authors would like to thank the IEEE GRSS Image Analysis and Data Fusion Technical Committee that support the Houston data. The authors would like to thank NASA for providing the G-LiHT data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Debes, C.; Merentitis, A.; Heremans, R.; Hahn, J.; Frangiadakis, N.; Kasteren, T.V.; Liao, W.; Bellens, R.; Pižurica, A.; Gautama, S.; et al. Hyperspectral and LiDAR Data Fusion: Outcome of the 2013 GRSS Data Fusion Contest. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2405–2418. [Google Scholar] [CrossRef]
Gao, L.; Zhao, B.; Jia, X.; Liao, W.; Zhang, B. Optimized Kernel Minimum Noise Fraction Transformation for Hyperspectral Image Classification. Remote Sens. 2017, 9, 548. [Google Scholar] [CrossRef]
Li, X.; Zhang, L.; You, J. Hyperspectral Image Classification Based on Two-Stage Subspace Projection. Remote Sens. 2018, 10, 1565. [Google Scholar] [CrossRef]
Zhan, T.; Sun, L.; Xu, Y.; Yang, G.; Zhang, Y.; Wu, Z. Hyperspectral Classification via Superpixel Kernel Learning-Based Low Rank Representation. Remote Sens. 2018, 10, 639. [Google Scholar] [CrossRef]
Paoletti, M.E.; Haut, J.M.; Plaza, J.; Plaza, A. Deep & Dense Convolutional Neural Network for Hyperspectral Image Classification. Remote Sens. 2018, 10, 1454. [Google Scholar] [CrossRef]
Li, Y.; Xie, W.; Li, H. Hyperspectral image reconstruction by deep convolutional neural network for classification. Pattern Recognit. 2017, 63, 371–383. [Google Scholar] [CrossRef]
Bourennane, S.; Fossati, C.; Lin, T. Noise Removal Based on Tensor Modelling for Hyperspectral Image Classification. Remote Sens. 2018, 10, 1330. [Google Scholar] [CrossRef]
Fu, P.; Sun, X.; Sun, Q. Hyperspectral Image Segmentation via Frequency-Based Similarity for Mixed Noise Estimation. Remote Sens. 2017, 9, 1237. [Google Scholar] [CrossRef]
Zhang, X.; Zhang, J.; Li, C.; Cheng, C.; Jiao, L.; Zhou, H. Hybrid Unmixing Based on Adaptive Region Segmentation for Hyperspectral Imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3861–3875. [Google Scholar] [CrossRef]
Xie, W.; Shi, Y.; Li, Y.; Jia, X.; Lei, J. High-quality spectral-spatial reconstruction using saliency detection and deep feature enhancement. Pattern Recognit. 2019, 88, 139–152. [Google Scholar] [CrossRef]
Sreevalsan-Nair, J.; Jindal, A. Using gradients and tensor voting in 3D local geometric descriptors for feature detection in airborne lidar point clouds in urban regions. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 5881–5884. [Google Scholar] [CrossRef]
Yang, Y.; Yang, G.; Zheng, T.; Tian, Y.; Li, L. Feature extraction method based on 2.5-dimensions lidar platform for indoor mobile robots localization. In Proceedings of the 2017 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM), Ningbo, China, 19–21 November 2017; pp. 736–741. [Google Scholar] [CrossRef]
Hata, A.Y.; Wolf, D.F. Feature Detection for Vehicle Localization in Urban Environments Using a Multilayer LIDAR. IEEE Trans. Intell. Transp. Syst. 2016, 17, 420–429. [Google Scholar] [CrossRef]
Guan, H.; Yu, Y.; Li, J.; Liu, P. Pole-Like Road Object Detection in Mobile LiDAR Data via Supervoxel and Bag-of-Contextual-Visual-Words Representation. IEEE Geosci. Remote Sens. Lett. 2016, 13, 520–524. [Google Scholar] [CrossRef]
Hu, X.; Li, Y.; Shan, J.; Zhang, J.; Zhang, Y. Road Centerline Extraction in Complex Urban Scenes From LiDAR Data Based on Multiple Features. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7448–7456. [Google Scholar] [CrossRef]
Yu, Y.; Li, J.; Guan, H.; Jia, F.; Wang, C. Learning Hierarchical Features for Automated Extraction of Road Markings From 3-D Mobile LiDAR Point Clouds. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 709–726. [Google Scholar] [CrossRef]
Laurin, G.V.; Chen, Q.; Lindsell, J.A.; Coomes, D.A.; Frate, F.D.; Guerriero, L.; Pirotti, F.; Valentini, R. Above ground biomass estimation in an African tropical forest with lidar and hyperspectral data. ISPRS J. Photogramm. Remote Sens. 2014, 89, 49–58. [Google Scholar] [CrossRef]
Broadbent, E.N.; Zambrano, A.M.A.; Asner, G.P.; Field, C.B.; Rosenheim, B.E.; Kennedy-Bowdoin, T.; Knapp, D.E.; Burke, D.; Giardina, C.; Cordell, S. Linking rainforest ecophysiology and microclimate through fusion of airborne LiDAR and hyperspectral imagery. Ecosphere 2014, 5, art57. [Google Scholar] [CrossRef]
Hakkenberg, C.R.; Zhu, K.; Peet, R.K.; Song, C. Mapping multi-scale vascular plant richness in a forest landscape with integrated LiDAR and hyperspectral remote-sensing. Ecology 2018, 99, 474–487. [Google Scholar] [CrossRef] [PubMed]
Huesca, M.; Riaño, D.; Ustin, S.L. Spectral mapping methods applied to LiDAR data: Application to fuel type mapping. Int. J. Appl. Earth Obs. Geoinf. 2019, 74, 159–168. [Google Scholar] [CrossRef]
Xu, X.; Li, W.; Ran, Q.; Du, Q.; Gao, L.; Zhang, B. Multisource Remote Sensing Data Classification Based on Convolutional Neural Network. IEEE Trans. Geosci. Remote Sens. 2017, 1–13. [Google Scholar] [CrossRef]
Liao, W.; Pižurica, A.; Bellens, R.; Gautama, S.; Philips, W. Generalized Graph-Based Fusion of Hyperspectral and LiDAR Data Using Morphological Features. IEEE Geosci. Remote Sens. Lett. 2015, 12, 552–556. [Google Scholar] [CrossRef]
Gu, Y.; Wang, Q. Discriminative Graph-Based Fusion of HSI and LiDAR Data for Urban Area Classification. IEEE Geosci. Remote Sens. Lett. 2017, 14, 906–910. [Google Scholar] [CrossRef]
Zhang, C.; Smith, M.; Fang, C. Evaluation of Goddards LiDAR, hyperspectral, and thermal data products for mapping urban land-cover types. GISci. Remote Sens. 2018, 55, 90–109. [Google Scholar] [CrossRef]
Dalponte, M.; Bruzzone, L.; Gianelle, D. Fusion of Hyperspectral and LIDAR Remote Sensing Data for Classification of Complex Forest Areas. IEEE Trans. Geosci. Remote Sens. 2008, 46, 1416–1427. [Google Scholar] [CrossRef] [Green Version]
Rasti, B.; Ghamisi, P.; Plaza, J.; Plaza, A. Fusion of Hyperspectral and LiDAR Data Using Sparse and Low-Rank Component Analysis. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6354–6365. [Google Scholar] [CrossRef]
Rasti, B.; Ghamisi, P.; Gloaguen, R. Hyperspectral and LiDAR Fusion Using Extinction Profiles and Total Variation Component Analysis. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3997–4007. [Google Scholar] [CrossRef]
Ghamisi, P.; Hofle, B.; Zhu, X.X. Hyperspectral and LiDAR Data Fusion Using Extinction Profiles and Deep Convolutional Neural Network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3011–3024. [Google Scholar] [CrossRef]
Mura, M.D.; Benediktsson, J.A.; Waske, B.; Bruzzone, L. Morphological Attribute Profiles for the Analysis of Very High Resolution Images. IEEE Trans. Geosci. Remote Sens. 2010, 48, 3747–3762. [Google Scholar] [CrossRef]
Benediktsson, J.A.; Palmason, J.A.; Sveinsson, J.R. Classification of hyperspectral data from urban areas based on extended morphological profiles. IEEE Trans. Geosci. Remote Sens. 2005, 43, 480–491. [Google Scholar] [CrossRef]
Ghamisi, P.; Souza, R.; Benediktsson, J.A.; Zhu, X.X.; Rittner, L.; Lotufo, R.A. Extinction Profiles for the Classification of Remote Sensing Data. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5631–5645. [Google Scholar] [CrossRef]
Ghamisi, P.; Souza, R.; Benediktsson, J.A.; Rittner, L.; Lotufo, R.; Zhu, X.X. Hyperspectral Data Classification Using Extended Extinction Profiles. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1641–1645. [Google Scholar] [CrossRef]
Friedman, J.H. Regularized discriminant analysis. J. Am. Stat. Assoc. 1989, 84, 165–175. [Google Scholar] [CrossRef]
Bandos, T.V.; Bruzzone, L.; Camps-Valls, G. Classification of hyperspectral images with regularized linear discriminant analysis. IEEE Trans. Geosci. Remote Sens. 2009, 47, 862–873. [Google Scholar] [CrossRef]
Sugiyama, M. Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. J. Mach. Learn. Res. 2007, 8, 1027–1061. [Google Scholar]
Chen, H.T.; Chang, H.W.; Liu, T.L. Local discriminant embedding and its variants. In Proceedings of the Computer Vision and Pattern Recognition, 2005 CVPR 2005, IEEE Computer Society Conference on IEEE, San Diego, CA, USA, 20–26 June 2005; Volume 2, pp. 846–853. [Google Scholar]
Kuo, B.C.; Landgrebe, D.A. Nonparametric weighted feature extraction for classification. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1096–1105. [Google Scholar] [Green Version]
Plaza, A.; Benediktsson, J.A.; Boardman, J.W.; Brazile, J.; Bruzzone, L.; Camps-Valls, G.; Chanussot, J.; Fauvel, M.; Gamba, P.; Gualtieri, A.; et al. Recent advances in techniques for hyperspectral image processing. Remote Sens. Environ. 2009, 113, S110–S122. [Google Scholar] [CrossRef] [Green Version]
Zhou, Y.; Peng, J.; Chen, C.L.P. Dimension Reduction Using Spatial and Spectral Regularized Local Discriminant Embedding for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1082–1095. [Google Scholar] [CrossRef]
Liu, M.; Tuzel, O.; Ramalingam, S.; Chellappa, R. Entropy rate superpixel segmentation. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 2097–2104. [Google Scholar] [CrossRef]
Wang, Q.; Qin, Z.; Nie, F.; Li, X. Spectral Embedded Adaptive Neighbors Clustering. IEEE Trans. Neural Netw. Learn. Syst. 2018, 1–7. [Google Scholar] [CrossRef]
Wang, Q.; Chen, M.; Nie, F.; Li, X. Detecting Coherent Groups in Crowd Scenes by Multiview Clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2018. [Google Scholar] [CrossRef] [PubMed]
Cook, B.; Corp, L.; Nelson, R.; Middleton, E.; Morton, D.; McCorkel, J.; Masek, J.; Ranson, K.; Ly, V.; Montesano, P. NASA Goddards LiDAR, Hyperspectral and Thermal (G-LiHT) Airborne Imager. Remote Sens. 2013, 5, 4045. [Google Scholar] [CrossRef]
Schölkopf, B.; Smola, A.; Müller, K. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Comput. 1998, 10, 1299–1319. [Google Scholar] [CrossRef] [Green Version]
Suykens, J.; Vandewalle, J. Least Squares Support Vector Machine Classifiers. Neural Process. Lett. 1999, 9. [Google Scholar] [CrossRef]
Comon, P. Independent component analysis, A new concept? Signal Process. 1994, 36, 287–314. [Google Scholar] [CrossRef] [Green Version]
Jolliffe, I.T. Principal Component Analysis; Springer Series in Statistics; Springer: New York, NY, USA, 1986; p. xiii. [Google Scholar]

Figure 1. The framework of the proposed method applies to Goddard’s light detection and ranging (LiDAR), hyperspectral and thermal (G-LiHT) real data. In the fusion strategy, the superpixels intersect with spatial neighbors to delete the samples with large spectral distance.

Figure 2. The 2012 Houston—from top to bottom: the hyperspectral images (HSI) using bands 60, 40, and 30, as R, G, and B, respectively; LiDAR digital surface model (DSM) image; testing samples; training samples; and legend of different classes.

Figure 3. The workflow to produce the G-LiHT data groundtruth.

Figure 4. Rochester—from (a–e): the hyperspectral data using bands 40, 30, and 20, as R, G, and B, respectively; LiDAR digital terrain model (DTM) image; LiDAR canopy height model (CHM) image; testing samples; training samples; legend of different classes.

Figure 5. The classification performance as a function of varying the dimension for superpixel segmentation-based local pixel neighborhood preserving embedding (SSLPNPE) with different width of the neighborhood window.

Figure 6. The classification performance as a function of varying the dimension for the Houston and Rochester data.

Figure 7. The 2012 Houston—(a) is the groundtruth; (b) is the classification map of the

H S I + D S M

; (c) is the classification map of the

H S I + E P_{H S I} + E P_{D S M}

; (d) is the classification map of the generalized graph-based fusion (GGF); (e) is the classification map of the SSLPNPE.

Figure 7. The 2012 Houston—(a) is the groundtruth; (b) is the classification map of the

H S I + D S M

; (c) is the classification map of the

H S I + E P_{H S I} + E P_{D S M}

; (d) is the classification map of the generalized graph-based fusion (GGF); (e) is the classification map of the SSLPNPE.

Figure 8. The 2015 Rochester classification map—(a) is the ground truth; (b) is the classification map of the

H S I + C H M + D T M

; (c) is the classification map of the

H S I + E P_{H S I} + E P_{C H M} + E P_{D T M}

; (d) is the classification map of the GGF; (e) is the classification map of the SSLPNPE.

Figure 8. The 2015 Rochester classification map—(a) is the ground truth; (b) is the classification map of the

H S I + C H M + D T M

; (c) is the classification map of the

H S I + E P_{H S I} + E P_{C H M} + E P_{D T M}

; (d) is the classification map of the GGF; (e) is the classification map of the SSLPNPE.

Table 1. The 2012 Houston: number of training and testing samples.

Class		Number of Samples
No	Name	Training	Test
1	Grass Healthy	198	1053
2	Grass Stressed	190	1064
3	Grass Synthetis	192	505
4	Tree	188	1056
5	Soil	186	1056
6	Water	182	143
7	Residential	196	1072
8	Commercial	191	1053
9	Road	193	1059
10	Highway	191	1036
11	Railway	181	1054
12	Parking Lot 1	192	1041
13	Parking Lot 2	184	285
14	Tennis Court	181	247
15	Running Track	187	473
	Total	2832	12,197

Table 2. Rochester: number of training and testing samples.

Class		Number of Samples
No	Name	Training	Test
1	Buildings	106	2450
2	Trees	115	12,414
3	Grass/lawn	75	1249
4	Water/pools	33	268
5	Residential_gray	117	1008
6	Residential_white	172	1786
7	Residential_coffee	82	567
8	Parking lot	56	1254
9	Roads	178	3748
10	Soil	38	743
	Total	972	25,487

Table 3. The effect of superpixel number on classification accuracy.

	3 × 3	3 × 3 + Superpixel
	OA	Number of Superpixels ( $\times 10^{3}$ )	OA
Houston	87.95	10–55	88.54
		60–150	88.70
		155–200	88.65
Rochester	97.56	10–65	98.52
		70–150	98.92
		155–200	98.78

Table 4. The 2012 Houston: classification accuracies achieved by different approaches.

Dimension	$DSM$	$HSI$	$HSI + DSM$	$E P_{DSM}$	$E P_{HSI}$	$E P_{HSI} + E P_{DSM}$	$HSI + E P_{HSI} + E P_{DSM}$	$GGF$	$SSLPNPE$
Dimension	1	144	145	75	225	300	444	37	24
1	33.24	83.10	83.10	32.10	82.53	82.24	83.10	81.77	81.20
2	21.62	82.71	83.08	35.06	82.14	82.80	83.08	84.87	84.40
3	66.93	99.80	99.80	93.66	100.00	100.00	99.80	100.00	100.00
4	35.80	97.25	97.25	63.64	92.33	92.52	98.39	93.66	90.44
5	54.26	99.05	99.05	81.25	99.62	98.77	99.15	99.62	98.77
6	55.24	99.30	99.30	66.43	95.10	95.10	99.30	100.00	95.10
7	58.58	86.01	86.01	67.16	70.71	82.65	75.47	92.54	92.91
8	18.90	47.39	49.86	64.39	44.35	58.59	52.04	89.36	87.46
9	14.16	72.90	72.99	24.65	83.76	88.29	80.26	87.16	88.48
10	7.24	59.94	59.36	64.77	68.05	68.34	66.12	67.08	68.05
11	42.50	78.84	78.84	98.10	87.76	92.98	87.48	97.91	99.34
12	5.48	74.93	75.50	41.50	89.15	90.01	86.84	86.94	96.64
13	15.09	71.23	71.58	49.82	68.07	72.98	79.30	80.70	78.25
14	81.38	99.60	99.60	98.79	100.00	100.00	99.60	100.00	100.00
15	69.13	97.46	97.46	97.04	98.73	98.94	97.25	98.94	100.00
AA	38.64	83.30	83.52	65.22	84.15	86.95	85.81	90.70	90.74
OA	33.42	80.42	80.68	61.08	81.90	85.22	83.14	89.25	89.75
Kappa	0.2832	0.7883	0.7911	0.5784	0.8035	0.8401	0.8177	0.8833	0.8887

Table 5. The 2015 Rochester: classification accuracies achieved by different approaches.

Dimension	$CHM$	$DTM$	$DSM$	$CHM + DTM$	$HSI$	$HSI + DSM$	$HSI + CHM + DTM$
Dimension	1	1	1	2	114	115	116
1	29.31	92.94	90.24	90.08	70.49	88.57	88.65
2	99.87	36.53	84.12	99.94	96.75	96.75	97.15
3	0.08	53.16	62.21	71.02	89.99	90.07	90.87
4	13.81	14.93	15.67	21.27	94.40	94.40	94.40
5	49.40	33.23	46.43	59.03	73.81	80.65	82.44
6	84.38	40.26	54.93	58.40	92.67	97.87	97.82
7	5.11	25.04	20.81	35.98	85.89	86.42	86.60
8	0.00	64.59	8.37	66.51	80.38	81.10	81.10
9	94.98	29.16	53.01	69.16	77.24	79.75	79.99
10	0.00	77.66	85.46	82.23	99.33	99.73	99.73
AA	37.69	46.75	52.13	65.36	86.10	89.53	89.88
OA	73.56	43.91	69.71	84.11	88.84	91.64	91.99
Kappa	0.6237	0.3292	0.5981	0.7778	0.8463	0.8849	0.8896
Dimension	$E P_{CHM}$	$E P_{DTM}$	$E P_{DSM}$	$E P_{CHM} + E P_{DTM}$	$E P_{HSI}$	$HSI + E P_{HSI} + E P_{DSM}$	$HSI + E P_{HSI} + E P_{CHM} + E P_{DTM}$	$GGF$	$SSLPNPE$
Dimension	75	75	75	150	225	414	489	31	40
1	87.22	60.45	99.14	97.67	86.16	85.35	90.00	98.94	99.59
2	99.95	70.49	98.39	99.95	95.92	99.52	99.98	99.80	99.78
3	99.28	61.01	99.52	99.28	92.07	97.84	98.08	99.84	100.00
4	68.66	34.70	61.94	77.99	92.91	94.40	94.40	85.07	84.33
5	81.85	30.26	86.71	86.90	92.36	93.45	96.92	92.66	99.31
6	78.56	38.80	75.31	86.79	98.54	99.83	99.94	96.36	100.00
7	62.08	43.92	75.13	61.55	96.47	93.12	94.18	92.59	86.77
8	99.84	43.86	99.84	99.84	98.56	98.25	98.80	99.84	99.92
9	79.48	43.89	81.78	92.02	95.57	95.46	97.23	96.42	99.20
10	96.23	82.91	96.37	95.29	100.00	100.00	100.00	96.90	100.00
AA	85.31	51.03	87.41	89.73	94.86	95.72	96.95	95.84	96.89
OA	92.18	59.42	93.11	95.87	95.01	97.02	98.16	98.30	99.25
Kappa	0.8918	0.4177	0.9047	0.9426	0.9314	0.9587	0.9744	0.9764	0.9895

Table 6. Compare the time between generalized graph-based fusion (GGF) algorithm and superpixel segmentation-based local pixel neighborhood preserving embedding (SSLPNPE) algorithm.

Data	Evaluation Index	GGF	SSLPNPE
Houston	dimension	37	24
	fusion time(s)	25.77	12.48
	classification time(s)	2.09	1.34
Rochester	dimension	31	40
	fusion time(s)	39.92	16.73
	classification time(s)	0.98	1.00

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Y.; Ge, C.; Sun, W.; Peng, J.; Du, Q.; Wang, K. Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding. Remote Sens. 2019, 11, 550. https://doi.org/10.3390/rs11050550

AMA Style

Li Y, Ge C, Sun W, Peng J, Du Q, Wang K. Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding. Remote Sensing. 2019; 11(5):550. https://doi.org/10.3390/rs11050550

Chicago/Turabian Style

Li, Yunsong, Chiru Ge, Weiwei Sun, Jiangtao Peng, Qian Du, and Keyan Wang. 2019. "Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding" Remote Sensing 11, no. 5: 550. https://doi.org/10.3390/rs11050550

APA Style

Li, Y., Ge, C., Sun, W., Peng, J., Du, Q., & Wang, K. (2019). Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding. Remote Sensing, 11(5), 550. https://doi.org/10.3390/rs11050550

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral and LiDAR Data Fusion Classification Using Superpixel Segmentation-Based Local Pixel Neighborhood Preserving Embedding

Abstract

1. Introduction

2. Methodology

2.1. The Extinction Profile

2.2. Fusion

3. Experimental Results and Discussion

3.1. Data

3.1.1. 2012 Houston Data

3.1.2. Rochester Data

3.2. Parameter Settings and Comparison of Methods

3.3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI