Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features

Brigot, Guillaume; Simard, Marc; Colin-Koeniguer, Elise; Boulch, Alexandre

doi:10.3390/rs11040381

Open AccessArticle

Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features

¹

DTIS, ONERA, Université Paris Saclay, 91123 Palaiseau, France

²

Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2019, 11(4), 381; https://doi.org/10.3390/rs11040381

Submission received: 22 December 2018 / Revised: 30 January 2019 / Accepted: 7 February 2019 / Published: 13 February 2019

(This article belongs to the Section Forest Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents a machine learning based method to predict the forest structure parameters from L-band polarimetric and interferometric synthetic aperture radar (PolInSAR) data acquired by the airborne UAVSAR system over the Réserve Faunique des Laurentides in Québec, Canada. The main objective of this paper is to show that relevant parameters of the PolInSAR coherence region can be used to invert forest structure indicators computed from the airborne LIDAR sensor Laser Vegetation and Ice Sensor (LVIS). The method relies on the shape of the observed generalized PolInSAR coherence region that is related to the three-dimensional structure of the scene. In addition to parameters describing the coherence shape, we consider the impact of acquisition parameters such as the interferometric baseline, ground elevation and local surface slope. We use the parameters as input a multilayer perceptron model to infer canopy features as estimated from LIDAR waveform. The output features are canopy height, cover and vertical profile class. Canopy height and canopy cover are estimated with a normalized RMSE of 13%, 15% respectively. The vertical profile was divided into 3 distinct classes with 66% accuracy.

Keywords:

synthetic aperture radar (SAR); LIDAR; interferometry; polarimetry; machine learning; Above-Ground Biomass; canopy height

1. Introduction

Forests play a significant role in the global carbon cycle. Globally, forest loss may be responsible for 20% of global carbon emissions to the atmosphere [1] and new methods to estimate forest volume at large scale are needed to improve carbon stock inventories. There have been significant advances in the use of airborne and space-borne LIDAR to map forest canopy structure at large scale and improve estimates of Above-Ground Biomass (AGB) [2,3,4,5,6]. Moreover, full waveform LIDAR as well as PolInSAR instruments were shown to be sensitive to vegetation cover [7]. The use of PolInSAR [8] to map forest canopy height has progressed rapidly over the last decades, in particular, due to the Random Volume over Ground model (RVoG) [9] that assumes a two-layer forest with a homogeneous volume with fixed microwave extinction over flat ground. However, the physical information provided by LIDAR and PolInSAR instruments differs according to a number of key points:

Viewing geometry: the LIDAR has a near vertical illumination around nadir while the radar has a slanted side-view geometry.
Wavelength: The radar operates at wavelengths on the order of several centimeters, being sensitive to larger canopy components with variations in moisture changing the dielectric constant of plants. The UAVSAR operates at 23.8 cm penetrating the forest canopy. The LVIS LIDAR operates at 1064 nm and is reflected by all intercepted surfaces including leaves and branches.

Each instrument has its own limitations [10]. LIDAR datasets are generally sparse, requiring spatial interpolation to generate spatially explicit maps of forest structure and carbon stocks [11]. This interpolation can be supported by model relationships between forest structure, environmental variables and ancillary remote sensing observations [3,12] (e.g., MODIS, Landsat), to provide a realistic model of forest structure. However, these models may not account for disturbances and small scale variations in forest structure. Instead, direct and spatially continuous measurements of canopy structure are desirable. The PolInSAR technique offers a complementary measurement with spatially explicit estimates canopy height [13,14,15].

In this paper, we use machine learning to re-synthesize LIDAR-profile features from the observed PolInSAR coherence shape. A large collection of machine learning tools enable discovery of non-linear relationships between observables and ancillary parameters, and has already been used successfully to map forest structure, in particular height and above ground biomass (AGB) [16]. The proposed methodology is timely, as several space-borne remote sensing instruments with the objective of mapping forest structure will be launched in the next few years: GEDI, a new spaceborne LIDAR, BIOMASS [17], a fully polarimetric radar mission operating at P-band, and NiSAR, an L-band polarimetric radar [18] planned for 2021. Repeat-pass interferometric acquisitions by the BIOMASS mission will allow for PolInSAR observations. Another mission, Tandem-L, proposes an innovative interferometric and polarimetric radar mission at L-band for global measurement of forest biomass. We first introduce the PolInSAR observations and approach in Section 2 and describe the LIDAR feature extraction in Section 3. In Section 4, the machine learning algorithms are discussed. The remote sensing data sets collected over the Réserve Faunique des Laurentides (from here Laurentides) are introduced and the results presented in Section 5. We conclude in Section 6.

2. The PolInSAR Information

2.1. PolInSAR Parameters

A Synthetic Aperture Radar (SAR) image is obtained by a coherent measurement of all the received echoes and the radar signal in each image pixel is represented as a complex value describing the amplitude and phase of the signal. The interferometric coherence

γ

expesses the similarity of two radar images observed from a similar geometry. It is obtained by the complex correlation of signals

s_{1}

and

s_{2}

:

γ = A e^{- i ϕ} = \frac{< s_{1} . s_{2}^{*} >}{\sqrt{< s_{2} . s_{2}^{*} > < s_{2} . s_{2}^{*} >}},

(1)

with

< . >

a spatial averaging function, A and

ϕ

being the coherence magnitude and phase difference respectively.

Several factors have a significant impact on the observed coherence. While the interferometric coherence

γ

is related to forest height, the observation frequency and polarization determines its sensitivity to various forest components ([9,13,14]). In addition, when performing repeat-pass interferometry (i.e., images acquired at different times), small changes in forest configuration due to wind or variation in moisture cause temporal decorrelation ([19,20,21]). The observed coherence

γ_{o b s}

can be expressed as the product of volume

γ_{v}

, temporal

γ_{t}

, noise

γ_{S N R}

and geometric

γ_{g}

coherences.

γ_{o b s} = γ_{v} γ_{t} γ_{S N R} γ_{g} .

(2)

The observed interferometric phase (i.e., phase center) represents the mean elevation of the microwave scatterers within the forest canopy. At high frequencies (e.g., for wavelength under 5 cm), branches and leaves cause early scattering, elevating the phase center toward the tree top. At smaller frequencies (e.g., P-band is 60 cm wavelength), the phase center is lower with microwave penetrating through leaves and small branches, and scattering with large branches, trunks and the ground. The phase center elevation also depends on the relative contributions of the scattering mechanisms that change with microwave polarization. Radar measurements in forests are generally modeled with three scattering mechanisms (Figure 1). The double-bounces mechanism occurs between the ground and trunk/branches with the phase center closest to the ground and enhanced with Hh-Vv observations. The single-bounce mechanism occurs from direct microwave return from the canopy components and the ground is emphasized with Hh + Vv observations. The single-bounce mechanism is the second closest to the ground. Finally, the last volume scattering mechanism results from signal depolarization after multiple bounces within the oriented components of the canopy volume. Volume scattering is often associated with the Hv polarization.

While radar polarimetry provides information about the dominant scattering mechanisms, radar interferometry measures the similarity between a pair of images and determines the phase center elevation. The latter strongly depends on the observed scattering mechanism. PolInSAR observations combine both types of measurements and can be represented as coherence magnitude and phase populating a region within the complex plane. In this study, we use this graphical representation with machine learning to model LIDAR profile features.

2.2. The Coherence Region

The generalized coherence

γ (ω)

was introduced by [8] and is expressed as:

γ (ω) = \frac{ω^{†} Ω_{12} ω}{\sqrt{ω^{†} T_{11} ω} \sqrt{ω^{†} T_{22} ω}},

(3)

where † denotes the complex conjugate transpose,

Ω_{12}

,

T_{11}

and

T_{22}

are the coherence matrices calculated from the polarimetric scattering vectors

k_{1}

and

k_{2}

associated to the images of the interferometric pair with

Ω_{12} = < k_{1} . {k_{2}}^{†} >

,

T_{11} = < k_{1} . {k_{1}}^{†} >

and

T_{22} = < k_{2} . {k_{2}}^{†} >

.

The argument of

γ (ω)

represents the interferometric phase for a specific mechanism

ω

, and the modulus represents the corresponding coherence level. The coherence region defined by

{γ (ω), ω \in C^{3} / ω^{H} ω = 1}

represents the interferometric response of scatterers to polarization, providing information on the relative contribution of scattering mechanisms. Describing the shape of the coherence shape with few parameters is not possible with the current definition of

γ (ω)

. In practice, it is easier to derive mathematical properties of a simplified set defined by the following:

\tilde{γ} = \frac{ω^{†} Ω_{12} ω}{ω^{†} T ω},

(4)

where the matrix

T

is defined as

T = (T_{11} + T_{22}) / 2

. This simplification is justified assuming matrices

T_{11}

and

T_{22}

are similar, representing the same target viewed under similar geometry. Provided that this assumption is valid, the mean average on the denominator is close to the geometric mean.

Since the arithmetic mean of non-negative real numbers is greater than or equal to their geometric mean, the modified coherence

| \tilde{γ} |

is lower than the generalized coherence

| γ |

, and thus always lies between 0 and 1. Moreover, the argument is not modified by this definition change:

| \tilde{γ} | \leq | γ |, arg γ = arg \tilde{γ} .

Replacing the definition of the coherence region by an approximation

| \tilde{γ} |

,

| \tilde{γ} |

defined by the numerical range of field of values of a 3 × 3 matrix

A = T^{- \frac{1}{2}} Ω_{12} T^{- \frac{1}{2}}

. This shape structure is complex but has many mathematical properties. It can be approximated through an ovoid shape, with a maximum of three local maxima [22]. Thus, the field of values of a 3 × 3 matrix corresponds to [23]:

a single point, or a line segment,
an ellipse,
a triangle, the convex hull of the three eigenvalues,
an ovular shape, as represented in Figure 2 on the left,
the union of one point and an ellipse. One of the points and the foci of the ellipse are eigenvalues of A (center panel of Figure 2),
an ovular shape with a flat portion parallel to the imaginary axis (right panel of Figure 2).

Generally, PolInSAR models the coherence region as an ellipse or segment, but in reality, we find the ovular shape to be most common.

2.3. Factors That Impact the Shape of the Coherence Region

The Random Volume over Ground model (RVoG) ([8,24]) predicts that, in the absence of noise and estimation errors, the coherence region is a straight line segment whose location and orientation depend on the canopy height, attenuation and ground elevation. However, in practice, this experimental shape expands into an ellipse, or ovular shape. In fact, the shape of the coherence region changes significant with forest structural parameters and the relative contribution of the scattering mechanisms (Figure 1), and has been successfully used for land cover classification [25]. We assume that this difference between the experimental coherence and the line segment shape predicted by the RVoG model comes from assumptions made in the RVoG model, and in particular from the simplifications made on the description of the forest. Thus, in Section 2.4, we propose to quantify deviation of the coherence region from theoretical shapes to characterize the variations in canopy vertical profile.

At P-band, we have also pointed out in [26] that the coherence shape changes with the tree species, along with other parameters such as tree height and height-of-ambiguity (Figure 3). In this figure, two different examples of coherence shapes are given for similar geometrical acquisition parameters but different tree species: 9-m pine trees and 12-m spruce trees.

Geometry of Acquisition

The imaging geometry also impacts the coherence shape. First, the presence of a local slope changes the ratio between the path traveled by the microwave within the canopy, and the path traveled to the ground, as illustrated in Figure 4. This reveals that the slope can induce significant changes in the coherence shape, and thus bias the canopy height estimation, especially in a landscape with hills, mountains and canyons.

Another geometrical aspect that impacts the coherence shape is the ambiguity height

h_{a}

, which corresponds to the maximum canopy height that can be inverted. As the interferometric phase is known

2 π

, the computed height is modulo

h_{a}

. Usually, the phase center is converted to an elevation by multiplication with

h_{a}

, which can be written in terms of the perpendicular baseline

B_{n}

that depends on the distance separating the two sensors, the incidence angle

θ_{i n c}

, the wavelength

λ

and the distance to the ground R, following:

h_{a} = λ \cdot \frac{R \cdot sin (θ_{i n c})}{2 B_{n}}

(5)

However, this expression is based on the assumption that the ground is flat. Ref. [27] proposes a more sophisticated formula for

h_{a}

that takes into account

B_{n}

and the local slope:

h_{a} = λ \cdot \frac{R}{2 B_{n}} \frac{sin (θ_{i n c} - θ_{S l o p e})}{cos (θ_{S l o p e})}

(6)

Using this expression in model will be considered to compensate for the imaging geometry that varies across the landscape.

To estimate coherence from our PolInSAR data, we use a constant window size similar to the size of the LIDAR footprints, i.e., 21 × 21 UAVSAR pixels. While the size of the window could be increased to further reduce the impact of noise on observed coherence, we found this size to sufficiently reduce noise and provide a useful spatial resolution for heterogeneous landscapes.

2.4. Coherence Region Parameters

The PolInSAR features were selected to characterize the geometrical shape of the coherence region and its location within the unit circle of the coherence plane. These parameters should be as independent as possible from the imaging parameters. They are summarized in Table 1, depicted in Figure 5, and explained below.

First of all, we estimate the center of the shape, by averaging 1000 random realizations of

γ (ω)

from Equation (4). The scattering mechanism vectors are generated following the parametrization:

ω = {[cos α, sin α cos β e^{i δ}, sin α sin β e^{i γ}]}^{t},

(7)

where

δ

and

γ

are uniformly distributed random variables in

[0, 2 π]

, and

θ, α

are uniformly distributed random variables in

[0, π / 2]

. The phase center of the shape is found by averaging the 1000 random realizations of

γ (ω)

. Its modulus gives the first parameter

ρ_{mean}

.

To estimate the angular position of this center, we assume the coherence shape rotates around the center of the unitary circle with the ground interferometric phase. In order to obtain a parameter that is independent of ground elevation, the angular position of the shape center,

θ_{mean}

, should be determined relative to the ground. The ground interferometric phase

θ_{ground}

is estimated as the intersection of the unit circle and a theoretical line segment (straight dot red line in Figure 5) ([13]). The orientation of the line follows the main axis of the shape, computed through the eigenvector of the biggest eigenvalue of the covariance matrix of all experimental coherence point coordinates in the Cartesian coordinate system. Once this line has been estimated, it is possible to find its intersection with the unit circle,

e^{j θ_{ground}}

. Furthermore,

θ_{mean}

is converted into elevation by

H = θ_{mean} h_{a} / 2 π

to account for observation geometry.

The third parameter to compute is

α

, which is linked to the angular extension of the theoretical line coherence shape.

α

is found thanks to the other intersection of the coherence shape major axis with the unit circle. It is also converted into elevation by

H = α h_{a} / 2 π

.

As most scattering models consider the coherence shape as an ellipse, we are then interested in describing the ellipse that is the closest to the experimental coherence values by its two principal axes. To achieve this, we compute the 2 × 2 covariance matrix between the x and y components of our N point samples, as in [25]. The directions of the minor and major axes correspond to the eigenvectors of this covariance matrix. Once these directions are found, and assuming the center of the ellipse corresponds to the barycenter, it is possible to find the intersection of the axis with the numerical range, and to estimate the ellipse axes lengths with the average of distance between these points and the center.

λ_{max}

and

λ_{min}

correspond to the minor and major axis lengths of the closest ellipse.

Finally, we quantify the deviation of the experimental coherence shape from the closest theoretical ellipse based on

R_{p}

and

R_{a}

, respectively, the ratio between the perimeter of the coherence shape and the approximated ellipse perimeter, and the ratio between the area of the coherence shape and the approximated ellipse area.

In summary,

three parameters describing the position and orientation of the coherence shape: $ρ_{mean}$ , $θ_{mean} h_{a} / 2 π$ , and $α h_{a} / 2 π$ ,
two describing the dimensions of the closest ellipse $λ_{min}$ and $λ_{max}$
two parameters $R_{p}$ and $R_{a}$ describing the similarity of the observed coherence shape and theoretical ellipse.

Now that we have selected parameters from the PolInSAR coherence region that are related to the forest structure, we need to do the same for the LIDAR data in order to propose the fusion scheme.

3. LIDAR Processing

The LIDAR is an laser altimeter instrument measuring the distance from the instrument to a target. A laser pulse is transmitted and the intensity and time of travel of the reflected light is recorded to generated the waveform. The waveform depicts the vertical distribution of all the forest components intercepted within the LIDAR footprint as shown in Figure 6 and thus closely related to the vertical canopy height profile. The lowest notable peak (last reflected signal) is the ground return with a shape similar to a Gaussian function over flat terrain and which becomes wider in the presence of terrain topography and slopes. The upper part (earliest reflected signal) of the waveform with signal above the system noise is the canopy contribution. Separating the ground and canopy contributions to the recorded LIDAR waveform, we can interpret important components of the forest structure: canopy height, canopy cover and the complexity of the vertical distribution as represented in Figure 7.

3.1. Tree Height

We implemented an algorithm to detect the ground and approximate the LIDAR waveform following the methods described in [28]. First, the ground peak is assumed to be a Gaussian function. Then, the remainder of the waveform is modeled as a sum of Gaussians f with profile height z. This step is equivalent to finding the parameters

A_{i}

,

m_{i}

and

σ_{i}

, which are respectively the amplitude, positions and standard deviations of each peak found in the LIDAR waveform:

{A_{i}, m_{i}, σ_{i} \in R^{3}, || f (z) - \sum_{i = 1}^{n} A_{i} \int_{0}^{z} f_{m_{i}, σ_{i}} (z^{'}) d z^{'} || < ϵ},

(8)

with

f_{m, σ} (z) = e x p (- \frac{{(z - m)}^{2}}{2 σ^{2}})

(9)

and

ϵ

a threshold to reach in order to end the decomposition. To solve this equation, we used a robust and fast method to decompose the LIDAR signal done by [29]. The method is recursive and described in the Figure 8. The main principle is first to look for local maximum of the filtered signal then find Gaussian functions centered on these maximum to best fit the waveform. Then the iterative part is to look for local maximum of the difference between the actual waveform and the sum of the previously computed Gaussian functions. If the maximum is greater than a given value, a new Gaussian function is added.

This decomposition has been used for land classification in the literature ([29,30,31]). The distance between the Gaussian centers is an indicator of the thickness of various vegetation layers. The amplitudes of each Gaussian gives information about the canopy cover for each layer, and the

σ

values represent how wide the layers are in the vertical axis. However, in our case, we only used the Gaussian decomposition to separate the ground from the canopy. Canopy height is computed by finding the height above the Gaussian-detected ground below which 99% of the total waveform energy was detected. We found this method to be far more robust than identifying the furthest local maximum as the ground. For instance in Figure 9), it is difficult to identify the ground and the canopy contributions based on the waveform signal (red). On the other hand, the Gaussian decomposition of the waveform can distinguish the two peaks (in black). In this example, the first method results in an estimated 9 m, 6 m below the Gaussian decomposition method.

3.2. Canopy Cover

Canopy cover is an important characteristic of forest canopies and can be estimated as the proportion of ground obscured by trees. However, from the LIDAR perpective, the lowest layers of vegetation and the ground receive less photons or laser energy than as it is partially obscured by the top layers. Given our interest in using PolInSAR, a more representative description of the true canopy height profile is desirable as microwaves penetrate deeper into the canopy. Thus, a “Canopy Height Profile” (CHP) is derived that takes into account occultation of light by upper layers ([32,33,34]). The mathematical steps in order to get a profile independent of the canopy absorption are:

TransHP (z) = \frac{\sum {h = z_{0}}^{z} C (h)}{\sum_{h = z_{0}}^{z_{\max}} C (h) + 2 G (z)}

(10)

cCHP (z) = - ln (1 - TransHP (z))

(11)

CHP (z) = \frac{d (cCHP (z))}{d z},

(12)

with C and G the canopy and ground return signals. An example is shown in Figure 10.

3.3. Vertical Distribution Complexity

The CHP allows characterization of the different vegetation layers which relative position (i.e., for a given top canopy height) can be significantly different for forest stands of different ages and species. We used a spectral clustering algorithm to define classes of vertical profile and associate each LIDAR waveform to a class. Clustering was performed on the normalized CHP C which is independant of height and total power. In the case of the Laurentides forest, LIDAR profiles were found to be simple, thus we ask the algorithm to classify CHPs into three classes shown in Figure 11 which suffice to determine if the canopy is concentrated toward the top, center or bottom. The X-axis is the normalized elevation axis where 0 and 1 correspond respectively to the bottom and the top of the canopy layer. The Y-axis is the proportion of the transmitted photons, normalized in order that the power of the signal is 1.

4. Fusion with Machine Learning

Machine learning is a subfield of computer science that explores the study and construction of algorithms that can tune part or totality of their parameters based on experience, i.e., on training data. Supervised learning algorithms make predictions based on a set of examples for which the target is provided. In other words, it infers a function from labeled training data. In the framework of our study, a large number of PolInSAR images are available that cover contiguously region. On the other hand, the current coverage of the full waveform LIDAR is limited. Our goal is to predict LIDAR labels (i.e., canopy height and cover, and CHP class)from PolInSAR descriptors using supervised machine learning. Supervised machine learning algorithms operate in two different modes: training and prediction. Training generates a model based on discovered relationships between LIDAR-derived labels and collocated PolInSAR input parameters. The trained model is then applied with PolInSAR inputs to predict labels. In this study, we use a multilayer perceptron for regression of the continuous labels (i.e., canopy height and cover) and random forests for classification of the CHP profiles.

4.1. Neural Networks

A neural network is an interconnected group of nodes and their state depends on the state of the previous ones. The reasoning is that multiple nodes can collectively gain insight about solving a classification problem that an individual node cannot. Neural networks consist of multiple hidden layers of neurons as shown in Figure 12. In this figure, each circular node represents an artificial neuron with an arrow representing a connection from the output of one neuron to the input of another. The neurons i are the inputs and the neurons o are the outputs. There can be numerous hidden layers (neurons h) allowing the creation of a more complex model fitted for a challenging problem. The drawback of adding hidden layers is the increase in computational cost. Also, it could generate a model that overfits the training set and fails to generalize the problem.

The relationship between the value of one neuron H and the ones that belong to the previous layer can be expressed as:

h_{1} = f (i_{1} . w_{1} + i_{2} . w_{2} + b_{1})

with f an activation function such as softmax, sigmoid or Rectified Linear Unit (ReLU). The goal of the training step is to find the value of the weights

w_{i}

and bias

b_{i}

that make the model fit the training set. In order to approximate a solution to this complex problem, a cost function (error function) is defined, which is minimized through a back-propagation algorithm. When the solution for weights has converged, the values are fixed and used for testing.

Figure 13 gives a schematic representation of a Multi Layer Perceptron where the input dataset is a 7-dimensional vector defined from the PolInSAR coherence region, and the outputs are the LIDAR parameters (the number of outputs depends on the target dimension).

We chose a perceptron algorithm for regression for the following reasons:

Perceptrons are easy to implement, run fast with only a few tuning parameters and also provide a classification score.
The mapping of PolInSAR parameters into LIDAR labels cannot be represented as a simple function. Although, SVMs have been used in similar approaches with encouraging results [35], we obtained higher accuracy with neural networks.
A significant practical advantage of a perceptron over SVMs is that perceptrons can be trained on-line, with their weights updated as new examples are provided. Thus, perceptrons are well adapted in the context of constant renewal of remote sensing data.

4.2. Random Forests

A Random Forest classifier is an ensemble learning method that fits a number of decision tree classifiers trained on various subsets of the full training set. A decision tree is a simple representation for classifying samples that is formed of nodes, branches and leaves as shown in an example in Figure 14. Each node tests a condition on one or several attributes. The binary decision leads to two new branches. The successive modes (i.e., decisions) lead to leaves where class labels are assigned.

The training of a decision tree is done recursively as branches are split until a accuracy of the prediction does not increase significantly anymore. Although very simple to understand and interpret, the decision tree is limited by its overall accuracy. When the number of nodes is not limited, the classifier tends to overfit the training dataset. In order to overcome these disadvantages, Random Forests average the results of numerous decision trees trained on different subsamples of the original training set. Additionally, random subsets of features are selected for each trees in order for strong predictors not to correlate every single trees with each others [36]. Random Forests are robust to outliers and overfitting, and generally rely on two tuning parameters: the number of trees and the number of parameters to be used at each nodes. Both parameters are optimized in the estimation process to improve performance.

4.3. The Data Set

We use fully polarimetric airborne L-band radar data from the UAVSAR system. Interferometry is acquired in repeat-pass mode. The selected test site is located around geographic coordinates 47.251

^{\circ}

N, −71.354

^{\circ}

E. The UAVSAR slant range images has a resolution of about 0.6 m in azimuth and 1.6 m in range. The LIDAR waveforms were collected with about 1 point every 20 m and height estimates are interpolated to fit the radar image as shown in Figure 15.

To evaluate the performance of the method, a site with LIDAR and radar data, and field measurements. The site exhibits homogeneously distributed heights. without lakes and clear cuts. The selected area is highlighted in the Figure 15.

4.4. The Training Set

The training set is composed of 1000 randomly selected points within the test image of 4000 × 1400 pixels. The seven PolInSAR parameters were extracted along with the LIDAR-derived labels (tree height, canopy cover and CHP class). To assess our selection of PolInSAR coherence shape parameters, we compare the results using the seven parameters to those using simplified decompositions of the PolInSAR images. To this aim, we analyze four new training sets, summarized in the Table 2. The first two are the coherence expressed either in polar (

ρ

and

ϕ

) or cartesian (

γ_{x}

and

γ_{y}

) coordinates, in the lexicographic basis (Hh, Hv, Vv).

k_{z} = A / h_{a}

parameter is added as a seventh parameter to take into account the ambiguity height. The last two training features are the eigenvalues of the matrix

A = T^{- \frac{1}{2}} T_{12} T^{- \frac{1}{2}}

(

λ_{1}, λ_{2}, λ_{3}

in polar and cartesian coordinates) still with

k_{z}

added as seventh parameter. These eigenvalues are represented above in Figure 2.

5. Results

In this section, the results of the prediction of canopy height, cover and CHP class are presented.

5.1. Tree Height

The prediction of the canopy top height was done with the help of a neuronal network with two hidden layers of 50 neurons each. Many alternative network structures, of different size and depth, have been tested. Training a deeper neuronal network did not improve results while the computational time increased significantly. Therefore, we kept these parameters as our best compromise. The resulting canopy height map is given in Figure 16. The root mean square error (RMSE) of the predicted canopy height is 3.2 m.

The influence of the descriptors of the learning scheme was tested. The RMSE achieved is shown in Table 3. The other alternative descriptor sets almost always give larger RMSE. Among them, only the set of eigenvalues of

A

in polar coordinates is promising, with a RMSE of 3.5 m. This can be explained by the fact that these three eigenvalues partially describe the coherence set, and have a physical meaning in polar coordinates: the phase can be associated with height, and the absolute value can be interpreted as coherence. Eigenvalues are also better distributed in the coherence set than others in the lexicographic polarization base. This result tends to prove that the network is more efficient when it learns on parameters that are meaningful.

The different histograms of the estimated height distributions in Figure 17 support the results of RMSE: the one calculated on our set of seven parameters is by far the closest to the reference histogram.

The distribution of our estimates, shown in dark green, shows the range of predicted values is smaller than the reference range. This is a known consequence of automatic regression methods, which tend to center the predicted values toward the mean value to minimize the error score.

We compare our results to those obtained with a conventional PolInSAR inversion based on the RVoG model and taking into account temporal decorrelation [37]. In Figure 18, a scatter-plot shows the canopy height estimated with coherence shape and the LIDAR-derived canopy height seen as the ground truth. The plots highlight a slight underestimation for the highest values of heights.

In the Figure 19, a similar scatter plot is shown for the PolInSAR inversion [37] and the LIDAR-derived height. The PolInSAR RMSE over our test site lies around 4.9 m. It is higher than that of our coherence shape method. However, the RMSE of the PolInSAR method estimated on the entire data set lies between 2.3 and 3.4 m. Results are slightly over-estimated for small trees and under-estimated over taller forests.

Finally, the Figure 20 represents the scatter plot of canopy height estimated by the PolInSAR and coherence shape methods. It shows the results to be similar.

5.2. Canopy Cover

A perceptron with the same structure as for RH100 was used to learn the canopy cover. The normalized RMSE between the predicted cover and the one calculated by LVIS is 13%. Note that in this case, the main difficulty lies in the differences in resolutions between the learning set, radar, and predicted output (LIDAR). Once again, results computed in Table 4 show that the two sets of parameters corresponding to “eigenvalues in polar coordinates” and “optimized set of seven geometrical parameters” lead to the best scores.

To our knowledge, there is no similar work in the literature attempting to estimate canopy cover using PolInSAR data, so it is difficult to compare to existing results. To get an idea of our performance, we converted these regression results into a binary classification result between bare ground and ground covered with vegetation. To do this, we perform simple thresholding of the canopy cover, retaining only values greater than 10%, to detect the presence or absence of forest. Our prediction score between LIDAR and POLINSAR is shown in Figure 21. It is 83% while it can reach 90% in the literature [38].

The result of the predicted canopy cover and its confrontation with the learned data is represented on the Figure 22.

5.3. CHP Class

The Random Forest classifier was used to classify the image pixels into the three different vertical profiles represented in Figure 11. Results of classification are illustrated in Figure 23. The classification scores are shown in Table 5 and should be read as follow: for instance, the value 0.085 in the coordinate (1,2) means that eight times out of 100, the neuronal network guessed the pixel should belong to the class 2 while it truly belongs to class 1. Likewise, 19 times out of 100, our algorithm tells the class 3 while it really is class 2.

The global score F1 of our classification is 0.66. It is a score used to measure the accuracy of a test, taking into account the precision, i.e., the fraction of retrieved instances that are relevant or positive predictive value) and the recall, i.e., the fraction of relevant instances that are retrieved. The results of the classification based on different training sets is shown in Table 6, with the best score achieved using the set of seven parameters.

5.4. Difficulties and Limits

5.4.1. Concerning RH100

The analysis of the results obtained on the Laurentides shows that the main sources of failure are related to the presence of a strong relief under the canopy, that has significant consequences on both the LIDAR signal and the SAR signal. On the SAR signal, it strongly modifies the polarimetric responses, which are very sensitive to orientation phenomena and in particular to slopes. The double-bounce mechanism, in particular, may be considerably lower in the presence of relief. On the LIDAR signal, the slope has the effect of mixing together the response of the ground and the vegetation, and also to widen the profile in height, as depicted in Figure 24. As a result, the extraction of the canopy height is often wrong. Also, if learning is achieved on erroneous labels, the performance of the algorithm will be reduced. So far, we have not found methods to overcome this effect, which we believe is the most important limitation of our learning method.

Finally, there is a lack of diversity in some area and the training is therefore biased. For instance, if we select an image where most trees are 20-m-high pines, the algorithm, in its search to minimize its error rate will tend to predict a 20-m-high tree for whatever input it receives.

5.4.2. Concerning Canopy Cover

The canopy cover is one of the parameters whose correspondence within LIDAR and radar data is the least obvious. As mentioned above, it is above all the difference in illumination geometry that is in question, as well as the differences in spatial resolutions. However, there is a significant correlation between the two datasets.

5.4.3. Concerning Vegetation Class

The difficulty here in the classification of the profiles comes mainly from the fact that the diversity of profiles present in this forest are limited. The method should be strengthened in the future for forests with more complex structures such as tropical forests.

5.5. Upscaling

The ultimate goal of the upscaling method is to extend the LIDAR signal on radar coverage. In our test cases, the learning data points were selected across the entire test area. In this section, we will restrict the area of learning samples to smaller portions of the image, and export the result of RH100 to larger and larger areas. The training is always done with 1000 points, selected randomly on the portion of the training image. For our test, we evaluate the predicted RH100 when learning is done on 100%, 50%, 25% and 10% of the area. These subfields are represented in Figure 25.

The different RMSE are grouped into the Table 7. Even by restricting the areas that can be used for learning samples, the scores remain stable, provided the area contains sufficiently varied samples.

6. Summary and Conclusions

This paper was intended to predict vegetation parameters from PolInSAR, by learning the inference relation between L-band POLINSAR data and LIDAR-derived labels describing these parameters. As such, this method could be used to map forest at large scale once PolInSAR data becomes available. Seven parameters describing the observed coherence region were proposed to characterize the PolInSAR information. We have been focusing on three canopy descriptors: canopy height and cover, and CHP class. We used perceptrons to associate the seven PolInSAR features to canopy height and cover; then we used random forests to associate our PolInSAR features to a vertical distribution class. Our choice of parameters was further supported by the comparison with models trained with more classical parameters, that led to less effective results. The coherence shape inversion produced uncertainty similar to that found with PolInSAR RVoG inversion. We believe the main sources of error were related to the presence of a high relief under the vegetation. The height estimation error was found to be 3.2 m RMSE for forest stands as tall as 35 m. With a normalized RMSE of 14% and the classification score between the three CHP classes is 66%.

We believe the additional parameters describing the forest structure that were predicted in this paper, may improve estimates of AGB stock. However, it would be useful to investigate the performance of the coherence shape method in different forest types.

Author Contributions

Conceptualization, G.B., M.S. and E.C.-K.; methodology, G.B. and M.S.; software, G.B.; formal analysis on machine learning, A.B. and G.B.; validation, G.B.; writing—original draft preparation, G.B.; writing—review and editing, E.C.-K. and M.S.; supervision, E.C.-K.

Funding

The first author was supported by a PhD grant co-funded by Onera and TOTAL. This work was partly performed at the Jet Propulsion Laboratory, California Institute of Technology. The APC was funded by Onera, as part of the MEDUSA Research project.

Acknowledgments

Guillaume Brigot gratefully acknowledges JPL for the internship opportunity, as well as for providing UAVSAR and LVIS data. We would also like to thank the personnel of the Foret Montmorency. Finally, we are very grateful for the contribution of Michael Denbina for the comparison with PolInSAR results.

Conflicts of Interest

The authors declare no conflict of interest.

References

Van der Werf, G.; Morton, D.; DeFries, R.; Olivier, J.; Kasibhatla, P.; Jackson, R.; Collatz, G.; Randerson, J. CO₂ emissions from forest loss. Nat. Geosci. 2009, 2, 737–738. [Google Scholar] [CrossRef]
Lefsky, M.A.; Cohen, W.B.; Parker, G.G.; Harding, D.J. LIDAR remote sensing for ecosystem studies: LIDAR, an emerging remote sensing technology that directly measures the three-dimensional distribution of plant canopies, can accurately estimate vegetation structural attributes and should be of particular interest to forest, landscape, and global ecologists. BioScience 2002, 52, 19–30. [Google Scholar]
Simard, M.; Pinto, N.; Fisher, J.B.; Baccini, A. Mapping forest canopy height globally with spaceborne LIDAR. J. Geophys. Res. Biogeosci. 2011, 116. [Google Scholar] [CrossRef]
Hyde, P.; Dubayah, R.; Walker, W.; Blair, J.B.; Hofton, M.; Hunsaker, C. Mapping forest structure for wildlife habitat analysis using multi-sensor (LIDAR, SAR/InSAR, ETM+, Quickbird) synergy. Remote Sens. Environ. 2006, 102, 63–73. [Google Scholar] [CrossRef]
Saatchi, S.S.; Harris, N.L.; Brown, S.; Lefsky, M.; Mitchard, E.T.; Salas, W.; Zutta, B.R.; Buermann, W.; Lewis, S.L.; Hagen, S.; et al. Benchmark map of forest carbon stocks in tropical regions across three continents. Proc. Natl. Acad. Sci. USA 2011, 108, 9899–9904. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Baccini, A.; Laporte, N.; Goetz, S.; Sun, M.; Dong, H. A first map of tropical Africa’s above-ground biomass derived from satellite imagery. Environ. Res. Lett. 2008, 3, 045011. [Google Scholar] [CrossRef] [Green Version]
Sun, G.; Ranson, K.J.; Guo, Z.; Zhang, Z.; Montesano, P.; Kimes, D. Forest biomass mapping from LIDAR and radar synergies. Remote Sens. Environ. 2011, 115, 2906–2916. [Google Scholar] [CrossRef]
Cloude, S.R.; Papathanassiou, K.P. Polarimetric SAR interferometry. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1551–1565. [Google Scholar] [CrossRef]
Neumann, M.; Ferro-Famil, L.; Reigber, A. Estimation of forest structure, ground, and canopy layer characteristics from multibaseline polarimetric interferometric SAR data. IEEE Trans. Geosci. Remote Sens. 2010, 48, 1086–1104. [Google Scholar] [CrossRef]
Montesano, P.; Nelson, R.; Dubayah, R.; Sun, G.; Cook, B.D.; Ranson, K.; Næsset, E.; Kharuk, V. The uncertainty of biomass estimates from LIDAR and SAR across a boreal forest structure gradient. Remote Sens. Environ. 2014, 154, 398–407. [Google Scholar] [CrossRef]
Wulder, M.A.; White, J.C.; Nelson, R.F.; Næsset, E.; Ørka, H.O.; Coops, N.C.; Hilker, T.; Bater, C.W.; Gobakken, T. LIDAR sampling for large-area forest characterization: A review. Remote Sens. Environ. 2012, 121, 196–209. [Google Scholar] [CrossRef]
Lim, K.; Treitz, P.; Wulder, M.; St-Onge, B.; Flood, M. LIDAR remote sensing of forest structure. Prog. Phys. Geogr. 2003, 27, 88–106. [Google Scholar] [CrossRef]
Cloude, S.; Papathanassiou, K. Three-stage inversion process for polarimetric SAR interferometry. IEE Proc. Radar Sonar Navig. 2003, 150, 125–134. [Google Scholar] [CrossRef]
Garestier, F.; Dubois-Fernandez, P.C.; Papathanassiou, K.P. Pine forest height inversion using single-pass X-band PolInSAR data. IEEE Trans. Geosci. Remote Sens. 2008, 46, 59–68. [Google Scholar] [CrossRef]
Neumann, M.; Saatchi, S.; Ulander, L.; Fransson, J. Assessing performance of L-and P-band polarimetric interferometric SAR data in estimating boreal forest above-ground biomass. IEEE Trans. Geosci. Remote Sens. 2012, 50, 714–726. [Google Scholar] [CrossRef]
Simard, M. Remote Sensing on Land Surfaces. Available online: http://LIDARradar.jpl.nasa.gov/ (accessed on 30 January 2019).
Le Toan, T.; Quegan, S.; Davidson, M.; Balzter, H.; Paillou, P.; Papathanassiou, K.; Plummer, S.; Rocca, F.; Saatchi, S.; Shugart, H.; et al. The BIOMASS mission: Mapping global forest biomass to better understand the terrestrial carbon cycle. Remote Sens. Environ. 2011, 115, 2850–2860. [Google Scholar] [CrossRef] [Green Version]
Alvarez-Salazar, O.; Hatch, S.; Rocca, J.; Rosen, P.; Shaffer, S.; Shen, Y.; Sweetser, T.; Xaypraseuth, P. Mission design for NISAR repeat-pass Interferometric SAR. In Proceedings of the SPIE 9241, Sensors, Systems, and Next-Generation Satellites XVIII, Amsterdam, The Netherlands, 11 November 2014; p. 92410C. [Google Scholar]
Simard, M.; Hensley, S.; Lavalle, M.; Dubayah, R.; Pinto, N.; Hofton, M. An empirical assessment of temporal decorrelation using the uninhabited aerial vehicle synthetic aperture radar over forested landscapes. Remote Sens. 2012, 4, 975–986. [Google Scholar] [CrossRef]
Lavalle, M.; Simard, M.; Hensley, S. A temporal decorrelation model for polarimetric radar interferometers. IEEE Trans. Geosci. Remote Sens. 2012, 50, 2880–2888. [Google Scholar] [CrossRef]
Lavalle, M.; Simard, M.; Pottier, E.; Solimini, D. PolInSAR forestry applications improved by modeling height-dependent temporal decorrelation. In Proceedings of the 2010 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Honolulu, HI, USA, 25–30 July 2010; pp. 4772–4775. [Google Scholar]
Colin Koeniguer, E. Polarimetric Radar Images; Habilitation à diriger des recherches; Université Paris Sud: Orsay, France, 2014. [Google Scholar]
Keeler, D.S.; Rodman, L.; Spitkovsky, I.M. The numerical range of 3 × 3 matrices. Linear Algebra Its Appl. 1997, 252, 115–139. [Google Scholar] [CrossRef] [Green Version]
Treuhaft, R.N.; Siqueira, P.R. Vertical structure of vegetated land surfaces from interferometric and polarimetric radar. Radio Sci. 2000, 35, 141–177. [Google Scholar] [CrossRef] [Green Version]
Neumann, M.; Reigber, A.; Ferro-Famil, L. Data classification based on PolInSAR coherence shapes. In Proceedings of the International Geoscience and Remote Sensing Symposium, Seoul, Korea, 25–29 July 2005; Volume 7, p. 4852. [Google Scholar]
Brigot, G.; Koeniguer, E.; Simard, M.; Dupuis, X. Fusion of LIDAR and PolInSAR images for forest vertical structure retrieval. In Proceedings of the EUSAR 2016: 11th European Conference on Synthetic Aperture Radar, Hamburg, Germany, 6–9 June 2016; pp. 1–5. [Google Scholar]
Denbina, M.; Simard, M. The effects of temporal decorrelation and topographic slope on forest height retrieval using airborne repeat-pass L-band polarimetric SAR interferometry. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 1745–1748. [Google Scholar]
Chauve, A.; Vega, C.; Durrieu, S.; Bretar, F.; Allouis, T.; Pierrot Deseilligny, M.; Puech, W. Advanced full-waveform LIDAR data echo detection: Assessing quality of derived terrain and tree height models in an alpine coniferous forest. Int. J. Remote Sens. 2009, 30, 5211–5228. [Google Scholar] [CrossRef]
Mallet, C.; Bretar, F. Full-waveform topographic LIDAR: State-of-the-art. ISPRS J. Photogramm. Remote Sens. 2009, 64, 1–16. [Google Scholar] [CrossRef]
Reitberger, J.; Krzystek, P.; Stilla, U. Analysis of full waveform LIDAR data for the classification of deciduous and coniferous trees. Int. J. Remote Sens. 2008, 29, 1407–1431. [Google Scholar] [CrossRef]
Chehata, N.; Guo, L.; Mallet, C. Airborne LIDAR feature selection for urban classification using random forests. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2009, 39, W8. [Google Scholar]
Lefsky, M.A.; Cohen, W.; Acker, S.; Parker, G.G.; Spies, T.; Harding, D. LIDAR remote sensing of the canopy structure and biophysical properties of Douglas-fir western hemlock forests. Remote Sens. Environ. 1999, 70, 339–361. [Google Scholar] [CrossRef]
Ni-Meister, W.; Jupp, D.L.; Dubayah, R. Modeling LIDAR waveforms in heterogeneous and discrete canopies. IEEE Trans. Geosci. Remote Sens. 2001, 39, 1943–1958. [Google Scholar] [CrossRef]
Brolly, M.; Simard, M.; Tang, H.; Dubayah, R.O.; Fisk, J.P. A LIDAR-Radar Framework to Assess the Impact of Vertical Forest Structure on Interferometric Coherence. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 5830–5841. [Google Scholar] [CrossRef]
Brigot, G.; Simard, M.; Koeniguer, E.; Taillandier, C. Prediction of forest canopy structure from PolInSAR dataset. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 4306–4309. [Google Scholar]
Ho, T.K. A data complexity analysis of comparative advantages of decision forest constructors. Pattern Anal. Appl. 2002, 5, 102–112. [Google Scholar] [CrossRef]
Simard, M.; Denbina, M. An Assessment of Temporal Decorrelation Compensation Methods for Forest Canopy Height Estimation Using Airborne L-Band Same-Day Repeat-Pass Polarimetric SAR Interferometry. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 95–111. [Google Scholar] [CrossRef]
Schlund, M.; Scipal, K.; Davidson, M.W. Forest classification and impact of BIOMASS resolution on forest area and aboveground biomass estimation. Int. J. Appl. Earth Obs. Geoinf. 2017, 56, 65–76. [Google Scholar] [CrossRef]

Figure 1. The three dominant microwave scattering mechanisms in forest canopies.

Figure 2. Examples of three types of coherence shape: ovular shape, union of ellipse and one point, ovular shape with flat portion.

Figure 3. Example of coherence region associated with different vertical forest profiles: 9-m pine trees on the Top, 12-m spruce tress on the Bottom.

Figure 4. The influence of the Slope on the height inversion.

Figure 5. Visualization of the main coherence shape parameters.

λ_{1}, λ_{2}, λ_{3}

are the eigen values of matrix A (see Figure 2).

Figure 5. Visualization of the main coherence shape parameters.

λ_{1}, λ_{2}, λ_{3}

are the eigen values of matrix A (see Figure 2).

Figure 6. Observed LIDAR waveform versus forest vertical profile.

Figure 7. Scheme of the decomposition of the LIDAR waveform in 3 parameters.

Figure 8. Principle of the Gaussian decomposition algorithm (scheme by [29]).

Figure 9. Case where the Gaussian decomposition helps the height retrieval.

Figure 10. Canopy Height profile decomposition.

Figure 11. Three classes of vertical distributions in Laurentides Forest obtained by a spectral clustering.

Figure 12. Weight between nodes of a neuronal network.

Figure 13. Principle of the fusion with a perceptron algorithm.

Figure 14. Example of a decision tree in a flowchart-like structure.

Figure 15. Left: overview of UAVSAR and LVIS datasets on Laurentides. Right: Study zone.

Figure 16. Top: RH100 (m) from LIDAR (m), Bottom: RH100 (m) computed with perceptron from PolinSAR.

Figure 17. Histograms of estimated Rh100 with different datasets.

Figure 18. Density plot of LIDAR RH100 VS RH100 estimated by neuronal network.

Figure 19. Density plot of LIDAR RH100 VS RH100 estimated by a RVoG based inversion method.

Figure 20. Density plot of RH100 estimated by a RVoG based inversion method VS our seven parameter machine learning-based method.

Figure 21. In green/brown: forest/non-forest zone. Top: reference. bottom: Classification done from the first computed Legendre coefficient with a perceptron.

Figure 22. Estimation of canopy cover. Top: classification by LIDAR; Bottom: classification achieved from SAR after learning.

Figure 23. Classification map of vertical profile classes. Top left: ground truth; Top right: Prediction; Bottom: legend with examples of profiles corresponding to each class.

Figure 24. Effect of strong slope on LIDAR profiles.

Figure 25. Areas Extent to select Learning Sets.

Table 1. Description of the selected geometrical parameters characterizing the PolInSAR coherence region.

Our 7 Features Set	Description
$ρ_{mean}$	Mean absolute coherence
$θ_{mean} h_{a} / 2 π$	Mean height center height
$α h_{a} / 2 π$	Equivalent height for $α$
$λ_{\min}$	Spread of the shape along the minor axis
$λ_{\max}$	Spread of the shape along the major axis
$R_{a}$	area ratio b/w the shape and the Closest Ellipse
$R_{p}$	perimeter ratio b/w the shape and the CE

Table 2. The different training sets.

Lexico Set	Lexico Set	Eigen Values Set	Eigen Values Set	Our 7 Features
(Cartesian)	(Polar)	(Cartesian)	(Polar)	Set
$γ_{x Hh}$	$ρ_{Hh}$	$λ_{1_{X}}$	$\| λ_{1} \|$	$γ_{mean}$
$γ_{y Hh}$	$ϕ_{Hh}$	$λ_{1_{Y}}$	$p h a (λ_{1})$	$θ_{mean} / k_{z}$
$γ_{x Hv}$	$ρ_{Hv}$	$λ_{2_{X}}$	$\| λ_{2} \|$	$α / k_{z}$
$γ_{y Hv}$	$ϕ_{Hv}$	$λ_{2_{Y}}$	$p h a (\| λ_{2} \|)$	$λ_{\max}$
$γ_{x Vv}$	$ρ_{Vv}$	$λ_{3_{X}}$	$\| λ_{3} \|$	$λ_{\min}$
$γ_{y Vv}$	$ϕ_{Vv}$	$λ_{3_{Y}}$	$p h a (\| λ_{3} \|)$	$R_{p}$
$k_{z}$	$k_{z}$	$k_{z}$	$k_{z}$	$R_{a}$

Table 3. RMSE (m) for tree height estimation with different feature sets.

Pauli Set	Pauli Set	Eigen Values Set	Eigen Values Set	Our 7 Features
(Cartesian)	(Polar)	(Cartesian)	(Polar)	Set
4.3	4.15	5.0	3.5	3.2

Table 4. Normalized RMSE (m) for canopy cover estimation with different feature sets.

Pauli Set	Pauli Set	Eigen Values Set	Eigen Values Set	Our 7 Features
(Cartesian)	(Polar)	(Cartesian)	(Polar)	Set
51%	42%	25%	13%	14%

Table 5. Confusion matrix of vertical profiles classes.

Actual/Predicted	Class 1	Class 2	Class 3
Class 1	300.725	50.085	100.190
Class 2	150.269	200.545	100.185
Class 3	150.281	50.082	250.637

Table 6. F1 score for vertical distribution for different feature sets.

Pauli Set	Pauli Set	Eigen Values Set	Eigen Values Set	Our 7 Features
(Cartesian)	(Polar)	(Cartesian)	(Polar)	Set
0.35	0.35	0.49	0.55	0.66

Table 7. RMSE (m) for predicted RH100 in terms of Area extent of learning data set selection.

Training Extent	100%	50%	25%	10%
RMSE RH100 (m)	3.2	3.2	3.5	3.7

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Brigot, G.; Simard, M.; Colin-Koeniguer, E.; Boulch, A. Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features. Remote Sens. 2019, 11, 381. https://doi.org/10.3390/rs11040381

AMA Style

Brigot G, Simard M, Colin-Koeniguer E, Boulch A. Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features. Remote Sensing. 2019; 11(4):381. https://doi.org/10.3390/rs11040381

Chicago/Turabian Style

Brigot, Guillaume, Marc Simard, Elise Colin-Koeniguer, and Alexandre Boulch. 2019. "Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features" Remote Sensing 11, no. 4: 381. https://doi.org/10.3390/rs11040381

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features

Abstract

1. Introduction

2. The PolInSAR Information

2.1. PolInSAR Parameters

2.2. The Coherence Region

2.3. Factors That Impact the Shape of the Coherence Region

Geometry of Acquisition

2.4. Coherence Region Parameters

3. LIDAR Processing

3.1. Tree Height

3.2. Canopy Cover

3.3. Vertical Distribution Complexity

4. Fusion with Machine Learning

4.1. Neural Networks

4.2. Random Forests

4.3. The Data Set

4.4. The Training Set

5. Results

5.1. Tree Height

5.2. Canopy Cover

5.3. CHP Class

5.4. Difficulties and Limits

5.4.1. Concerning RH100

5.4.2. Concerning Canopy Cover

5.4.3. Concerning Vegetation Class

5.5. Upscaling

6. Summary and Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI