Comparative Analysis of Digital Elevation Model Generation Methods Based on Sparse Modeling

Fuse, Takashi; Imose, Kazuki

doi:10.3390/rs15112714

Open AccessArticle

Comparative Analysis of Digital Elevation Model Generation Methods Based on Sparse Modeling

by

Takashi Fuse

^1,*

and

Kazuki Imose

²

¹

Department of Civil Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan

²

Central Japan Railway Company, JR-Tokai Shinagawa Bldg.A, 2-1-85 Konan, Minato-ku, Tokyo 108-8204, Japan

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(11), 2714; https://doi.org/10.3390/rs15112714

Submission received: 27 April 2023 / Revised: 17 May 2023 / Accepted: 22 May 2023 / Published: 23 May 2023

(This article belongs to the Special Issue Photogrammetry – the Science of Precise Measurements from Images: A Themed Issue in Honour of Professor Emeritus Armin Grün in Anticipation of His 80th Birthday)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

With the spread of aerial laser bathymetry (ALB), seafloor topographies are being measured more frequently. Nevertheless, data deficiencies occur owing to seawater conditions and other factors. Conventional interpolation methods generally need to produce digital elevation models (DEMs) with sufficient accuracy. If the topographic features are considered as a basis, the DEM should be reproducible based on a combination of such features. The purpose of this study is to develop new DEM generation methods based on sparse modeling. Based on a review of the definitions of sparsity, we developed DEM generation methods based on a discrete cosine transform (DCT), DCT with elastic net, K-singular value decomposition (K-SVD), Fourier regularization, wavelet regularization, and total variation (TV) minimization, and conducted a comparative analysis. The developed methods were applied to artificially deficient DEM and ALB data, and their accuracy was evaluated. Thus, as a conclusion, we can confirm that the K-SVD method is appropriate when the percentage of deficiencies is low, and that the TV minimization method is appropriate when the percentage of deficiencies is high. Based on these results, we also developed a method integrating both methods and achieved an RMSE of 0.128 m.

Keywords:

digital elevation model (DEM) generation; aerial laser bathymetry (ALB); sparse modeling; sparse coding; compressed sensing; total variation; comparative analysis

1. Introduction

Digital elevation models (DEMs) comprise the primary data structure for representing detailed topographic information and are used in various ways [1]. Aerial laser surveying and aerial laser bathymetry (ALB) have been widely used to create DEMs. For example, ALB data can be used for various purposes [2], such as in sediment volume surveys of dams and rivers and seafloor surveys for offshore construction. However, in these measurements, cases occur where measurement data are missing owing to topographic surface conditions, or obstructions [3]. In ALB, the performance of bathymetry can be degraded owing to various factors such as the water quality and surface and bottom conditions. Moreover, the density of the laser measurement data in a water area is approximately 1/10 of that in a land area. Generally, interpolation is used to generate a DEM from deficient data. Currently, interpolation methods such as the nearest neighbor method, Kriging, and inverse distance weighting are used to generate DEMs [4]. However, their accuracy can degrade in regions with sparse data. When interpolation with a guaranteed accuracy cannot be performed, re-measurement can become necessary owing to the low quality of the interpolation, creating a severe problem. Therefore, a high-accuracy method for interpolating DEMs is required.

It is worthwhile to consider the characteristics of the physical shape of the topography. Topographic quantities such as slopes, curvatures, and ground clearances have been commonly used to describe topographic features. These topographic features have been used to analyze the geomorphology of a target area by introducing knowledge of topographic formation mechanisms. In addition, many topographic classifications and representations have been created based on topographic features [5]. These results indicate that even limited topographic features can effectively represent the topography. If the topographic features are used as a basis, a combination of such features should make it possible to reproduce the topography. However, in the context of topographic reconstruction based on limited features, it remains unclear which bases are generally effective. In addition, from the perspective of generalizability, it is desirable to use as few bases as possible. Sparse modeling is a method that satisfies these requirements. It can extract important bases by assuming sparsity even when the number of data is limited. Owing to these characteristics, significant results have been achieved in image restoration. For example, the K-singular value decomposition (K-SVD) method has attracted attention due to its assumption of sparsity in the image space [6]. K-SVD uses a set of many bases (referred to as a dictionary) to represent an image. It assumes that only a small number of bases from this dictionary can be used to represent the image. The details of K-SVD are described in Section 3.2.2. Lustig et al. used an interpolation method that assumes sparsity in the frequency space of the image [7]. Here, the assumption is that an image can be represented using a small number of various frequency patterns.

As described above, various definitions of sparsity exist in image restoration. Similarly, although various types of sparsity can be assumed for topography, understanding which type of sparsity is effective for the topography requires further clarification. Therefore, it is necessary to construct several generation methods based on sparsity for topography and to conduct a comparative analysis. Moreover, the assumption of sparsity in topography provides a perspective not found in conventional interpolation methods, as it interpolates by considering the spatial correlations of topographic features.

This study aims to develop such new DEM generation methods. They should enable efficient DEM generation and provide high-resolution information. In order to achieve this aim, this study constructs multiple DEM generation methods by considering various types of sparsity in sparse modeling and performs a comparative analysis. Specifically, the definitions of sparsity in topography are compared, and multiple topography interpolation methods corresponding to each sparsity definition are considered. Then, each method is applied to artificially deficient data from an existing DEM to determine the topography interpolation performance based on the percentage of deficient data. Then, the applicability of each method is verified on actual data obtained from ALB measurements. Finally, we discuss the results and summarize the characteristics of each method.

2. Related Work

DEM generation can be classified into methods for generating DEMs from nodal data arranged in a grid of points, and methods for generating DEMs from randomly arranged point data. In the former, many DEM generation methods assume self-similarity, namely, the existence of topographic similarity in the shape of the topography at a given point in the surrounding area. One popular type uses filters (e.g., median or nonlinear filters) while considering the spatial similarity [8]. However, the applicability of these methods has only been verified in cases where approximately 2% of the total data, at most, are missing. Other methods include spline interpolation [9], inverse filtering [10], and singular value decomposition [11]. Additionally, “absolutely minimizing Lipschitz extension” has been proposed; this is an interpolation method for addressing missing regions or elevation discontinuity boundaries caused by sea levels or inland water surfaces in the DEM [12]. However, none of these methods have been validated for cases where the missing data are located in the entire area of interest.

The methods for generating DEMs from randomly arranged point data include the use of weighted average and weighted first-order interpolation methods, in which the values around each point are approximated by a first-order plane [13]. Another generation method determines the surface while aiming to minimize the second-order variance representing the smoothness of the surface [14]. Nevertheless, owing to their deficiencies, these approaches have limited applicability to cases with differences in the point density.

As mentioned above, this study focuses on sparse modeling [15]. Sparse modeling was separately developed for sparse coding [16] and compressed sensing [17]. Sparse coding was developed mainly for signal processing, such as for images and audio. It is used for various applications such as noise reduction, interpolation, and classification [18]. The concept of sparse coding is that “a natural signal is approximated by using a large set of bases (a dictionary) and combining only a small number of bases from this dictionary”. In other words, sparse coding aims to find a sparse representation of the observed data using a given dictionary. Various algorithms have been proposed for this problem. One example is orthogonal matching pursuit (OMP) [19,20]. This algorithm is a greedy method that individually adds the best basis candidates rather than selecting the best basis from among all combinations [21]. The details of OMP are described in Section 3.2.1. “Basis Pursuit” is another example [22]. This method is attributed to the relaxation problem of minimizing the L₁ norm instead of the L₀ norm in the sparse representation. A method called the “Least Absolute Shrinkage and Selection Operator” adds a regularization term for the L₁ norm to the error function of the ordinary least squares method [23,24,25]. Overall, however, the accuracy of such sparse coding methods depends on how well a given dictionary fits the data.

Two methods are mainly used for determining dictionaries. The first method uses an existing dictionary. Examples include the discrete cosine transform (DCT), wavelet transform, curvelet transform, and bandelet transform. Starck et al. and Rubinstein et al. described the above dictionaries in detail, and we refer the reader to those studies for the details [26,27]. The second method involves learning a dictionary. In one example, two steps are alternated to perform the dictionary learning. First, sparse coding uses a given dictionary to identify sparse representations of the observed data. Second, the dictionary is updated by fixing the values of the sparse representation coefficients and positions of nonzero elements. Olshausen et al. proposed an algorithm using these two steps for dictionary learning [28,29]. Engan et al. improved on the high computational cost of that algorithm and proposed a more efficient algorithm called the “Method of Optimal Directions” [30]. This method requires the computation of inverse matrices to update the dictionary, making it relatively computationally expensive. Aharon et al. proposed the K-SVD method [6]. In this method, bases in the dictionary are updated one by one instead of updating the dictionary all at once [31]. The update is approximated by using SVD.

Compressed sensing, conversely, is a framework for reconstructing anonymous data from a small number of incomplete observations [32,33]. It assumes that the data are sparse in the original space or in a space after some transformation process. Compressed sensing uses this sparsity to enable reconstruction from a small number of samples (i.e., less than that from the sampling frequency). Compressed sensing is also based on random sampling. For example, consider the reconstruction of magnetic resonance images in a k-space. If the observed data are equally sampled in the k-space, the reconstruction results in a folded image called a folded artifact [34]. As the original signal and its duplicate are indistinguishable, it is impossible to reconstruct the original image from the observation data. When random sampling is used, the resulting artifacts appear as Gaussian noise. The incoherence of this noise makes compressed sensing possible. Compressed sensing was first introduced by Donoho et al. [35]. Since then, compressed sensing has been utilized in various fields and domains, such as in medicine and astronomy [7,36].

3. Methods

This section first summarizes DEM generation approaches using sparse modeling. Typical sparsity approaches include sparse coding, compressed sensing, and total variation (TV). We developed DEM generation methods according to the definitions of sparsity.

3.1. Definitions of Sparsity and Comparative Analysis Methods

Sparse coding uses a set of many bases (a dictionary) to represent the topography. However, only a few bases from this dictionary can be used to represent the topography. Here, the approach focuses on a local topography. When the coefficients are computed using the observed topography and dictionary (i.e., sparse coding), the sparsity of the coefficients is assumed. Sparsity is assumed such that any localized 2D plane cut from the topography can be represented by a linear sum of a small number of the many elements/bases comprising the topography (Figure 1). This approach focuses on spatial patterns and is based on generating the topography from a small number of spatial patterns.

Compressed sensing assumes that the topography can be represented using a small number of patterns among various frequency patterns. It assumes sparsity in the frequency space of the topography, that is, only some coefficients are significant, while others are zero or very small. It assumes that a section of the topography can be considered as a wave and expressed as a linear sum of a small number of patterns among the many elements and underlying waves (frequency patterns) comprising the topographic section (Figure 2). The topography is generated from a small number of frequency patterns after the entire terrain is transformed into the frequency space.

The TV approach considers that the topography is smooth and that the differences between adjacent points are often tiny, with few areas of significant change. Sparsity is assumed for such topographic variations. This approach is based on the sparsity of the few sizeable topographic change points.

Based on Section 2 and the above definitions, the following six methods were comparatively analyzed in this study.

DCT method: This is a sparse coding framework that uses a redundant DCT dictionary (an existing general-purpose dictionary) instead of learning a dictionary. The DCT transforms a discrete sequence of signals into a combination of cosines with various frequencies and amplitudes, a technique derived from the discrete Fourier transform. The sparse coding is performed by using the dictionary to reproduce the topography.
DCT with elastic net method: This is another sparse coding framework. It also uses a redundant DCT dictionary (an existing general-purpose dictionary) instead of learning a dictionary. The sparse coding is performed using the dictionary to reproduce the topography. The sparsity assumption is relaxed during the sparse coding, and elastic net regression is used. This allows for expanding the range of the local topography.
K-SVD method: another sparse coding framework, the K-SVD method learns a dictionary from an observed deficient topography and performs sparse coding using the dictionary to reproduce the topography.
Fourier regularization method: this compressed sensing framework assumes the sparsity of the topography in the Fourier space and reconstructs the topography.
Wavelet regularization method: this compressed sensing framework assumes the sparsity of the topography in the wavelet space and reconstructs the topography.
TV minimization method: this method reproduces the topography by minimizing the sum of the differences (gradients) between adjacent points (i.e., the TV).

3.2. Digital Elevation Model Generation Method Using Sparse Coding

As noted above, two types of methods are based on the sparse coding framework: those that use existing dictionaries and those that learn dictionaries. In both types, the DEM is represented as a linear sum of a dictionary A and its sparse coefficients x.

3.2.1. Using Existing Dictionaries

First, DEM data with defects are cut into n × n patches to obtain N patches as patch data. This is based on the assumption that in the n × n patch, observation

y_{i}

can be expressed as a linear sum of a dictionary A and its sparse coefficients

x_{i}

.

x_{i}

is obtained from the known

y_{i}

and A by solving the optimization problem in Equation (1) (Figure 3).

{\hat{x}}_{i} = \underset{x_{i}}{\arg \min} \frac{1}{2} ‖ y_{i} - M A x_{i} ‖_{2}^{2}, s . t . x_{i} i s s p a r s e

(1)

In the above, M is a mask matrix. The algorithm for solving the optimization problem uses OMP or elastic net regression, as described below. As there are deficiencies in

y_{i}

, the computation process is performed only on the portion without deficiencies. Specifically, the mask matrix M is introduced to align the number of rows in each matrix. The dictionary is a redundant DCT dictionary, that is, a standard dictionary used in image compression such as JPEG (Figure 4). From the obtained sparse coefficients

x_{i}

and dictionary A, the topography of the corresponding patch can be reproduced and interpolated as

{\hat{y}}_{i}

using Equation (2).

{\hat{y}}_{i} = A x_{i}

(2)

The value of each mesh in the DEM is obtained by averaging the values of the corresponding cells in all patches containing that mesh.

In OMP, Equation (1) is regarded as the following optimization problem; the solution is obtained using a greedy method.

{\hat{x}}_{i} = \underset{x_{i}}{\arg \min} \frac{1}{2} ‖ y_{i} - M A x_{i} ‖_{2}^{2}, s . t . ‖ x_{i} ‖_{0} \leq k_{0}

(3)

where k₀ is a constant that defines the sparsity. First, x₀ = 0 and k = 0 are set as initial values. Initial values are also set for the initial residuals r⁰ =

y_{i}

, sparsity k₀, and initial solution support S⁰ =

ϕ

. Next, k is incremented by one, and the basis vector a_j that maximizes the inner product

|a_{j} \cdot r^{k}|

with the residual

r^{k}

from the dictionary A is added to the support

S^{k}

. The tentative solution

x^{k}

is then updated by solving Equation (4), and the residual

r^{k}

is updated based on Equation (5). The above process is repeated k₀ times.

x^{k} = \underset{x_{i}}{\arg \min} \frac{1}{2} ‖ y_{i} - A x_{i} ‖_{2}^{2}, s . t . S u p p o r t \{x_{i}\} = S^{k}

(4)

r^{k} = y - A x^{k}

(5)

Elastic net regression, in contrast, is considered as a linear regression problem that uses the L₁ and L₂ norms as regularization terms for Equation (1) [37,38].

{\hat{x}}_{i} = \underset{x_{i}}{\arg \min} \frac{1}{2} ‖ y_{i} - M A x_{i} ‖_{2}^{2} + λ_{1} ‖ x_{i} ‖_{1} + λ_{2} ‖ x_{i} ‖_{2}

(6)

where λ₁ and λ₂ are the weights of the L₁ and L₂ regularization terms, respectively. In elastic net regression, as with L₁ regularization, it is easy to obtain sparse solutions. In L₁ regularization, for example, only one solution is likely to be chosen when dealing with two variables. Elastic net regression, in contrast, makes it easy to select both variables simultaneously owing to the effect of the L₂ regularization term. In other words, the sparsity constraint is slightly relaxed. As discussed in more detail below, the patch size must be increased when considerable deficiencies exist. If the sparsity restrictions are excessively strict, the estimation may not be possible. Therefore, in this case, sparse coding is performed using elastic net regression as in Equation (6).

3.2.2. Using Dictionary Learning

In this study, we focus on the K-SVD method [6] as a dictionary-learning method. As in the previous section, we describe the K-SVD method in the context of DEM data with deficiencies.

First, the defective DEM data are cut into n × n patches to obtain N patches of data. The initial values of k = 0 and the sparsity k₀ are set, and a redundant DCT dictionary is set as the initial dictionary A₀. As the defective DEM data for the dictionary learning, we randomly sample from the N patches and use them to create

y_{k}

. The sparse representations

x_{k}

are estimated from the data

y_{k}

and dictionary

A_{k}

by solving Equation (7) using OMP described above. The sum of products of each basis and the corresponding sparse representation are expressed as shown in Equation (8). Figure 5 shows an image of the sum of products. The bases of the dictionary are updated one by one. As an example, let

a_{i}

be the basis to be updated. The sum of products of this basis and the corresponding sparse representation can be expressed as shown in Equation (9) using the data

y_{k}

, basis not to be updated

a_{j \neq i}

, and corresponding sparse representation

x_{j \neq i}^{T}

. The updated

a_{i} x_{i}^{T}

can be obtained by performing a low-rank approximation of

a_{i} x_{i}^{T}

using singular value decomposition. Then, the sequential updating of all of the bases updates the dictionary.

\hat{x} = \underset{x_{k}}{\arg \min} \frac{1}{2} ‖ y_{k} - M A_{k} x_{k} ‖_{2}^{2}, s . t . ‖ x_{k} ‖_{0} \leq k_{0}

(7)

A_{k} x_{k} \approx \sum_{j} a_{j} x_{j}^{T}

(8)

a_{j} x_{j}^{T} \approx y_{k} - \sum_{j \neq i} a_{j} x_{j}^{T}

(9)

3.3. Digital Elevation Model Generation Method Using Compressed Sensing

The Fourier and wavelet transforms [39] map the original signal to a frequency space. In compressed sensing, the data’s Fourier transform coefficients or wavelet transform coefficients are used as regularization terms to set up the optimization problem, as shown in Equation (10).

\hat{x} = \underset{x}{\arg \min} \frac{1}{2} ‖ y - Mx ‖_{2}^{2} + λ ‖ Ψ x ‖_{1}

(10)

where

Ψ

represents the sparse transformation of each transform, and λ is the weight of the regularization term. To solve this optimization problem, we consider the more general optimization problem for non-smooth convex functions as an asymptotic solution (Equation (11)).

\min_{x} F (x) \equiv f (x) + g (x)

(11)

where

f (x)

is a continuous and differentiable convex function, and

g (x)

is a continuous convex function that is not necessarily smooth. In this case, the solution x is obtained based on the proximity gradient method, as follows.

First, the solution’s initial value x₀ and step width ρ are set. Using Equation (12), the gradient method is updated when only the function

f (x)

is used. Next, the solution is updated using the proximity map

{prox}_{λ_{g}} (r)

according to Equation (13). This proximity map is given by Equation (14). The solution

\hat{x}

is obtained by iteratively applying Equations (12) and (13) until convergence. This method utilizes the structure of the function

g (x)

and is called the “Iterative Shrinkage Thresholding Algorithm”, which is especially useful for L₁ regularization.

r_{k + 1} = x_{k} - ρ \nabla f (x_{k})

(12)

x_{k + 1} = {prox}_{λ_{g}} (r_{k + 1})

(13)

{prox}_{λ_{g}} (r) = \underset{x}{\arg \min} \frac{1}{2} ‖ x - r ‖_{2}^{2} + λ g (x)

(14)

3.3.1. Compressed Sensing Using Fourier Transform

Assuming sparsity in the Fourier space in Equation (11) results in Equations (15) and (16).

f (x) = \frac{1}{2} ‖ y - Mx ‖_{2}^{2}

(15)

g (x) = λ ‖ F x ‖_{1}

(16)

where the function F is the function performing the Fourier transform. When the regularization term is expressed as shown in Equation (16), the proximity map becomes Equation (17).

{prox}_{λ_{g}} (r) = F^{T} s o f t (F r, λ)

(17)

where the function

s o f t (y, λ)

is called the “soft threshold function” and is defined according to Equation (18).

s o f t (y, λ) = \{\begin{array}{l} y - λ (λ < y) \\ 0 (- λ \leq y \leq λ) \\ y + λ (y < - λ) \end{array}

(18)

The solution can be obtained using the proximity gradient method as follows. First, the initial solution value x₀, step width α, and threshold value λ are set. Using Equation (19), the gradient method is used to update the solution. Next, the solution is updated by proximity mapping using Equation (20). The solution is updated only for defective meshes using Equation (21). In Equation (21), j refers to the j-th mesh, and y represents the defective DEM data. These steps are repeated until convergence to obtain the solution.

r_{k + 1} = x_{k} - ρ M^{T} (y - M x_{k})

(19)

z_{k + 1} = F^{T} s o f t (F r_{k + 1}, λ)

(20)

x_{k + 1} [j] = \{\begin{matrix} z_{k + 1} [j] (y [j] = N a N) \\ y [j] (y [j] \neq N a N) \end{matrix}

(21)

3.3.2. Compressed Sensing Using Wavelet Transform

When assuming sparsity in the wavelet space, Equations (16) and (20) can be replaced by Equations (22) and (23), respectively. A similar algorithm can be used to obtain the solution.

g (x) = λ ‖ W x ‖_{1}

(22)

z_{k + 1} = W^{T} s o f t (W r_{k + 1}, λ)

(23)

where the function W is the function performing the wavelet transform.

3.4. DEM Generation Method Using Total Variation Minimization

In this section, we first define the TV and explain the derivatives required for optimization. Then, a DEM generation method is presented based on the TV minimization criterion.

3.4.1. Total Variation

The TV norm is defined as the L₁ norm of the magnitude of the gradient in Equation (24) [40]. In the case of discrete data x like in a DEM, the TV norm can be defined as shown in Equation (25). Thus, the TV norm can be calculated by differentiating the DEM in the

x

and

y

directions, finding the gradients, and then summing them.

T V = \int \int ‖ \nabla f (x, y) ‖_{1} d x d y = \int \int \sqrt{{(\frac{\partial f}{\partial x})}^{2} + {(\frac{\partial f}{\partial y})}^{2}} d x d y

(24)

‖ x_{T V} ‖_{1} = \sum_{i} \sum_{j} |\nabla x_{i, j}| = \sum_{i} \sum_{j} \sqrt{{(x_{i, j} - x_{i - 1, j})}^{2} + {(x_{i, j} - x_{i, j - 1})}^{2}}

(25)

The derivative of the TV norm is expressed as shown in Equation (26). Equation (26) is not differentiable when the denominator is zero, but when a small quantity ε is introduced to approximate

|x| \approx \sqrt{x^{T} x + ε}

, the derivative can be calculated as shown in Equation (27). For the actual calculations, the approximation in Equation (28) has been proposed [41]. Here, ε is a minimal value of approximately 10⁻⁸.

\nabla ‖ x_{T V} ‖_{1} = \frac{\partial ‖ x_{T V} ‖_{1}}{\partial x_{i, j}}

(26)

\frac{\partial |x|}{\partial x} \approx \frac{x}{\sqrt{x^{T} x + ε}}

(27)

\begin{array}{l} \frac{\partial ‖ x_{T V} ‖_{1}}{\partial x_{i, j}} & \approx \frac{x_{i, j} - x_{i - 1, j}}{\sqrt{{(x_{i, j} - x_{i - 1, j})}^{2} + {(x_{i - 1, j + 1} - x_{i - 1, j})}^{2} + ε}} \\ + \frac{x_{i, j} - x_{i, j - 1}}{\sqrt{{(x_{i + 1, j - 1} - x_{i, j - 1})}^{2} + {(x_{i, j} - x_{i, j - 1})}^{2} + ε}} \\ - \frac{x_{i + 1, j} - x_{i, j + 1} - 2 x_{i, j}}{\sqrt{{(x_{i + 1, j} - x_{i, j})}^{2} + {(x_{i, j + 1} - x_{i, j})}^{2} + ε}} \end{array}

(28)

3.4.2. Digital Elevation Model Generation Method Using Total Variation Minimization

Using the TV norm

‖ x_{T V} ‖_{1}

from the previous section as the regularization term, we set up the optimization problem shown in Equation (29).

\hat{x} = \underset{x}{\arg \min} \frac{1}{2} ‖ y - M x ‖_{2}^{2} + λ ‖ x_{T V} ‖_{1}

(29)

For this optimization, we use the gradient method. First, the step width α is set. The derivative of the TV norm

\frac{\partial ‖ x_{T V} ‖_{1}}{\partial x_{i, j}}

is obtained using Equation (28). Only the solution for the defective mesh is updated according to Equations (30) and (31). j refers to the j-th mesh, and y represents the defective DEM data. These steps are repeated until convergence to obtain the solution.

z_{k + 1} = x_{k} - α \frac{\partial ‖ x_{T V} ‖_{1}}{\partial x_{i, j}}

(30)

x_{k + 1} [j] = \{\begin{matrix} z_{k + 1} [j] (y [j] = N a N) \\ y [j] (y [j] \neq N a N) \end{matrix}

(31)

4. Results

As described in this section, we verified the performance of each method for different deficiencies. Then, the methods were applied to actual data from ALB measurements.

4.1. Application to Different Deficiency Ratios

The methods were applied to DEM data with artificial deficiencies. The original DEM (Mt. Unzen, Nagasaki, Japan) comprised 1 m mesh data (512 × 512 m) obtained from an aerial laser survey (Figure 6). Two areas with different topographic shapes were selected. Area 1 (upper area) has a constant slope throughout (elevation 560 to 470 m) and several valley lines. The elevation of the valley floor is approximately 450 m. Area 2 (lower area) is a cliff section, with elevations above and below the cliff of 660 and 510 m, respectively. We prepared two methods for selecting deficient meshes: one randomly selected a mesh, and the other selected a mesh by considering spatial correlations. The deficiencies were set as 5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, and 90% in the cases of random deficiencies. Figure 7 shows an example of the DEM data with deficiencies. The black areas in the figures indicate the deficiencies. The root mean square error (RMSE) was used to evaluate the accuracy of the DEM generation.

Figure 8 and Figure 9 show the application results. In the case of random deficiencies, the K-SVD method was effective when the percentage of deficiencies was under 50%. The DCT method was effective when the percentage was 50–75%, and the TV minimization method was effective when the percentage was 80% or more. Regarding the correlated deficiencies, the K-SVD and TV minimization methods were effective when the percentage of deficiencies was 75% or less. In comparison, the TV minimization method was effective when the percentage of deficiencies was 80% or more. The DCT and K-SVD methods had good accuracy when the percentage of deficiencies was low. Both methods used local features in the DEM, resulting in significant accuracy degradation when the percentage of deficiencies was high (80% and 60%, respectively). On the other hand, the DCT with elastic net method did not show the above significant accuracy degradation because it relaxed the constraint condition by using L₂ regularization. Compressed sensing, which deals with the frequency space of the entire DEM, did not show outstanding accuracy compared to the other methods. Comparison with the sparse coding results indicates that local features, rather than the entire DEM, are significant for terrain restoration. The TV method, on the other hand, considered local DEM gradients in the entire DEM, resulting in a higher overall accuracy. The effectiveness of each method did not differ depending on the topographic shape.

4.2. Application to Actual Aerial Laser Bathymetry Data

Next, each method was applied to actual ALB data. The data were measured in the shallow water area of a harbor (Port of Oniike, Kumamoto, Japan). As in the previous section, a 512 × 512 m section of the 1 m mesh data was cut out (Figure 10). The conditions of the seawater caused the deficiencies. The accuracy verification used bathymetric data at almost the same times. Cross-section 1 in the figure is a gentle valley with water depths ranging from 3 to 8 m. Significant deficiencies are observed, especially at the bottom of the valley. Cross-section 2 is a cliff section with water depths ranging from 4 to 9 m. ALB is capable of measuring at shallow depths, and this measurement confirms that there are many deficiencies, especially at water depths deeper than 7 m. The accuracy was verified by applying the DEM generation methods and calculating the RMSEs of the bathymetric data in cross-sections 1 and 2 in the figure. The nearest neighbor and Kriging methods were also compared as conventional methods.

Table 1 shows the application results. The effectiveness of each method did not differ depending on the topographic shape. This result appeared because both cross-sections had more deficiencies at their relatively flat bottoms.

5. Discussion

First, we discuss the results of Section 4.1 in the context of different deficiency ratios. The sparse coding methods without dictionary learning lead to ill-posed problems when the deficiency ratio is high; thus, the results could be more unstable. Elastic net can alleviate this instability because of its regularization effect. When using dictionary learning, the K-SVD method is effective when the deficiency ratio is low. This means that the K-SVD method can learn detailed topographies. In contrast, when the deficiency ratio is high, the dictionary cannot be appropriately learned owing to over-fitting, and the accuracy decreases. We also confirmed that the accuracy is improved when the dictionary is learned in a region with a small percentage of deficiencies and is shared across the entire range.

The differences between the sparse coding and compressed sensing methods are as follows. The sparse coding methods are applied to a local topography, whereas the compressed sensing methods are applied to the entire DEM in the frequency domain. Therefore, the compressed sensing method can generate a stable DEM even if localized areas have high deficiencies. The Fourier regularization method is more accurate than the wavelet regularization method for all deficiencies. From comparing the coefficients for the different regularization methods, it can be seen that the coefficients in the Fourier space are often smaller than those in the wavelet space. The Fourier regularization method is equally used for low, medium, and high frequencies. Therefore, the Fourier coefficients are also small, with Fourier coefficients below 0.001 accounting for 70%. The wavelet regularization method, on the other hand, uses mainly low-frequency components, with small wavelet coefficients accounting for only 40%. This is likely because the Fourier regularization represents global and local features. In addition, the TV minimization method was the most accurate and effective for all patterns, especially when the deficiencies were high.

Overall, the DEM generation results do not differ significantly depending on whether the defects are random or correlated. The accuracy of each method was also compared depending on the topography. The slope, Laplacian, and ground opening were used as quantitative indicators of the topography. No differences from the above trends were observed from any of these perspectives.

Next, we discuss the results from applying the methods to the ALB data as described in Section 4.2. The TV minimization method and DCT with elastic net method, which allow the patch size to be larger than the deficiencies, are effective in areas where the percentage of deficiencies is relatively high (line 1). The spatial extents of these deficiencies are also considered. In the case of small-scale deficiencies (line 2), the DCT with elastic net method is effective. This result arises from the effect of considering the smoothness in the L₂ regularization.

The results for the Fourier and wavelet regularization methods show relatively high RMSEs. All the data were summarized in the frequency space, which did not allow for capturing the local topography.

Based on the above discussion, we also developed a method integrating the K-SVD and TV minimization methods (TV-SVD method). The ALB data include areas with high and low percentages of deficiencies (low-density deficiencies). The practical application of the K-SVD method to the airborne laser DEM showed that the K-SVD method is effective for low deficiencies. In contrast, the TV minimization method is effective for high deficiencies.

The K-SVD results were not good because the estimation included areas with high deficiencies. The data in line 1 were divided into ranges with high and low deficiencies, and K-SVD was applied to the range with low deficiencies, resulting in a high accuracy (RMSE: 0.167 m). This suggests that dictionary learning works well only in areas with low deficiencies. It was also confirmed that a DEM could be generated accurately, even with shared dictionaries in different areas. This result shows the scalability of the dictionary and its ability to deal with many low deficiencies. Rather than learning dictionaries from data with high deficiencies, the DEM generation accuracy is higher when dictionaries are learned in different locations but in areas with low deficiencies.

In the case of large-scale deficiencies, the TV minimization method is effective because the dictionary learning does not work correctly. These results are consistent with previous results concerning the different deficiencies.

Accordingly, we applied the TV minimization method to the large-scale deficient parts and the K-SVD method to the small-scale deficient parts and integrated the two results. In order to construct a method according to the size of the deficiencies, it is necessary to distinguish the size. The large-scale deficient parts can be extracted using morphological operations [42]. Morphological operations include the closing operation, in which dilation is repeated a certain number of times, and then erosion is repeated the same number of times. The closing operation can be applied to a binary image in which deficient areas are set to 1 and other areas are set to 0 to extract regions. One of the characteristics of the closing operation is that it extracts large regions, while sparse regions are eliminated. The process makes it suitable for extracting large areas of deficiencies. It enables automatic identification of the applied method.

The proposed method was applied to the area around cross-section 1 in Figure 10. The morphological operation correctly identified the high-deficiency areas. Therefore, the K-SVD and TV minimization methods could be applied appropriately. The dictionary was learned in the same area. The RMSE of the result was 0.128 m. We confirm the improved accuracy over the results in Section 4.2. This result confirms the effectiveness of the proposed combination method. One of the results is shown in Figure 11. In Figure 11a, the uncolored areas are deficient data areas. Therefore, it looks as if a part of the figure is missing.

6. Conclusions

In this study, we compared and organized various types of sparsity in sparse modeling and constructed multiple DEM generation methods. In organizing the sparsity types, we focused on sparsity related to the basis of spatial topographic components, sparsity related to frequency patterns, and sparsity with few topographic change points. Based on the results, we developed DEM generation methods based on the DCT method, DCT with elastic net method, K-SVD method, Fourier regularization method, Wavelet regularization method, and TV minimization method. The developed methods were applied to artificially created DEMs with various deficiencies and to actual ALB data to verify the applicability of each method. In particular, we confirmed the effectiveness of the sparsity approach for spatial patterns and topographic changes. Specifically, we can confirm that the K-SVD method is appropriate when the percentage of deficiencies is low, and that the TV minimization method is appropriate when the percentage of deficiencies is high. Based on these results, we also developed a method integrating both methods and achieved an RMSE of 0.128 m. Furthermore, we clarified effective sparsity approaches for topography and effective DEM generation methods based on a discussion of the application results.

Future work includes extending the method to an integrated method for simultaneously considering multiple sparsities. In addition, the results from the multiple generation methods could be integrated into an ensemble. Such extended methods would be expected to stabilize the results and suppress the influence(s) of noise.

Author Contributions

Conceptualization, T.F.; methodology, T.F.; software, K.I.; validation, T.F. and K.I.; formal analysis, K.I.; investigation, K.I.; resources, T.F.; data curation, K.I.; writing—original draft preparation, T.F.; writing—review and editing, T.F. and K.I.; visualization, K.I.; supervision, T.F.; project administration, T.F.; funding acquisition, T.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JSPS KAKENHI Grant Number JP18H01554.

Data Availability Statement

Restrictions apply to the availability of these data. Data was obtained from Asia Air Survey Co., Ltd. with the permission of Asia Air Survey Co., Ltd.

Acknowledgments

The DEM of airborne laser scanner data, ALB data, and bathymetric data used in this study were provided by the Asia Air Survey Co., Ltd. (Tokyo, Japan). We would like to express our gratitude for their cooperation.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Zhang, L.; Gruen, A. Multi-image matching for DSM generation from IKONOS imagery. ISPRS J. Photogramm. Remote Sens. 2011, 60, 195–211. [Google Scholar] [CrossRef]
Szafarczyk, A.; Toś, C. The use of green laser in LiDAR bathymetry: State of the art and recent advancements. Sensors 2023, 23, 292. [Google Scholar] [CrossRef] [PubMed]
Guo, K.; Li, Q.; Wang, C.; Mao, Q.; Liu, Y.; Zhu, J.; Wu, A. Development of a single-wavelength airborne bathymetric LiDAR: System design and data processing. ISPRS J. Photogramm. Remote Sens. 2022, 185, 62–84. [Google Scholar] [CrossRef]
Chen, C.; Bei, Y.; Li, Y.; Zhou, W. Effect of interpolation methods on quantifying terrain surface roughness under different data densities. Geomorphology 2022, 417, 108448. [Google Scholar] [CrossRef]
Dragut, L.; Blaschke, T. Automated classification of landform elements using object-based image analysis. Geomorphology 2006, 81, 330–344. [Google Scholar] [CrossRef]
Aharon, M.; Elad, M.; Bruckstein, A. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 2006, 54, 4311–4322. [Google Scholar] [CrossRef]
Lustig, M.; Donoho, D.; Pauly, J.M. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magn. Reson. Med. 2007, 58, 1182–1195. [Google Scholar] [CrossRef]
Bopche, L.; Rege, P.P. Use of noise reduction filters on stereo images for improving the accuracy and quality of the digital elevation model. J. Appl. Remote Sens. 2021, 15, 014508. [Google Scholar] [CrossRef]
Wise, S. Assessing the quality for hydrological applications of digital elevation models derived from contours. Hydrol. Process. 2000, 14, 1909–1929. [Google Scholar] [CrossRef]
Kalbermatten, M.; Ville, D.V.D.; Turberg, P.; Tuia, D.; Joost, S. Multiscale analysis of geomorphological and geological features in high resolution digital elevation models using the wavelet transform. Geomorphology 2012, 138, 352–363. [Google Scholar] [CrossRef]
Mora, O.; Mallorqui, J.J.; Broquetas, A. Linear and nonlinear terrain deformation maps from a reduced set of interferometric SAR images. IEEE Trans. Geosci. Remote Sens. 2003, 41, 2243–2253. [Google Scholar] [CrossRef]
Almansa, A.; Cao, F.; Gousseau, Y.; Rougé, B. Interpolation of digital elevation models using AMLE and related methods. IEEE Trans. Geosci. Remote Sens. 2002, 40, 314–325. [Google Scholar] [CrossRef]
Shi, W.; Wang, B.; Tian, Y. Accuracy analysis of digital elevation model relating to spatial resolution and terrain slope by bilinear interpolation. Math. Geosci. 2014, 46, 445–481. [Google Scholar] [CrossRef]
Grimson, W.E.L. An implementation of a computational theory of visual surface interpolation. Comput. Vis. Graph Image Process. 1983, 22, 36–69. [Google Scholar] [CrossRef]
Elad, M. Sparse and Redundant Representations; Springer: New York, NY, USA, 2010. [Google Scholar] [CrossRef]
Jiao, L.; Yang, Y.; Liu, F.; Yang, S.; Hou, B. The new generation brain-inspired sparse learning: A comprehensive survey. IEEE Trans. Artif. Intell. 2022, 3, 887–907. [Google Scholar] [CrossRef]
Mishra, I.; Jain, S. Soft computing based compressive sensing techniques in signal processing: A comprehensive review. J. Intell. Syst. 2020, 30, 312–326. [Google Scholar] [CrossRef]
Baraniuk, R.G.; Candes, E.; Elad, M.; Ma, Y. Applications of sparse representation and compressive sensing. Proc. IEEE 2010, 98, 906–909. [Google Scholar] [CrossRef]
Pati, Y.C.; Rezaiifar, R.; Krishnaprasad, P.S. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 1–3 November 1993. [Google Scholar] [CrossRef]
Tropp, J.A.; Gilbert, A.C. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inf. Theory 2007, 53, 4655–4666. [Google Scholar] [CrossRef]
Rish, I.; Granbarnik, G.Y. Sparse Modeling: Theory, Algorithms, and Applications; Taylor & Francis: Boca Raton, FL, USA, 2015; pp. 71–93. [Google Scholar] [CrossRef]
Chen, S.S.; Donoho, D.L.; Saunders, M.A. Atomic decomposition by basis pursuit. SIAM Rev. 2001, 43, 129–159. [Google Scholar] [CrossRef]
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning, 2nd ed.; Springer: New York, NY, USA, 2009; pp. 43–94. [Google Scholar] [CrossRef]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning; Springer: New York, NY, USA, 2013; pp. 203–259. [Google Scholar]
Starck, J.-L.; Elad, M.; Donoho, D.L. Image decomposition via the combination of sparse representations and a variational approach. IEEE Trans. Image Process. 2005, 14, 1570–1582. [Google Scholar] [CrossRef] [PubMed]
Rubinstein, R.; Bruckstein, A.M.; Elad, M. Dictionaries for sparse representation modeling. Proc. IEEE 2010, 98, 1045–1057. [Google Scholar] [CrossRef]
Olshausen, B.A.; Field, D.J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 1996, 381, 607. [Google Scholar] [CrossRef] [PubMed]
Olshausen, B.A.; Field, D.J. Sparse coding with an overcomplete basis set: A strategy employed by V1? Vis. Res. 1997, 37, 3311–3325. [Google Scholar] [CrossRef]
Engan, K.; Aase, S.O.; Husoy, J.H. Method of optimal directions for frame design. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ, USA, 15–19 March 1999. [Google Scholar] [CrossRef]
Cheng, H. Sparse Representation, Modeling and Learning in Visual Recognition; Springer: London, UK, 2015; pp. 155–178. [Google Scholar] [CrossRef]
Carmi, A.Y.; Mihaylova, L.S.; Godsill, S.J. Introduction to Compressed Sensing and Sparse Filtering. In Compressed Sensing & Sparse Filtering; Carmi, A.Y., Mihaylova, L.S., Godsill, S.J., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; pp. 1–23. [Google Scholar] [CrossRef]
Strother, S.C.; Rasmussen, P.M.; Churchill, N.W.; Hansen, L.K. Stability and Reproducibility in fMRI Analysis. In Practical Applications of Sparse Modeling; Rish, I., Cecchi, G.A., Lozano, A., Niculescu-Mizil, A., Eds.; The MIT Press: Cambridge, MA, USA, 2014; pp. 99–121. [Google Scholar] [CrossRef]
Sartoretti, T.; Reischauer, C.; Sartoretti, E.; Binkert, C.; Najafi, A.; Sartoretti-Schefer, S. Common artefacts encountered on images acquired with combined compressed sensing and SENSE. Insights Imaging 2018, 9, 1107–1115. [Google Scholar] [CrossRef]
Donoho, D.L. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [Google Scholar] [CrossRef]
Bobin, J.; Starck, J.-L.; Ottensamer, R. Compressed sensing in astronomy. IEEE J. Sel. Top. Signal Process. 2008, 2, 718–726. [Google Scholar] [CrossRef]
Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Wainwright, M. Statistical Learning with Sparsity; Taylor & Francis: Boca Raton, FL, USA, 2015. [Google Scholar] [CrossRef]
Daubechies, I. Orthonormal bases of compactly supported wavelets. Commun. Pure Appl. Math. 1988, 41, 909–996. [Google Scholar] [CrossRef]
Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 1992, 60, 259–268. [Google Scholar] [CrossRef]
Sidky, E.Y.; Pan, X. Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization. Phys. Med. Biol. 2008, 53, 4777. [Google Scholar] [CrossRef] [PubMed]
Haralick, R.M.; Sternberg, S.R.; Zhuang, X. Image analysis using mathematical morphology. IEEE Trans. Pattern Anal. Mach. Intell. 1987, 9, 532–550. [Google Scholar] [CrossRef]

Figure 1. Linear sum of spatial patterns.

Figure 2. Linear sum of frequency patterns.

Figure 3. Relationships between observed patches, dictionary, and sparsity coefficients. Black areas indicate deficient data.

Figure 4. Redundant discrete cosine transform (DCT) dictionary.

Figure 5. Product set of each basis and corresponding sparse representation.

Figure 6. Original DEM data to be applied. The red frame indicates the target area.

Figure 7. DEM data with deficiencies.

Figure 8. Root mean square errors (RMSEs) of applications to random deficiencies.

Figure 9. RMSEs of applications to correlated deficiencies.

Figure 10. Original ALB data to be applied. The red frame indicates the target area.

Figure 11. Combination of the K-singular value decomposition (K-SVD) and total variation (TV) minimization methods.

Table 1. Root mean square errors (RMSEs) of applications to aerial laser bathymetry (ALB) data.

Methods	RMSE of Section 1 (m)	RMSE of Section 2 (m)
Discrete cosine transform (DCT)	0.463	0.257
DCT with elastic net	0.226	0.135
K-singular value decomposition (K-SVD)	10.837	0.296
Fourier regularization	0.353	0.292
Wavelet regularization	0.288	0.273
Total variation (TV) regularization	0.206	0.289
Nearest neighbor	0.526	0.296
Kriging	0.346	0.291

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fuse, T.; Imose, K. Comparative Analysis of Digital Elevation Model Generation Methods Based on Sparse Modeling. Remote Sens. 2023, 15, 2714. https://doi.org/10.3390/rs15112714

AMA Style

Fuse T, Imose K. Comparative Analysis of Digital Elevation Model Generation Methods Based on Sparse Modeling. Remote Sensing. 2023; 15(11):2714. https://doi.org/10.3390/rs15112714

Chicago/Turabian Style

Fuse, Takashi, and Kazuki Imose. 2023. "Comparative Analysis of Digital Elevation Model Generation Methods Based on Sparse Modeling" Remote Sensing 15, no. 11: 2714. https://doi.org/10.3390/rs15112714

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Analysis of Digital Elevation Model Generation Methods Based on Sparse Modeling

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Definitions of Sparsity and Comparative Analysis Methods

3.2. Digital Elevation Model Generation Method Using Sparse Coding

3.2.1. Using Existing Dictionaries

3.2.2. Using Dictionary Learning

3.3. Digital Elevation Model Generation Method Using Compressed Sensing

3.3.1. Compressed Sensing Using Fourier Transform

3.3.2. Compressed Sensing Using Wavelet Transform

3.4. DEM Generation Method Using Total Variation Minimization

3.4.1. Total Variation

3.4.2. Digital Elevation Model Generation Method Using Total Variation Minimization

4. Results

4.1. Application to Different Deficiency Ratios

4.2. Application to Actual Aerial Laser Bathymetry Data

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI