1. Introduction
Radars have been widely used for terrain surveillance under different weather conditions [
1], which is crucial for environmental protection and natural disaster evaluation [
2]. Synthetical aperture radars (SARs), including ALOS (L-band, 2006–2011) [
3], Sentinel-1 (C-band, 2014–) [
3], and UAVSAR (airborne, L-band and P-band, 2008–) [
3], have been used for monitoring glaciers, volcanoes, earthquakes, and so on. TerraSAR-X, operating at X-band with 300 MHz bandwidth, offers spatial resolution of 0.6 m × 1.1 m (slant range × azimuth) in spotlight mode, 0.6 m × 0.24 m in staring spotlight mode, and 1.2 m × 3.3 m in stripmap mode [
4,
5].
The InSAR technique has been used for measuring surface topography and altimetry profile [
6], mapping three-dimensional building shape [
7], and detecting building edge [
8]. InSAR and TomoSAR imaging techniques demand precise coregistration between master image and slave images [
9,
10]. In [
9], a two-step, scale-invariant feature transform (SIFT) registration method was proposed. In [
11], an outlier-detecting total least-squares (OD-TLS) algorithm was proposed to enhance the precision and robustness of 3D point-set registration. In [
12], a sinc interpolation method was used to implement subpixel-to-subpixel match.
Faithful reconstruction of a terrain profile relies on accurate acquisition of interferometric phase. Numerous filtering methods on interferometric phase have been developed in the past few decades [
13], including transform domain methods [
14], nonlocal methods [
15], and spatial domain methods [
16]. The trade-off between noise reduction and preservation of terrain-related signal with transform domain methods is typically adjusted via a threshold [
14].
In [
17], a 3D space-time nonlocal mean filter (NLMF) was applied to detect terrain changes by extracting nonlocal information from pixels in SAR images acquired in different time windows. In [
18], a nonlocal mean filter was applied to a few persistent scattering points in a search area to improve the accuracy of 3D deformation profile. The nonlocal filters performed well in preserving details of complex structures, but were less effective in removing speckle noise [
15].
A spatial-domain Gaussian filter was used to reduce high-frequency noise while preserving deformation information [
19]. It could reduce impulse noise and preserve edges by replacing each pixel with the mean value of its neighboring pixels [
20], but the edges might become blurred due to loss of fine details. On the other hand, nonlocal filters preserve intricate details and adapt to local structures by considering pixel patch similarities, with the downside of computational complexity and sensitivity to parameters.
Phase unwrapping (PU) is a critical step to derive a faithful terrain profile from the interferometric phase of the acquired InSAR image, and the results are affected by the number of baselines used in probing the target area [
21]. A phase unwrapping problem could be formulated as a wrap count classification task to invoke deep learning methods [
22], as used in processing optical images [
23,
24]. In [
25], a quality-guided algorithm was developed by unwrapping the phases along an optimal path in the interferometric phase image, based on a quality map of all edges in the image. Although the result is insensitive to noise, its performance relies on the quality map and the errors may propagate along the path.
A least-squares (LS) phase unwrapping method was formulated as a global optimization task [
26], which may be sensitive to outliers and takes long computational time to process a large image. In [
27], a phase unwrapping method was proposed by minimizing the difference between the discrete partial derivative of the wrapped phase function and that of the unwrapped phase function. The unwrapped phases were obtained by solving a Hunt’s matrix and a discrete Poisson’s equation, accelerated by using FFT, and the result was comparable to other methods.
InSAR imaging tasks have been operated on spaceborne [
28], airborne [
29], ground-based [
30], and UAV-borne platforms [
31]. Spaceborne platforms are typically used to survey wide areas or large-scale phenomena [
4], airborne platforms are more flexible in path planning [
32], and ground-based platforms are used to monitor local environment [
33].
UAV-borne platforms [
34,
35,
36] are expedient for monitoring local area of contingency and can achieve spatial resolution of 10 cm [
37] in P and L bands [
31]. For example, the Antarctic ice sheet (AIS) is covered with rifts and crevasses off the map, endangering the exploration personnel [
38,
39]. Satellite-borne sensors cannot provide updated images and information for on-site tasks [
38,
40], but can be complemented with the InSAR images acquired with UAVs. Typical satellite-borne platforms take days to revisit the same area, with a baseline of a few hundred meters, while UAV-borne platforms can revisit the same area immediately after the previous flight, with a baseline of a few meters.
The radar signals can be acquired in two separate flights with single-channel SAR or a single flight with dual-channel SAR [
41]. Typical position accuracy of UAVs derived from GPS lies between 0.5 and 2 m [
42], which can be enhanced to the centimeter level by using differential GPS (DGPS) technique [
43] or real-time kinematic GPS [
44]. The downside of deploying UAVs is the impact of airflow disturbance and platform perturbation [
42], which can be mitigated by applying motion compensation and autofocusing techniques [
45,
46,
47].
In this work, an on-site InSAR imaging method is proposed to reconstruct a high-resolution local terrain profile with UAV-borne SARs in L-band. A mean filter is used to reduce artifact speckles, and a least-squares phase unwrapping method is used to acquire 2D interferometric phase in almost real time. Three high-quality digital elevation models (DEMs) featuring volcano, glacier, and landslide, are retrieved from the US geological survey (USGS) 3D elevation program (3DEP) [
48] to validate the efficacy of the proposed method. The performance is further evaluated by comparing the acquired InSAR images with their counterparts acquired using other state-of-the-art techniques under the effects of noise.
The rest of this paper is organized as follows: the proposed InSAR method is presented in
Section 2, the simulation results are discussed in
Section 3, and some conclusions are drawn in
Section 4.
2. Proposed InSAR Method
Figure 1 shows the schematic of InSAR operation with two parallel flight paths, where the
x,
y, and
z axes are aligned in the ground-range direction, azimuth direction, and height direction, respectively.
denotes a point target, and the platform flies at height
H above ground, with the side-looking angle
to
.
The coordinates of radar
along the master track and radar
along the slave track are given by
The slant ranges from
and
to
are
and
, respectively, with
2.1. Backscattered Signals
Figure 2 shows the flow-chart of the range-Doppler algorithm (RDA) used in this work [
49]. The signal backscattered from the point target
and received at
(
) is demodulated to the baseband as
where
is the amplitude,
is the carrier frequency,
is the chirp rate of the linear frequency modulation (LFM) pulse,
is the range (fast) time,
is the azimuth (slow) time, and
is a window function, which is equal to one when
and zero otherwise.
By taking the Fourier transform of
with respect to
and
sequentially, we have
where
A is a constant of integration,
, and
with
.
2.2. Range Compression
Let us define a range-compression filter
, a coupling-compensation filter
, and a range cell migration correction (RCMC) filter
as
where
is the Doppler centroid and
Then, multiply these three filters with
to have
where
By taking the inverse Fourier transform of
in the range, we obtain the range-compressed signal
where
.
2.3. Azimuth Compression
Let us define an azimuth compression filter
which is multiplied with
to have
By taking the inverse Fourier transform of
in azimuth, we obtain the azimuth-compressed signal
which is the SAR image stored in a matrix
of size
.
2.4. Coregistration
Figure 3 shows the flow-chart of InSAR imaging. In the master image, the
-axis is sampled at
, with
and
. These sampling values of
are stored in a vector
The slant ranges associated with all the range cells in the master image are
, and the side-looking angle of the
vth range cell is
, with
.
Figure 4a shows that the point target
appears at
in the master image and
in the slave image. If the platforms fly high enough, the range difference between the two tracks in
Figure 4a can be approximated as that in
Figure 4b, namely,
. By the law of cosines,
can be represented as
. The range difference
is normalized with respect to
to have
.
Next, apply both sinc interpolation [
12] and subpixel-to-subpixel match to coregister the slave image to the master image. The original slave image
of size
is interpolated in the range direction by a factor of 16 to obtain a finer slave image
of size
, which is resampled to derive a coregistered slave image
of size
.
2.5. Interferometry and Flat-Earth Phase Removal
An interferogram is formed from the master image
and the coregistered slave image
as
where
is the interferometric phase.
The interferometric phase attributed to the flat-earth reference plane is
[
50], which is subtracted from the phase of
in (
17) to obtain
where
.
2.6. Mean Filter
Since the master image and the slave image are not perfectly coregistered, the interferometric phase manifests some random noise, inflicting errors in the subsequent phase unwrapping process. A mean filter is applied before phase unwrapping to reduce such phase noise.
Consider a target area of
azimuth cells by
range cells, centered at
. The interferometric phase in the target area is mapped from
as
Next, apply a searching window of size
and centered at
on
to have
An intermediate phase,
, is derived from
as [
51]
The interferometric phase after mean filtering is computed as [
20]
2.7. Poisson’s Equation of Unwrapped Phase
Let us define a wrapping operator as [
25]
which returns the principal value of
in
. The residue of
is determined as [
25]
with possible outcomes of
, or
.
Next, take the mirror reflections of the wrapped phase function to obtain an even periodic function, which is continuous at the junction between two adjacent periods. Let
and
, an expanded phase function is defined in terms of
and its three versions of mirror reflection as
where
The wrapped phase differences
fall in
.
Given the wrapped phase
, its unwrapped counterpart,
satisfies
The least-squares solution of (
25) and (
26) can be obtained by minimizing the cost function [
27,
52]
with the Hunt’s method to have [
52]
which is rearranged into a Poisson’s difference equation on a
grid as
where
2.8. Solving Poisson’s Difference Equation with FFT
Define the 2D discrete Fourier transform (DFT) of
and its inverse as [
52]
By substituting (
31) into the left-hand-side of (
28), we obtain
where
and
. The right-hand side of (
28) can be represented as
By equating (
32) and (
33), we obtain
The phase unwrapping procedure is summarized as follows:
Step 1: Take the mirror reflections of
to obtain
, as in (
22);
Step 2: Compute
in (
29), with
and
;
Step 3: Take 2D DFT of
to obtain
, as in (
33);
Step 4: Compute
by using (
34), with
,
, and
;
Step 5: Take 2D IDFT of to obtain the solution, ;
Step 6: Retrieve the unwrapped interferometric phases in the target area as
2.9. Nonlocal Filter
A nonlocal filter can be applied to either the interferometric phase
in (
18) before phase unwrapping or
in (
35) after phase unwrapping. The output of the nonlocal filter to
is computed as [
15,
53]
where
is a search window and
is a weighting coefficient that is determined by the difference of pixels between two similarity windows centered at
and
. The weighting coefficient is large if the pixels in these two similarity windows match closely, and vice versa. The sum of all weighting coefficients over
is set to one.
In the literature, a nonlocal filter is applied before phase unwrapping to reduce noise, speckle, or other artifacts embedded in the wrapped flattened phase, aiming to acquire a more accurate unwrapped phase. A nonlocal filter applied after phase unwrapping aims to smooth the unwrapped phase, at the risk of inducing artifacts or errors to the latter. The simulation results in this work show that smoother interferometric phase distribution is acquired by applying a nonlocal filter before phase unwrapping than after it.
2.10. Quality-Guided Phase Unwrapping
A quality-guided phase unwrapping process is also used in this work for comparison. A quality map is defined over a window
centered at
as [
25]
where
and
are the partial derivatives of the wrapped phase in the
u and
v directions, respectively, and their mean values over the window
are denoted as
and
, respectively.
After computing the quality map over an image area of interest, the pixel with the highest quality-map value is denoted as . The phase unwrapping process begins with its four surrounding pixels, and , followed by the pixels surrounding them. The process is repeated until all the pixels in the image area are exhausted.
2.11. Target Height Estimation
By adding the flat-earth phases in the target area,
back to the unwrapping phase,
, we have
Without loss of generality, choose cell
as the reference cell, with a reference phase
. The phase difference between the master image and the slave image is calibrated as
Figure 5 shows the geometry for target-height estimation. The difference between
and
is estimated as
The side-looking angle from the master track toward the point target
A is calculated by using the law of cosines as
Finally, the height of point target
A is estimated as
3. Simulations and Discussions
In this section, three scenarios are simulated by using the DEM models extracted from the US Geological Survey (USGS) 3D Elevation Program (3DEP) dataset [
48], including Mount St. Helens, Columbia Glacier, and Santa Cruz landslide. Without loss of effectiveness, each DEM model is scaled down by a common factor in all three dimensions to reduce the computational time.
Table 1 lists the default InSAR parameters used in the simulations, from which the height of ambiguity is determined as [
54]
Aside from the mean filter (MF) and the least-squares phase unwrapping (LSPU) method, the nonlocal filter (NF) and the quality-guided phase unwrapping (QGPU) method are also used for comparison. The effects of noise are studied by comparing the acquired images without noise with their counterparts at SNR dB, dB, and dB.
3.1. Mount St. Helens
Figure 6 shows the intermediate images of Mount St. Helens, scaled down tenfold to reduce the computational time. Mount St. Helens is an active volcano located at (46.2° N, 122.18° W), Skamania County, Washington, USA. Its elevation is 2549 m and its prominence is 1404 m. The DEM is extracted from the USGS 3DEP dataset [
48], with spatial resolution of 1 m × 1 m.
Figure 6a,b shows the master SAR images without noise and at SNR
dB, respectively. The latter manifests speckles over the whole image.
Figure 6c,d shows the interferometric phase without noise and at SNR
dB, respectively. The latter is severely smeared by noise and covered with speckles.
Figure 6e,f shows the wrapped flattened phase without noise and at SNR
dB, respectively. Similar features as in the interferograms are observed.
Figure 6g,h shows the coherence maps without noise and at SNR
dB, respectively. The coherence between the master SAR image
and the coregistered slave image
is defined as [
54]
which is equal to one if the coregistration is perfect. It is observed that the coherence map without noise is close to one, and that, at SNR
dB, it is slightly reduced to about 0.8.
Figure 7 shows the reconstructed images of Mount St. Helens with the proposed method and the effects of noise. The comparison between mean filter (MF) and nonlocal filter (NF), as well as between least-squares phase unwrapping (LSPU) and quality-guided phase unwrapping (QGPU) methods, under noise free condition are also demonstrated.
Figure 7a shows the true DEM of Mount St. Helens extracted from the dataset,
Figure 7b shows the tenfold scale-down model of that in
Figure 7a, and
Figure 7c shows the reconstructed DEM with the proposed method.
The fidelity of the acquired InSAR image
a against the true image
b is evaluated with a structural similarity (SSIM) index defined as [
55,
56]
where
and
are the mean and standard deviation, respectively, of image
p, with
;
is the covariance between images
a and
b; and
and
are stability constants. The SSIM index lies in
, with higher index indicating higher similarity. Each image pixel is stored in 8 bits, implying the dynamic range of
. The stability constants are chosen as
and
. The SSIM index between the images in
Figure 7b,c is 0.90.
The fidelity of the acquired InSAR image
a against the true image
b is also evaluated with a root-mean-square error (RMSE) defined as [
57]
where
and
are the values of the
pth pixels in images
a and
b, respectively, and
P is the number of pixels in one image. The RMSE between the images in
Figure 7b,c is 5.79 m.
Figure 7d shows the reconstructed DEM, with the NF replacing the mean filter; its SSIM index and RMSE against the image in
Figure 7b are 0.89 and 6.14 m, respectively, i.e., slightly worse than the proposed method.
A closer inspection of the images in
Figure 7c,d reveals that the NF preserves sharper edge while the MF smears image features. The SSIM indices and RMSE values of these two images are similar, implying that MF and NF have comparable performance.
Figure 7e shows the reconstructed DEM with MF and QGPU; its SSIM index and RMSE against the image in
Figure 7b are 0.90 and 5.79 m, respectively, which are identical to those in
Figure 7c, indicating that LSPU and QGPU methods have comparable performance in this case. Note that the QGPU method has longer computation time than the LSPU method.
Table 2 lists the CPU time of running for LSPU, QGPU, mean filter, and nonlocal filter, with MATLAB R2019a on a PC with i7-3.00 GHz CPU and 32 GB memory. The CPU time of running for the mean filter is about half that of the nonlocal filter. The CPU time of the LSPU is much shorter than that of the QGPU because the former is implemented with FFT on the whole image, while the QGPU is executed pixel by pixel. The breakdown of CPU time in LSPU, QGPU, mean filter, and nonlocal filter, as well as their algorithms, are detailed in
Appendix A.
Figure 7f–h shows the InSAR images acquired with the proposed method at SNR
dB,
dB, and
dB, respectively. Their SSIM indices against
Figure 7b are 0.89, 0.89, and 0.74, respectively, and their RMSE values against
Figure 7b are 10.8 m, 9.22 m, and 22.38 m, respectively. The main features in the image are almost unaffected at SNR
dB and become slightly distorted at SNR
dB. In short, the DEM of Mount St. Helens is reconstructed with high fidelity by visual inspection, as well as in terms of SSIM and RMSE, even at SNR
dB.
Figure 8 shows the differences between the reconstructed DEMs in
Figure 7c,f,g,h and the true DEM in
Figure 7b. The difference is calculated as
, where
and
are the values of the
pth pixel in images
a and
b, respectively.
Figure 8 shows that the difference is negligible at SNR
dB and becomes significant at SNR
dB.
Figure 9 shows the reconstructed images of Mount St. Helens, with the nonlocal filter (NF) applied before and after the LSPU, under noise-free condition. The computational noise distorts some terrain features and inflicts speckles in the reconstructed image if the nonlocal filter is applied after phase unwrapping.
3.2. Columbia Glacier
Figure 10 shows the images of the Columbia Glacier, located at (61.14° N, 147.08° W) on the south coast of Alaska, USA. The DEM is extracted from the USGS 3DEP dataset [
48], with spatial resolution of 5 m × 5 m.
Figure 10a shows the true DEM of the Columbia Glacier extracted from the dataset.
Figure 10b shows the fivefold scale-down model of that in
Figure 10a.
Figure 10c shows the reconstructed DEM with the proposed method and the simulation parameters listed in
Table 1. The reconstructed DEM closely matches the true DEM; its SSIM index and RMSE against the image in
Figure 10b are 0.88 and 28.4 m, respectively.
The backscattered signals from multiple resolution cells near the steep mountain slope region surrounding the glacier, enclosed by white dashed curves in
Figure 10b, are mapped to the same resolution cell in the acquired image, inflicting layover effect. The high RMSE value is attributed to such layover regions, which is confirmed later in
Figure 11.
Figure 10d shows the reconstructed DEM with NF replacing the mean filter; its SSIM index and RMSE against the image in
Figure 10b are 0.87 and 28.24 m, respectively.
Figure 10e shows the reconstructed DEM with QGPU replacing LSPU; its SSIM index and RMSE against the image in
Figure 10b are 0.88 and 24.91 m, respectively, slightly better than their counterparts in
Figure 10c. The glacier in this scenario manifests a steeper slope than that of the volcano in the previous scenario. The use of mean filter may blur some fine features in the DEM; hence, it should be used with caution if the terrain profile changes drastically.
Figure 10f–h shows the InSAR images acquired with the proposed method at SNR
dB,
dB, and
dB, respectively. Their SSIM indices against
Figure 10b are 0.87, 0.86, and 0.78, respectively, and their RMSE values against
Figure 10b are 31.93 m, 30.24 m, and 33.24 m, respectively. The acquired InSAR images at SNR
dB and SNR
dB have similar SSIM indices, and the RMSE at SNR
dB is slightly lower than the other two images.
Figure 11 shows the difference between the reconstructed DEM in
Figure 10c,f–h, and the true DEM in
Figure 10b. As SNR is decreased from 0 dB to
dB, more pixels in the layover regions manifest significant difference.
3.3. Santa Cruz Landslide
Figure 12 shows the images of an area with potential landslide hazards near Santa Cruz (37.03° N, 122.12° W), California, USA, on 17 March 2020, which are extracted from the USGS 3DEP dataset [
48], with spatial resolution of 3 m × 3 m.
Figure 12a shows the true DEM of the target area, and
Figure 12b shows the tenfold scale-down model of that in
Figure 12a.
Figure 12c shows the reconstructed InSAR image with the proposed method. The reconstructed DEM closely matches the true DEM; its SSIM index and RMSE against the image in
Figure 12b are 0.90 and 2.32 m, respectively.
Figure 12d shows the reconstructed InSAR image, with the nonlocal filter replacing the mean filter. Its SSIM index and RMSE against the image in
Figure 12b are 0.89 and 2.46 m, respectively.
Figure 12e shows the reconstructed DEM, with QGPU replacing LSPU. Its SSIM index and RMSE against the DEM in
Figure 12b are 0.90 and 2.32 m, respectively, same as those for the proposed method.
Figure 12f–h show the InSAR images acquired with the proposed method at SNR
dB,
dB, and
dB, respectively. Their SSIM indices against
Figure 12b are 0.90, 0.72, and 0.66, respectively, and their RMSE values against
Figure 12b are 2.32 m, 7.74 m, and 9.09 m, respectively.
Figure 13 shows the differences between the reconstructed DEM in
Figure 12c,f–h and the true DEM in
Figure 12b. As SNR is decreased, more pixels manifest significant difference.
Table 3 summarizes the RMSE and SSIM indices of images in
Figure 7,
Figure 10, and
Figure 12, with different combinations of the filter and phase unwrapping methods under noise-free condition. The best indices among the three different methods are marked by boldface, and the differences among these combinations are not significant.
Table 4 summarizes the RMSE and SSIM indices of images in
Figure 7,
Figure 10, and
Figure 12, by using the proposed method under different SNRs. In general, the best indices occur at SNR
dB, but some indices at SNR
dB turn out to be slightly better.
Next, we reconstruct two DEMs over the same area, dated 17 March 2020 and 10 August 2022, and show their height difference in
Figure 14 to detect possible landslide hazards.
Figure 14a shows the height difference between the two true DEMs extracted from the dataset on the two dates just mentioned [
48].
Figure 14b shows the height difference between the two InSAR images reconstructed with the proposed method, and its SSIM index against the image in
Figure 14a is 0.29. Both images show similar patterns, but some fine features in
Figure 14a are smeared out in
Figure 14b.
Figure 14c shows the height difference between the two images reconstructed with the nonlocal filter replacing the mean filter. The image shows a similar pattern as in
Figure 14a, with more fragmented features than the latter. The SSIM index between these two images is 0.30.
Figure 14d shows the reconstructed image, with the QGPU replacing the LSPU. It is more resemblant of
Figure 14b than
Figure 14c, and its SSIM index against the image in
Figure 14a is 0.29. By comparing
Figure 14a–d, the combination of the NF and LSPU methods seems to manifest more terrain details in the true DEM.
Figure 14e–g shows the height differences acquired with the NF and LSPU at SNR
dB,
dB, and
dB, respectively. Their SSIM indices against
Figure 14a are 0.45, 0.20, and 0.13, respectively, and their RMSE values against
Figure 14a are 4.31 m, 4.63 m, and 9.54 m, respectively. The images in
Figure 14e,f still retain some useful information about terrain profile change, but that in
Figure 14g provides no useful clue.
Figure 15 shows the density maps of high-risk landslide areas acquired with the three methods compared in this section. The areas with height difference greater than
m are highlighted with red marks (
m) and blue marks (
m).
The density maps in
Figure 15b,d appear similar, consistent with the performance indices of these two methods. On the other hand,
Figure 15c manifests an excessive number of high-risk marks.
3.4. Comparison with State-of-the-Art Techniques
In [
58], a satellite-based InSAR method utilizing a Kalman filter (KF) and sequential least squares (SLS) was introduced to implement near-real-time applications. The SLS was designed to reduce the CPU time of conventional LS methods by sequentially processing the whole image. For comparison, the results in
Figure 7,
Figure 10,
Figure 12,
Figure 14 and
Figure 15 demonstrate the efficacy of the LSPU method, which incorporates 2D FFT in the LS method to reduce the CPU time even more significantly.
In [
59], a deep learning-based LSPU method utilizing encoder–decoder architecture (PGENet) was proposed to reconstruct the wrapped phase data embedding noise. Similarly, a deep learning-based QGPU via global attention U-Net was introduced in [
60]. The efficacy of LSPU and QGPU can be enhanced by utilizing a deep learning approach. Furthermore, the results in [
59] demonstrated that LSPU outperformed QGPU, producing lower RMSE and shorter computational time, especially in low-coherence areas. The results in
Figure 7c,e and
Figure 12c,e show that the LSPU has nearly the same performance as the QGPU, not to mention that the LSPU has high computational efficiency, as listed in
Table 2.
In [
61], a weighted least-squares (WLS) technique was proposed to improve the effectiveness of phase unwrapping within a small baseline InSAR framework. Choosing a small baseline in a satellite-based InSAR approach can reduce the computational cost. The proposed UAV-based InSAR approach has relatively smaller (temporal and spatial) baseline compared to the satellite-based counterpart in [
61]. In addition, the UAV-based platform offers more flexibility in achieving specific baseline and revisit time.
In [
15], the low-coherence area and high-coherence area were filtered by a local fringe frequency compensation nonlocal filter and Goldstein filter, respectively. The Goldstein filter, considered an old-fashioned method, was used for its computational efficiency [
15]. For the same reason, the mean filter adopted in our work is suitable for relatively smooth and high-coherence areas. In our approach, the data can be acquired with two UAVs (sensors) in a single flight or with one UAV (sensor) in two separate flights that are staggered by a short revisit time. The coherence in the UAV-based InSAR image pair is higher than that in the satellite-based counterpart, which has typical revisit time of 12 days or longer.
In [
53], various filters were simulated upon ramp and square noisy images. The results indicated that the nonlocal filter outperformed both the Lee filter and the Goldstein filter (considered old-fashioned filters) on square noisy images, but underperformed the latter on ramp noisy images [
53]. Such outcomes are consistent with the simulation findings presented in
Section 3.1,
Section 3.2 and
Section 3.3. The scenarios simulated in
Section 3.1 and
Section 3.3 manifest relatively smooth height profiles, resembling ramp noisy images.
Figure 7c,d and
Figure 12c,d show that the mean filter achieves lower RMSE and higher SSIM value in these two scenarios. On the other hand, the scenario simulated in
Section 3.2 manifests a steep mountain terrain, resembling square noisy images.
Figure 10c,d show that nonlocal filter achieves lower RMSE in this scenario.
In the presence of additive Gaussian noise, the pivoting mean filter emerges as statistically optimal from the perspective of maximum likelihood estimation [
20]. As for the scenarios with relatively smooth profile discussed in
Section 3.1 and
Section 3.3, reconstruction with mean filter (MF) results in slightly higher SSIM value and lower RMSE value compared with the nonlocal filter (NF). However, the mean filter may oversmooth the phase details in areas with drastic topographical variations. As discussed in
Section 3.2, the scenario containing some steep areas may not be well reconstructed by using the mean filter, and the nonlocal filter achieves a lower RMSE on the reconstructed DEM.
In [
62], a coherence-guided InSAR phase unwrapping method was proposed in conjunction with cycle-consistent adversarial networks. The coherence-guided phase unwrapping method typically employs a cost function in terms of phase gradients and coherence values to penalize phase discontinuities in low-coherence regions and promote smooth phase paths in high-coherence areas. The method could achieve accurate phase unwrapping with low RMS value. However, the generative adversarial networks entail high computational cost and require extensive training data.
In [
63], a median filter was cascaded with a mean filter based on stationary wavelet transform for phase filtering. The median filter exceled in preserving phase fringes, while the mean filter demonstrated superior noise reduction capabilities.
Lightweight UAVs are typically more susceptible to wind disturbances than airborne platforms in conducting SAR or InSAR imaging tasks. Both types of platform may tilt or dip under headwinds and deviate from planned flight path under crosswinds [
64]. Take a real-world example of dispatching a small UAV for InSAR imaging. It can carry a payload up to 7.5 kg and stay in the air for an hour while equipped with GPS navigation gear. Its attitude response to the wind interference can be ignored if the wind speed if less than 5 mph, and its trajectory deviation can be compensated with servo mechanisms and algorithms.
3.5. Discussions on Contributions and Constraints
The contributions of this work are summarized as follows:
- 1
An on-site InSAR imaging method is proposed for monitoring environmental changes. The imaging task is carried out with UAVs, which can be swiftly deployed on site with small decorrelation between master and slave images;
- 2
High-resolution DEMs are reconstructed and enhanced with a mean filter to mitigate artifacts on InSAR images, which are attributed to imperfect coregistration between master and slave images. A least-squares phase unwrapping method at extremely low computational cost is applied to run the imaging task near real-time;
- 3
Three scenarios of DEM reconstruction are simulated to validate the efficacy of the proposed approach, considering the effect of noise. The fidelity of acquired InSAR images is evaluated in terms of SSIM index and RMSE. The merits of using mean filter and least-squares phase unwrapping method are compared with two popular counterparts.
We propose a feasible scheme of deploying UAVs for on-site InSAR imaging of small areas, which cannot be achieved with satellite-borne InSAR platforms. Potential applications include monitoring natural disasters such as landslides, wildfires, and volcanic eruptions. In these scenarios, the satellite-borne InSAR imaging technique is limited by the long revisit time of days, which is impractical for real-time monitoring of disaster evolution. Among many state-of-the-art algorithms, choosing the mean filter and the least-square phase unwrapping method via Poisson’s difference equation and FFT can practically accomplish real-time imaging tasks in terms of robustness and computational efficiency.
The attitude of an airborne platform can be disturbed by complicated airflow disturbance and platform mechanical oscillation. Their effects on SAR imaging have been compensated with a compressive-sensing technique [
46].