1. Introduction
The estimation of forest height is a timely and important research topic for improving forest management activities and studying the role of forests as a sink or a source of carbon in the global carbon cycle [
1]. Synthetic aperture radar (SAR) and light detection and ranging (LiDAR), which have better penetration ability in forests, are important remote sensing technologies for forest height estimation. Usually, LiDAR, especially airborne LiDAR, can achieve high accuracy in the retrieval of forest height [
2]. However, the technique is mainly applied for small spatial coverage areas because of its low sampling efficiency [
3]. In contrast to LiDAR, SAR can obtain wall-to-wall data on a large area, especially spaceborne systems such as the SRTM and TanDEM-X mission, which can provide global-scale elevation products. For SAR, the estimation of forest height mainly relies on interferometric SAR (InSAR) technology, including current research hotspots such as tomographic SAR (TomoSAR) and Polarimetric SAR interferometry (PolInSAR) techniques. Among them, TomoSAR technology enhances the resolution of SAR in the vertical direction using multi-baseline InSAR joint observations [
4]. However, the high cost (time and money) of data acquisition and complex 3D imaging algorithms discourage the use of TomoSAR for forest height estimation in large areas. PolInSAR technology mainly estimates forest height by using the penetration differences of different polarization channels [
5]. However, this difference in penetration between polarization channels is limited by specific wavelengths, making it difficult to apply to diverse forest-type covers. In view of this, a more robust option may be to make full use of the penetration differences of different SAR wavelength technologies to estimate forest height.
Since forest height is normally defined as the difference between the forest surface height and the underlying surface [
6], the basic idea of estimating forest height using SAR with different wavelengths is clear, as shown in
Figure 1. First, the long-wavelength (e.g., P-band, L-band) InSAR with better penetrating ability to the forest is used to extract the underlying topography, that is, the digital terrain model (DTM). Second, the short-wavelength (e.g., X-band, Ka-band) InSAR with limited penetrating ability to the forest is used to extract the forest surface height, that is, the digital surface model (DSM). The difference between the DSM and DTM is called the canopy height model (CHM), which directly reflects the forest height information in forested areas.
Several studies have verified the feasibility of estimating forest height based on dual-frequency InSAR data from two different covers. Among them, the most used dual-frequency InSAR combination is P-band and X-band [
2,
7,
8,
9], followed by L-band and X-band [
10,
11]. The above research results have shown that there is a stable correlation between the height obtained by long- and short-wavelength SAR interferometric difference and the forest height, but the forest height is usually underestimated. The main reason is that the phase center of short-wavelength InSAR is lower than the real DSM because of the limited—but not negligible—penetration ability. For instance, Kugler et al. [
12] found a penetration of up to 12 m in the boreal forest with TanDEM-X data (X-band). On the other hand, the phase center of long-wavelength InSAR is higher than the real DTM owing to the influence of forest scatterers [
13]. For example, one study [
14] observed an overestimation bias of about 5 m in understory terrain inversion based on L-band InSAR data. To solve this problem, empirical models were established based on LiDAR or ground-measured forest height data to correct the underestimation bias [
2,
7]. The common disadvantage of empirical models is that external data are necessary, and, in most cases, the established models are applicable only for the study area where the external data are collected [
2,
7]. This empirical approach limits the applicability and robustness of forest height estimation methods. Therefore, how to compensate for the deviation between the phase center of long/short-wavelength InSAR and the DTM/DSM is the key to improving the accuracy of forest height estimation based on dual-frequency InSAR. Fortunately, in recent years, several methods have been developed for accurate extraction of the DTM and DSM based on InSAR data of suitable wavelengths.
For accurate extraction of the DTM based on long-wavelength InSAR data, the methods based on random volume over ground (RVoG) and time-frequency (TF) analysis are the two most used approaches [
13]. The former mainly relies on the RVoG model and PolInSAR data, and the DTM is usually a by-product of the forest height inversion process [
5,
15,
16]. This type of method has already been used in the forest height estimation combining dual-frequency InSAR data [
9]. However, the main problem is that the RVoG model assumes that the forest is a uniformly distributed scatterer, but this assumption does not apply to InSAR with long wavelengths, such as the P-band. For P-band SAR, the main scatterers of the forest are trunks and thick branches, which are clearly heterogeneous. In view of this, some scholars have proposed a TF analysis method based on subaperture decomposition [
17,
18]. This type of method reduces the influence of forest scatterers on InSAR signals by extracting subaperture images for interferometric processing. With this method, the phase center of the underlying topography can be directly separated from the total InSAR signal without relying on any physical model assumption. This feature improves the applicability of the DTM extraction method for diverse forests.
For the extraction of the DSM based on short-wavelength InSAR, there is a certain consensus that the penetration ability of short-wavelength InSAR will still cause deviations that cannot be ignored [
19,
20,
21], and the penetration bias can be affected by forest structures [
22,
23]. At present, two types of compensation methods of penetration bias have been proposed: empirical methods and physical model methods. The empirical methods construct a statistical model to correct the bias of the InSAR DSM based on prior knowledge of forest height, leaf area index and forest canopy closure obtained from ground survey or LiDAR sample data [
19,
20,
24]. In contrast, the compensation methods based on the physical model do not rely on (or rely on relatively little) prior knowledge. For the first time, Dall proposed a compensation method based on a physical model for the elevation deviation of InSAR measurements [
25]. In Dall’s approach, forests were assumed to be infinitely deep uniform volumes (IDUV). On this basis, the relationship between InSAR coherence and elevation measurement bias was established. Subsequently, Schlund et al. applied the IDUV model to the bias compensation of the DSM measured by X-band InSAR and obtained a good compensation effect [
21]. However, the physical assumptions of the IDUV model about forests do not actually apply to short-wavelength InSAR. For short-wavelength SAR, such as the X-band (approximately 3 cm), the dielectric penetration distance of the SAR signal in the forest canopy is very short, and the SAR signal is more likely to penetrate the forest stand through canopy gaps [
26]. Soja et al. considered this characteristic of the X-band and proposed a two-level model (TLM) [
27]. Moreover, Zhao et al. proposed a more generalized multi-level model (MLM) [
28]. The TLM and MLM models have been used for forest height retrieval or forest biomass estimation based on X-band InSAR [
27,
28,
29,
30]. The related research results indicated that the gap penetration model represented by the TLM and MLM conforms to the scattering mechanism characteristics of X-band InSAR. However, such gap penetration models have not been applied to the bias compensation of the DSM extraction based on short-wavelength InSAR.
In summary, in considering the bibliographic references and studies already carried out, the combination of P-band and X-band InSAR is an important means to achieve a robust estimation of forest height, and the key to improving the accuracy of forest height estimation is to achieve unbiased extraction of the DTM and DSM. In this regard, the existing methods are mainly based on the dielectric penetration model represented by RVoG and IDUV to achieve accurate extraction of the DTM or DSM. However, the homogeneous and isotropic assumptions of the dielectric penetration model for forests do not apply to very long (e.g., P-band) and very short (e.g., X-band) wavelengths. In order to reduce the uncertainty caused by the inapplicability of the model, for the extraction of the DTM based on the P-band InSAR, the TF analysis method based on subaperture decomposition should be a better choice, and for the extraction of the DSM based on the X-band InSAR, it is necessary to develop new bias compensation algorithm based on the gap penetration model (e.g., MLM). In fact, the DTM extraction based on the TF method and the DSM compensation based on the gap penetration model have similarities, that is, to utilize the gap penetration capability of SAR. For a long time, the gap penetration ability of SAR has been neglected, and scholars generally focus on the dielectric penetration mechanism with the extinction process of SAR. However, gap penetration is actually a fundamental capability of SAR that cannot be ignored. In recent years, the inversion of forest structure parameters based on the gap penetration scattering mechanism has gradually increased [
18,
26,
28,
29,
31].
In this study, we provide a new approach to forest height estimation combining P-band and X-band InSAR data. First, an improved TF analysis method was used for the DTM extraction of P-band InSAR data. Second, a novel bias compensation algorithm was developed for the DSM extraction of X-band InSAR based on the MLM. The article is structured as follows. The methodology of the DTM extraction method based on TF analysis and the DSM extraction and compensation method is given in
Section 2. The study area and experimental data are presented in
Section 3. In the
Section 4, the details of data processes are presented.
Section 5 shows the extracted InSAR elevations and the estimation results of forest height. Finally, the discussion and conclusion are presented in
Section 5 and
Section 6.
4. Data Processing
4.1. DTM Extraction Process
The DTM extraction process began with SLC images of P-band InSAR data, and as mentioned before, mainly included three steps: subaperture decomposition, RME removal and DTM generation.
For P-band polarimetric InSAR data used in this paper, we first needed to select a polarization channel with the strongest penetration ability into the forest, and previous studies have shown that HH polarization is the better choice [
17]. Then, based on P-band InSAR data of HH polarization, five subaperture images of equivalent resolution with 50% overlap (each subaperture was a third of the full aperture) were generated for both primary and secondary images. Subsequently, the co-registrations between the primary and secondary images of the same subaperture were completed with the accuracy of 0.02~0.03 pixels in range and 0.05~0.06 pixels in azimuth. Then, the interferograms of the subaperture InSAR pairs were completed with a 3 × 3 multi-look window size.
In the RME removal,
φflat and
φtopo were generated and removed from the interference phase of each subaperture based on the image parameters (interferometric baseline, slant range and incidence angle) of the InSAR system and SRTM DEM data [
36]. Then, the level 1 wavelet decomposition with Haar wavelet basis was performed for removing Δ
φtopo,
φforest and
φnoise. Subsequently,
φRME was fitted according to a three-order (
n = 3) polynomial based on Equation (5). Finally, the effect of RME was removed by multiplying
φRME by the complex conjugate of the original subaperture interferograms.
For the DTM generation, the subaperture whose interferometric phase had the largest difference with the interferometric phase of the HV polarization in the full aperture data was selected as the optimal ground phase for every pixel. Then, to reduce the phase noise, the optimal ground phase was filtered with a modified Goldstein filter with a 3 × 3 window size [
40]. Next, the filtered phase was unwrapped using the minimum cost flow (MCF) method [
41]. Finally, the P-band DTM (
DTMP-band) was obtained by linear function of phase-height conversion [
36].
In terms of the noise level of the P-band SAR data used, the equivalent number of looks (ENL) for the SLC level was 0.90, and the ENL after multi-looking and filtering was 13.51.
4.2. DSM Extraction and Compensation Process
The processing of DSM extraction and compensation was mainly divided into two steps, which were the extraction of the initial DSM version (DSMX-InSAR) and the estimation of DSMbias.
For the extraction of
DSMX-InSAR, the first step was co-registration. Since the InSAR data of the TanDEM-X system undergo co-registration before they are provided to users, the step can be skipped and the extraction processing begun with interferogram generation using the SLC images of TanDEM-X InSAR data. The interferogram was completed with a 3 × 3 multi-look window size. Then, the flat-earth phase was calculated and removed based on the image parameters of the InSAR system [
36]. Subsequently, the flattened InSAR phase was filtered with the Goldstein filter with a 3 × 3 window size reducing the phase noise [
40]. Next, the MCF algorithm was used for phase unwrapping [
41]. Finally,
DSMX-InSAR was extracted by linear function of phase-height conversion based on the unwrapped phase [
36].
For the estimation of
DSMbias, the first step was the calculation of coherence. Before the calculation, the combined flat-earth and topographic phase was calculated and removed according to the image parameters of the InSAR system and the
DTMP-band previously extracted [
36]. Then,
γIDUV and
γMLM were estimated according to Equation (12) and Equation (21), respectively. The estimated window size of both estimations was 3 × 3. Since the main decoherence factors of the TanDEM-X InSAR data were the SNR ratio and volume scattering [
12,
42], the influence of SNR was removed by dividing
γIDUV by the SNR decorrelation, which was computed based on the backscattering intensity and noise equivalent sigma zero (NESZ) provided as the image parameters for every TanDEM-X acquisition [
42].
For the IDUV model, DSMbias was estimated according to Equation (11) based on γIDUV, and the compensated DSM based on IDUV (DSMX-IIDUV) was obtained by adding DSMbias to DSMX-InSAR. Similarly, for the MLM, the compensated DSM based on the MLM (DSMX-MLM) was formed by adding the DSMbias of the MLM to DSMX-InSAR. The DSMbias of the MLM in the special uniform distribution case was estimated according to Equation (20), based on γMLM.
In terms of the noise level of the X-band SAR data used, the ENL for the SLC level was 0.84, and the ENL after multi-looking and filtering was 19.64.
4.3. Forest Height Estimation and Accuracy Validation
The final forest height estimation results were completed by subtracting DTMP-band from different X-InSAR DSMs. The pixel size of P-band and X-band InSAR products was 6 m × 6 m (coordinate: UTM zone 50N), which was decided according to the pixel size of InSAR SLC data (about 2 m) and the parameters of the multi-looking process (3 × 3 window size).
The accuracy verification was based on the InSAR product with a 6 m × 6 m pixel size and the LiDAR product with a 2 m × 2 m pixel size. For each pixel of the InSAR product to be evaluated, we first found the LiDAR pixel located at the center of the pixel of InSAR product. Then, taking this LiDAR pixel as the center, all LiDAR pixels within the actual range corresponding to the pixel of InSAR product were used for verification. Since there are 3 × 3 window size multi-look and 3 × 3 window size filtering involved in the InSAR procedures, each pixel of InSAR products contained the information corresponding to an area of about 18 m × 18 m. Therefore, although the pixel size of the InSAR product was 6 m × 6 m, the LiDAR-based verification value was calculated based on all LiDAR pixels within 18 m × 18 m. For the validation of the DTM, the validation value was the mean of all LiDAR DTM pixels within the 18 m × 18 m. For the validation of forest height, the validation value was the maximum value of all LiDAR CHM pixels within the 18 m × 18 m, as displayed by
Figure 7d in the previous section. For the validation of the DSM, the verification value was the sum of the above two values.
For the validation of forest height estimation results, a total of 2631 samples were extracted systemically with 42 m intervals. In fact, at 42 m intervals, a total of 2754 samples could be extracted, and 123 samples from non-vegetated areas (H100 < 0.5 m) were removed, and the remaining 2631 samples were evenly distributed in the study area.
The above steps were completed using GAMMA software (version: 2012; GAMMA Remote Sensing corporation, Gumligen, Switzerland;
https://www.gamma-rs.com/, accessed on 20 June 2022), PolSARpro software and our own program. The software and program were run on a computer with AMD Ryzen 7 5800H CPU and NVIDIA GeForce RTX 3060 GPU.
6. Discussion
In this paper, we proposed a method for forest height estimation using the penetration difference between P-band and X-band InSAR. Although the experimental results verified the effectiveness of the proposed method, some key issues and future developments of this study should be further described.
For DTM extraction based on P-band InSAR, we adopted the improved TF analysis method proposed by Fu et al. [
18], which used a subaperture decomposition technique to enhance the ability of P-band SAR to “see” the pure ground surface through forest gaps. In this paper, we again verified the effectiveness of the TF analysis method. However, it should be noted that the experimental verification of the TF method included in this study was based on the airborne P-band high-resolution InSAR data [
17,
18]. It is important to note that this method is only suitable for high-resolution long-wavelength InSAR data. However, this requirement is somewhat difficult for spaceborne long-wavelength SAR. For example, the spaceborne P-band InSAR mission BIOMASS will have a spatial resolution of about 12.5 m × 25 m [
43], which is much lower than the spatial resolution of airborne InSAR data used in this study. Low spatial resolution will lead to a narrow observation angle. For this situation, it is also difficult to obtain pure ground surface SLC pixels with methods based on TF analysis. In other words, the size of the SLC pixel is likely to be larger than the forest gap that can be penetrated, and mixing in forest scatterers is unavoidable. In this case, it may be useful to consider the dielectric penetration scattering mechanism within the SLC pixel based on the TF analysis method. In addition to algorithmically adapting to low resolutions as much as possible, we strongly recommend launching the P-band SAR satellites with a spatial resolution better than 3 m to improve the gap penetration capability of spaceborne SAR. It is notable that China’s civil P-band SAR has just entered the pre-research stage, and higher spatial resolution should be the basic feature of this satellite [
44]. In addition, the TF method only increases the probability of the SAR signal seeing the pure ground surface by changing the observation angle in the azimuth direction. In fact, in the range direction, the viewing angle of the SAR relative to the forest has a strong effect on the penetration ability. In general, a smaller angle of incidence is more conducive to penetration of the forest [
9]. However, in this paper, the incident angle of the P-band SAR was about 63 degrees, which is relatively large. This may also have an impact on the extraction accuracy of the DTM. If multi-angle InSAR observations are carried out in the range direction, such as complementary observations of ascending and descending orbits, it will be possible to further increase the possibility of SAR sensors seeing the pure ground surface.
For DSM extraction based on X-band InSAR, we proposed a new compensation method based on the gap penetration scattering mechanism. Compared with the IDUV method, the advantage of our method is that there is no theoretical limit on the maximum penetration depth, so it is more likely to be applied to diverse forests. Because of the wavelength limitation, X-band cannot use subaperture decomposition technology to enhance the ability of gap penetration, like high-resolution P-band SAR. Even so, the gap penetration capability of the X-band is unquestionable. For example, Lei et al. realized the simultaneous automatic extraction of the DTM and DSM in the tropical forest area by using the gap penetration ability of the X-band InSAR [
31]. However, in Lei’s method, in order to avoid the influence of multi-look averaging, only few looks can be performed. In this case, the algorithm must have strong adaptability to the influence of various noises. In contrast, we did not need to worry about this aspect with our method. Within an estimation unit (such as 20 m × 20 m), we did not focus on extracting the phase of the SLC pixel that has the highest phase center and is close to the location of the forest canopy top because the phase of a single SLC pixel has a large uncertainty. We chose to average all SLC pixels in the estimation unit to eliminate the influence of noise, and then reconstruct the position of the highest SLC pixel with certain model assumptions mainly about the distribution of the scatters. Therefore, for our method, reasonable model assumptions about the distribution of the scatters based on the type and structure of the specific forest are the most important for the extraction accuracy of DSM. In this study, the distribution was assumed to be uniform for the purpose of model simplicity. Obviously, this does not work for all forest types. For example, for sparse forest, the scatterers are more likely to be concentrated on the ground surface than be evenly distributed. In this case, a more complex and realistic distribution rather than uniform distribution may improve the extraction accuracy of the DSM, thereby improving the estimation accuracy of the forest height. As a previous study [
28] shows, the MLM with a mixed truncated normal distribution achieved a superior forest height result compared to the model with less complex distribution. Further study will focus on the more complex scatters distribution to improve the accuracy and adaptability of the MLM-based DSM compensation method.
In addition, it should be mentioned that the integration of InSAR data from a different frequency requires accurate geo-positioning and comparable spatial resolution, especially when the data are obtained with a different platform. In our case, the spaceborne X-band InSAR data and airborne P-band InSAR data were used. Along with these data, accurate orbit information and image parameters were provided, which allowed us to perform accurate geocoding (within 1 pixel,) for both datasets based on the Range-Doppler positioning model. The resolutions of the airborne and spaceborne InSAR SLC data are comparable. Therefore, the influence of this aspect on the results of this paper should be small.
7. Conclusions
This paper presented a forest height estimation approach utilizing the penetration difference between P-band and X-band InSAR data. The forest height was determined by subtracting the P-band DTM from the X-band DSM. To realize the unbiased DTM extraction, the improved TF analysis method was utilized to separate and obtain the pure underlying topography phase. The method enhanced the capability of gap penetration of the P-band InSAR by adopting subaperture decomposition. To compensate for the height bias in the DSM, we proposed a novel method based on the MLM. As opposed to traditional scattering models (such as IDUV, RVoG), which focus on the dielectric penetration of the SAR signal, the MLM emphasizes the penetration capability of the radar, which is more in line with the characteristics of the scattering mechanism for X-band InSAR in a forest scenario.
Based on the spaceborne X-band and airborne P-band InSAR data of the Saihanba Forest Farm, the proposed method was validated. The experimental results showed that the extracted DTM based on the TF method achieved an RMSE with an accuracy of about 0.94 m, which is sufficient for forest height estimation. The proposed MLM-based DSM compensation method outperformed the IDUV-based method by 1.34 m in terms of RMSE. Furthermore, in the area where the original X-band DSM underestimated the forest surface height over 10 m, the MLM-based method achieved a better compensated result compared to the IDUV-based method. As a result, under the extracted P-band DTM, the forest height estimation based on the MLM-based compensation method achieved better accuracy (Acc. = 86.58%, RMSE = 1.81 m) compared to the IDUV-based method (Acc. = 78.09%, RMSE = 2.98 m). The results demonstrate the effectiveness of the proposed forest height estimation method in the study area. In the future, we plan to adopt this approach to estimate the forest height in more and larger areas to further assess the adaptability of the method.