1. Introduction
Multi-temporal SAR images utilize various auxiliary data sources for cross-validation and information supplementation [
1], allowing for the observation of surface changes. Multi-temporal SAR provides a more comprehensive and detailed view of surface and dynamic changes [
2]. SAR not only overcomes the limitations of time and weather conditions in ground observation [
3], specifically demonstrating the capability of all-weather, all-time, and the continuous observation of moving targets [
4,
5], but also exhibits a certain ability to penetrate vegetation, soil, and occlusions [
6,
7]. Given these unique advantages of SAR, its applications are extremely diverse. For example, Zhang et al. proposed a near real-time method for monitoring the progression of forest fires [
8]. Fu et al. mapped mangrove species and elucidated their scattering characteristics to monitor the extent and health of mangroves [
9]. However, the presence of speckle noise significantly affects the quality and resolution of SAR images [
10]. Therefore, removing speckle noise has always been a key issue for the further processing and applications of SAR images. Speckle removal can be broadly categorized into two types: single-temporal SAR denoising and multi-temporal SAR denoising.
Methods for single-temporal SAR speckle removal can be broadly categorized into three main types: spatial domain filtering, transform domain filtering, and deep learning filtering [
11]. Spatial domain filtering includes Lee filtering [
12], Frost filtering [
13], and Kuan filtering [
14], among others. The performance of spatial domain filtering is highly affected by the size of the filtering window, as smaller windows may not effectively suppress noise, while larger windows may lead to the loss of image texture details during denoising [
15]. Transform domain filtering commonly utilizes Fourier transform [
16], wavelet transform [
17], and other techniques. Deep learning filtering includes MRDDANet [
18] and AGSDNet [
19], among others. SAR images inherently contain speckle noise during the imaging process, making it impossible to obtain completely clean images [
20]. Supervised learning requires a training dataset with clean images, but there are no clean images in SAR. Moreover, deep learning models lack interpretability [
21], which restricts their further applications in certain scenarios where model explanations are needed.
With the continuous development of SAR satellite technology, SAR satellites are capable of capturing multiple images of the same target with shorter time intervals. Given the increasing demand for multi-temporal SAR speckle removal, several common methods have emerged. Lê et al. proposed a novel method for the temporal adaptive despeckling of multi-temporal SAR images [
22]. Chierchia et al. proposed a despeckling algorithm for multi-temporal SAR images, utilizing the principles of block-matching and collaborative filtering [
23]. The RABASAR [
24] proposed by Zhao et al. in 2019, is one of the most remarkable frameworks for multi-temporal SAR image despeckling in recent years [
25]. The key idea of the RABASAR lies in the utilization of ratio images, as they are more amenable to despeckling due to their better spatial stationarity. However, the RABASAR still has some limitations in terms of obtaining the “superimage” and the ratio image.
Regarding the method of obtaining the “superimage”, the RABASAR adopts a weighted averaging technique during its first step to generate the so-called “superimage”. This approach may cause significant information loss in the input of multi-temporal SAR images, contradicting the goal of enriching image information through multi-temporal SAR images. To address this issue, we propose utilizing the DSMT-NLM to obtain the “superimage”. Antoni Buades et al. introduced the NLM [
26], which incorporates both local and global information. Its concept still represents a groundbreaking advancement. However, the algorithm still has limitations, which can be analyzed from two aspects. Firstly, the NLM itself has room for improvement. Within the target window, there may be pixels with low relevance to the center pixel, resulting in situations where the weighted similarity calculation assigns too low of a weight to the center pixel values of the target window, despite their close resemblance to the center pixel values of the sliding window. This leads to significant discrepancies between the updated pixel values and the true values when updating through weighting. Secondly, the application of the NLM to multi-temporal SAR images poses challenges. Existing research methods often utilize single-temporal SAR images for NLM despeckling, which may result in insufficient information. The presence of speckle noise caused by coherence effects in SAR imaging blurs the boundary between the speckle noise and image details, making it challenging to distinguish speckle noise from details in SAR images. Multi-temporal SAR images are designed to address this information deficiency. To address the above issues, we propose the DSMT-NLM. The DSMT-NLM will utilize directional segmentation to identify the window with the highest correlation as the target window, effectively overcoming the distortion at the image edges. Additionally, we will traverse all time-series windows using a sliding window to maximize the utilization of information from all multi-temporal SAR images, thus improving the despeckling performance. Lastly, by combining data from multiple information sources, we will perform information fusion, thereby supplementing the content and dimensions lacking in a single information source, enhancing the completeness and accuracy of SAR image information, and obtaining the “superimage”.
Regarding the approach to obtain the ratio image, during the experimental process of the RABASAR we observed a phenomenon: by only selecting one image of interest for ratio calculation with the “superimage”, we found that this approach resulted in the final image containing only the geographical information of the selected interest image. This processing approach contradicts the goal of using the rich information from multi-temporal data to generate the “superimage”. To address this issue, we improved the method of generating the ratio image. We adopted a WAMWT to fuse information from the multi-temporal SAR images, resulting in the creation of a superimposed image. Subsequently, we performed the ratio operation between the superimposed image and the “superimage” to generate the ratio image. This processing approach preserves the characteristics of the multi-temporal information, thus obtaining more accurate despeckling results.
The main contributions of this study are as follows:
We propose a DSMT-NLM to acquire a high-quality “superimage”.
We employ a WAMWT to fuse information from multi-temporal SAR images, producing a superimposed image. Subsequently, we perform a ratio operation between the superimposed image and the “superimage” to generate the ratio image.
We introduce a directional segmentation method to calculate the window with the highest correlation as the target window.
By employing a sliding window to traverse all time-series images, we maximize the utilization of information from all temporal SAR images, significantly enhancing the despeckling effect.
2. Research Methodology
The flowchart of the whole framework is illustrated in
Figure 1. The algorithm comprises the following steps:
Step 1: From the input multi-temporal SAR images, sequentially select each SAR image of the time series as the reference image.
Step 2: For the pixels to be despeckled in the reference image, perform directional segmentation within their neighborhood, resulting in eight directional windows: up, down, left, right, left-up, left-down, right-up, and right-down. Calculate the weighted average of pixels within each directional window to obtain their mean values. Then, utilize the correlation distance to calculate the relevance between the pixel mean values of each directional window and the pixels to be despeckled in the reference image. Among the eight directional windows, identify the one with the maximum relevance to the pixels to be despeckled in the reference image, and select that directional window as the target window.
Step 3: For each SAR image in the time series, including the selected reference image, set a search window. Calculate the similarity between the target window and the sliding windows within the search window to determine the weights of the center pixels in the sliding windows. Multiply each center pixel value of the sliding window by its corresponding weight, and then calculate the weighted average of these products. The resulting value is used to update the pixel to be despeckled in the target window. Repeat the above process for each pixel in the reference image, thus completing the despeckling process for the reference image. Similarly, repeat the above steps for other SAR images selected as reference images to achieve the despeckling for all reference images.
Step 4: Apply wavelet transform to each filtered reference image to decompose it into different frequency components. Merge these components and reconstruct the “superimage” through wavelet inverse transform. Perform MuLoG-BM3D filtering on the “superimage”. At the same time, follow the same fusion process for the unfiltered input images to obtain the superimposed image. Then, calculate the ratio image by performing the ratio operation between the superimposed image and the filtered “superimage”.
Step 5: Apply RuLoG filtering to the ratio image, and then perform the inverse transform to obtain the final image.
2.1. Preliminary Despeckling of Multi-Temporal SAR Images Based on Directional Segmentation
The original RABASAR utilizes a weighted average method to generate the “superimage”, which achieves initial despeckling and integrates the information from multi-temporal SAR images. However, this simple approach leads to unsatisfactory despeckling results and fails to fully exploit the abundant information features in multi-temporal SAR images. Moreover, the use of weighted averaging causes the smoothing of image texture, resulting in the severe loss of feature information. This contradicts the fundamental idea of utilizing multi-temporal SAR images to compensate for the insufficient information in single-temporal SAR images. Therefore, to address these issues, we propose utilizing the DSMT-NLM method to generate the “superimage”, as illustrated in
Figure 2. The specific steps are detailed in this section.
2.1.1. Selection of Target Window Based on Directional Segmentation
In the original NLM and its subsequent improvements, preserving edge details is often overlooked. This is because there are pixels in the center pixel block that have a low correlation with the center pixel, especially in the regions near the edges. This significantly interferes with the weight calculation between the target window and the sliding window, causing blurriness at the image edges and leading to the loss of edge details. To address this issue, our algorithm improves upon the NLM by introducing directional segmentation to guide the selection of the target window, thereby mitigating edge blurriness.
For each pixel
to be despeckled in the reference image, along with its corresponding neighborhood region
, our algorithm adopts directional segmentation to obtain eight directional windows, denoted as
, where
represents the eight directions: left-up, left-down, right-up, right-down, up, down, left, and right. The schematic diagram of the directional segmentation is illustrated in
Figure 3.
By taking the weighted average of the pixel values within the directional window
, we obtain the pixel value mean of the directional window. To provide a clearer explanation, let us take the left-up directional window
as an example.
Figure 4 shows the pixel points within
, denoted as
,
,
, and
. We assign the weights
,
,
, and
to the pixel values
,
,
, and
, respectively. After the weighted average calculation, we obtain the pixel value mean
, as shown in Equation (1), where
represents the sum of the weights of each pixel point, as shown in Equation (2).
Next, we calculate the correlation distance
between the pixel mean
of each directional window and the pixel value to be despeckled
using Equation (3). Then, we select the directional window corresponding to the highest correlation distance as the target window
for the pixel to be despeckled, as shown in Equation (4).
Here, left-up, left-down, right-up, right-down, up, down, left, and right.
2.1.2. Despeckling of Multi-Temporal SAR Images Based on DSMT-NLM
In previous research on SAR image despeckling, the NLM was commonly used, but it was typically applied only to single-temporal SAR images. Due to the speckle noise in SAR images, the uniqueness of this speckle noise makes it difficult to accurately distinguish between the speckle noise and image details, leading to a blurring of the boundary between the speckle noise and image details, which increases the difficulty of speckle noise removal while preserving image details. The NLM is primarily designed for denoising single-temporal images and may have limited effectiveness in handling speckle noise. Compared to single-temporal SAR images, multi-temporal SAR images contain richer information about the scene, and the correlation information from multiple sequences can better estimate noise and preserve image details accurately. Therefore, in order to better distinguish between speckle noise and image details and provide more accurate noise estimation, we extended the principles of the NLM to better adapt it to multi-temporal SAR images. This specific method involves expanding the traditional NLM algorithm’s sliding window selection strategy from a single image to multiple images, allowing the sliding window to traverse all temporal SAR images. This extension allows the NLM to better utilize temporal information and achieve more accurate and reliable SAR despeckling.
In the search window of the multi-temporal SAR image, we select the sliding window
and traverse it across all temporal SAR images. The size of the sliding window
remains consistent with the target window
selected in
Section 2.1.1. The similarity between the target window
and the sliding window
is calculated using Equation (5). The similarity
is used to calculate the weight
corresponding to the center pixel value
of the sliding window
, as shown in Equation (6), where
is the smoothing parameter and
is the normalization coefficient. By multiplying
by the corresponding weight
and summing the values, then taking the average, we can update the pixel value
of the target window
to the despeckled pixel value
, as shown in Equation (8). We repeat this process for each pixel point of the reference image to obtain the despeckled reference image
.
where
represents the pixel value at point
in the target window, and
represents the corresponding pixel value at point
in the sliding window. Given that the sizes of the target window and sliding window are equal, the terms
and
mentioned here respectively denote the number of rows and columns in either the target window or the sliding window.
2.1.3. Weighted Average Information Fusion Method Based on Wavelet Transform
Due to the richer information content in multi-temporal SAR images, information fusion, which combines features from different sources to obtain more comprehensive and reliable information, is often necessary to obtain a “superimage.” Conventional approaches in previous research typically involve weighted averaging of multiple SAR images, which is a relatively simple form of information fusion. However, this method directly uses fixed weights to average the pixel values in the images, leading to the loss of rich texture details in the multi-temporal SAR images and a decrease in image clarity. To address these issues, we propose using a WAMWT to replace the conventional simple weighted averaging approach used in previous studies. This method utilizes wavelet transform for multi-scale decomposition, fully considering the frequency characteristics of SAR images and more finely processing information in different frequency ranges. By decomposing, fusing, and reconstructing multiple images, a more comprehensive “superimage” is synthesized, achieving more accurate information fusion. The specific steps of this method are as follows: First, the despeckled reference images
are subjected to four-scale decomposition using the Daubechies 4 wavelet, obtaining different wavelet coefficients representing information in different frequency ranges, including the low-frequency sub-band (cA), horizontal high-frequency sub-band (cH), vertical high-frequency sub-band (cV), and diagonal high-frequency sub-band (cD), as shown in Equation (9).
where
represents the input i-th reference image, and
is the function for the two-dimensional discrete wavelet transform.
For each wavelet sub-band, a weighted average is performed based on the corresponding weights, as shown in Equation (10).
The weighted average of the wavelet sub-bands is then reconstructed through the inverse wavelet transform, specifically using the Daubechies 4 wavelet, resulting in the final “superimage”
, as shown in Equation (11).
where
represents the two-dimensional discrete inverse wavelet transform.
2.2. Residual Speckle Noise Removal Based on Ratio Image
After applying the DSMT-NLM, we obtained the “superimage”. This image has undergone initial despeckling and effective information fusion, but residual speckle noise still persists. To address this issue, the RABASAR employs the concept of a ratio image as its core approach. In the first section, we discussed the RABASAR’s method of selecting one SAR image of interest and generating a ratio image with the “superimage”. However, this approach fails to fully utilize the abundant information contained in the multi-temporal SAR images and contradicts the objective of extensively exploiting the multi-temporal information to generate the “superimage”. Therefore, we propose a new method, the weighted average based on wavelet transform, to replace the RABASAR’s approach of selecting only one SAR image of interest and generating a ratio image with the “superimage”. With this method, we can better utilize the information from the multi-temporal SAR images while maintaining the consistency of the previous step.
We applied the WAMWT to process the input multi-temporal SAR images, following a procedure similar to that described in
Section 2.1.3, to obtain the superimposed image
. According to the RABASAR, by applying the MuLoG-BM3D filter to the “superimage”, we obtained the processed “superimage” [
24]. Then, by performing a ratio operation between the superimposed image
and the processed “superimage”
, we obtained the ratio image
, as shown in Equation (12).
Next, we denoised the ratio image using the RuLoG algorithm [
24]. Finally, we performed a restoration operation on the filtered ratio image by multiplying it with the despeckled “superimage” to obtain the final image, denoted as
, as shown in Equation (13).
4. Conclusions
In this paper, we proposed an algorithm, titled “Enhancing RABASAR for Multi-Temporal SAR Image Denoising through Directional Filtering and Wavelet Transform,” to address the challenge of speckle noise removal in multi-temporal SAR images. The proposed algorithm introduced a novel approach to obtain the “superimage”, referred to as DSMT-NLM. Additionally, we utilized a WAMWT to generate the superimposed image, which was then ratioed with the “superimage” to obtain the ratio image. Through subjective visual evaluation and objective performance metrics, we not only demonstrated the feasibility of the proposed approach but also showcased its superiority over the other methods. However, during the experiments on real multi-temporal SAR image II, we noticed that the image contrast of the experimental results did not reach the level of other comparative experiments, indicating a new challenge that we need to address. Despite the excellent performance of our algorithm in other aspects, the issue of insufficient contrast still requires further in-depth research and resolution. We acknowledge that this problem might stem from certain aspects or parameter settings of the algorithm. Therefore, in future research, we will focus on exploring and optimizing these aspects to achieve better contrast performance.