1. Introduction
Climate changes have significantly influenced human and animal habitats. As an illustration of these changes, one notable example is the reduction in the widths of water zones. Identifying changes in water zones is crucial for making informed decisions in environmental protection and management [
1]. Identification of such changes through field observations is time-consuming and expensive. The utilization of remote sensing techniques has significantly facilitated the monitoring of changes, surpassing the challenges encountered in the past. Remote sensing images provide great information about the Earth’s surfaces [
2]. Unlike optical satellites, radar satellites can acquire data in all weather conditions, day and night. These data are sensitive to surface roughness and can provide comprehensive information about the environment. Water zones exhibit minimal surface roughness, particularly in the absence of strong winds, resulting in their appearance as dark areas in radar images [
3]. In remote sensing, detecting algorithms can be classified into two groups: classical methods and deep learning models.
Classical methods rely on backscattered information, often leading to unsatisfactory results with low accuracy. Liang et al. [
4] presented a new local hierarchical regional thresholding method for describing water using SAR images. Zhang et al. [
5] introduced a novel approach to assessing flood extent using multi-temporal Sentinel-1 data. An automatic thresholding procedure generates initial land and water classification. Then, a fuzzy logic-based method refines the initial classification. Experiments demonstrate that using different polarizations as image bands cannot provide better results. To tackle this issue, incorporating contextual information enhances the accuracy and reliability of the classification outcomes [
6]. Wang et al. [
7] combined the threshold segmentation method with Markov random fields (MRF) and integrated simulated annealing (SA) into the process of image noise reduction. As a result, a water extraction method demonstrates high accuracy in classification. In another study, Song et al. [
8] introduced a method for selecting features from SAR images, which relied on the correlation of sparse coefficients. The aim was to enhance the precision of change detection (CD). However, these conventional methods still need to be improved in terms of extracting spatial information properly.
Deep learning models have the advantage of effectively extracting spectral information without being constrained by the limitations of classical approaches. In their paper, Aghdami-Nia et al. [
9] developed an automatic coastline extraction framework by modifying the Standard U-Net model to enhance sea-land segmentation. In another study, Lin et al. [
10] proposed a novel approach utilizing a Fully Convolutional Neural Network to detect water in Sentinel-1 SAR images accurately. The overall detection performance is enhanced by incorporating the spatial information of neighboring pixels and analyzing the corresponding pixel intensities.
The performance of classical methods and deep networks in CD using Sentinel-1 images has been investigated to determine which approach yields superior results. In this study, the Ratio Index (RI) is employed as a fundamental classical method, while the MRF is utilized as an enhanced version of this method. In addition, an improved form of CNN called Inception CNN is introduced as a deep network to detect waterbody changes effectively. This network can consider the different scales of image objects within the network.
The structure of the current investigation is as follows: The second section introduces the research methodology.
Section 3 presents the experimental result. Finally, in
Section 4, we summarize the conclusions.
2. Methodology
In this section, we present the three mentioned CD methods. An overview of the workflow is shown in
Figure 1.
Figure 1 illustrates the stepwise process of CD. In general, the research method has four steps. Initially, the images undergo preprocessing, including geocoding, radiometric calibration, and filtering using the Lee Sigma filter. Afterward, the preprocessed images are subjected to three methods to produce the desired difference image. The final step is to evaluate the change maps created. In the following sections, these methods will be introduced in detail.
2.1. Ratio Index
If we let
and
represent the SAR intensity images in t1 and t2 times, the RI, which looks like a log ratio index, can be defined as follows:
where
eps represents a minimal decimal value, and refers to a small constant value known as “epsilon” or “small parameter”. This parameter is employed to avoid computational issues arising from division by zero. The equation’s robustness and results are improved, especially when the values of
and
tend towards zero. This study sets
eps to 5, and the Otsu thresholding technique is employed [
11] to generate the change map.
2.2. MRF
The MRF algorithm is an influential image-processing technique employed to model and analyze intricate structures within images. Using probability theory, the MRF can estimate the likelihood of a particular state occurring in each pixel. Imagine receiving a change index image representing a collection of N pixel vectors
. The labels of the difference image are denoted by
L = {l1, l2}. The maximum a posteriori (MAP) estimation determines the pixels’ labels. For a given pixel
x, the formulation can be described as follows [
12]:
where
P(
x|c) represents the conditional probability distribution within the Gaussian distribution model, and
P(c) denotes the prior probability distribution of the label layer. Based on the Bayesian inference principle, one can achieve the maximum value in the posterior probability by minimizing the total energy function. The detailed investigations in reference [
12] can be referred to for further details.
2.3. Inception CNN
Deep learning models such as CNNs are applied to image recognition, classification, and CD. These networks enable accurate predictions or classifications by automatically learning and extracting relevant features from input images. The distinguishing characteristic of CNNs is the capability to execute convolution operations. Convolution involves sliding a small kernel over the input image to extract spatial information. By getting deeper layers, CNNs can generate complex features. The process and operations carried out in this layer can be described as follows:
where
denotes the output feature vector of layer l.
ml represents the number of convolutional filters in layer
l of the network and
corresponds to the
nth input vector of layer
l.
represents the bias vector and
shows the filter connecting the
nth feature map in the previous layer (
l−1) to the
kth feature map in layer
l. The
denotes the convolution operator [
13].
Using a fixed kernel size in the initial layers of CNNs can lead to disregarding the varying scale of objects in an image. To address this, the Inception module has been applied in this study. The Inception module aims to capture features at multiple spatial scales using parallel convolutional operations of different filter sizes within the same layer. This allows the model to learn and combine diverse features simultaneously. The Inception module simultaneously applies max pooling and three convolutions to the input data. All generated feature maps are merged to serve as inputs for the next layer.
The proposed deep network receives the stacked bi-temporal SAR VV polarization images as input and produces the change map in the output layer. Patch-based processing is the fundamental approach to utilizing image data in CNNs. Therefore, the input image is divided into dimensions of 25 × 25 × 2 and used as input for the network. The numbers of filters are arranged in the following order: [16, 32, 64, 128, 256], and the kernel size is set as 3 × 3. The learning rate and the cost function are set to 0.001 and Adam, respectively. The network architecture, as shown in
Figure 2, illustrates the desired configuration.
4. Conclusions
The advancement of remote sensing techniques has made it easier to monitor environmental changes, such as the depletion of water zones. This progress has significantly enhanced our ability to understand and address ecological transformations. This study compares the performance of classical methods and deep learning approaches in identifying water zone changes from Sentinel-1 images. As examples of classical methods, the research employed RI and MRF. Moreover, Inception CNN was utilized as an alternative to deep learning networks to enhance the CD performance. The MRF algorithm improved detection results by taking into account pixel neighborhoods. However, the time-consuming task lies in determining suitable iterations and window sizes. On the other hand, Inception CNN integrates a multi-scale approach directly within its architecture, enabling the extraction of reliable spatial features. Experimental findings validate the efficacy of incorporating these features for CD. Contrary to the common belief that simple features like water can be swiftly identified using simple algorithms, this study revealed the limitations of such a perspective. The results underscore the indispensability of leveraging deep learning networks to attain significantly improved accuracy levels.
We will develop a multi-source architecture based on CNN, utilizing Sentinel-1 and -2 images to detect changes in future work.