1. Introduction
Change detection (CD) plays an important role in land-use planning, population estimation, natural disasters and city management [
1,
2,
3]. Change detection is a technique for obtaining change regions of interest using remote sensing images in different time periods. There are more than thirty years of studies related to change detection [
1], and many state-of-the-art techniques [
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17] have been proposed to automatically identify changes in a region in remote sensing images. However, most of these change detection methods [
12,
13,
14,
15,
16,
17] are binary change detection (BCD), which overlooks the pixel’s categories that are usually necessary for practical application.
Semantic change detection (SCD) can detect the change regions and identify the semantic labels simultaneously. However, SCD-based datasets that are openly available are still limited [
18]. Hence, to validate the proposed methods, a large-scale semantic change detection dataset (HRSCD) was built by Daudt et al. [
18], and a sequential training framework for semantic change detection was proposed. Based on semantic change detection, Mou et al. [
19] proposed a recurrent convolutional neural network (ReCNN) network, and two data sets were built to validate their work, but the proposed datasets are not publicly available. Although, existing methods can achieve the semantic change detection with promising results, their studies face the problems of locating and identifying the area of change [
20]. Consequently, Yang et al. [
20] proposed an asymmetric Siamese network (ASN) for dual-task semantic change detection, and a large-scale semantic change detection dataset was built, named SECOND, to detect change regions between the same land-cover types.
Different from binary labels, semantic change labels contain multiple categories, and each period image corresponds to a semantic change label. However, existing change detection methods usually can only realize binary change detection (such as References [
12,
13,
14,
15,
16,
17,
18]) or single task semantic change detection [
19]. The SECOND dataset contains a pair of semantic change labels corresponding to two periods’ images. Consequently, conventional methods, such as FC-EF [
15], FC-conc [
15] and FC-diff [
15], do not work well.
With the development of remote sensing and machine learning technology [
21,
22], change detection has achieved great progress. Change detection generally falls into two categories, one is binary change detection and the other is semantic change detection. Binary change detection mainly focuses on change regions yet overlook categories of pixels. Different from binary change detection, semantic change detection not only achieves the change regions detection, but also identifies the categories of each pixel of change regions.
In recent years, a lot of studies mainly pay attention to binary change detection [
12,
23,
24,
25,
26,
27]; most of the works focus on detecting change regions, labeling them with “0” and “1”, where “0” is no change and “1” represents a change region. Consequently, the binary change map can be obtained through extracting difference features from image pairs. The main purpose of change detection in remote sensing images is to obtain the change information from the bitemporal images; the change regions usually are highly important to analyze land-use and land-cover change. However, binary change detection usually fails to identify the ground truth of each pixel in change regions. Consequently, semantic change detection (SCD)-based methods have been developed for achieving semantic segmentation based on a change region. However, we found that the existing works have two shortcomings, as follows:
(1) Based on the above, since the existing methods are single-task-oriented change detection frameworks, they are inappropriate to achieve dual-task semantic change detection.
(2) Although, dual-task-based semantic change detection has been developed, its generality is not strong, and thus the model has space for improvement.
Based on the abovementioned problems, our work proposes a novel dual-task semantic change detection Siamese network using the generative change field module to help the prediction of change regions and segmentation. The proposed network uses a binary change detection branch to guide the two semantic segmentation networks to predict pixels’ categories. Since the semantic segmentation branch does not have the perception of the change information before the fusion of the change region information (generative change field) and semantic information, then only by fusing the change information can the two semantic segmentation branches realize the prediction of the change region. Therefore, the proposed network is called the Generative Change Field (GCF)-based dual-task Semantic Change
Detection Network (GCF-SCD-Net), as shown in
Figure 1. Although, in previous work [
19], Mou et al. have proposed to use the binary change map to help model training, the proposed network exploits the convolutional neural network and recurrent neural network to achieve feature extraction and change detection, and obtains the binary change maps by using fully connected layers activated by the sigmoid function. Their method only used the auxiliary loss method to generate a sematic change map. For the SCD task, change features play an important role in helping the model predict semantic change labels. They can guide the generative semantic change map module to focus on different regions between bitemporal images. In addition, change regions can be generated using a change feature map. Fusing the binary change feature map and semantic change feature map is an effective method to improve the segmentation results. In this paper, the main contributions are as follows.
(1) We propose a novel dual-task semantic change network to identify the change region in bitemporal images, and it achieves strong results using the SECOND dataset. The proposed SCD-based model effectively solves the dual-task semantic change detection problem.
(2) To the best of our knowledge, we are the first to exploit the generative change field method to guide two branch networks to achieve dual-task semantic change detection.
(3) In order to alleviate the influence of an imbalanced label between the change region and no-change region, we propose a robust separable loss function that enables to improve the performance of the network.
4. Discussion
To validate the performance of GCF-SCD-Net, we list the results obtained by [
20] and our methods in
Table 6. The proposed method achieves the best results on the SECOND dataset, 16.5 in SeK and 69.1% in mIoU. In the testing process, Yang et al. [
20] improved the detection results by flip methods and a multiscale strategy. Since the proposed network did not use the multiscale strategy to optimize the parameters of the models, we only used the flip method to validate the performance of the networks. As shown in
Table 6, although ASN-ATL outperforms GCF-SCD-Net slightly in mIoU, the proposed network achieves the best results in SeK by an improvement of 1.1.
Above, the proposed method stably improves the performance of the semantic change detection, which effectively demonstrates the superiority and robustness of GCF-SCD-Net.
To present the change detection results intuitively, we visualized the segmentation results to demonstrate the performance of the proposed methods.
Figure 3 shows the change detection results generated by FC-EF, FC-Siam-conv, FC-Siam-diff and the three types of dual-task semantic change detection networks proposed in this work.
According to semantic segmentation results in Samples A and B, we can note that the proposed GCF module enables to identify the change region accurately and the no-change region in complex scenarios. Since “tree” and “low vegetation” have a similar texture and color, most of the SCD networks have a poor detection performance, but this does not limit the segmentation results of GCF-SCD-Net, as shown in Sample B. In terms of sample C, conventional change detection methods cannot identify the pseudochange region well, but the proposed SCD-based UNet and PSPNet are capable of alleviating this problem, which demonstrates that existing change detection networks are improper for dual-task SCD. Generally, the trees on both sides of the road are elongated. According to Sample D, we note that the proposed GCF module performs well for the stripe scenario. Due to the small number of samples (such as “Playground”), it is difficult to accurately identify these change types; our method performs well under the abovementioned conditions.
Figure 4 depicts the visual results of the semantic prediction and binary prediction based on GCF-SCD-Net, where we can see that the GCF module can extract the change regions accurately. Consequently, the semantic change detection module can effectively classify the categories of each pixel based on the change field.
Above, the visual results fully demonstrate that our method is effective and superior to the existing methods.
5. Conclusions
In this work, in order to address the problem that existing methods are incapable of obtaining a significant result for dual-task semantic change detection, we proposed a generative change field (GCF)-based dual-task semantic change detection network for remote sensing images. The proposed network consists of a Siamese convolutional neural network (Siam-Conv) module for extracting the feature representation from the raw image pairs, a generative change field module for obtaining the binary change map and two generative semantic change modules for generating the semantic segmentation maps of the bitemporal images. Moreover, it is an end-to-end SCD network. To alleviate the sample imbalance problem, we designed a separable loss for better training the deep models.
Extensive experiments were conducted in this work to demonstrate the competitive performance that can be achieved by GCF-SCD-Net, compared with existing methods as well as the proposed dual-task SCD networks (UNet-SCD and PSPNet-SCD). What is more, we validate the effectiveness of the proposed separable loss function; it is worth noting that the proposed separable loss is a general strategy to alleviate the sample imbalance problem. Therefore, it can be applied to other benchmark datasets that suffer from label imbalance.
At present, the SECOND dataset is the only public dataset for dual-task semantic change detection. In the meantime, we note that the proposed network and conventional networks perform poor regarding edge detection and contour extraction in the intersecting zone. Consequently, in future work, we intend to build a large-scale, very-high-resolution benchmark dataset for semantic change detection based on multi-source satellite data. To achieve better segmentation results, we intend to use the Markov Random Field (MRF) [
37] method as well as boundary loss [
38] to optimize the segmentation results.