Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs

Wu, Junzheng; Ni, Weiping; Bian, Hui; Cheng, Kenan; Liu, Qiang; Kong, Xue; Li, Biao

doi:10.3390/rs15071770

Open AccessArticle

Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs

by

Junzheng Wu

^1,2,*,

Weiping Ni

¹,

Hui Bian

¹,

Kenan Cheng

¹,

Qiang Liu

¹,

Xue Kong

¹ and

Biao Li

²

¹

Northwest Institute of Nuclear Technology, Xi’an 710024, China

²

College of Electronic Science, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(7), 1770; https://doi.org/10.3390/rs15071770

Submission received: 28 February 2023 / Revised: 19 March 2023 / Accepted: 20 March 2023 / Published: 25 March 2023

(This article belongs to the Special Issue Advanced Artificial Intelligence Algorithm for the Analysis of Remote Sensing Images II)

Download

Browse Figures

Versions Notes

Abstract

:

With the aim of automatically extracting fine change information from ground objects, change detection (CD) for very high resolution (VHR) remote sensing images is extremely essential in various applications. However, the increase in spatial resolution, more complicated interactive relationships of ground objects, more evident diversity of spectra, and more severe speckle noise make accurately identifying relevant changes more challenging. To address these issues, an unsupervised temporal-spatial-structural graph is proposed for CD tasks. Treating each superpixel as a node of graph, the structural information of ground objects presented by the parent–offspring relationships with coarse and fine segmented scales is introduced to define the temporal-structural neighborhood, which is then incorporated with the spatial neighborhood to form the temporal-spatial-structural neighborhood. The graphs defined on such neighborhoods extend the interactive range among nodes from two dimensions to three dimensions, which can more perfectly exploit the structural and contextual information of bi-temporal images. Subsequently, a metric function is designed according to the spectral and structural similarity between graphs to measure the level of changes, which is more reasonable due to the comprehensive utilization of temporal-spatial-structural information. The experimental results on both VHR optical and SAR images demonstrate the superiority and effectiveness of the proposed method.

Keywords:

unsupervised change detection; high resolution remote sensing images; similarity of graph; temporal-spatial-structural neighborhood; metric function

1. Introduction

Change detection (CD) that aims at identifying the change information of land cover by analyzing remote sensing images captured at different times but in the same geographical area [1] has been widely applied in military and civil fields, such as disaster assessment [2], city planning [3], and ocean monitoring [4].

According to whether labeled data is used, most CD methods can be divided into two categories: supervised and unsupervised. Until now, numerous supervised methods have achieved high performance where labeled data is adequate and effective [5,6,7]. However, acquiring plenty of labeled data is always time-consuming and costly, or even impossible under some circumstances. Consequently, unsupervised methods are more feasible and convenient in many CD tasks due to the independence on labeled data. This paper thus focuses on unsupervised CD methods which can be roughly classified as the sub-pixel level, pixel-based, and object-based ones.

The sub-pixel level methods are usually applied in situations where only one fine spatial resolution image can be acquired while the other image displays coarse spatial resolution due to the influence of imaging conditions [8]. For the coarse spatial resolution image, sub-pixel mapping is implemented to obtain a land cover map with fine spatial resolution; then, land cover changes can be obtained by comparing it with the fine spatial resolution image [9]. Although sub-pixel level methods have shown tremendous promise, one main issue is still unresolved; the sub-pixel mapping accuracy will inevitably and seriously affect the CD results. However, designing reliable sub-pixel mapping methods for CD is still challenging.

The pixel-based methods usually include two types. The first type treats each single pixel as a processing unit and assumes independence among pixels. For instance, the iteratively reweighted multivariate alteration detection (IRMAD) [10], the change vector analysis (CVA) [11] and its improved forms (Compressed CVA, C2VA) [12], which are the most commonly used in optical image CD, and the ratio and log ratio [13] are widely used in synthetic aperture radar (SAR) image CD. These methods have shown their success for medium-low spatial resolution images for the reason that the covered area of one pixel is relatively large, and that the assumption of independence among pixels can be available. On the whole, these methods identify changes by using only spectral features which ignore the relationships among pixels that may be essential when the spatial resolution increases.

The second type treats a rectangular image patch centered on one pixel as a processing unit instead of a single pixel. These methods can obtain more features to identify changes because the neighborhood information is exploited. In recent years, a number of such methods have emerged. For instance, the mean ratio method [14] -utilizes the mean of an image patch to the ratio operation, which can effectively suppress the common speckle noise in SAR images. Chatelain et al. [15] assumed that the brightness of SAR images obeys bivariate gamma distributions; thus, they obtained the level of changes through comparing the difference of distributions within an image patch. Gong et al. [16] treated each image patch as an atom and extracted the mean of the spectrum as a feature vector; then, a method based on coupling dictionary learning was proposed for heterogeneous remote sensing image CD tasks. Zhuang et al. [17] characterized each pixel using the homogeneity of the image patch centered on it; then, the adaptive generalized likelihood ratio test was proposed for CD in SAR images.

Although the aforementioned pixel-based methods have been successfully applied in many circumstances, some unavoidable limitations still exist in high spatial resolution cases. It can be observed that a ground object is always constituted of a certain number of adjacent pixels in VHR images. In other words, a single pixel can hardly express an object with practical meaning, which indicates that the independence of pixels is unreasonable when the spatial resolution is high. In addition, with increasing resolution, the impacts of illumination, weather, and imaging angle are more serious, which may lead to severer “pepper and salt” noise in CD results. Treating a rectangular image patch as basic unit can weaken the impact to a certain extent. Nevertheless, two issues still inevitably exist. First, obtaining the most appropriate window sizes is still undeterminable. Sizes which are too small may lead to a deficiency of neighboring information, while sizes which are too large may result in the heterogeneity of the patch, which is disadvantageous for extracting representative features. Second, it can be observed that the shapes of ground objects vary in VHR images. Thus, a fixed rectangular image patch can hardly represent a meaningful object.

Compared with the pixel-based methods, the object-based ones which treat an object as a basic processing unit can overcome these limitations to a certain extent, as more information (e.g., spectrum, texture, and shape) can be exploited. The first and indispensable step of object-based methods is superpixel segmentation. However, no matter which segmentation methods are adopted, the objects and superpixels can be rarely a one-to-one correspondence in the VHR image scenes. In fact, the superpixel is the basic unit in most existing object-based CD methods, as is the case in this paper. Thus far, numerous methods using superpixels as basic processing units have emerged. For instance, an object-based Markov random field (MRF) was proposed for CD [18] which markedly improved the performance of MRF in remote sensing image CD tasks. Chen et al. [19] extracted multi-feature of superpixels as the vectors of CVA, the experimental results indicated that the object-based strategy can alleviate the “pepper and salt” noise in the results of VHR image CD. Im et al. [20] proposed an object-based change detection approach using correlation image analysis combined with a multiresolution segmentation technique, and obtained satisfactory results. Lefebvre [21] introduced the geometrical features (e.g., shape, size, location) of superpixels and textural features of wavelets into CD tasks, which achieved better performance.

Due to the various sizes and shapes of objects in VHR images, over-segmentation and under-segmentation can hardly be eliminated simultaneously. To guarantee the homogeneity of the segmented superpixels, Wu et al. [22] pointed out that slight over-segmentation may be more suitable to unsupervised CD tasks, for the reason that evident heterogeneity within a superpixel may lead to poor representation of extracted features. In the case of over-segmentation, the majority of objects are composed of a certain number of superpixels. However, most of the current object-based CD methods only extract features and analyze changes for each single superpixel. Despite the fact that structural information can be partly represented by the contours of superpixels, the sufficiency of structural information needs to be exploited by considering interactions among adjacent superpixels under inevitable over-segmentation. In addition, as spectral diversity usually exists within an object of relatively large size, separately considering a single superpixel of the object may result in an inaccuracy.

Contextual relationships exist among ground objects in VHR remote sensing images which can contribute to better CD performance. As a tool of modeling relationships among objects, graph models have been proved to be powerful in utilizing the contextual information in various remote sensing tasks. Consequently, some graph-based methods have been proposed for VHR image CD in recent years. Pham et al. [23] first introduced the graph theory to model relationships between key points obtained by local maximization; then, an energy function was designed to measure the level of change in the positions of key points. A stereo graph model was proposed to extract temporal-spatial information of key points [24]. Further, Wang et al. [25] introduced a hypergraph to exploit interactive information, which constructs the hyperedges using similarity among pixels within a certain neighborhood; then, the CD between bi-temporal images was treated as a matching problem between two hypergraphs with the same topological structure.

The existing graph-based CD methods have achieved promising performance in some applications. Nevertheless, several limitations still need to be further improved. First, most of these methods only detect changes in the positions of key points, but not the global image scenes, which means that the shapes and areas of changed regions are missing. Second, the positions and numbers of key points are captured by some simple operators (e.g., local maximization). However, they are apt to be influenced by noise or spurious points when the spatial resolution is high. In addition, the exploitation of temporal-spatial information is insufficient, which results in incomplete structural information of detected results.

In view of the aforementioned issues, an unsupervised CD method based on the graph similarity of temporal-spatial-structural neighborhoods is proposed in this paper. First, the input bi-temporal images are segmented into superpixels with one coarse and one fine scale with other parameters invariant. Treating each segmented superpixel as a node of graph, a temporal-spatial-structural neighborhood is defined. Then, a graph can be constructed for each superpixel with the fine segmented scale separately. The two graphs corresponding to the two superpixels in the same position of bi-temporal images are with the same topological structure. Finally, a metric function is used to measure the level of the changes, which takes both spectral and structural similarity into account. The main contributions of this paper can be summarized as follows:

We propose a novel temporal-spatial-structural neighborhood, which combines the common spatial neighborhood and a new temporal-structural neighborhood by introducing the structural constraint of ground objects in the bi-temporal images with different scale parameters. Based on the defined neighborhood, a weighted graph is then constructed for each superpixel with the fine scale, which can effectively avoid the loss of structure and context information, as the complicated interactive relationships among superpixels of both two images with coarse and fine scale parameters are taken into account.
A new metric function is designed to measure the similarity between graphs with the same topological structure, which integrates the spectral difference with the temporal-structural difference to better alleviate impacts of inevitable spectral variability and noise commonly existing in VHR images.

The rest of this paper is as follows. Section 2 illustrates the proposed method in detail. Subsequently, Section 3 exhibits the experimental results on both optical and SAR VHR images. Some discussions of the method are listed in Section 4. Finally, the conclusions are provided in Section 5.

2. Materials and Methods

In this section, the proposed temporal-spatial-structural graph (TSSG) is elaborated in detail, as shown in Figure 1. Assuming that the input bi-temporal images

I_{1}

and

I_{2}

(both with a size of

A \times H \times B

, where

A, H, B

represent width, height, and number of bands, respectively) have been pre-processed, including co-registration and radiometric calibration, the TSSG mainly consists of the following steps:

The input images $I_{1}$ and $I_{2}$ are first stacked into one image $I$ (with the size of $A \times H \times 2 B$ ) in sequence of bands. Then, $I$ is segmented by the fractal net evolution approach (FNEA) [26] with a coarse scale $S_{C}$ and a fine one $S_{F}$ . After that, the boundaries with $S_{C}$ and $S_{F}$ are mapped to $I_{1}$ and $I_{2}$ , respectively, to obtain the coarse scale superpixel sets $Ω_{C 1}, Ω_{C 2}$ and the fine sets $Ω_{F 1}, Ω_{F 2}$ .
After defining the temporal-spatial-structural neighborhood, we construct a graph for each superpixel of $Ω_{F 1}, Ω_{F 2}$ based on the defined neighborhood. The graphs $G_{1, i}$ and $G_{2, i}$ corresponding to the superpixels $F_{1, i} \in Ω_{F 1}$ and $F_{2, i} \in Ω_{F 2}$ have the same topological structure.
All graph pairs ( $G_{1, i}$ and $G_{2, i}$ , $i \in 1, 2, \dots, N$ , where $N$ is the number of superpixels with $S_{F}$ ) are fed into the designed metric function to obtain the difference images (DIs).
The DI is segmented or clustered to obtain binary change maps.

The pivotal steps of the TSSG and main contributions of this paper are graph construction and the measurement of similarity between two graphs; therefore, we illustrate these two modules in detail.

2.1. Graph Construction Based on Temporal-Spatial-Structural Neighborhood

As a multiresolution segmentation approach, the FNEA can generate superpixels with relatively precise contours of ground objects in remote sensing images. Nevertheless, the influence of scale parameters is unavoidable. Assuming that the stacked image

I

(with the size of

A \times H \times 2 B

) is segmented by the FNEA with scale parameters

S_{C}

and

S_{F}

with other parameters invariant, then superpixel sets

Ω_{C 1} = {C_{1, 1}, \dots, C_{1, M}}, Ω_{C 2} = {C_{2, 1}, \dots, C_{2, M}}

and

Ω_{F 1} = {F_{1, 1}, \dots, F_{1, N}}, Ω_{F 2} = {F_{2, 1}, \dots, F_{2, N}}

can be obtained by mapping the boundaries to

I_{1}

and

I_{2}

, where

M

and

N

denote the numbers of superpixels with coarse and fine scales, respectively, and

M < N

. As only scale parameters are different when the FNEA is applied, each superpixel

C_{1, i} \in Ω_{C 1}

is composed of several superpixels in

Ω_{F 1}

. Likewise,

C_{2, i} \in Ω_{C 2}

is composed of the same indexes of superpixels in

Ω_{F 2}

.

For each

F_{1, i} \in Ω_{F 1}

, a temporal-spatial-structural graph

G_{1, i} = {V_{1, i}, E_{1, i}, W_{1, i}}

is constructed corresponding to it, where

V, E, W

denote the node set, edge set, and weight set, respectively. As shown in Figure 2, the definition of

V_{1, i}

is as follows:

V_{1, i} = F_{1, i} \cup N_{F 1} (i) \cup N_{S 2} (i)

(1)

where

F_{1, i}

is the superpixel we construct a graph for, namely, the central node.

N_{F 1} (i)

denotes the superpixels which are adjacent to

F_{1, i}

within image

I_{1}

, namely the node set of the spatial neighborhood.

N_{S 2} (i)

represents the node set of the temporal-structural neighborhood, which can be defined by

N_{S 2} (i) = {F_{2, j}; F_{1, i} \in C_{1, k} & F_{1, j} \in C_{1, k}} .

(2)

It should be noted that the nodes of the temporal-structural neighborhood are within another image

I_{2}

, as shown by the black solid circles in Figure 2. The positions (or indexes) can be acquired according to the parent–offspring relationships existing between segmentation with the coarse and fine scale parameters. Through the definition of Equation (2), the structural information of ground objects between two imaging periods is introduced into the construction of the graph model. When constructing the corresponding graph

G_{2, i} = {V_{2, i}, E_{2, i}, W_{2, i}}

for

F_{2, i}

, the form of its temporal-structural neighborhood

N_{S 1} (i)

is the same with that of

N_{S 2} (i)

, which can be expressed by:

N_{S 1} (i) = {F_{1, j}; F_{2, i} \in C_{2, k} & F_{2, j} \in C_{2, k}}

(3)

According to the node sets defined, the edge set

E_{1, i}

of

G_{1, i}

can be denoted by:

E_{1, i} = {(F_{1, i}, F_{1, j}) \cup (F_{1, i}, F_{2, k}); F_{1, j} \in N_{F 1} (i), F_{2, k} \in N_{S 2} (i)}

(4)

The weights of edges are computed using:

{\begin{matrix} W_{1, i} (F_{1, i}, F_{1, j}) = \exp (- | μ_{1, i} - μ_{1, j} |) \\ W_{1, i} (F_{1, i}, F_{2, k}) = \exp (- | μ_{1, i} - μ_{1, k} |) \end{matrix}

(5)

where μ_1,i is the spectral mean value of the ith superpixel

F_{1, i}

. It can be observed from Formula (5) that regardless of whether the edges link

F_{1, i}

and its spatial neighborhood or the edges link

F_{1, i}

and its temporal-structural neighborhood, the weights are all calculated from

I_{1}

. Exponential fall-off is adopted to calculate weights, which means that the weight between two superpixels is large if their spectral mean values are near.

According to the same method given above, for each superpixel

F_{2, i}

in

Ω_{F 2}

, a graph model

G_{2, i} = {V_{2, i}, E_{2, i}, W_{2, i}}

can be constructed which shares the same topology structure with

G_{1, i}

. Nevertheless, the weights of the corresponding edges are different.

2.2. Measurement of Graph Similarity

Based on the constructing method given above,

G_{1, i}

and

G_{2, i}

can be obtained according to

F_{1, i}

and

F_{2, i}

generated with the fine scale parameter. The level of changes within area of the ith superpixel can be measured by the similarity between

G_{1, i}

and

G_{2, i}

, which share the same topology structure. A metric function is designed in this section to measure the similarity. As temporal-spatial-structural information is contained in the constructed graph, the function considers similarity from two aspects: similarity of the spatial neighborhood and similarity of the temporal-structural neighborhood, which are shown by blue and red dotted lines in Figure 3, respectively.

The similarity of spatial neighborhoods is measured by using a form of a one-to-one correspondence, as shown by the blue dotted lines in Figure 3, namely, node pairs in the same positions are compared. The metric value can be calculated by:

D_{N} (i) = [D (μ_{1, i}, μ_{2, i}) + \frac{\sum_{F_{1, j} \in N_{F 1} (i)} ω_{j} \times D (μ_{1, j}, μ_{2, j})}{\sum_{F_{1, j} \in N_{F 1} (i)} ω_{j}}]

(6)

where

D (μ_{1, i}, μ_{2, i})

denotes the Euclidean distance of

μ_{1, i}

and

μ_{2, i}

.

ω_{j}

is the weight of

D (μ_{1, j}, μ_{2, j})

, which can be obtained by

ω_{j} = 0.5 (W_{1, i} (F_{1, i}, F_{1, j}) + W_{2, i} (F_{2, i}, F_{2, j}))

(7)

where W_1,i has been defined in Formula (5). As can be seen, information in both input images will be considered.

When the FNEA or other methods of superpixel segmentation are employed for VHR remote sensing images with a relatively fine scale parameter, over-segmentation would inevitably exist, which means that much structural information of ground objects may be represented by assembling adjacent superpixels. As aforementioned in Section 2.1, parent–offspring relationships exist between the segmented results with the coarse and fine scales, which may also contain abundantly structural information. To completely utilize the structural information, we adopt the crosswise comparing form to measure the similarity of temporal-structural neighborhoods.

Assuming that

F_{1, i} \in C_{1, k}

, and

C_{1, k}

is composed of K superpixels in

Ω_{F 1}

, which can be denoted by

C_{1, k} = {F_{1, i_{1}}, \dots, F_{1, i_{K}}}

, similarly,

F_{2, i} \in C_{2, k}

and

C_{2, k} = {F_{2, i_{1}}, \dots, F_{2, i_{K}}}

, the similarity of temporal-structural neighborhood between

G_{1, i}

and

G_{2, i}

can be calculated as:

D_{S} (i) = | \frac{\sum_{m = 2}^{K} \sum_{n = 1}^{m} γ_{m n} (μ_{1, i_{m}} - μ_{2, i_{n}})}{\sum_{m = 2}^{K} \sum_{n = 1}^{m} γ_{m n}} + \frac{\sum_{m = 2}^{K} \sum_{n = 1}^{m} λ_{m n} (μ_{2, i_{m}} - μ_{1, i_{n}})}{\sum_{m = 2}^{K} \sum_{n = 1}^{m} λ_{m n}} |

(8)

The weights

γ_{m n}, λ_{m n}

are obtained by:

{\begin{matrix} γ_{m n} = \exp (- | μ_{1, i_{m}} - μ_{1, i_{n}} |) \\ λ_{m n} = \exp (- | μ_{2, i_{m}} - μ_{2, i_{n}} |) \end{matrix}

(9)

According to Formulas (5) and (8), the level of changes (namely difference image, DI) value within the area of the ith superpixel can be formulated as:

D I (i) = (1 - α) D_{N} (i) + α D_{S} (i)

(10)

where

α

is a user-defined constant within the range of [0, 1]. As

D_{N} (i)

denotes the similarity of spatial neighborhoods, when the signal-to-noise levels are low,

α

should be small to enhance the spatial neighborhood information to suppress noise. The common binary change maps can be sequentially obtained by segmentation or clustering using the DIs.

3. Results

To validate the superiority of the proposed method, the experimental results on both the optical and SAR data sets are shown below.

3.1. Data Sets and Experimental Settings

3.1.1. Optical Data Sets

Three optical data sets are employed in our experiments. The first one is a pair of images captured by Gaofen-2, with the spatial resolution of 2 m/pixel; the covered area is the Mingfeng lake, Dongying, Shandong province, China, as shown in the first row of Figure 4. The main changes during the two acquisitions are the construction of buildings and roads. The second is a pair of images from the openly available SZTAKI aerial data set. The spatial resolution is 1.5 m/pixel and the size is 640 × 952, as shown in the second row of Figure 4. The third is the Beijing & Tianjin (B&T) data set released by Hou et al. [24], which contains 29 pairs of images with more than 2000 × 2000 pixels for training. In addition, 21 pairs of images with a size of about 500 × 500 are provided for testing. The areas are located in the Beijing and Tianjin regions of China. A part of the images is collected from Google Earth with the resolution of 0.46 m/pixel, the others are imaged by Gaofen-2 with the resolution of 1 m/pixel. The images are acquired under different seasons and illumination conditions. Thus, the complexity of the images can be insured to test the robustness of the methods. The third and the fourth rows of Figure 4 show two sample images and the corresponding ground truth.

3.1.2. SAR Data Sets

Two VHR data sets are used to conduct the experiments on SAR images. The first data set is a pair of images acquired by the TerraSAR-X sensor with HH polarization and 1 m/pixel covering a suburban area of Wuhan, China, where the remarkable changes are the construction and demolition of buildings, as shown in the first row of Figure 5. The second one corresponds to an area in Beijing, China, as shown in the second row of Figure 5. The images are acquired by Gaofen-3 with a size of 550 × 900 and a spatial resolution of 1 m/pixel.

3.1.3. Experimental Settings

The user-defined constant α and segmented scale parameters need to be set when the proposed method is implemented. They are listed in Table 1.

To evaluate the performances of the proposed method, four common quantitative evaluation indices, false alarm rate (FAR), missed alarm rate (MAR), overall accuracy (OA), and Kappa coefficient (KC), are adopted as metrics. FAR, MAR, and OA can be formulated as FAR = FP/(FP + TN), MAR = FN/(FN + TP), and OA = (TP + TN)/(TP + TN + FP + FN), respectively, where TP denotes the number of true positives, FP denotes the number of false positives, TN denotes the number of true negatives, and FN denotes the number of false negatives, respectively. Kappa is a statistical measure of the consistency between the change map and the reference map. It is calculated by

\begin{array}{l} K C = (O A - P R E) / (1 - P R E) \\ P R E = \frac{(T P + F N) (T P + F P) + (T N + F P) (T N + F N)}{{(T P + T N + F P + F N)}^{2}} \end{array}

(11)

As the proposed unsupervised method aims at generating reliable DI, the Receiver Operating Characteristic (ROC) curve is used to intuitively evaluate the reliability of DIs regardless of segmentation or clustering methods. The closer to the upper left point (0, 1), the more reliable the DI is. The horizontal and vertical coordinates of an ROC curve are False Positive Rate (FPR) and True Positive Rate (TPR), respectively. They are expressed as.

\begin{array}{l} F P R = F P / (F P + T N) \\ T P R = T P / (T P + F N) \end{array}

(12)

Once the ROC curve has been obtained, the threshold corresponding to the point of the ROC curve which is the closest to (0, 1) is selected to segment the DI. As the FPR and TPR are both functions of threshold T, the segmented threshold is decided by

\hat{T} = \arg \min_{T} [{(F P R (T))}^{2} + {(1 - T P R (T))}^{2}]

(13)

3.2. Experiments on Optical Images

To verify the effectiveness of the proposed TSSG for optical images, we compare our TSSG with the following six unsupervised methods: change vector analysis (CVA) [11], object-based change vector analysis (OCVA) [27], deep change vector analysis (DCVA) [28], deep slow feature analysis (DSFA) [29], adaptive spatial-context extraction algorithm (ASEA) [30], and superpixel neighborhood graph (SNG) [31].

The difference images (DIs) generated by the above unsupervised methods are shown in Figure 6, Figure 7, Figure 8 and Figure 9. All the DIs are linearly stretched to the range of [0, 1]. In a reliable DI, the values of changed regions are close to 1, while those of unchanged regions are close to 0. From the view of visually qualitative analysis, DSFA fails to enhance some changed regions, as shown by the red boxes in Figure 7 and Figure 8d. A large number of white isolated points exist in the DIs generated by CVA and ASEA, which may result in severe “pepper and salt” noise in binary change maps. Compared to the reference images, some values of unchanged regions are still too large in the DIs of OCVA and DCVA. The contrast of DIs generated by the SNG is intuitively appropriate; nevertheless, some changed regions are too dark, as marked by the yellow boxes in Figure 6 and Figure 9f. Intuitively, in the DIs generated by the TSSG, the color of changed regions is close to white, and that of the unchanged ones is relatively dark. In addition, the structure of changed regions can be represented clearly.

The ROC curves obtained by the above methods are shown in Figure 10. As can be seen, the ROC curves of CVA are dissatisfying, and the main reason is that as a pixel-based method, the independency assumption of pixels is unavailable in VHR images; thus, the results are prone to be affected by the diversity of spectra as only spectral information of independent pixel is used to generate DIs. The ROC curves of OCVA achieve poor performance due to a similar reason to that of CVA; the negligence of neighborhood information among superpixels results in unavoidable influence from the diversity of spectra. The performance of ASEA is unstable, which performs poorly on the SZTAKI image pair. DSFA cannot achieve perfect performance as it needs reliable training samples which can be hardly satisfied through simply unsupervised pre-classification on VHR images. Therefore, the discrimination of the deep feature is limited. DCVA seems to be better than the above methods, however the performance on the first patch sample of B&T data set is also dissatisfying. The ROC curves of the SNG and TSSG are closer to the upper left point (0, 1) than those of others. Observing carefully, those of the TSSG are better on all data sets.

The corresponding binary change maps are shown in Figure 11, Figure 12, Figure 13 and Figure 14. For the Mingfeng data set, a large number of false alarms occur in the results of CVA, OCVA, DSFA, and ASEA, namely many unchanged pixels which are detected as changed ones, as shown in the regions marked by the red boxes in Figure 11a,b,d,e. This phenomenon is the most obvious in the results of CVA, as it treats each pixel as an independent unit; thus, the structural and context contextual information can hardly be utilized. False alarms in the results of DCVA are visibly fewer than those of others, but missed alarms occur in some regions, such as the red ellipse in Figure 11c. The result of the SNG seems to be better, however, compared with that of the TSSG. Some structural information of changed regions is missing, such as the region marked by the red ellipse in Figure 11f.

For the SZTAKI data set, obvious “pepper and salt” noise occurs in the results of DSFA. CVA and ASEA also fail to avoid the “pepper and salt” noise. DCVA, SNG, and TSSG seem to achieve better performance. Interpreting in detail, more false alarms occur in the results of DCVA and SNG, as shown in the red boxes in Figure 12c,f.

For the B&T data set, due to the influence of spectral diversity, the false alarms in results of CVA, OCVA, DCVA, DSFA, and ASEA are more remarkable compared with those of the SNG and TSSG, as shown by the red boxes in Figure 13a–e. Intuitively, the SNG and TSSG achieve close performance. However, interpreting in detail, the TSSG can preserve better structural information in the changed regions with complicated ground objects. For example, in the region marked by the red box in Figure 13g, more structural information is missing in the results of the SNG.

The quantitative evaluations of change maps are listed in Table 2 and Table 3. It can be observed that the SNG and TSSG can achieve stable and relatively high performances compared with other methods. Specifically, OA and KC are higher than those of others, while both the MAR and FAR of the SNG and TSSG can be kept relatively low. Relatively, the TSSG achieves higher performance than the SNG, for the reason that the TSSG not only utilizes features from neighborhood of superpixels, but also fully exploits temporal-structural information when generating DIs.

3.3. Experiments on SAR Images

To verify the effectiveness of the proposed TSSG for SAR CD tasks, the following eight unsupervised methods are employed as benchmarks: Log ratio (LR) [13], mean ratio (MR) [14], differential principal component analysis (DPCA) [32], object-based mean ratio (OMR) [33], object-based contrast neighborhood (OCN) [33], principal component analysis networks (PCA-Net) [34], convolutional wavelet neural networks (CWNN) [35], and superpixel neighborhood graph (SNG) [31]. Because PCA-Net and CWNN directly generate a binary change map for CD tasks without DIs, only binary change maps of these two methods are used to compare with others.

The difference images (DIs) generated by the aforementioned methods on SAR data sets are shown in Figure 15 and Figure 16. Intuitively, LR, MR, OMR, and OCN have weak immunity to speckle noise that commonly exists in VHR SAR images which results in many bright spots in unchanged regions, such as the regions marked by red ellipses in Figure 15a,b,d,e. The performance of DPCA is dissatisfying, as the discrimination between changed and unchanged regions is weak. In addition, as pixel-based methods, LR MR and DPCA generate numerous isolated points due to the fact that contextual information has not been effectively utilized. The contrast of DIs generated by the SNG is obvious, however the brightness of some changed positions is too dark, which may lead to missed alarms, such as the region of a changed building in the left–down part of Figure 16f.

The ROC curves on two SAR data sets are shown in Figure 17. It can be obviously observed that DPCA achieves poor performance. Intuitively, the ROC curves of the SNG and TSSG are closer to (0, 1) than those of others. In contrast, the TSSG is better than the SNG.

The binary change maps of SAR data sets are exhibited in Figure 18 and Figure 19. As can be seen, the results in Figure 18 and Figure 19a–c contain numerous false alarms which locate in the whole scenes. The reason is that as pixel-based methods, LR, MR, and DPCA utilize little geometric and structural information. Nevertheless, speckle noise is always severe in VHR SAR images, and poor immunity to noise can be obtained if only intensity information is considered. For the Beijing SAR data set, as can be observed in the red boxes of Figure 18e–g, some unchanged positions are detected as changed ones in the results of OCN, PCA-Net, and CWNN. In contrast, the SNG and TSSG can better balance false alarms and missed alarms than others. For the Wuhan SAR data set, evident false alarm regions occur in the results of OMR and OCN, while PCA-Net and CWNN seem to miss the integrated information of structures in some changed regions where many small holes appear within the building regions. The results of the SNG and TSSG preserve structural information better than others. Observing in detail, the contours of buildings in Figure 19i are more precise than those in Figure 19h.

Table 4 lists the quantitative evaluations of binary change maps on SAR data sets. It can be observed that the OA and KC of the SNG and TSSG are higher than those of others. Meanwhile, the FAR and MAR of the SNG and TSSG are relatively low. Comparing in detail, the TSSG achieves higher performance than all other benchmark methods.

4. Discussion

4.1. Discussion of the Methods of Multitemporal Segmentation

To insure that a superpixel F_1,i in I₁ and its corresponding one

F_{2, i}

in

I_{2}

strictly cover the same geographical area, four methods of multitemporal segmentation can be considered: (1)

I_{1}

and

I_{2}

are stacked into one image, then the stacked image is segmented to obtain contours which are mapped to both

I_{1}

and

I_{2}

(Stacked); (2)

I_{1}

is segmented, then the contours to

I_{2}

(Assigned-1) are assigned; (3)

I_{2}

is segmented, then the contours to

I_{1}

(Assigned-2) are assigned; (4)

I_{1}

and

I_{2}

are segmented respectively, then the contours are overlaid (Overlaid). As an obvious discrepancy exists in superpixels acquired in different ways, the performance of object-based methods which treat each superpixel as a basic unit would be unavoidably influenced by the method of segmentation. Hence, in this section, to study influence for the TSSG, the aforementioned four ways are employed to the input image pairs, respectively. For fairness, the parameters needed in the TSSG are the same under different segmented methods, which are listed in Table 1.

The ROC curves under different methods of segmentation in the Mingfeng data set and Wuhan SAR data set are shown in Figure 20. As can be seen, the ROC curves of the stacked way are the closest to the upper left (0, 1) point. The reason may be that the stacked way takes two input images into account when the FNEA is applied; thus, the contours generated by this method can fit the ground objects of both images more accurately. By contrast, the Assigned-1 and Assigned-2 methods only consider one image when the FNEA is applied. Under these circumstances, the contours may fit the ground objects of one image well, but fail to accurately delineate those of the other image. For the overlaid method, each image is segmented respectively, and numerous misaligned contours exist between two images. After contour overlaying, the superpixels may be too fragmentized, which may result in inadequate exploring of structural information.

4.2. Influence of Parameter $α$

The parameter

α

decides the weight of information from the temporal-structural neighborhood. To clarify the influence of

α

on the performance of the TSSG, we vary

α

to compare corresponding ROC curves to all test image pairs of the B&T data set. The ROC curves on six image pairs are shown in Figure 21.

As can be observed from Figure 21, when

α = 0.1, 0.2, 0.3

, the ROC curves are relatively close to the upper left point (0, 1). Nevertheless, when

α = 0.4, 0.5

, the performances of the TSSG are relatively lower in some image pairs, e.g., Figure 21a–c. The information from the temporal-structural neighborhood can be contributing to comprehensively measure the level of changes as the structural changes between two images are taken into account. However, when

α

is large, some interference may be introduced. Therefore, to achieve stable and relatively high performance, we suggest that

α

is no larger than 0.3 when employing the proposed TSSG.

5. Conclusions

Unsupervised change detection is studied from the perspective of the similarity between two graphs with the same topological structure in this paper. The basic strategy is treating each superpixel as a node of graph, and then presenting the structural and contextual relationships by a pair of defined graphs. Specifically, an unsupervised method based on the temporal-spatial-structural graph (TSSG) is proposed for VHR image change detection tasks. After obtaining superpixels by segmentation, a temporal-spatial neighborhood is defined using the parent–offspring relationships between coarse and fine segmented scale parameters, which can present the structural information of ground objects. Incorporating with the spatial neighborhood, the TSSG extends the interactive range among nodes from two dimensions to three dimensions, which can more perfectly exploit the structural and contextual information of the bi-temporal images. Subsequently, to measure the similarity between two graphs, both spectral and structural similarities are taken into account to define the metric function which is used to generate the difference images. The proposed TSSG has been compared with some state-of-the-art unsupervised CD methods on both VHR optical and SAR data sets. The experimental results demonstrate the effectiveness and superiority of our method.

Author Contributions

Conceptualization, J.W. and B.L.; methodology, J.W. and H.B.; software, J.W.; validation, J.W., W.N. and K.C.; formal analysis, Q.L. and X.K.; investigation, J.W.; resources, J.W. and Q.L.; data curation, K.C. and W.N.; writing—original draft preparation, J.W.; writing—review and editing, J.W. and X.K; visualization, J.W.; supervision, J.W. and B.L.; project administration, B.L.; funding acquisition, W.N. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 42101344.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Radke, R.J.; Andra, S.; Kofahi, O.A.; Roysam, B. Image change detection algorithms: A systematic survey. IEEE Trans. Image Process. 2005, 14, 294–307. [Google Scholar] [CrossRef] [PubMed]
Lu, P.; Qin, Y.; Li, Z.; Mondini, A.C.; Casagli, N. Landslide mapping from multi-sensor data through improved change detection-based Markov random field. Remote Sens. Environ. 2019, 231, 111235. [Google Scholar] [CrossRef]
Gao, Y.; Gao, F.; Dong, J.; Wang, S. Transferred deep learning for sea ice change detection from synthetic-aperture radar images. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1666–1670. [Google Scholar] [CrossRef]
Timilsina, S.; Aryal, J.; Kirkpatrick, B. Mapping urban tree cover change using object-based convolution neural network (OB-CNN). Remote Sens. 2020, 3017, 1218. [Google Scholar] [CrossRef]
Chen, H.; Shi, Z. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sens. 2020, 12, 1662. [Google Scholar] [CrossRef]
Zhang, C.; Yue, P.; Tapete, D.; Jiang, L.; Shangguan, B.; Huang, L. A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images. ISPRS J. Photogramm. Remote Sens. 2020, 166, 183–200. [Google Scholar] [CrossRef]
Shi, Q.; Liu, M.; Li, S.; Liu, X.; Wang, F.; Zhang, L. A deeply supervised attention metric-based network and an open aerial image dataset for remote sensing change detection. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–16. [Google Scholar] [CrossRef]
Li, Z.; Shi, W.; Zhang, C.; Geng, J.; Huang, J.; Ye, Z. Subpixel change detection based on improved abundance values for remote sensing images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 10073–10086. [Google Scholar] [CrossRef]
Wang, P.; Wang, L.; Leung, H.; Zhang, G. Super-resolution mapping based on spatial-spectral correlation for spectral imagery. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2256–2268. [Google Scholar] [CrossRef]
Nielsen, A.A. The regularized iteratively reweighted mad method for change detection in multi-and hyperspectral data. IEEE Trans. Image Process. 2007, 16, 463–478. [Google Scholar] [CrossRef] [Green Version]
Lambin, E.F.; Strahlers, A.H. Change-vector analysis in multitemporal space: A tool to detect and categorize land-cover change processes using high temporal-resolution satellite data. Remote Sens. Environ. 1994, 48, 234–244. [Google Scholar] [CrossRef]
Bovolo, F.; Marchesi, S.; Bruzzone, L. A framework for automatic and unsupervised detection of multiple changes in multitemporal images. IEEE Trans. Geosci. Remote Sens. 2012, 50, 2196–2212. [Google Scholar] [CrossRef]
Dekker, R.J. Speckle filtering in satellite SAR change detection imagery. Int. J. Remote Sens. 1998, 19, 1133–1146. [Google Scholar] [CrossRef]
Inglada, J.; Mercier, G. A new statistical similarity measure for change detection in multitemporal SAR images and its extension to multiscale change analysis. IEEE Trans. Geosci. Remote Sens. 2007, 45, 1432–1445. [Google Scholar] [CrossRef] [Green Version]
Chatelain, F.; Tourneret, J.Y. Change detection in multisensor SAR images using bivariate Gamma distributions. IEEE Trans. Image Process. 2008, 17, 249–258. [Google Scholar] [CrossRef] [Green Version]
Gong, M.; Zhang, P.; Su, L.; Liu, J. Coupled dictionary learning for change detection from multisource data. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7077–7091. [Google Scholar] [CrossRef]
Zhuang, H.; Tan, Z.; Deng, K.; Yao, G. Adaptive generalized likelihood ratio test for change detection in SAR images. IEEE Geosci. Remote Sens. Lett. 2020, 17, 416–420. [Google Scholar] [CrossRef]
Bruzzone, L.; Prieto, D.F. Automatic analysis of the difference image for unsupervised change detection. IEEE Trans. Geosci. Remote Sens. 2000, 38, 1171–1182. [Google Scholar] [CrossRef] [Green Version]
Chen, Q.; Chen, Y. Multi-feature object-based change detection using self-adaptive weight change vector analysis. Remote Sens. 2016, 8, 549. [Google Scholar] [CrossRef] [Green Version]
Im, J.; Jensen, J.R.; Tullis, J.A. Object-based change detection using correlation image analysis and image segmentation. Int. J. Remote Sens. 2008, 29, 399–423. [Google Scholar] [CrossRef]
Lefebvre, A.; Corpetti, T.; Hubert-mov, L. Object-oriented approach and texture analysis for change detection in very high resolution images. In Proceedings of the IGARSS 2008—2008 IEEE International Geoscience and Remote Sensing Symposium, Boston, MA, USA, 6–11 July 2008; pp. 663–666. [Google Scholar]
Wu, T.; Luo, J.; Fang, J.; Song, X. Unsupervised object-based change detection via a Weibull mixture model-based binarization for high-resolution remote sensing images. IEEE Geosci. Remote Sens. Lett. 2018, 15, 63–67. [Google Scholar] [CrossRef]
Pham, M.T.; Mercier, G.; Michel, J. Change detection between SAR images using a pointwise approach and graph theory. IEEE Trans. Geosci. Remote Sens. 2016, 54, 2020–2032. [Google Scholar] [CrossRef]
Wang, J.; Yang, X.; Jia, L.; Yang, X.; Dong, Z. Pointwise SAR image change detection using stereo-graph cuts with spatio-temporal information. Remote Sens. Lett. 2019, 10, 421–429. [Google Scholar] [CrossRef]
Wang, J.; Yang, X.; Yang, X.; Jia, L.; Fang, S. Unsupervised change detection between SAR images based on hypergraphs. ISPRS J. Photogramm. Remote Sens. 2020, 164, 61–72. [Google Scholar] [CrossRef]
Baatz, M.; Schape, A. Multiresolution segmentation: An optimization approach for high quality multiscale image segmentation. In Beitrage zum AGIT-Symposium Salzburg 1999; Herbert Wichmann Verlag: Heidelberg, Germany, 2000; pp. 12–23. Available online: https://www.researchgate.net/publication/268745811_An_optimization_approach_for_high_quality_multi-scale_image_segmentation (accessed on 27 February 2023).
Li, L.; Li, X.; Zhang, Y.; Wang, L.; Ying, G. Change detection for high-resolution remote sensing imagery using object-oriented change vector analysis method. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 2873–2876. [Google Scholar]
Saha, S.; Bovolo, F.; Bruzzone, L. Unsupervised deep change vector analysis for multiple-change detection in VHR images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 3677–3693. [Google Scholar] [CrossRef]
Du, B.; Ru, L.; Wu, C.; Zhang, L. Unsupervised deep slow feature analysis for change detection in multi-temporal remote sensing images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 9976–9992. [Google Scholar] [CrossRef] [Green Version]
Lv, Z.; Wang, F.; Liu, T.; Kong, X.; Benediktsson, A.A. Novel automatic approach for land cover change detection by using VHR remote sensing images. IEEE Geosci. Remote Sens. Lett. 2022, 19, 63–67. [Google Scholar] [CrossRef]
Wu, J.; Li, B.; Qin, Y.; Ni, W.; Zhang, H. An object-based graph model for unsupervised change detection in high resolution remote sensing images. Int. J. Remote Sens. 2021, 42, 6212–6230. [Google Scholar] [CrossRef]
Falco, N.; Marpu, R.; Benediktsson, J. A Toolbox for unsupervised change detection analysis. Int. J. Remote Sens. 2016, 37, 1505–1526. [Google Scholar] [CrossRef] [Green Version]
Yousif, O.; Ban, Y. A novel approach for object-based change detection image generation using multitemporal high-resolution SAR images. Int. J. Remote Sens. 2017, 38, 1765–1787. [Google Scholar] [CrossRef] [Green Version]
Gao, F.; Dong, J.; Li, B.; Xu, Q. Automatic change detection in synthetic aperture radar images based on PCANet. IEEE Geosci. Remote Sens. Lett. 2016, 12, 1792–1796. [Google Scholar] [CrossRef]
Gao, F.; Dong, J.; Li, B.; Xu, Q. Sea ice change detection in SAR images based on convolutional-wavelet neural networks. IEEE Geosci. Remote Sens. Lett. 2019, 16, 1240–1244. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed method.

Figure 2. Illustration of nodes of a temporal-spatial-structural graph.

Figure 3. Measurement of a temporal-spatial-structural graph pair.

Figure 4. Image patch examples and corresponding reference images of optical data set: (a) Images from T1; (b) Images from T2; (c) Ground Truth.

Figure 5. Images and corresponding reference images of SAR data set: (a) Images from T1; (b) Images from T2; (c) Ground Truth.

Figure 6. DIs of Mingfeng data set generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 7. DIs of SZTAKI data set generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 8. DIs of the first patch example of B&T data set generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 9. DIs of the second patch example of B&T data set generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 10. ROC curves of different methods on the four data sets: (a) Mingfeng; (b) SZTAKI; (c) The first patch sample of B&T data set; (d) The second patch sample of B&T data set.

Figure 11. Binary change maps of Mingfeng data set generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 12. Binary changed maps of SZTAKI data set generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 13. Binary change maps of B&T sample 1 generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 14. Binary change maps of B&T sample 2 generated by different methods: (a) CVA; (b) OCVA; (c) DCVA; (d) DSFA; (e) ASEA; (f) SNG; (g) TSSG; (h) Reference image.

Figure 15. DIs of Beijing SAR data set generated by different methods: (a) LR; (b) MR; (c) DPCA; (d) OMR; (e) OCN; (f) SNG; (g) TSSG; (h) Reference image.

Figure 16. DIs of Wuhan SAR data set generated by different methods: (a) LR; (b) MR; (c) DPCA; (d) OMR; (e) OCN; (f) SNG; (g) TSSG; (h) Reference image.

Figure 17. ROC curves of different methods on the two SAR data sets: (a) Beijing; (b) Wuhan.

Figure 18. Binary change maps of Beijing SAR images generated by different methods: (a) LR; (b) MR; (c) DPCA; (d) OMR; (e) OCN; (f) PCA-Net; (g) CWNN; (h) SNG; (i) TSSG; (j) Reference image.

Figure 19. Binary change maps of Wuhan SAR images generated by different methods: (a) LR; (b) MR; (c) DPCA; (d) OMR; (e) OCN; (f) PCA-Net; (g) CWNN; (h) SNG; (i) TSSG; (j) Reference image.

Figure 20. ROC curves of different segmented methods on the two data sets: (a) Mingfeng; (b) Wuhan SAR.

Figure 21. ROC curves of different

α

on six image pairs: (a) Image pair-1; (b) Image pair-2; (c) Image pair-3; (d) Image pair-4; (e) Image pair-5; (f) Image pair-6.

Figure 21. ROC curves of different

α

on six image pairs: (a) Image pair-1; (b) Image pair-2; (c) Image pair-3; (d) Image pair-4; (e) Image pair-5; (f) Image pair-6.

Table 1. Parameter settings.

Data Set	Mingfeng	SZTAKI	B&T	Wuhan	Beijing
α	0.2	0.2	0.2	0.2	0.2
Scale parameters	(15, 30)	(15, 30)	(10, 30)	(15, 25)	(15, 25)

Table 2. Quantitative accuracy results for different methods on Mingfeng and SZTALI data sets (The best ones are shown by bold).

Methods	Mingfeng				SZTAKI
Methods	FAR	MAR	OA	KC	FAR	MAR	OA	KC
CVA	12.31	30.38	87.27	44.35	1.13	64.61	95.21	43.77
OCVA	11.82	32.31	87.57	44.20	1.22	64.01	95.12	43.87
DCVA	3.58	52.20	90.32	50.36	2.67	45.28	94.87	52.49
DSFA	10.17	55.21	86.88	34.58	7.29	69.87	89.09	18.59
ASEA	6.64	32.69	91.32	50.29	2.93	66.04	93.42	33.93
SNG	3.56	45.90	93.11	51.51	3.10	43.10	94.59	51.97
TSSG	3.54	45.27	93.18	52.10	0.90	56.28	95.90	53.24

Table 3. Quantitative accuracy results for different methods on B&T data set (The best ones are shown by bold).

Methods	B&T Sample 1				B&T Sample 2
Methods	FAR	MAR	OA	KC	FAR	MAR	OA	KC
CVA	13.48	48.29	81.18	34.54	5.66	45.29	91.90	43.83
OCVA	11.66	43.86	83.40	41.03	4.26	40.66	93.51	51.34
DCVA	12.02	50.08	81.95	35.91	4.71	32.93	93.72	54.25
DSFA	12.74	57.00	80.47	29.69	5.04	36.82	93.08	51.51
ASEA	10.21	44.55	84.53	43.15	3.91	39.20	94.25	55.08
SNG	9.45	40.36	85.81	47.88	3.02	42.59	94.46	55.87
TSSG	9.00	38.09	86.54	50.51	1.57	43.35	95.88	60.51

Table 4. Quantitative accuracy results for different methods on SAR data sets (The best ones are shown by bold).

Methods	Beijing				Wuhan
Methods	FAR	MAR	OA	KC	FAR	MAR	OA	KC
LR	8.52	62.49	86.15	27.14	17.49	41.22	78.78	34.03
MR	5.20	56.00	89.29	40.34	13.01	19.03	86.05	56.35
DPCA	7.75	76.25	85.49	21.40	15.53	54.69	78.31	26.69
OMR	9.51	28.75	88.59	49.06	14.13	18.92	85.12	54.39
OCN	6.20	46.90	89.58	44.97	12.24	34.61	84.42	47.21
PCA-Net	7.48	39.49	89.36	47.01	2.85	38.20	89.59	65.02
CWNN	10.50	37.46	86.84	41.30	4.89	27.18	89.90	67.19
SNG	7.08	40.14	89.66	47.60	8.58	15.25	90.37	67.70
TSSG	6.88	37.99	90.05	49.65	5.78	18.47	92.23	72.09

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, J.; Ni, W.; Bian, H.; Cheng, K.; Liu, Q.; Kong, X.; Li, B. Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs. Remote Sens. 2023, 15, 1770. https://doi.org/10.3390/rs15071770

AMA Style

Wu J, Ni W, Bian H, Cheng K, Liu Q, Kong X, Li B. Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs. Remote Sensing. 2023; 15(7):1770. https://doi.org/10.3390/rs15071770

Chicago/Turabian Style

Wu, Junzheng, Weiping Ni, Hui Bian, Kenan Cheng, Qiang Liu, Xue Kong, and Biao Li. 2023. "Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs" Remote Sensing 15, no. 7: 1770. https://doi.org/10.3390/rs15071770

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs

Abstract

1. Introduction

2. Materials and Methods

2.1. Graph Construction Based on Temporal-Spatial-Structural Neighborhood

2.2. Measurement of Graph Similarity

3. Results

3.1. Data Sets and Experimental Settings

3.1.1. Optical Data Sets

3.1.2. SAR Data Sets

3.1.3. Experimental Settings

3.2. Experiments on Optical Images

3.3. Experiments on SAR Images

4. Discussion

4.1. Discussion of the Methods of Multitemporal Segmentation

4.2. Influence of Parameter $α$

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Unsupervised Change Detection for VHR Remote Sensing Images Based on Temporal-Spatial-Structural Graphs

Abstract

1. Introduction

2. Materials and Methods

2.1. Graph Construction Based on Temporal-Spatial-Structural Neighborhood

2.2. Measurement of Graph Similarity

3. Results

3.1. Data Sets and Experimental Settings

3.1.1. Optical Data Sets

3.1.2. SAR Data Sets

3.1.3. Experimental Settings

3.2. Experiments on Optical Images

3.3. Experiments on SAR Images

4. Discussion

4.1. Discussion of the Methods of Multitemporal Segmentation

4.2. Influence of Parameter α

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. Influence of Parameter $α$