The BAs in SAR images appear heterogeneous, with alternating brightness and darkness due to the double bounce reflection of buildings, the shadow effect, and multiple reflections. As a result, image intensity, together with the spatial correlation and texture information need to be taken into consideration to extract BAs. The procedure proposed in this work consists of three main steps: (1) preprocessing of the images including contrast enhancement and image filtering; (2) BAs extraction based on a seed selection and region growing method; and (3) post-processing of the extracted BAs map, including mountain masking using DEM data and morphological processing.
3.2. Built-Up Areas Extraction Algorithm
The BAs extraction approach is based on the Seeds Selection and Region Growing (SSRG) procedure. The points in SAR images which show high backscattering patterns and obvious textural patterns are chosen as seeds, and also considered as candidate BAs. The characteristics of these seeds include three parts: image intensity, local spatial indicator, and texture feature, respectively, and the region growing step is implemented independently from the three parts. BAs extraction results are obtained by merging the results obtained from the three SSRG procedures.
(a) Seed extraction:
The pixels with very high backscattering in SAR have a high probability of belonging to BAs, which are selected as the first part of the seeds. A threshold, namely , is applied to the whole image to obtain these seeds. is a value set in the [0, 1] range which specifies a cutoff point of the maximum intensity (i.e., 255) for the seeds.
The second set of seeds is obtained from the local Getis-Ord
index.
is useful for determining clusters of similar values, where concentrations of high values result in a high value and concentrations of low values result in a low value [
22]. The local Getis-Ord
index is defined as
where
is the intensity value of pixel
. As shown in Equation (1), the calculation of the local
index requires the spatial weight matrix
. Each weight element
in
corresponds to a pair of observations at locations
and
.
is set to 1 if the two locations show spatial interaction, and set to zero if the two locations indicate a lack thereof. There are three main forms of weight matrix, including Rook’s, Queen’s, and Bishop’s. In Rook’s matrix,
is set to 1 if the pixel
shares a common edge with
; otherwise, it is set to 0.
is set to 1 if the pair shares a common vertex, or set to 0 otherwise in Bishop’s case. In Queen’s weight,
is set to 1 if the pair shares either a common edge or a vertex; otherwise, it is set to 0 [
35]. In our work, the Queen’s case (i.e., the eight neighborhoods of the pivotal position
are considered) is used to obtain more compact building area. The denominator
is the same for every pixel
in the target image and can be neglected, so the
index are actually the function of the eight neighborhoods of position
. In this case, the central pixel
can be detected as long as it is surrounded by pixels with high intensities. As a result, the bright concentration and the “outliers”, i.e., pixels with low intensity surrounded by pixels with high values can all be detected by
. Thus, some areas with shadow effects or low levels of reflections can be identified using
index, which is an alternative to using only image intensity. After computing the
index, the
map is stretched to a value range between 0 and 255. To obtain a second set of seeds, a threshold of
from [0, 1] is applied to the 8-bit gray-scale Getis-Ord
map.
To get more precise BAs extraction mapping, texture features also need to be considered to extract BAs in areas with remarkable heterogeneity. For a single-polarized SAR image, integrating both texture measures and intensity has proven to be effective in classification [
36,
37]. The last set of seeds is obtained from variogram function-based texture feature. Compared with the traditional texture features derived from the gray level co-occurrence matrix (GLCM) and other classic texture-based methods, the variogram function-based textures provide greater efficiency and better results from SAR data [
38]. The traditional semivariogram is defined as follows:
The variogram function-based feature used in this work is the simple deformation of (2), namely madogram, and defined as follows [
39]:
where
represents the image intensity at location
;
, called the lag distance, is a vector possessing both magnitude and direction, and
represents the pixel located at distance
to location
in the vector direction. The madogram
is calculated in the neighborhood window of each pixel.
is the number of pairs of pixels within
distance apart in the vector direction within the window. Generally, the lag distance is defined as the value at which the
are no longer correlated when the semivariogram reaches a sill [
40], which also applies to the madogram. This latter, taking the absolute values instead of measuring squares of all difference by the semivariogram, can also relate the variance of pixels to their spatial location and, characterize the spatial variability in a neighborhood, thus providing the same performance as that obtained from the semivariogram but with a much more level of lower computational complexity [
41]. The optimal window size and lag distance were set to 9 × 9 and 3, based on trials seeking a tradeoff between accuracy and false alarms. Too large a window size will cause boundary effects, while too small a window size will fail to capture the texture structures. The
is calculated by averaging four directions set as 0°, 45°, 90° and 135°.
After computing the madogram, the associated map is also stretched to a value range between 0 and 255. To obtain the last set of seeds, a threshold of set at [0, 1] is applied to the 8-bit gray-scale madogram map.
(b) Region growing:
The second step is region growing from the three parts of seeds extracted in step (a), respectively. For the seeds of intensity, the pixels in a window around the seeds that have a backscattering larger than a second intensity threshold, labelled
, are included in the BAs map. Then, the newly added pixels are included in the intensity seeds, and the region growing step is iteratively implemented until no additional pixel can be added to the BAs map. The same region growing process separately applies to the
seeds and madogram seeds using a second
threshold and a second madogram threshold, namely
and
. The window size is set at 3 × 3, since the possible size values between 3 and 7 make no difference to the result [
33].
Three BAs extraction maps can be obtained through the three independent SSRG processes, and the result BAs map is obtained by merging the three maps using a logical operator OR. For the Sentinel-1 SAR data with a resolution approximately 5 × 20 m
2, the texture features are not obvious in dense urban areas, and image intensity can be used to extract most BAs. The Getis-Ord
and madogram features are supplements to yield more precise BAs extraction results in areas with apparent textures (e.g., industrial zones with large, flat roofs). A more detailed analysis is provided in
Section 4.1 for the determination of parameters
and
.
3.3. Post-Processing
The last part of the framework (
Figure 1) aims to remove false positives caused by mountains. The strong reflection due to foreshortening or layover effects in mountainous areas are highly likely to be misclassified as buildings, which significantly reduces the extraction accuracy. The most direct way to address this problem is to use digital elevation model (DEM) data to mask the mountain areas. Since buildings may also exist on high plateaus, we cannot use a unified threshold of DEM value to automatically mask the layover areas for all regions. Therefore, the slope factor needs to be considered. In a previous work [
33], an empirically determined slope value (30°) was used to discard mountainous areas. This was based on the assumption that it is highly likely that high backscattering regions are due to hills and not to buildings, when the slope value is too large. Since this assumption is reasonable, our work is also based on this point.
In this step, a bilinear interpolation algorithm is first used to obtain a DEM map with the same space posting as that of Sentinel-1 data. Then, the slope is calculated inside a 5 × 5 pixel kernel. The slope map derived from DEM may be not accurate in each pixel of the SAR image for many reasons, such as inaccuracy of DEM data, or the resolution of the DEM. Thus, some pixels belonging to building areas may have a high slope, which would be filtered out by the slope map. To avoid this problem, the average slope value is computed in a 21 × 21 window around the pixel being tested [
33], which is also the optimal window size for Sentienl-1 SAR data obtained through trials. Obviously, this step can be done independently before or after the BAs extraction process, and the slope threshold is different in areas with different terrains, i.e., 10° in plains or 15° in mountainous cities, based on a tradeoff between BAs commission error and omission error determined through multiple experiments in plains and mountainous areas.
Specifically, the Advanced Spaceborne Thermal Emission Reflectometer DEM (ASTER GDEM2) and the DEM based on the Shuttle Radar Topography Mission, as released by the Consortium for Spatial Information (SRTM CGIAR-CSI version 4.1), as two freely available global DEMs, are introduced for mountain masking. The possible errors due to these two DEMs are analyzed, as further detailed in
Section 4.4.
After all of the above steps, a morphological operator with 3 × 3 pixel-size kernels is applied to the result to smooth the BAs borders of BAs.