# **Remote Sensing of Natural Hazards**

Edited by Bayes Ahmed and Akhtar Alam Printed Edition of the Special Issue Published in *Remote Sensing*

www.mdpi.com/journal/remotesensing

# **Remote Sensing of Natural Hazards**

# **Remote Sensing of Natural Hazards**

Editors

**Bayes Ahmed Akhtar Alam**

MDPI ' Basel ' Beijing ' Wuhan ' Barcelona ' Belgrade ' Manchester ' Tokyo ' Cluj ' Tianjin

*Editors* Bayes Ahmed Institute for Risk and Disaster Reduction University College London (UCL) London United Kingdom

Akhtar Alam Department of Geography and Regional Development University of Kashmir Srinagar India

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Remote Sensing* (ISSN 2072-4292) (available at: www.mdpi.com/journal/remotesensing/special issues/RS NaturalHazards).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-4308-6 (Hbk) ISBN 978-3-0365-4307-9 (PDF)**

© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

# **Contents**



# **About the Editors**

#### **Bayes Ahmed**

Dr Bayes Ahmed is a Lecturer in the Institute for Risk and Disaster Reduction (IRDR) at University College London (UCL). He obtained a PhD in Disaster Risk Reduction from UCL, a joint MSc degree in Geospatial Technologies from Spain, Germany, and Portugal; and a Bachelor of Urban and Regional Planning degree from Bangladesh University of Engineering and Technology (BUET). His background includes research into the field of disaster risk reduction (DRR), community vulnerability assessment, GIS and remote sensing, climate change adaptation, conflict and migration, disaster displacement, and climate justice. Mr Ahmed is also affiliated as a visiting lecturer in the Department of Disaster Science and Management at the University of Dhaka, Bangladesh.

#### **Akhtar Alam**

Dr Akhtar Alam is an Assistant Professor in the Department of Geography and Regional Development, University of Kashmir, Srinagar-190006, Jammu and Kashmir, India. His research and teaching interests revolve around environmental change, geomorphology, natural hazards, and disaster risk. Having core interest in spatial science, he explores the earth surface processes and interplay of natural and social environment with an extensive use of remote sensing data and Geographic Information System (GIS).

# *Article* **Assessing Agricultural Vulnerability to Drought in a Heterogeneous Environment: A Remote Sensing-Based Approach**

#### **Mst Ilme Faridatul 1, \* and Bayes Ahmed 2,3**


Received: 28 August 2020; Accepted: 13 October 2020; Published: 15 October 2020

**Abstract:** Agriculture is one of the fundamental economic activities in most countries; however, this sector suffers from various natural hazards including flood and drought. The determination of drought-prone areas is essential to select drought-tolerant crops in climate sensitive vulnerable areas. This study aims to enhance the detection of agricultural areas with vulnerability to drought conditions in a heterogeneous environment, taking Bangladesh as a case study. The normalized difference vegetation index (NDVI) and land cover products from the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite images have been incorporated to compute the vegetation index. In this study, a modified vegetation condition index (mVCI) is proposed to enhance the estimation of agricultural drought. The NDVI values ranging between 0.44 to 0.66 for croplands are utilized for the mVCI. The outcomes of the mVCI are compared with the traditional vegetation condition index (VCI). Precipitation and crop yield data are used for the evaluation. The mVCI maps from multiple years (2006–2018) have been produced to compute the drought hazard index (DHI) using a weighted sum overlay method. The results show that the proposed mVCI enhances the detection of agricultural drought compared to the traditional VCI in a heterogeneous environment. The "*Aus*" rice-growing season (sown in mid-March to mid-April and harvested in mid-July to early August) receives the highest average precipitation (>400 mm), and thereby this season is less vulnerable to drought. A comparison of crop yields reveals the lowest productivity in the drought year (2006) compared to the non-drought year (2018), and the DHI map presents that the north-west region of Bangladesh is highly vulnerable to agricultural drought. This study has undertaken a large-scale analysis that is important to prioritize agricultural zones and initiate development projects based on the associated level of vulnerability.

**Keywords:** agriculture; drought; NDVI; MODIS; remote sensing; Bangladesh

#### **1. Introduction**

Drought is one of the natural hazards characterized by a prolonged water shortage. The impacts of drought are multifaceted, ranging from the environment of a country to its economy and society [1–6]. Various components including agriculture, vegetation, ecosystem, and water resources can be affected by drought [7,8]. Agriculture is the backbone of the economy in many countries; however, this sector suffers from drought in many parts of the world [1,9]. Sound knowledge of the spatial variations in agricultural water stress is important for effective management of drought risk in agrarian countries. Moreover, the determination of areas with agricultural vulnerability to drought is important to drought dynamic planning [7].

Various approaches have been developed and used to monitor different types of drought [5,10–15]. The traditional approach of drought monitoring has been based on meteorological observations, which lack continuous spatial data to monitor the detailed drought conditions [16]. Over the past decades, meteorological data have been used to improve the understanding of drought. Precipitation based drought indices e.g., the standardized precipitation index (SPI), rainfall anomaly index (RAI), palmer drought severity index (PDSI), standard precipitation and evapotranspiration index (SPEI), national rainfall index (NRI) have been commonly used to monitor drought in various regions [17–22]. The meteorological drought indices have their own strengths and popularity; however, they are limited by the distribution of weather stations and provide only point data. In contrast, remote sensing (RS)-based drought indices have gained attention for drought monitoring as they provide repeatable information for broad regions [5,23,24].

The techniques used to monitor drought can be categorized as RS and empirical modelling. The use of RS based approaches to monitor vegetation and agricultural water stress in a large area is promising compared to other approaches [25]. Over the past decades, various RS based drought indices have been developed [9,13,14,26–29]. For example, the normalized difference vegetation index (NDVI) developed by Rouse et al. [30], has been used for vegetation classification and vegetation phenology study. The NDVI has also been used for the assessment of agricultural and vegetative drought [4,15]. Kogan [26] developed the vegetation condition index (VCI) for improving the analysis of vegetation conditions in non-homogeneous areas. The VCI has proved to be effective to provide accurate drought information and, therefore, this index has been applied to monitor vegetation water stress in various regions [1,31–33]. The temperature condition index (TCI), developed by Kogan [27], provided additional information on vegetation stress and facilitated the detection of stress whether it is caused by dryness or excessive wetness. Li et al. [28] developed the normalized temperature anomaly index (NTAI) and the normalized vegetation anomaly index (NVAI). These indices were applied to monitor drought and found a better measure of anomalies and evolution compared to the VCI and TCI. Sandholt et al. [29] developed the temperature vegetation dryness index (TVDI) using an empirical parameterization of the relationship between NDVI and land surface temperature (LST). The TVDI proved to be a potential indicator of understanding of the variations in soil moisture. Ghulam et al. [14] developed the perpendicular drought index (PDI) based on the spatial characteristics of moisture distribution in near infrared (NIR)–Red space. Their study concluded that PDI has potential in RS-based drought phenomenon analysis. The normalized multi-band drought index (NMDI) was proposed by Wang and Qu [13] for monitoring the moisture condition of soil and vegetation using RS data. Amri et al. [8] developed the vegetation anomaly index (VAI) and used it to assess the presence of vegetation stress. They found a satisfactory performance of the index; however, the VAI may be affected by the pattern of irrigation in agricultural areas, and evolutions of land use and its heterogeneity. The vegetation index has been widely used as one of the important parameters for understanding drought conditions, crop yield, and mapping of agricultural areas [34]. Gouveia et al. [35] applied correlation analysis between NDVI and SPEI to analyze the drought impacts on vegetation, and to determine the most sensitive vegetation types. Dutta et al. [36] used NDVI based VCI for monitoring agricultural drought and compared it with SPI, RAI and the yield anomaly index (YAI). They found a good agreement between the VCI and meteorological drought indices.

Although a great effort has been made to develop various drought indices [3,9,13,27,37], the previous studies rarely evaluated their performances to monitor and understand agricultural drought in a large heterogeneous environment. Various land cover types including cropland, wetland, waterbody, forest, urban built-up area, and tree cover can exist when analyzing a large territory. Land cover variability might influence the accurate detection of agricultural areas with vulnerability to drought. The purpose of this research is to improve the understanding of how the variations in land cover types affect the estimation of agricultural drought, and to identify the agricultural areas facing

high drought risk by combining multiyear RS-based drought indices. This study proposes a modified vegetation condition index (mVCI) suitable for the determination of agricultural drought in the areas of varied landscapes. This research uses the principle of NDVI based VCI [27], because it has been proven to be useful means for the detection of drought conditions around the world [2,33,38].

In this paper, Section 2 includes the study area profile and the experimental data. Section 3 elaborates in detail the approach of assessing the heterogeneity of landscape and improving the separation of agricultural drought from water-stressed vegetation. Section 4 demonstrates the results and discussion and, finally, concluding remarks are provided in Section 5.

#### **2. Study Area and Experimental Data**

This study selects Bangladesh as a research area. It is a South Asian country, which is situated between latitudes 20◦34′ and 26◦38′ N and longitudes 88◦01′ and 92◦41′ E. India borders Bangladesh along the north, west and northeast borders. It shares borders with West Bengal of India in the west, Meghalaya in the north and Tripura in the east. It also shares borders with Myanmar in the southeast, and the Bay of Bengal demarcates its southern border (Figure 1). Bangladesh consists of 64 administrative districts. Its topography is relatively flat, the great plain lies almost at sea level along the southern coast and rises gradually towards the north. Agriculture is the backbone of the country; it grows a wide variety of crops which are broadly classified as Kharif Crops (grown in the summer and harvested in early winter), and Rabi Crops (sown in winter and harvested in the spring or early summer). Rice and wheat are the major cereals of the country. The rice-growing seasons have been commonly classified into three categories e.g., Aus (sown in mid-March to mid-April and harvested in mid-July to early August), Aman (sown in early September and harvested in December to early January) and Boro (sown in mid-November to mid-January and harvested in April to May). Moreover, wheat is one of the most important winter crops, which is sown in November to December and harvested in March to mid-April [12,39,40].

**Figure 1.** Location of the study area. (**a**) Geographic location; (**b**) Bangladesh.

The country is characterized by a subtropical monsoon climate. The mean annual precipitation is nearly 2400 mm, with 70% occurring during the monsoon season. Figure S1 shows monthly and seasonal variations in the precipitation over a period of 13 years (2006–2018). The highest precipitation occurs between May and September. Note that the Aus rice-growing season receives the highest precipitation (72%), and the Boro rice-growing season is the driest season that receives only 7% of total precipitation. Bangladesh consists of four recognized seasons e.g., a hot, humid summer between March and May; a wet, monsoon season between June and September; autumn between October and November; and a dry winter between December and February [41]. Bangladesh regularly experiences natural hazards including droughts, floods and cyclones. In the past, Bangladesh experienced severe drought in the years 1951, 1961, 1975, 1989, 1997, 2006 and 2010. Most of these droughts occurred in pre- and post-monsoon seasons. It should be noted that drought is a periodic occurrence in many regions of Bangladesh; however, the northwest region is more vulnerable to drought compared to the other parts of the country. The mean annual precipitation in this dry zone ranges from 1250 to 1750 mm [41,42].

Over the past decades, RS data e.g., National Oceanic and Atmospheric Administration-Advanced Very High Resolution Radiometer (NOAA-AVHRR), Landsat, SPOT VGT NDVI, and Moderate Resolution Imaging Spectroradiometer (MODIS) satellite imageries have been commonly used to monitor drought conditions. This study uses NDVI products of MODIS (MOD13A3). For comparative analysis, and to evaluate the performance of the drought indices, this study uses data for both the drought and non-drought years. Moreover, precipitation and crop yield are used for evaluating the results. Precipitation data is collected from the Bangladesh Meteorological Department and crop yield from the yearbook of agricultural statistics of the Bangladesh Bureau of Statistics. This study uses the principle of VCI to assess agricultural drought, which requires long-term maximum and minimum NDVI values for each pixel, thereby a total of 156 NDVI images were collected from 2006 to 2018. This study reviewed related research on Bangladesh [41,42] and selected the typical years for assessing the agricultural drought. To understand the heterogeneity of the environment, this study also uses the MODIS land cover yearly product (MCD12Q1). It should be noted that MODIS land cover data consist of 17 land cover classes; however, the study area characterizes six major land cover types e.g., cropland, urban, tree cover, forest, wetland, and permanent waterbody. In contrast, other land cover classes are small in proportion. Note that small-scale changes in land cover would not affect drought monitoring [6], thus the six major land cover classes are considered in the analysis of the heterogeneity of the environment.

#### **3. Improving Agricultural Drought Assessment in Heterogeneous Areas**

In this research, first, the heterogeneity of the landscape is investigated. Second, land cover variability is considered in delineating the mVCI for separating the water-stressed croplands areas from other land covers and vegetation. Third, a comparative analysis is done between the mVCI and the traditional VCI, the result is evaluated using precipitation and crop yield. Lastly, multiyear mVCI maps are input to compute a composite map of areas indicating the levels of vulnerability to agricultural droughts. The composite map is important to the decision-makers to detect and prioritize the most vulnerable zones for initiating development projects and allocating funds to cope with drought in future.

#### *3.1. Evaluation of Heterogeneity of the Landscape and Segregation of Agricultural Areas*

A heterogeneous environment consists of various land covers including vegetation, which largely encompasses agricultural/cropland, rangeland, tree cover, and forest [43]. Various land cover classification approaches have been used to detect geographic features [44,45]. The NDVI has also been used as a good indicator for the classification of vegetation [30], and has been used to detect stressed or damaged crops [15,32,34]. This study evaluates the utility of NDVI in the segregation of agricultural land from other vegetation and land cover types. In this section, first, the influence of

seasonality and temporal variation on NDVI is evaluated. Second, yearly composite NDVI maps are computed, and then representative sample patches of six major land cover types are collected from the MODIS land cover maps. Third, the NDVI values are extracted by the sample patches, and the basic statistics of NDVI for six land cover types are graphically presented to understand the heterogeneity of the landscape. Fourth, to segregate agricultural land from other land cover types, a maximum and minimum threshold value is defined. Faridatul and Wu [46] developed an approach of threshold optimization and it proved to be efficient in the separation of land covers. Thus, this research used their approach to determine threshold values of NDVI for the croplands. To avoid the influence of an outlier or extreme values, this study uses the maximum and minimum threshold for the agricultural land rather than using the maximum and minimum NDVI of the agricultural land. Finally, for evaluation and comparison analysis overlay intersect is performed between NDVI-based agricultural land and cropland as defined in the MODIS land cover map.

#### *3.2. Assessment of Agricultural Drought using the Vegetation Condition Index*

The NDVI based VCI as Equation (1) has been used as an indicator of the status of vegetation cover. The conditions of vegetation are usually measured in percent. The VCI values close to 0% (zero) indicate an extreme dry condition, whereas the VCI values between 50% and 100% indicate normal vegetation conditions [32]. A VCI of less than 50% indicates drought conditions, and VCI ranges between 0% and 35% indicate the severe drought condition [27].

$$\text{VCI} = 100 \ast \left( \text{NDVI}\_l - \text{NDVI}\_{\text{min}} \right) / \left( \text{NDVI}\_{\text{max}} - \text{NDVI}\_{\text{min}} \right) \tag{1}$$

where *NDVI<sup>i</sup>* is the NDVI value for a specific pixel in the month of *i*, *NDVImax* and *NDVImin* are the highest and lowest NDVI values of the same pixel for the period of 2006–2018.

It should be noted that the VCI has been used for assessing the spatial characteristics of drought [7,33]; however, previous studies [32,47,48] rarely evaluated its performance in the detection and separation of water stressed cropland from other vegetation. This study computes the VCI for both the drought and non-drought years. The VCI values are extracted by sample patches of six major land cover types as defined earlier, and the results are evaluated to understand the utility of this index in the detection of agricultural drought in the areas of varied landscape.

#### *3.3. Enhanced Estimation of Agricultural Drought and Comparison Analysis*

For evaluating drought conditions in the heterogeneous environment, a land cover map could be used to mask out non-agricultural lands. For example, Rulinda et al. [4] used a land cover map to mask out non-vegetated areas and forest to separate vegetative areas. Note that this study uses the threshold of NDVI to segregate cropland from other land cover types and applies Equation (2) to measure and highlight agricultural drought conditions.

This study proposes an approach to make the VCI suitable for accurate assessment of agricultural drought in a heterogeneous environment. First, yearly composite NDVI maps are computed and analyzed for their spatial distribution in relation to land cover types. Secondly, the graphical and statistical analysis is performed to compute the range of maximum and a minimum threshold of NDVI values for the cropland. Finally, this research uses the principle of VCI and develops the modified vegetation condition index (mVCI) as Equation (2).

$$mV\text{CI} = \begin{cases} 100 \ast \text{NDVI}\_{\text{Iaux}}, & \text{where } \text{NDVI}\_{\text{Iaux}} > \text{measured maximum threshold of the cropland} \\ 100, & \text{for } \text{NDVI}\_{\text{Iaux}} < \text{measured minimum threshold of the cropland} \\ 100 \frac{(\text{NDVI}\_{\text{I}} - \text{NDVI}\_{\text{min}})}{(\text{NDVI}\_{\text{IIaux}} - \text{NDVI}\_{\text{min}})}, & \text{otherwise} \end{cases} \tag{2}$$

where *NDVIYcom* is the yearly composite NDVI, *NDVIimax* is the maximum NDVI in the month of *i*, *NDVI<sup>i</sup>* is the NDVI value for a specific pixel in the month of *i*, *NDVImax* and *NDVImin* are the highest and lowest NDVI values of the same pixel for the period of 2006–2018.

After deriving the traditional VCI and the proposed mVCI, this research performs a comparison analysis between them to improve the understanding of how the existence of heterogeneous land covers affect the estimation of agricultural drought using the traditional vegetation index. To demonstrate the competence of the modified drought index, the comparison is shown in maps. Moreover, the spatial distribution of the mVCI and VCI values are derived and shown by land cover types.

#### *3.4. Detection of Agricultural Drought Vulnerable Regions*

This study computes the agricultural drought hazard index to facilitate the investigation of the spatial distribution of drought-vulnerable regions. For each year, the seasonal mVCI maps are produced and reclassified. Then, the equally weighted sum overlay analysis is performed and a new drought vulnerable map is generated. This study applies an approach to detect drought hazard zones as used by Daneshvar et al. [49] and Yu et al. [6]. It should be noted that Daneshvar et al. [49] used the SPI and Yu et al. [6] both the SPI and VCI drought thematic maps to produce the drought hazard index (DHI). However, this study assigns weights to mVCI values (Table 1) and uses the drought thematic maps as in Equation (3). Finally, the drought vulnerable regions are defined as high, medium and low. Note that in this study, the highest and lowest DHI values indicate, respectively, high and low vulnerability of regions to agricultural drought. In contrast, the intermediate DHI values indicate medium-vulnerable regions.

$$DHI = \sum\_{ij=1}^{n} mVCI\_{ij} \tag{3}$$

where *DHI* is the drought hazard index produced by the sum overlaying of the *mVCI* drought thematic map of the *i*th year and *j*th season for a time of n = 13 years (2006–2018).

**Table 1.** Classification of the conditions of cropland based on the modified vegetation condition index (mVCI) values.


#### **4. Results and Discussion**

In this section, first, the heterogeneity of the environment is investigated. Second, the drought indices that are derived using the VCI and mVCI and presented for visual interpretation and comparison analysis. Then, the DHI maps are computed for the investigation of the agricultural vulnerability to drought conditions.

#### *4.1. Spatial Distribution of Land Cover Types and Detection of Agricultural Areas*

This study assesses the heterogeneity of the landscape using the NDVI. It should be noted that the temporal and seasonal variations influence the characteristics of the land cover types [50]. Thus, the influence of their variations in the detection of land cover types is evaluated. Figure S2 shows that the NDVI values for the different land cover types including cropland, forest, tree cover, and other geographic features. The result confirms the variations in the NDVI values. To understand the spatial distribution of land cover types and detect cropland, this study uses yearly composite NDVI (Figure 2). The typical statistics of the NDVI show the lowest values for the non-vegetation land covers e.g., water, wetland, and urban. Forest and tree cover show the highest NDVI values. This study

computes and uses the threshold of NDVI to segregate cropland from other land cover types. Table 2 shows representative threshold values of the NDVI. To evaluate its performance an overlay analysis is performed between the cropland as defined in MODIS land cover product and threshold-based classified map. This study finds 91–95% agreement in the detection of cropland using the threshold of NDVI.

**Figure 2.** Spatial distribution of land cover types, yearly composite NDVI and its typical statistics in (**a**–**c**) 2006, and (**d**–**f**) 2018.

**Table 2.** Measured maximum and minimum threshold of normalized difference vegetation index (NDVI) for the agricultural land.


#### *4.2. Evaluating Drought Conditions Using the VCI and mVCI*

≤ This study computes the vegetation conditions and investigates the seasonal and temporal variations in agricultural drought for 13 years. To be concise, this study presents the results from a representative drought year of 2006 [41] and a non-drought year of 2018. In contrast, the drought maps of other years are provided as supplementary material (Figures S3 and S4). Figure 3a–f shows the maps of the VCI and mVCI for the drought year. The index values range between 0 and 100. The drought conditions are highlighted, dividing the index values into four scales. The VCI and mVCI values of ≤50% indicate the drought-prone areas and the values of greater than 50% indicate normal condition. Rice is one of the major cereals in Bangladesh, which grows in the three seasons. Moreover, based on broad crop growing season e.g., Kharif (May–October) and Rabi (November–April) the drought maps are produced and shown in Figures S5 and S6. The highest precipitation falls in Bangladesh in the Kharif/Aus rice-growing season and supports rain-fed agriculture [12] thus the highest index values are observed in these seasons. In contrast, the Boro and Rabi cropping seasons show the lowest index values.

**Figure 3.** Major cropping seasons and spatial distribution of vegetation conditions based on the VCI and mVCI in: (**a**–**f**) 2006, and (**g**–**l**) 2018.

Figure 3g–l presents the maps of the VCI and mVCI for the non-drought year of 2018. The results show overall higher index values than the drought year. Note that small areas contain the lowest values indicating drought conditions, and the influence of seasonal variations in the vegetation conditions is not significant in the normal year. The low index values indicate the development of vegetation with unfavorable weather. The vegetation phenology phases e.g., leaf coloring and unfolding, are driven by dry weather, which reduces the greenness in vegetation and enables the realization of drought conditions [32,51]. The lowest precipitation falls in the Rabi/Boro rice-growing season, and it is relatively dryer than the Kharif season, thus the highest drought condition is observed in this season.

#### *4.3. Comparison Analysis*

This study improves the VCI to estimate accurate drought conditions of cropland in a heterogeneous environment using the mVCI. Figure 4 presents a visual comparison between the maps of VCI and mVCI. It should be noted that both the VCI and mVCI require long-term maximum and minimum NDVI values for each pixel; thus, monthly NDVI images of 13 years, a total 156 images are used for estimating agricultural drought. However, to be concise, the comparative analysis is presented in detail for two representative years. The results reported in this study show that, without considering land cover types, the VCI yields the low index values for many non-vegetative areas that seem to be classified as drought-affected areas (Figure 4a,d) because the non-vegetative areas e.g., waterbodies, wetlands, and urban areas consist of low NDVI values compared to the cropland (Figure 2c,f). In contrast, consideration of land cover types in the mVCI minimizes the overestimation of drought areas (Figure 4c,f), thus improving the demarcation of actual agricultural drought areas. A large territory or an entire country consists of heterogeneous land covers, thus this study suggests considering land cover types in the mVCI.

Figure 5a–d show the differences in basic statistics of predicted drought conditions derived from the VCI and mVCI. The comparison analysis indicates the differences between the VCI and mVCI for cropland and other land cover types. The croplands show similar statistics in both models. It is worth noting that, without consideration of land cover types, the VCI yields very similar values of many land cover types (Figure 5a,c). Thus, it is challenging to estimate accurately the drought conditions of the cropland. In contrast, the consideration of land cover types in the mVCI facilitates the distinguishing of the actual conditions of cropland from other land cover types (Figure 5b,d).

Figure 5e,f presents the local spatial difference between vegetation conditions derived from the VCI and mVCI. The areas of water body, wetland, forest, tree cover and urban show strong deviations between the results of the VCI and mVCI in the prediction of drought conditions of 2006 (Figure 5e). In contrast, cropland shows low deviations between the models. The deviations in the estimation of drought condition in 2018 (Figure 5f) show similar findings to those for the dataset of 2006. Figure 6 also presents the differences in the mean temporal variations between the VCI and mVCI. The results demonstrate that the croplands yield relatively high index values in the wet months. The mVCI performs better that the VCI in separating agricultural drought conditions in the heterogeneous environment.

Monthly NDVI and its long-term maximum and minimum values are input into the computation of the indices and thus the variations in NDVI highly influence their values. Figure 2 confirms that the waterbody, wetland and urban areas have the lowest NDVI, thus resulting in the lowest VCI for these land covers. In contrast, the forest and tree covers may also suffer from water stress and result in low VCI. In a heterogeneous environment, the accurate estimation of the agricultural drought condition can be affected if these land covers are not considered in the estimation of VCI. Table 3 shows the differences in the estimation of drought conditions using the VCI and mVCI. The VCI overestimates the areas of extreme and moderate drought conditions. In contrast, the mVCI shows a lower proportion of areas of drought conditions than the VCI. In the estimation of mVCI, land cover types are considered, thus excluding the water-stressed vegetation and non-vegetation land covers in the calculation of agricultural drought.

**Figure 4.** Spatial distribution of land cover types and vegetation conditions derived from the VCI and mVCI in (**a**–**c**) 2006, and (**d**–**f**) 2018.

$$\mathbf{(a)}\text{ v}\mathbf{C}$$

(**a**) **VCI** (**c**) **VCI** 

(**b**) **mVCI** (**d**) **mVCI** 

**Figure 5.** Comparison of typical statistics of the VCI and mVCI (**a**–**d**), and local spatial difference in the estimation of vegetation conditions (**e**,**f**).

**Figure 6.** Mean vegetation conditions as derived from the (**a**) VCI, and (**b**) mVCI.


**Table 3.** Area (%) indicating different drought conditions.

#### *4.4. Assessing Drought Hazard and its Impact on the Yield of Major Cereals*

Figure S7 shows the cropping area of the major cereals, and Figure 7 shows the drought-vulnerable croplands. The results demonstrate that the regions located in the north-west are highly vulnerable to agricultural drought. In Bangladesh severe drought primarily occurred in the pre- and post-monsoon periods [42]. The results of this study also indicate a high drought occurrence in the pre-monsoon rice-growing season of Boro and post-monsoon rice-growing season of Aman (Figure 7b,c). In contrast, the rain-fed agriculture, Aus rice-growing season shows mild drought conditions (Figure 7a).

**Figure 7.** Spatial and temporal variations in the vulnerability to drought.

20 40 60 80 100 120 Yield (M.Ton) Avg VCI Yield (M.Ton) Avg VCI Yield (M.Ton) Avg VCI Aus Aman Boro **VCI and Crop Yield Year 2006 Year 2018** 0.00 0.50 1.00 1.50 **Year 2006 Year 2018** Yield (M.Ton/acre) An evaluation of the drought impact on the yield of major cereals is shown in Figure 8. The results show the lowest yield in the drought year compared to the non-drought year. It is worthy of note that the Boro rice-growing season shows the lowest mean mVCI; however, the yield of Boro rice is highest (Figure 8a). It seems to be inconsistent because the low index value indicates higher drought conditions, thus it should have a high impact on the yield of Boro rice. The sown and harvesting times of Boro rice are between mid-November and April, which is the driest season in Bangladesh (Figure S1). Various factors including low precipitation, leaf unfolding, and coloring seem to have an impact on the vegetation conditions. In this study, the yield–mVCI relationship is shown graphically in Figure 8a. It shows a comparison between two representative years that limits the application of regression analysis to evaluate the impacts of drought on the crop yield. This study underlines the importance of using the long-term crop yield and mVCI for quantitative analysis in future work.

The NDVI-based VCI has been widely used to evaluate drought conditions [2,33,38], and this study proposes the mVCI for the accurate estimation of agricultural drought. In the proposed approach, the thresholds of NDVI are used to segregate croplands from other land cover types, and Equation (2) is developed for the estimation of agricultural drought in the heterogeneous environment. This study finds 91–95% agreement in the detection of cropland using the threshold of NDVI. It is worthy of note that the heterogeneous environment consists of various land cover types, and non-vegetation land covers e.g., waterbody, wetland, and the built environment have low NDVI values compared to the cropland and other vegetation (Figure S2). Therefore, the use of the traditional vegetation index in the heterogeneous environment yields low VCI values in many areas of non-vegetation land covers and seems to include them as drought-affected areas (Figure 4). In contrast, the use of the NDVI threshold

and the consideration of separating croplands from other land cover types reduces the inclusion of misclassified drought areas thus improving the estimation of agricultural drought.

In this study, precipitation and crop yield have been used to verify the ability to detect drought conditions [32]. This study also uses these data to evaluate the performance of the drought hazard index. Figure S8 shows the temporal variations in the average precipitation and mVCI of the croplands between two representative years. The investigation indicates that the drought year yields low mVCI values compared to the non-drought year. It should also be noted that the mean mVCI values fall with a decrease in precipitation in 2006. In contrast, the influence of precipitation on the vegetation conditions is not noticeable in 2018. Temporal variations in precipitation present the dry and wet seasons, and can be used as an important indicator of meteorological drought.

Dutta et al. [36] used a yield-based drought index for comparison with the VCI and found a moderate coefficient of determination between VCI and yield of major rainfed crops (Sorghum). In this study, a comparison is shown between the yield of major cereals and the corresponding mean mVCI (Figure 8). The results demonstrate that mVCI is lowest in the rice-growing season of Boro but the yield rate is highest. In contrast, the mVCI is largest in the rice-growing season of Aus but the yield rate is lowest. The investigation of this research indicates that the vegetation condition is one of the important indicators of drought. However, several other influencing factors should be considered to find out the correlation between crop yield and the occurrence of drought.

It should be noted that this study selects a large territory for the assessment of agricultural drought. A large-scale analysis facilitates the detection and comparison of the levels of drought vulnerability (Figure 7) on a regional scale that are important to prioritize vulnerable croplands for initiating development projects and allocating funds accordingly. A large-scale analysis is also of importance for country-level decision making to withstand drought vulnerability.

#### **5. Conclusions**

Agricultural drought is one of the natural hazards occurring in many parts of the world. Various factors including reduction in precipitation and soil moisture, climate change, and the changes in water supply and demand cause drought. It is important to understand the factors of drought conditions and detect the vulnerable areas for effective planning and minimizing of the drought risk. Various indices are available to monitor drought conditions. For example, meteorological drought indices e.g., SPI, RAI, and SPEI have been commonly used but are limited by the distribution of weather stations and provide only point data. In contrast, RS based indices facilitate multi-temporal drought vulnerability mapping on a regional scale. The VCI is one of the popular RS-based indices that has been applied for drought analysis; however, existing studies rarely evaluate drought in a heterogeneous large territory. This study improves the traditional VCI and proposes the mVCI to make it suitable for investigating agricultural drought in a heterogeneous environment. The proposed mVCI uses MODIS earth observation data of NDVI and land cover. Note that the traditional VCI has been mostly used for small-scale analysis, and thereby land cover types have not been considered for evaluating drought. This study evaluates agricultural drought in an entire country and computes the mVCI considering the variations in land cover types.

In this study, the basic statistics of the NDVI for six major land cover types are enumerated. The results show the lowest NDVI values for the non-vegetation land cover types and the highest for the forest and tree cover. In contrast, the intermediate NDVI values indicate the cropland areas. This study computes a threshold of NDVI to segregate cropland from other land cover types and uses the threshold values in the algorithm of the mVCI. The proposed approach is compared with the VCI. The results reported in this study show that the use of the traditional vegetation index in the heterogeneous environment yields low VCI values in many areas of non-vegetation land covers thus overestimating the areas of agricultural drought conditions. In contrast, the use of the NDVI threshold and the consideration of separating croplands from other land cover types reduces the inclusion of misclassified drought areas thus improves the estimation of agricultural drought. The results

of seasonal variations in the drought conditions indicate that the Aus rice-growing season is less vulnerable to drought as the highest precipitation falls in this season. This study uses mVCI maps from multiple years and seasons to develop the DHI map. The result indicates the local spatial variations in the vulnerability to agricultural drought. The highly vulnerable agricultural areas are located in the north-west of Bangladesh. In contrast, the southeast hilly region consists of forest indicates less vulnerable to drought conditions.

It should be noted that most of the major cereals are cultivated in the north and north-west districts of Bangladesh. However, the north-west districts are highly vulnerable to drought conditions and thus care should be taken with dynamic drought planning for this region. The crops that withstand drought conditions could be selected for cultivation in the highly vulnerable regions. Note that this study assesses agricultural drought using s vegetation index. However, some other factors including hydrogeological characteristics, soil types and moisture conditions, air and land surface temperature, irrigation water demand and supply should be considered while estimating agricultural drought in future work. Climate change has varying impacts on global and local weather [42]. This study suggests climate change-induced drought assessment in future work. In this study, both the VCI and mVCI are generated by inputting the NDVI. However, NDVI-based vegetation indices commonly indicate the condition of vegetation in terms of greenness. High greenness indicates healthy vegetation, and low greenness indicates poor vegetation conditions. NDVI-based vegetation indices limit differentiation of the inherent causes (e.g., lack of water or nitrogen) of poor vegetation conditions, thus this research suggests considering the investigation of soil conditions in future work along with the vegetation condition.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/2072-4292/12/20/3363/s1, Figure S1: Temporal variations in precipitation, Figure S2: Temporal variations of NDVI, Figure S3: Seasonal Multiyear VCI maps, S4: Seasonal Multiyear mVCI maps, Figures S5 and S6: Spatial distribution of vegetation conditions in 2006 and 2018, Figure S7: Cropping area of the major cereals, Figure S8: Temporal variations in the mean mVCI in relation to precipitation.

**Author Contributions:** Conceptualization, M.I.F.; methodology, M.I.F.; formal analysis, M.I.F.; data curation, M.I.F.; writing—original draft preparation, M.I.F.; writing—review and editing, M.I.F. and B.A.; supervision, B.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** The authors thank the National Aeronautics and Space Administration (NASA) for making the MODIS data publicly available.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Article* **How Does Peri-Urbanization Trigger Climate Change Vulnerabilities? An Investigation of the Dhaka Megacity in Bangladesh**

#### **Md. Golam Mortoja and Tan Yigitcanlar \***

School of Built Environment, Queensland University of Technology, 2 George Street, Brisbane,

QLD 4000, Australia; mdgolam.mortoja@hdr.qut.edu.au

**\*** Correspondence: tan.yigitcanlar@qut.edu.au; Tel.: +61-7-3138-2418

Received: 24 October 2020; Accepted: 30 November 2020; Published: 1 December 2020

**Abstract:** This paper aims to scrutinize in what way peri-urbanization triggers climate change vulnerabilities. By using spatial analysis techniques, the study undertakes the following tasks. First, the study demarcates Dhaka's—the capital of Bangladesh—peri-urban growth pattern that took place over the last 24-year period (1992–2016). Afterwards, it determines the conformity of ongoing peri-urban practices with Dhaka's stipulated planning documents. Then, it identifies Dhaka's specific vulnerabilities to climate change impacts—i.e., flood, and groundwater table depletion. Lastly, it maps out the socioeconomic profile of the climate change victim groups from Dhaka. The findings of the study reveal that: (a) Dhaka lacks adequate development planning, monitoring, and control mechanisms that lead to an increased and uncontrolled peri-urbanization; (b) Dhaka's explicitly undefined peri-urban growth boundary is the primary factor in misguiding the growth pockets—that are the most vulnerable locations to climate change impacts, and; (c) Dhaka's most vulnerable group to the increasing climate change impacts are the climate migrants, who have been repeatedly exposed to the climate change-triggered natural hazards. These study findings generate insights into peri-urbanization-triggered climate change vulnerabilities that aid urban policymakers, managers, and planners in their development policy, planning, monitoring and control practices.

**Keywords:** peri-urbanization; urban growth boundary demarcation; climate change; climate migrants; natural hazards; flooding; land use and land cover; night-time light data; Dhaka; Bangladesh

#### **1. Introduction**

Due to the global urbanization drives, major metropolitan cities and regions across countries are gradually expanded by continuously encroaching their physical growth boundaries into adjoining peri-urban areas [1–4]. Hence, urban growth at present predominantly is occurring in the form of peri-urbanization globally [5]. Thus, rapid peri-urbanization has already become an issue of increasing global concern [6]. In addition, peri-urban areas are also vulnerable to climate change impacts [7,8]. Consequently, these peri-urban growth pockets provoke considerable growth challenges to policymakers [9–11].

Although the rapidly occurring peri-urbanization is commonly a global phenomenon, peri-urban growth factors vary across countries. Such variations in growth factors are deeply ingrained in a country's unique socio-economic settings, legal systems, institutional arrangements, and environmental conditions [12,13]. For instance, the in-migration—which is considered as a major determinant of peri-urbanization—varies across countries. Peri-urban in-migrants for the context of developed nations are predominantly amenity-led migrants [14,15]. Contrariwise, such in-migration in developing countries context is largely attributed by forced factors such as poverty, lack of employment, and climate change impacts—e.g., sea-level-rise, river erosion, flooding, salinity intrusion, and drought [16,17].

In addition, biased national policies—e.g., excessive capital city-oriented development tendencies—also accelerate this forced migration [18]. In this way, metropolitan cities in the developing countries context are overwhelmed with unanticipated population growth. Hence, policymakers and planners are greatly hindered in maintaining a balance between economic development and social change [19], resulting in a gross failure in estimating the projected population and corresponding growth demands. While the cities of developing countries are frequently struggling to cope with the growing demand, the influx of added migrants makes the development control tasks extremely challenging [20]. Subsequently, the unprecedented growth of informal economies occurs in their peripheries [21].

While some cities are gradually expanding in response to this growing need, cities with geographical limitations are greatly obstructed to provide further avenues for expansion. For example, the expansion of Dhaka city—the capital of Bangladesh—is severely constrained by the lack of sufficient flood-free landscapes. The elevation of Dhaka city lies up to a maximum of 13 m above the mean sea-level [22]. Hence, Dhaka city is even more low-lying than its contemporaries—e.g., Mumbai (India), Kolkata (India), and Karachi (Pakistan).

Moreover, Dhaka's annual average rainfall is 2148 mm, out of which the monsoon rainfall that lasts from May to September accounts for nearly 78% [23]. Thus, Dhaka encounters frequent annual flooding in the monsoon due to heavy rainfall. While the portion of the city's annual flooding contributed by climate change impacts is as yet unknown, such frequently occurring flood events are widely claimed to be climate change-induced [24,25].

Furthermore, Bangladesh also tops the list of climate change impacts and such impacts are already evident throughout the country. Due to climate change impacts—e.g., sea-level rise, salinity intrusion, riverbank erosion, flooding, and drought, many people lose their livelihoods and properties, and eventually migrate to the major cities of Bangladesh [26]. Such internal migrants are globally defined as 'climate migrants' [27]. So far, Bangladesh has two million climate migrants and the capital city Dhaka individually hosts 68% of them [28]. These climate migrants predominantly represent the poorest portion of the city. With an annual poverty growth rate of 4%, about 40% of Dhaka's urban population are poor [29], and are predominantly deprived of any sort of urban facilities [30]. These poor people generally live in the peripheral areas of the Dhaka city, which are highly flood-prone. A boom in the ready-made garments (RMG) industries has further made the population influx to these flood-prone areas unstoppable [31]. Consequently, Dhaka's population is increasing at a rate of 0.4 million/year [32].

Thus, a more densified peri-urban growth occurs in their peri-peripheries, resulting in an intensified vulnerability of climate change's impact. These climate change impacts appear to be perpetual. As Bangladesh is one of the most vulnerable countries to climate change's impacts, incidents such as sea-level-rise, flooding, salinity intrusion, and their subsequent impacts on livelihoods and properties seem to persist. Thus, Dhaka will continue to absorb the direct and indirect impacts of climate change. While climate change has already become an issue of increasing concern for the government, policymakers are as yet unequipped to deal with the direct and indirect impacts of climate change at the metropolitan level.

Peri-urban areas—which are neither urban nor rural—are a distinct geographic space and have clear implications in urban governance [33,34]. While the idea of independently operating peri-urban areas is globally gaining increasing concern [35,36], no study, to date, is reported in the literature investigating peri-urbanization-driven climate change vulnerabilities based on an explicitly demarcated peri-urban growth boundary.

This study provides a remote sensing approach in demarcating peri-urban growth pockets which are vulnerable to climate change impacts. In order to demarcate these peri-urban growth pockets, this paper uses Mortoja's et al. [20] research findings apropos of 'what the most suitable methodological approach to demarcate peri-urban areas is'. The modus operandi of this investigation comprises carrying out change analyses with Landsat data, and identifying peri-urban growth pockets with night-time light data. By adapting the Greater Dhaka Area (Bangladesh)—or the Dhaka Megacity—as a testbed, the study undertakes the following four analyses. First, it maps out the changes in peri-urban growth boundaries that occurred in the peripheral areas of Dhaka over the last 24-year period (1992–2016). Second, it points out the consistencies of prevailing peri-urban growth with Dhaka's designated plan documents. Third, it spells out Dhaka's particular vulnerabilities to climate change impacts—i.e., flooding and groundwater table depletion. Lastly, it identifies the socioeconomic profile of the climate change victim groups from Dhaka. The insights generated from this study provide an evidence base in identifying the peri-urban growth pockets, and thereby enable policy makers to formulate area-specific growth policies and mitigating circumstances in dealing with increasing climate change vulnerabilities.

#### **2. Materials and Methods**

#### *2.1. Study Area*

In order to reveal peri-urbanization-related climate change vulnerabilities, this study chose one of the most vulnerable cities of global climate change impacts—i.e., Dhaka, the capital of Bangladesh. In terms of physical growth, Dhaka has become overwhelmingly saturated within its city corporation areas since the 2000s. Hence, further expansions of Dhaka city are protruding towards the adjoining peri-urban landscapes. The Capital Development Authority (a.k.a. Rajdhani Unnayan Kotripakhkha—RAJUK)—that is entitled as the prime organization for guiding and monitoring Dhaka city's growth—projects the city's growth to nine adjoining sub-districts (a.k.a. upazilas), including Savar, Gazipur Sadar, Kaliganj, Rupganj, Sonargaon, Bandar, Narayanganj Sadar, and Keraniganj (Figure 1). These RAJUK-designated sub-districts are altogether declared as the Dhaka Metropolitan Development Plan (DMDP) boundary. The study considered this DMDP boundary, comprising an aggregate area of 1530 km<sup>2</sup> , as the testbed to demonstrate the peri-urbanization driven climate change vulnerabilities.

**Figure 1.** Location of the study area: (**a**) the Dhaka Metropolitan Development Plan (DMDP) area within the national context; (**b**) the Dhaka Metropolitan Development Plan (DMDP) area and its corresponding sub-districts' location.

The compelling reasons for investigating the DMDP's peri-urban growth under this microscope entail: (a) the capital-city-centered biased national polices of agglomerating all major economic and infrastructural facilities around Dhaka city [37]; (b) unprecedented flow of in-migrants to Dhaka city [38]; (c) rampant unsustainable development practices around Dhaka city [39]; (d) both in terms of the world least livable city index and global population density, Dhaka ranks third [40–42], and; (e) Dhaka is already facing climate change-induced frequent flooding [25].

#### *2.2. Datasets*

The datasets used for this study primarily comprise two types: (a) level-one terrain-corrected (L1T) and cloud-free multispectral Landsat data for the years 1989, 1999, 2009, and 2019 collected from USGS [43] (https://earthexplorer.usgs.gov) (Table 1), and; (b) night-time light (NTL) data for the years 1992 and 2016 collected from NCEI [44] (http://ngdc.noaa.gov/eog/download.html).


**Table 1.** Characteristics of Landsat data for the DMDP area [43].

The downloaded NTL Defense Meteorological Satellite Program's Operational Linescan System (DMSP-OLS) data of 1992 and NTL Visible Infrared Imaging Radiometer Suite (VIIRS) data of 2016 were 'F101992.v4b\_web.stable\_lights.avg\_vis.tif' and 'SVDNB\_npp\_20160101- 20161231\_75N060E\_v10\_c201807311200', respectively.

While classified images of Landsat data over time were taken to derive the spatio-temporal dynamics of DMDP's growth, NTL data were used to detect the shifts in peri-urban growth boundaries corresponding to those spatial changes in DMDP's growth. As peri-urban areas take around 20 to 30 years to become completely urbanized, this study selected the NTL DMSP-OLS dataset of 1992 and NTL VIIRS Day-Night Band (DNB) dataset of 2016 in order to investigate the changes in the peri-urbanization pattern over a period of 24 years. This study selected the NTL dataset of 1992 and 2016, as the NTL data are available from 1992 onwards and the latest year of NTL data which provides annual composite is 2016.

In order to reveal the socioeconomic aspects of peri-urban growth, the UN-adjusted grid-level raster data on demography of the year 2001, 2016, and 2020, the average likelihood of poverty (ALP) (i.e., the average probability of living on less than \$2.50/day), and standard deviation of the average likelihood of poverty (ALP) of the year 2013 for the context of Bangladesh were collected from WorldPop and CIESIN [45] (https://www.worldpop.org/geodata/country?iso3=BGD). In addition, a continuous surface elevation map was generated for the DMDP area by using Shuttle Radar Topography Mission (SRTM) datasets of 2014 collected from USGS [43].

Other relevant documents including administrative boundaries, data on flood plain areas, proposed land use zone data, major river boundary, and reports of the City Region Development Plan (CRDP) 2015–2035 were collected from the RAJUK. The data on daily rainfall for the DMDP context from 2000 to 2018 were collected from the Bangladesh Water Development Board (BWDB).

#### *2.3. Methods*

The collected raster datasets and shapefiles were all projected to the universal coordinate system: 'WGS\_1984\_UTM\_Zone\_46N'. Figure 2 illustrates the overall methodology of this study, while the methodological steps are spelled out in the following sub-sections.

#### 2.3.1. Pre-Processing of Satellite Images

#### Preprocessing of Landsat Data

The collected Landsat images were carefully cross-checked to ensure that all collected images were geometrically corrected. Thus, no image-to-image spatial adjustments were performed thereafter.

Nevertheless, two accompanying Landsat scenes altogether cover the entire boundary of the DMDP area. However, due to the differences in sun angle as well as in the timing and date of image acquisition, the spectral signatures of each scene were notably different than to its counterpart for each corresponding year. Thus, this study avoided image mosaicking before image classification. The radiometric calibration and atmospheric correction of these collected images were performed thereafter in the ENVI platform.

#### Preprocessing of NTL Data

The NTL DMSP-OLS data inherently have two main problems [46]: (a) saturation and blooming effects, and; (b) inter-annual inconsistency, whereas such problems with the NTL VIIRS data are assumed to be minimal [47]. Hence, in order to perform the spatio-temporal analysis using NTL data, producing a temporarily consistent DMSP-OLS data became necessary.

*Reducing saturation and blooming e*ff*ects of NTL DMSP-OLS 1992 data:* In order to reduce the saturation and blooming effects of the NTL DMDP-OLS data of 1992, this study adapted Cao et al.'s [48] self-adjusting model (SEAM).

The SEAM model was first applied in Beijing, China. Considering the compact land use development pattern and high-density sprawl development that are predominantly prevalent both in Dhaka and Beijing [49,50], Dhaka's urban growth pattern and subsequent night-time light illumination appears to be similar with the corresponding illuminations of Beijing. Thus, the paper utilized this SEAM model for the DMDP area context by using the SEAM's script coded in MATLAB. In addition, this study chose the SEAM model because it can alleviate the saturation and blooming effects of the NTL DMSP-OLS images without the help of other auxiliary data.

According to the SEAM model, the saturation and blooming effects are estimated by pixel-based regression using pseudo light pixels (PLPs) and their neighboring light sources. These PLPs are selected from the urban edges. The PLPs shown in Figure 3 represent weak brightness (i.e., Digital Number (DN) > 0), but one or more of its eight neighbors are dark (i.e., DN = 0). For each DMSP pixel with DN larger than 0, PLPs are selected within a radius of 150 km.

*Remote Sens.* **2020**,*12*, 3938

**Figure 2.** Methodological steps of this study.

**Figure 3.** The selection of pseudo light pixels (PLPs) and their adjacent pixels within the frame of a 7 X 7 moving window [48] (p. 404).

*Interannual Calibration of NTL 1992 data:* The SEAM-corrected NTL DMSP-OLS data of 1992 (Figure 4b) were further intercalibrated by using Wu et al.'s [51] study approach. The following equation was applied:

$$\text{DN}\_{\text{C}} = \text{a } \text{X } (\text{DN}\_{\text{m}} + 1)^{\text{b}} - 1 \tag{1}$$

where DN<sup>c</sup> = intercalibrated NTL DMSP-OLS image of 1992; DN<sup>m</sup> = SEAM-corrected NTL DMSP-OLS image of 1992; a = model coefficient = 0.8959; and b = model coefficient = 1.0310. The negative value of the NTL data of 1992 was later set to zero. This way, the NTL DMDP-OLS data of 1992 were made compatible with the NTL VIIRS data of 2016 to carry out spatio-temporal dynamics of peri-urban growth.

**Figure 4.** (**a**) The original DMSP-OLS image of 1992 for the DMDP area; (**b**) self-adjusting model (SEAM)-corrected image of 1992 for the DMDP area; (**c**) resampled night-time light (NTL) VIIRS image of 2016 for the DMDP area.

*Resampling and harmonizing NTL datasets:* In the case of selecting the NTL dataset of 2016, this study chose the VIIRS annual average radiance composite vcm-orm-ntl data of 2016, which are cloud-free, outlier removed, stray-light corrected, ephemeral-lights eliminated, and geometrically corrected. Hence, no pre-processing on NTL VIIRS data was performed. These NTL data of 2016 were projected to the WGS\_1984\_UTM\_Zone\_46N reference system. The spatial resolution of the projected NTL VIIRS data was 451.52m and radiance values (a.k.a. DN values) within the selected DMDP area were

between 0 and 69.86 nanowatts/cm<sup>2</sup> /sr. In order to bring harmony with the spatial resolution of the NTL DMSP-OLS data of 1992, the NTL data of 2016 were then resampled to 903.04 m spatial resolution by using the 'bilinear' resampling technique in the ArcGIS platform (Figure 4c).

#### 2.3.2. Classifying Landsat Data

This study applied the most commonly used maximum likelihood supervised classification (MLSC) technique in the ENVI Platform to classify each selected Landsat image into five land cover categories: (a) bare soil; (b) built-up; (c) vegetation; (d) water body, and; (e) low-land. Other classification techniques—e.g., random forest algorithms, support vector machines, decision tree algorithm—are also popular for classifying remotely-sensed data. However, for classifying Landsat data, in particular, taking appropriate training samples produces more accurate classification outcomes than adapting any specific classification technique itself [52,53]. Given that the MLSC technique is robust and available with any remote sensing software package [54], this study selected this MLSC technique. While applying the MLSC technique, the training samples were repeatedly modified in order to ensure the selection of the most representative fraction for each land cover class for yielding the most accurate classification outcome. Later on, post-classification mosaic for each selected year's image was performed in order to accommodate the entire boundary of DMDP area within a single classified image frame.

#### 2.3.3. Post-Processing of Classified Images

All classified images were found with 'salt-and-pepper' effects and some degree of localized misclassifications—e.g., 'bare soil' was misclassified as 'built-up areas', and 'built-up areas' was misclassified as 'bare soil'. In addition, in some places, low-density scattered settlements were misclassified as 'vegetation' due to the shade of vegetation coverage. Nonetheless, due to the lower resolution of Landsat images, such perceived misclassifications are common with Landsat image classification [53].

Hence, post-processing of classified images became necessary, which was subsequently carried out in the ArcGIS platform. First, in order to remove the 'salt-and-pepper' effects—i.e., removing the presence of isolated pixels from classified images, this study employed 'Majority Filter' by using the three by three window. Second, the 'Boundary Clean' tool was used for smoothening the boundary of land cover classes. Third, for removing the small isolated regions, the study further generalized classified images by sequentially using the 'Region Group', 'Set Null', and 'Nibble' tools. Fourth, the classified raster images were further vectorized in order to manually rectify the localized misclassifications. After manual rectification of localized misclassification, these images were again rasterized to make those images fit for accuracy assessment.

#### 2.3.4. Accuracy Assessment of Classified Images and Change Analysis

The accuracy assessment of classified images was performed in the Google Earth platform. In the case of selecting the number of sampling points for accuracy assessment, if the study area's coverage is less than 1 million acres and the classified land cover categories are fewer than 12 land cover classes, a minimum of 50 sampling points for each land cover category is recommended [55]. In this regard, the size of the study area was 377,842.53 acres, and the number of classified land cover categories was 5. Thus, the threshold sampling points for this accuracy assessment task were 250. However, the number of sampling points under the stratified random sampling technique is proportional to the area coverage of each land cover class. As some land cover classes (e.g., water body) were too rare, formulation of a minimum of 50 stratified random sampling points for each land cover class was not possible. Finally, a total of 351 random sampling points was generated through stratified random sampling technique in the ArcGIS spatial analyst platform. The image classification process and the corresponding post-processing tasks of classified images as mentioned above were repeatedly done until all classified images were found with a minimum classification accuracy of 85% [56]. Overall classification accuracies of the classified images for 1989, 1999, 2009, and 2019 were 90.00%, 84.90%, 84.90%, and 85.47% with Kappa coefficients of 0.849, 0.778, 0.775, and 0.769, respectively. Finally, post-classification change detection was carried out by using Land Change Modeler (LCM) in the TerrSet Geospatial Monitoring and Modelling platform.

#### 2.3.5. Peri-Urban Mapping Using NTL Data

#### Recognizing the Fuzzy Characters of Peri-Urban Areas

Peri-urban areas generally start from location proximity to an urban core and continue until a predominantly rural landscape is found. Thus, peri-urban areas, in general, comprise both the characteristics of urban and rural land uses, where some portions of peri-urban areas are more urban, and the remaining factions are more rural. Hence, peri-urban areas significantly possess the characteristics of the fuzzy-set theory—which limits peri-urban areas within a fuzzy membership value of 0 and 1, where '0 = predominantly rural', and '1 = predominantly urban'. Consequently, the values between 0 and 1 imply peri-urban areas, where higher membership values (e.g., 0.75) indicate more peri-urban—i.e., inclined to more urban—and lower values (e.g., 0.15) indicates less peri-urban—i.e., inclined to more rural. Henceforth, this study applied the fuzzy membership function to reveal these perceived fuzzy characteristics of peri-urban areas.

#### Identifying the Suitable Fuzzy Membership Function

This study primarily centered around selecting a suitable fuzzy membership function to reveal the fuzzy characterizes of peri-urban areas. Initially, fuzzy 'Gaussian' and 'Linear' membership functions on NTL data in the ArcGIS platform were run. By using multiple combinations as input values, it is observed that the fuzzy Linear membership function is more interpretive than the Gaussian one while using NTL data. Thus, this paper chose the fuzzy Linear membership function to map the fuzzy characteristics of peri-urbanization and named this as the 'fuzzy linear urban membership function'.

#### Selecting the Membership Value for the Fuzzy Linear Urban Membership Function

In order to determine the minimum and maximum membership values of the fuzzy linear urban membership function, this study first extracted the persistent and dynamic built-up areas. The persistent built-up areas are those which remained unchanged between 1989 and 2019, while the dynamic built-up areas are the landscapes which were converted into built-up surfaces between 1989 and 2019. As peri-urban areas become urbanized over a 20- to 30-year period, the built-up areas showing persistence between 1989 and 2019 were hypothesized to be predominantly urban. Thus, this study assumes that the characteristics of peri-urban areas lie within the dynamic built-up surfaces. Henceforth, this study extracted the areas which were converted into built-up surfaces between 1989 and 2019.

Later on, this study carried out zonal statistics on the NTL datasets of 1992 and 2016 for the areas comprising persistent and dynamic built-up areas individually (Table 2). It is observed that the mean value between these two separate built-up zones is significantly different, while the differences in the maximum values between these two separate zones are not so conspicuous. Thus, the mean value of dynamic built-up areas appears to have significant potential to interpret the peri-urbanization pattern of the DMDP area. Consequently, this study took the 'minimum value' and 'mean value' as the 'minimum value' and 'maximum value', respectively, to form the fuzzy linear urban membership function set images in the ArcGIS platform. Any values more than the mean estimate (which was subsequently considered as the 'maximum value') indicated predominantly urban areas in the fuzzy linear urban membership function set images.


**Table 2.** The radiance values of NTL data between dynamic and persistent built-up areas.

It is important to mention that this study only used the boundary of dynamic built-up areas in order to derive the membership values for the fuzzy linear urban membership function set images using NTL data. No pixel-to-pixel comparison was performed between 30 m resolution classified Landsat data and 903.04 m resolution NTL data. This way, the difficulty of handling the variations in spatial resolution of these two different image datasets was avoided.

Identification of Peri-Urban Areas within the Fuzzy Linear Urban Membership Function Set Images

The urban core areas were predominantly found within the values of 0.80 to 1.0, while the rural areas were predominantly found within the value range of 0 to 0.10 in the fuzzy linear urban membership function set images. Consequently, peri-urban areas are delimited within the value range of 0.11 to 0.79. Hence, by using these perceived value ranges, the fuzzy linear urban membership function set images of the years 1992 and 2016 are reclassified into three categories (Figure 5): (a) predominantly urban (PURBAN); (b) predominantly rural (PRURAL), and; (c) peri-urban (PU).

**Figure 5.** The level of urbanization under the microscope of fuzzy linear urban membership function set.

Derivation of Fuzzy Set Statistics for the Study Area

This study derived the overall level of urbanization and the degree of fuzziness for the DMDP context. The level of urbanization denotes the mean gross level of membership in the fuzzy linear urban membership function set. The following equation was used to calculate the level of urbanization.

$$\text{LoU}\_{\text{year}} = \frac{\sum\_{i}^{n} \text{F}\_{\text{year}}}{\sum \text{n}} \tag{2}$$

where LoU is the the level of urbanization of each corresponding year, and n is the total number of cells in the corresponding fuzzy linear urban membership function set image.

In order to determine the degree of fuzziness, the following equation was used:

$$\text{DoF}\_{\text{year}} = \frac{\left(\frac{\text{F}\_{\text{year}} - \text{F}\_{\text{Predominary}}}{\text{F}\_{\text{year}}}\right)}{\sum \text{n}} \tag{3}$$

where DoFyear = the degree of fuzziness for each corresponding year, and FPredominantly Urban = fuzzy linear membership function set for predominantly urban areas of each corresponding year, where the minimum and maximum values for the predominantly urban set image were assigned to 0.80 and 1.0 respectively.

#### 2.3.6. Validation of Peri-Urban Mapping

#### Ground Truthing of Peri-Urban Mapping

In order to determine the accuracy of the peri-urban mapping approach, this study carried out ground-truthing by using 300 stratified random sampling points. Considering the availability of data for the purpose of ground-truthing using the Google Earth platform, the study chose NTL VIIRS data mapping of 2016 for this ground-truthing purpose. Nevertheless, as peri-urbanization is local context-specific, a set of criteria was primarily formulated to distinguish the areas which are predominantly urban, peri-urban, or predominantly rural for the context of the DMDP area (Table 3). Later on, a ground-truthing exercise was performed in the Google Earth platform and a confusion matrix was formed thereafter to derive the accuracy of the peri-urban mapping exercise of 2016.


**Table 3.** Criteria to ground-truth the NTL VIIRS data mapping of 2016.

Checking the Consistency of Peri-Urban Expansions with the Proposed Plan Documents

In order to determine the changes in peri-urban growth corresponding to the proposed changes as stipulated in the DMDP's City Region Development Project (CRDP) 2015–2035, this study compared the NTL VIIRS data mapping of 2016 with the proposed land use zones of CRDP 2015–2035.

2.3.7. Identifying Factors Affecting the Spatial Distribution of Peri-Urbanization

#### Pre-Processing of Ancillary Datasets

The collected raster datasets on demography, the average likelihood of poverty (ALP) and standard deviation of the average likelihood of poverty (ALP), and the continuous surface elevation dataset generated for the DMDP context were all resampled to 903.04 m spatial resolution (by using 'bilinear' resampling technique in the ArcGIS platform) in order to obtain consistency with the NTL data-driven peri-urban mapping for comparative analysis. This study considers the standard deviation of the average likelihood of poverty (ALP) as a proxy variable of social stratification.

#### Identification of Statistically Significant Hot Spots for Interpreting Peri-Urbanization

In order to reveal whether the spatial distribution of population growth, the standard deviation of the average likelihood of poverty (ALP), elevation, and corresponding changes in the spatial distribution of peri-urbanization is clustered, random, or dispersed, the spatial autocorrelation was carried out to derive the Global Moran's I statistic. The positive Moran's Index implies a tendency towards spatial clustering, while the negative value of Moran's Index insinuates a tendency towards spatial dispersion. Based on the outcome of the spatial autocorrelation against each parameter as mentioned above, the null hypothesis 'spatial distribution of a given parameter is normally distributed' was tested.

Afterwards, hotspot analysis was carried out to compute the Getis-Ord Gi\* in order to measure the intensity of such perceived clustering. While identifying the Getis-Ord Gi\* statistic, a False Discovery Rate (FDR) correction was applied in order to resolve the issues of spatial dependency and multiple testing. Regardless of whether the z-score is positive or negative, a higher z-score value implies more spatial clustering. A positive z-score value indicates clustering of hot spots, while a z-score with a negative value indicates clustering of cold spots, and a z-score near zero implies clustering is 'not significant', meaning that no apparent spatial clustering is evident.

#### Performing Geographically Weighted Regression

The Geographically Weighted Regression (GWR) was carried out in order to reveal how peri-urbanization is interpreted as a response variable to the spatial distribution of population growth, the standard deviation of the average likelihood of poverty (ALP), and elevation pattern for the DMDP context.

#### 2.3.8. Identifying Peri-Urbanization Triggered Climate Change Vulnerabilities

#### Identifying Rainfall Pattern of the Study Area

The daily rainfall data from 2000 to 2018 (collected from the Bangladesh Water Development Board) were used to derive the statistics on total annual rainfall, annual monsoon rainfall, and annual non-monsoon rainfall. Each year's rainfall from May to September was considered as the 'annual monsoon rainfall', and the rainfall of the remaining months for each year was considered as the 'annual non-monsoon rainfall'.

#### Identification of Peri-Urban Growth Pockets Vulnerable to Flooding

The data on DMDP's flood plain area (collected from the RAJUK) were superimposed on NTL VIIRS data mapping of 2016 in order to find out the peri-urban growth pockets vulnerable to flooding.

#### Mapping the Socioeconomic Impacts of Peri-Urban Growth

The demographic data for the years 2001, 2016, and 2020, the average likelihood of poverty (ALP) and the standard deviation of the average likelihood of poverty (ALP) for the year 2013 for the DMDP context were all analyzed and compared with the NTL VIIRS data mapping of 2016 in order to illustrate how the changes in the spatial distribution of demography and the standard deviation of ALP correspond to the changes in peri-urban growth.

#### **3. Results**

#### *3.1. Changes in Land Cover*

The classified images are presented in Figure 6, while the aerial statistics of those classified images are presented in Table 4.

**Figure 6.** (**a**) Classified DMDP image from 1989; (**b**) classified DMDP image from 1999; (**c**) classified DMDP image from 2009; (**d**) classified DMDP image from 2019.


**Table 4.** Aerial statistics of classified images for the DMDP area.

The results find that except for built-up areas, the remaining land cover categories declined over time. While the highest decline of 22.18% is observed in the vegetation category, the 'bare soil' and 'low-land' categories also demonstrate a significant decline over this 30-year period. Thus, the losses in bare soil, vegetation, and lowlands contributed altogether to the development of built-up areas over time. Consequently, built-up areas quadrupled over time and constituted more than half (i.e., 52%) of the DMDP area in 2019, whereas bare soil and vegetation were more than halved within this 30-year period. Earlier studies also found a similar growth trend for bare soil, built-up, vegetation, and lowland areas (e.g., [57–59]).

In addition, all land cover categories (except water body) demonstrate an exponential growth/decline trend with strong coefficient of determination values of R<sup>2</sup> .

In the case of sub-district-wise net changes in bare soil, Gazipur Sadar encountered the highest decline (i.e., 26%) in bare soil, followed by the Dhaka Metropolitan Area (DMA), and Narayanganj Sadar by 13% and 8%, respectively, between 1989 and 2019 (Figure 7). While sub-districts predominantly lost bare soil over time, Kaliganj presented an increase of 7% for bare soil within this 30-year period. On the contrary, all sub-districts encountered significant losses in vegetation. As for the case of built-up areas, all sub-districts except for Kaliganj demonstrated significant increases over time, while built-up areas of Kaliganj declined by 4% between 1989 and 2019.

#### *3.2. Changes in Peri-Urban Boundary*

The aerial statistics of peri-urban growth has been presented in Table 5, while the peri-urban mapping has been depicted in Figure 8. The analysis reveals that for the DMDP context, predominantly urban areas remained nearly unchanged over time, while the major changes occurred in transitioning predominantly rural areas into peri-urban areas (Table 5). Thus, major development within the DMDP context predominantly took place in the form of peri-urbanization.


**Table 5.** Aerial statistics of peri-urban growth between 1992 and 2016.

**Figure 7.** (**a**) % of net changes in bare soil; (**b**) % of net changes in built-up areas; (**c**) % of net changes in vegetation; (**d**) % of net changes in low-land areas.

**Figure 8.** (**a**) Peri-urban mapping of 1992; (**b**) peri-urban mapping of 2016.

The level of urbanization within the predominantly urban areas category was 0.94 and 0.96 in 1992 and 2016, respectively. Thus, predominantly urban areas had undergone the highest level of urbanization within these 24 years (1992–2016), followed by peri-urban areas and predominantly rural areas. Nevertheless, in terms of the level of urbanization, predominantly urban areas demonstrated an increase of 2.13% only, which resulted in an increase of 71.43% in the degree of fuzziness in 2016 from the base year estimate of 1992. Such a finding reveals that along with the pace of urbanization, a less articulated pattern of urban land use practices becomes more evident.

In addition, the level of urbanization within the predominantly rural areas category demonstrates the same thing, while the degree of fuzziness within this zone is 1.00 in 1992 and 2016, meaning the land use practices embraced an unclear pattern in distinguishing whether the areas are predominantly rural or urban.

Furthermore, predominantly rural areas and peri-urban areas both demonstrated an increase of 33.33% in the level of urbanization in 2016. However, while unquestionably rural areas demonstrated persistence in encountering the highest degree of fuzziness in 1992 and 2016, peri-urban areas showed a slight decline of 4.08% in 2016, implying a nominal improvement in embracing articulated land use practices within peri-urban areas.

In the case of the sub-district-wise level of urbanization and degree of fuzziness, with the correlation coefficients of (-)0.97 and (-)0.95, respectively, for the years 1992 and 2016, the result reveals that the degree of fuzziness is inversely correlated to the level of urbanization, with Bandar being an exception (Figure 9). The Bandar sub-district encountered the highest increase in the rate of the level of urbanization (i.e., 277%) and experienced a reduction in the degree of fuzziness, from 0.93 to 0.65 in 2016. Such a finding implies that scattered urban development in Bandar has been further consolidated through increased urbanization, resulting in a decrease in the degree of fuzziness thereafter. The Dhaka Metropolitan Area (DMA), and Narayanganj Sadar continued the higher level of urbanization both in 1992 and 2016, and hence scored the lowest in terms of the degree of fuzziness.

The remaining sub-districts underwent less urbanization and encountered more fuzziness, meaning that an intensified unarticulated peri-urbanization predominantly took place within these (remaining) sub-districts.

#### *3.3. Validation of Peri-Urban Mapping*

#### 3.3.1. Ground-Truthing of Peri-Urban Mapping

The spatial distribution of the ground-truthing points generated through the stratified random sampling technique is presented in Figure 10, while the confusion matrix of this ground-truthing exercise is given in Table 6. With a Kappa coefficient of 0.75, the ground-truthing exercise yields an overall accuracy of 86%.

**Figure 10.** Spatial distribution of ground-truthing points on NTL VIIRS data mapping of 2016 for: (**a**) predominantly rural areas; (**b**) peri-urban areas; (**c**) predominantly urban areas.



#### 3.3.2. Consistency of Peri-urban Expansions with the Proposed Plan Documents

The City Region Development Plan (CRDP) proposed land use zone of 2015–2035 was superimposed over the NTL VIIRS data mapping of 2016 in order to cross-check how the adapted peri-urban mapping approach fits with the proposed land use zoning for the DMDP area.

The CRDP's 2015–2035 proposed land use zones comprise eight broad land use categories (Figure 11). The analysis reveals that the mixed-use zone (55%) and agricultural zone (29%) are the two most dominant proposed land use zones in the DMDP area, which altogether account for 84% of the DMDP's area coverage (Table 7).

**Table 7.** Aerial statistics of the proposed land use zone (derived from the CRDP 2015–2035 [60]).


**Figure 11.** Spatial distribution of CRDP's 2015–2035 proposed land use zones [60].

According to the NTL VIIRS data mapping of 2016, predominantly rural areas comprise 11% of the DMDP area, out of which the CRDP's 2015–2035 proposed agricultural zone and mixed use (i.e., a blend of residential, commercial, and general industrial areas) zone altogether comprise around 10%.

Similarly, in an aggregate, peri-urban areas comprise around 58% of the DMDP's area, whereas mixed use zones (28.10%) and agricultural zones (21.87%) altogether comprise around 50% of the DMDP's peri-urban area. Thus, peri-urban areas are proposed to accommodate around three fourths of the DMDP's proposed agricultural areas and more than half of the proposed mixed-use zone. Predominantly urban areas are proposed to be dominated by mixed use zones which comprise around two fifths of the proposed mixed-use zone areas.

Among the other proposed land use zones, institutional zones comprise around 4% of the DMDP's area, and comprise 2.61% and 1.33% of the predominantly urban and peri-urban areas, respectively. In addition, more than three fourths of the DMDP's proposed heavy industrial zone is located in predominantly urban areas. Meanwhile, among the total proposed forest areas, nearly three fourths of the proposed forest zones are located in peri-urban areas and the remaining portion was proposed to be accommodated by predominantly rural areas.

The analysis reveals that the proposed forest areas are observed as having the lowest values for LoU and the highest values for DoF (Figure 12). Contrariwise, the proposed heavy industrial zone appeared to be the most urbanized in terms of LoU, and hence had the lowest values for DoF. Apparently, heavy industrial zones are proposed based on strict policy control, and hence spontaneous development of these areas are not possible, resulting in an articulated land use pattern of these areas. Meanwhile, forested areas are spontaneous natural growth zones, and hence were observed as having the highest DoF.

**Figure 12.** The level of urbanization (LoU) and degree of fuzziness (DoF) under the CRDP's 2015–2035 prosed land use zones.

In addition, in the case of each subdistrict, the higher the LoU, lower the level of DoF in each proposed land use zone is (Figure 13). The Kaliganj sub-district, with the lowest LoU and an absolute DoF value of 1.0, provokes an immediate land use intervention by the city authorities (Figure 13d). Other sub-districts with higher DoF include Gazipur, Keraniganj, Rupganj, Sonargaon, and Savar. These areas are predominantly located in peri-urban areas, meaning that peri-urban growth appears to be dominated by unarticulated land use patterns.

**Figure 13.** *Cont.*

**Figure 13.** The level of urbanization (LoU) and degree of fuzziness (DoF) for CRDP's 2015–2035 proposed land use zones: (**a**) Bandar; (**b**) Dhaka Metropolitan Area (DMA); (**c**) Gazipur Sadar; (**d**) Kaliganj; (**e**) Keraniganj; (**f**) Narayanganj Sadar; (**g**) Rupganj; (**h**) Savar; and (**i**) Sonargaon.

Such general findings are also evident in the sub-district-wise LoU and DoF calculations that were derived from the NTL VIIRS data mapping of 2016 (Figure 9). Thus, it can be deduced that the proposed NTL data mapping exercise of 2016 yields substantive outcomes in distinguishing predominantly rural, peri-urban, and predominantly urban areas.

#### *3.4. Factors A*ff*ecting the Spatial Distribution of Peri-Urbanization*

#### 3.4.1. Identification of Statistically Significant Hot Spots for Interpreting Peri-Urbanization

#### Hotspots of Population Growth

In the DMDP area, a highly statistically significant (Moran's Index: 0.69, *p*-value = 0, and hence the null hypothesis 'spatial distribution of the population is normally distributed' was rejected) cluster pattern of population growth is evident. The hot spots of population increase are found predominantly in the Gazipur Sadar areas (Figure 14a), which bolsters the earlier findings on Gazipur Sadar's supremacy in attainting the highest rate of population growth. However, Savar areas are prevalent with both hot spots and cold spots. Indeed, the northern part of Savar in proximity to the Gazipur Sadar areas is observed as having hot spots, whereas the southern part of Savar in proximity to the Dhaka Metropolitan area is found to have the cold spots for population growth. As Savar is found to have both cold and hot spots of population growth, the aggregate growth rate of population in Savar does not appear to be higher than its contemporaries, although the earlier findings claim that Savar is the most peri-urbanized sub-district for this study.

#### Hotspots Mapping of the Standard Deviation of the Average Likelihood of Poverty

The hotspots mapping exercise on the standard deviation of the average likelihood of poverty (ALP) demonstrates a highly statistically significant clustering pattern (Moran's Index = 0.78 and *p*-value = 0, and hence the null hypothesis 'spatial distribution of poverty deviation is normally distributed' was rejected). However, in this regard, hot spots are predominantly confined within the Dhaka Metropolitan Area while the cold spots are predominantly observed near the eastern periphery of the Dhaka Metropolitan Area, which enjoins Rupganj, Sonargaon, and Kaliganj sub-districts (Figure 14b). Thus, the spatial distribution of the standard deviation of ALP hot-spot mapping exercise does not appear to be commensurate with the hot spot mapping of the population growth pattern.

#### Hotspots of Elevation Pattern

A statistically significant (Moran's Index: 0.56, *p*-value = 0, and hence the null hypothesis 'spatial distribution of elevation pattern is normally distributed' was rejected) clustering pattern is dominant in the spatial distribution of elevation for the DMDP area. The hot spots (i.e., areas with higher elevation) are predominantly concentrated in the Gazipur Sadar and Savar sub-districts—i.e., the areas which underwent the highest peri-urbanization (Figure 14c).

#### Hotspots of Peri-Urbanization

This study used the NTL VIIRS data mapping of 2016 in determining the spatial distribution of hotspots for peri-urbanization (Figure 14d). The Moran's Index was computed as (+)0.90 with a z-score value of 54.51 and *p*-value of 0. Thus, the null hypothesis 'spatial distribution of peri-urbanization pattern is normally distributed' was rejected, implying that the spatial distribution of the fuzzy linear urban membership function set image of 2016 (i.e., NTL VIIRS data mapping of 2016) demonstrates a highly significant clustering pattern.

The analysis reveals that hot spots are predominantly urban, whereas the cold spots are predominantly rural (Figure 14d). On the contrary, the areas which are not statistically significant in the fuzzy linear urban membership function set image of 2016 are predominantly peri-urban. Indeed, peri-urban areas which are neither urban or rural seem to possess more fuzzy characteristics, which in turn lead to them to be statistically insignificant.

**Figure 14.** (**a**) Hot spot analysis of the aggregate population growth rate (APGR) mapping; (**b**) hot spot analysis of the standard deviation of the average likelihood of poverty (ALP); (**c**) hot spot analysis of elevation pattern derived from the SRTM global DEM data [43]; (**d**) Hot spot analysis of the fuzzy linear urban membership function set image of 2016.

#### 3.4.2. Carrying out Geographically Weighted Regression for Interpreting Peri-Urbanization

The findings from hot spot analysis reveal that the spatial distribution of peri-urban hot spots mapping is more linked to the corresponding distribution pattern of elevation and the growth of population. Thus, in order to illustrate how such perceived peri-urbanization of the DMDP area varies across space corresponding to the changes in elevation and population growth pattern, Geographically Weighted Regression (GWR) was carried out. The adjusted R<sup>2</sup> value of the GWR model was 0.87, indicating that peri-urbanization is predominantly linked to higher elevation and higher population growth rate. Meanwhile, the R<sup>2</sup> value of the GWR model was 0.63 when peri-urbanization was interpreted as a response variable of elevation, population growth rate, and the standard deviation of ALP. Thus, when the standard deviation of ALP is omitted, the GWR model interpreting DMDP's peri-urbanization produces a better outcome, meaning that peri-urbanization does not necessarily mean an over concentration of poverty in the city's outskirts. Instated, such concentration is predominantly driven by the land scarcity in the city's core and an unleased influx of population to the capital city Dhaka.

#### *3.5. Identifying Peri-Urbanization Triggered Climate Change Vulnerabilities*

#### 3.5.1. Identifying Rainfall Pattern of the Study Area

The annual rainfall pattern of the DMDP area is presented in Figure 15. The analysis reveals that from 2000 to 2018, total annual rainfall in the DMDP area ranged from 818.1 (in 2010) to 2873 mm (in 2017), with an annual average rainfall of 1824.91 mm within this 18-year period. In addition, it was calculated that the annual average rainfall in the monsoon—which lasts from September to May—comprised exactly 80% of the total annual rainfall within this 18-year period.

**Figure 15.** Rainfall pattern of the DMDP area.

Surprisingly, in 2012, the DMDP area had the lowest annual monsoon rainfall percentage of 67.72%, whereas in the following year (i.e., in the year 2013), the percentage of annual monsoon rainfall was the highest, at 91.30%. In addition, this rainfall pattern does not follow any significant growth trend over time. For example, although the highest annual rainfall occurred in the year 2017, annual monsoon rain within this year was 73.41%, which is below the annual average percentage of monsoon rainfall (i.e., 80%), whereas the highest annual rainfall in the non-monsoon period was observed in 2017.

However, this finding does not simplify with certainty that an occurrence of more rainfall events in the monsoon is linked with less rainfall in the non-monsoon period and vice versa, as the correlation coefficient of the rainfall pattern between the monsoon and non-monsoon period is 0.44 (which is not very strong), meaning that abrupt rainfall in the DMDP area has gradually become common. Consequently, events of frequent flooding due to heavy rainfall often occur in the DMDP area.

#### 3.5.2. Identification of Peri-Urban Growth Pockets Vulnerable to Flooding

In general, DMDP is a flood-prone area. However, the northern part of the DMDP is in higher elevation, and is therefore comparatively less flood-prone (Figure 16). Thus, higher population concentration and subsequent peri-urbanization is observed in the areas with higher elevation, which are predominantly concentrated in the Gazipur Sadar and Savar areas. The analysis reveals that around 44% of total peri-urban areas are located within the flood plain area, while such percentage distribution for predominantly rural and urban areas is 35% and 33%, respectively, meaning that peri-urban areas are most vulnerable to frequent flooding. Such findings are relevant to Dewan et al.'s [61] study, who claimed that the DMDP's peri-urban areas are moderately to highly vulnerable to annual flooding.

**Figure 16.** DMDP's flood plain area (FPA) (derived from CRDP [60]) superimposed on the VIIRTS NTL data mapping of 2016.

#### 3.5.3. Mapping Socioeconomic Impacts of Peri-Urban Growth

#### Changes in Demography

The spatial distribution of population predominantly is concentrated within the Dhaka Metropolitan Area (DMA), Narayanganj Sadar, and along the Dhaka-Mymensingh and Tangail-Joydevpur Highways of the Gazipur Sadar area (Figure 17). Apparently, the spatial changes in demography between 2001 and 2016 are quite conspicuous, whereas such changes between 2016 and 2020 appear to be quite similar.

**Figure 17.** (**a**) Spatial distribution of 2001 population; (**b**) spatial distribution of 2016 population; (**c**) spatial distribution of 2020 population (derived from WorldPop and CIESIN [45]).

In addition, the percentage distribution of population between predominantly rural areas, peri-urban areas, and predominantly urban areas remained almost equal over time, whereas the density of population nearly doubled within this 19-year period (Table 8). Thus, this study finds a perpetual increase in growth rate of population over time for the DMDP context. Although the density of population in DMDP's predominantly urban areas is unparalleled compared to the rest of the world, the population density of even DMDP's predominantly rural areas is much higher than some of the urban areas' population density of the developed world. For example, a country like Australia, which comprises more than 50 times larger area coverage than Bangladesh, has an urban area population density of 903 persons/km<sup>2</sup> [41].

**Table 8.** Distribution of population between predominantly rural areas, peri-urban areas, and predominantly urban areas based on NTL VIIRS 2016 data mapping.


Nevertheless, the population density of the Dhaka City Corporation (DCC) area is 52,000 persons/km<sup>2</sup> [40], whereas the estimated 2020 population density for predominantly urban areas is 31,666. Indeed, predominantly urban areas of 2016 cover an area of 478 km<sup>2</sup> , whereas the area coverage of DCC is 134 km<sup>2</sup> . Thus, NTL VIIRS 2016 data estimated that predominantly urban areas are more than 3.5 times larger than the DCC's area coverage. Hence, such perceived lower population density in predominantly urban areas is likely, as DMDP's predominantly urban areas extend far beyond the jurisdiction of the DCC boundary.

#### Population Densification and Peri-Urban Growth

In general, predominantly urban areas have a relatively lower growth of population, while predominantly rural areas and peri-urban areas have higher growth rates of population.

Figure 18a illustrates the spatial changes in peri-urban mapping between 1992 and 2016, while the aerial statistics of these changes are presented in Figure 18b. The analysis reveals that the highest percentage distribution lies in the peri-urban areas category, which covers more than one-third of the DMDP's area. Surprisingly, in terms of spatial changes between the areas of predominantly rural, peri-urban, and predominantly urban, around two thirds (i.e., 63%) of the DMDP area demonstrated persistence over time. However, the third and fourth highest categories are the transition of predominantly rural areas to peri-urban areas and the transition of peri-urban areas to predominantly urban areas, constituting 18.25% and 7.08% of the area, respectively.

In terms of aggregate population growth rate that occurred between 2001 and 2020, the analyses reveal that persistent predominantly rural areas encountered the highest pressure of population growth by around 135%—i.e., 7.10% per year. Among the areas which remained persistent over time, predominantly urban areas exhibited the lowest population growth rate. The second highest growth rate of population was observed in the 'peri-urban to predominantly urban' areas category by 125.91%, implying a sequential transition of 'predominantly rural areas to peri-urban areas' and the subsequent transition of 'peri-urban areas to predominantly urban' areas.

Nevertheless, the areas demonstrating the transition of peri-urban to predominantly rural shows the third highest concentration of population growth by 113.05%, which predominantly occurred near the persistent predominantly rural areas of Kaliganj sub-district. Such a finding, in fact, is not unlikely as these persistent predominantly rural areas conglomerate the highest percentage of population growth between 2001 and 2020.

Surprisingly, an unusual transition titled 'predominantly urban to predominantly rural' is observed in the Narayanganj Sadar area which accounted for a negative growth rate of population. These areas of negative population growth are located in the highly flood prone area of the Dhaleshwari River's catchment, and hence frequent flooding displaces many people from this specific chunk of landscapes.

In terms of the sub-district-wise population growth rate, the Gazipur Sadar area encountered the highest rate of population increase with a larger margin than its contemporaries (Figure 18c). The areas—i.e., Kaliganj, Keraniganj, Rupganj, which are predominantly peri-urban or predominantly rural in type—have nearly the same rate of population growth in dynamic and persistent areas. Meanwhile, the two most urbanized areas—i.e., the Dhaka Metropolitan Area (DMA), and Narayanganj Sadar area—are observed as having the lowest rate of population increase. In addition, the population growth difference between persistent and dynamic areas is also found to be relatively higher in the Narayanganj Sadar and Dhaka Metropolitan Areas. However, the population growth rate in dynamic areas is relatively lower in each sub-district compared to the corresponding population growth rate of persistent areas, with Savar being an exception. The Savar sub-district possesses the largest chunk of landscapes which have been converted into peri-urban areas from predominantly rural, implying that Savar is the most peri-urbanized sub-district of this study. Such findings reveal that peri-urbanization is more linked to population increase in dynamic areas. Thus, while the Dhaka Metropolitan Area and Narayanganj Sadar area have extremely limited scope for peri-urbanization, the nearby sub-districts are potential areas for such peri-urban expansions.

**Figure 18.** (**a**) Map showing the spatial distribution of predominantly rural (PRURAL), peri-urban (PU), and predominantly urban (PURBAN) areas; (**b**) table showing percentage distribution in the changes of PRURAL, PU, and PURBAN areas and corresponding changes in the aggregate population growth rate (APGR) between 2001 and 2020; (**c**) sub-district-wise aggregate population growth rate (APGR) between 2001 and 2020.

#### Poverty and Peri-Urban Growth

The average likelihood of poverty (ALP) is more dominant in the Kaliganj sub-district and trickles down to its nearby sub-districts of the Gazipur Sadar and Rupganj (Figure 19a), whereas such poverty concentration is relatively lower in the Dhaka Metropolitan Area, Narayanganj Sadar, and Bandar areas. The above-mentioned Dhaka Metropolitan Area, Narayanganj Sadar, and Bandar areas are also observed as having higher population growth difference between dynamic areas and persistent areas, meaning that the influx of poor migrants is predominantly confined to the dynamic land parcels of Kaliganj and its nearby areas.

**Figure 19.** (**a**) Spatial distribution of the average likelihood of poverty (ALP) (i.e., the average probability of living on less than \$2.50/day) of 2013 [62]; (**b**) slum settlement areas (derived from Gruebner et al. [63]) superimposed on the standard deviation (Std. Dev.) of the average likelihood of poverty (ALP) mapping of 2013 [62]; (**c**) sub-district-wise spatial distribution of the average likelihood of poverty (ALP); (**d**) sub-district-wise spatial distribution of the standard deviation (Std. Dev.) of the average likelihood of poverty (ALP).

However, the standard deviation mapping of the average likelihood of poverty (ALP) depicts that the DMDP area is inflicted with higher variation in poverty concentration, that ranges from 2.8 to 6.5 (Figure 19b). Although a higher concentration of poverty is found in the Kaliganj, Rupganj and Gazipur Sadar areas, the Dhaka Metropolitan Area is observed with the highest deviation in the average likelihood of poverty (ALP), meaning that inequalities in social stratification within predominantly urban areas are comparatively higher.

Most parts of the Dhaka Metropolitan Area and Narayanganj Sadar area are already predominantly urbanized, while, through increasing peri-urbanization, the Savar sub-district is in the queue to be predominantly urbanized shortly. Consequently, the Dhaka Metropolitan Area, Narayanganj Sadar area, and Savar are derived as the sub-districts of higher poverty deviation (Figure 19d). Thus, the deviation in poverty concentration appears to be correlated with urbanization. For example, the Dhaka Metropolitan Area, as the most urbanized, ended up with the highest poverty deviation. Such a deviation in poverty intensification can further be illustrated by the abundance of slum settlements which are scattered throughout the DMDP area (Figure 19b).

#### **4. Discussion**

#### *4.1. Factors A*ff*ecting Peri-Urbanization*

While the horizontal expansion of the DMDP area is remarkably constrained by the unavailability of flood-free landscapes, the northern part of the DMDP area encountered the highest growth rate of population and subsequent peri-urbanization pressure. The DMDP's northern periphery, in general, lies at a higher elevation and is therefore naturally free from seasonal flooding. Thus, the findings reveal that peri-urbanization for the DMDP context is interpreted as an interaction space of elevation and population growth rate. Consequently, in-migration of the population to the capital city Dhaka plays a significant role for this perceived peri-urbanization. Yet, such an in-migration pattern is profoundly interlinked to the unique socio-economic settings of this country.

For example, the Dhaka's share of the national gross domestic product (GDP) is USD 162 billion (i.e., 40%) [64], implying a heavily inclined concentration of economic activities towards the capital city Dhaka. Consequently, Dhaka is disproportionately equipped with nearly one third (i.e., 31.8%) of the country's total employment [21]. In general, Bangladesh's economic base is predominantly agricultural (i.e., 49%), and 45% of the country's labor force primarily relies on agriculture for employment [65]. In addition, 84% of the rural population directly or indirectly depends on agriculture [66]. The disproportionate growth of agriculture-based industries coupled with the adaption of more efficient farming technologies triggers an exodus of surplus agricultural laborers to the capital city Dhaka [60]. Consequently, 63% of Dhaka's population growth is due to the rural–urban migration [67].

The majority of the migrants predominantly rely on the informal sector. Subsequently, the informal sector accounts for 84.30% of Dhaka's total employed population [21]. Thus, anticipating the social changes with response to economic growth becomes more difficult. Consequently, the task of addressing actual spatial growth pattern and accommodating informal economics become highly challenging.

In addition, Dhaka's per capita GDP is around 3 times higher than the national average. If Dhaka were an independent nation, Dhaka's gross GDP individually would stand as the 50th largest economy in the world. Consequently, excessive agglomeration of the capital city-centered development paradigm makes the DMDP's peri-urban growth highly unpredictable and challenging. Thus, DMDP's physical growth predominantly implies a more densified peri-urban growth which agglomerates overestimated population with inadequate infrastructural supports. The scarcity of resources further hinders the timely upgrading of supporting infrastructures and provisions—e.g., land use–transport integration [68,69], utilities, and employment. This way, policymakers and planners are severely obstructed to enhance a planned development in their peripheries.

#### *4.2. Implications for Growth Management and Natural Hazards*

Bangladesh is one of the most vulnerable countries to climate change's impact. Such impacts are already evident, as climate change-induced flooding—e.g., river flooding and storm water flooding—often strikes in the DMDP area [25]. Most parts of the predominantly urban areas (i.e., around 67%) are considered to be free from river flooding because of the proposed protective embankments, which surround them [60]. Nevertheless, the construction of the DMDP Urban Area Plan (1995–2005) proposed an eastern embankment which as yet remains incomplete. In addition, the portion of predominantly urban areas—that is considered to be flood-free—severely lack an efficient drainage system [70]. The analysis reveals that the DMDP's rainfall pattern from 2000 to 2018 is rather abrupt. Thus, heavy rainfall often happens, resulting in frequent storm water flooding during each year irrespective of the monsoon. Such a finding is also evident in other studies [25].

Nevertheless, due to the proximity to the nearby river, the remaining 23% of predominantly urban areas seem to be highly vulnerable to river flooding. The low-income people predominantly live in slums, and the spatial distribution of slums is scattered throughout predominantly urban areas. In addition, 80% of these slums are built on privately owned land and hence pose considerable institutional challenges in providing the basic urban facilities [30]—e.g., water supply, sanitation, sewerage, drainage, and electricity—in these slum-inflicted areas. On that very point, Baker [30] estimated that slum areas within a proximity of 50 m to nearby rivers accommodate around 76,000 households those are at high risk of being frequently river flooded. Such river flooding usually occurs with a return period of every 10 to 40 years [71].

However, despite the geographical limitations to support further physical growth, Dhaka's unleased population growth is not hindered by geographical limitations at all. For example, the analysis reveals that within these 19 years (2001–2020), the DMDP's population has increased at a rate of 0.5 million/year. Although the urban population as yet comprises less than 37.41% of the nation's population [72], such a portion reaches nearly 80% for the DMDP context. Nevertheless, the reported estimation of DMDP's annual population growth rate is 0.10 million/year higher than the earlier estimation by the World Bank [32]. Nonetheless, from 2001 to 2020, the DMDP's population has increased at an annual growth rate of 4.53%. Such perceived growth rate in the population is much higher than the national average. For example, between 2000 and 2019, the national annual population growth rate was derived as 1.5% [73].

In terms of population size, Dhaka has secured the 19th position globally among the top 20 megacities, but stands first with a large margin in terms of the urban population density of 33,878 persons/km<sup>2</sup> [41]. Among the top 20 megacities, only six have constantly experienced a population growth rate of more than 3% over the last 20 years [74]. Hence, the reported rate of DMDP's population growth is much higher than its contemporaries.

Consequently, due to an unabated growth in the DMDP's population, higher population concentration is also observed in the north-eastern periphery of the DMDP area (i.e., Rupganj and Kaliganj), which are predominantly flood-prone. In addition, these two areas are the hot spots of poverty with a lower standard deviation in the average likelihood of poverty index, meaning that these two areas predominantly accommodate the poorest portion of the DMDP's area. Thus, it can be inferred that poverty is forcing the poor migrants to live in these flood-prone areas. This way, the majority of the in-migrants, which also comprise climate migrants [75], are redundantly exposed to this climate change-induced frequent flooding. Such a finding poses a grave concern and brings forth the issue of climate justice for further consideration. On that very point, Ahsan [76] urged the adoption of area-specific growth policies in securing the basic rights for climate migrants of Bangladesh. Similarly, developing sound climate change mitigation policies is deemed critical to minimize climate migrant numbers across the globe [77].

Furthermore, this unleashed population influx to the capital city Dhaka worsens the prevailing water scarcity at large. For example, in 2005, the Dhaka Water Supply and Sanitation Authority (DWASA) was able to manage 1.6 mm<sup>3</sup> /day groundwater extraction at maximum against the minimum demand of 2.1 mm<sup>3</sup> /day [78]. In addition, since 1986, the DMDP's groundwater table has been depleted at a rate of 2 m/year [78]. While the DMDP area is suffering from evident water scarcity, the

gradual groundwater depletion coupled with the unprecedented population growth further exacerbates this shortfall.

Nonetheless, due to the prevalence of wetlands, small water bodies, canals, and rivers, the groundwater level is relatively higher in the peripheral areas than the central part of the DMDP area. Unfortunately, peri-urbanization is predominantly taking place in the northern and north-eastern peripheries of the DMDP area, which are the hotspots of industrial clusters and agglomerate more than 2000 ready-made garments (RMG) industries [79]. This RMG sector individually has contributed to 11.17% of the total GDP and makes Bangladesh the second-largest global exporter in the RMG sector [31]. The economic growth of this country is enormously linked to the development of this RMG sector [80]. Hence, the country's RMG sector possesses great potential to grow further.

Nevertheless, this RMG sector heavily relies on groundwater extraction, because surface water is not available throughout the year. Thus, over-dependence on groundwater extraction coupled with the increased peri-urbanization pose significant environmental threats to these peri-urban growth pockets including land subsidence, increased earthquake vulnerability, and ecosystem degradation.

#### **5. Conclusions**

Although managing urban growth based on a demarcated peri-urban growth boundary as yet remains rarely practiced, explicitly defined peri-urban growth pockets appear to be a facilitating tool for promoting more rationalized land use practices in the periphery. Particularly in the context of developing countries, where rampant peri-urbanization frequently occurs, quantifying the magnitude and degree of fuzziness of peri-urban land parcels facilitates policymakers in identifying the areas with more urbanities and higher transition potentials. This way, policymakers and planners will be well equipped in dealing with the dual characteristics (i.e., urban or rural) of peri-urban areas and resultant growth directions, and thereby to identify priority areas for immediate intervention.

Heikkila et al. [81] first recognized the fuzzy characteristics of peri-urban areas, and statistically quantified the level of urbanization and degree of fuzziness in detecting the urban growth pattern of Ningbo, China by using Landsat data. However, they did not map the spatial distribution of peri-urban areas at all, whereas this paper demarcated the spatial distribution of peri-urban growth pockets by unveiling the fuzzy characteristics of peri-urban areas. In doing so, this study developed fuzzy linear urban membership function set images using NTL data.

According to the findings of Mortoja et al.'s [20] study, few studies used NTL data for peri-urban demarcation (e.g., [82–85]). While all the studies as mentioned above used the crisp values of NTL data, none of the studies, to date, investigated the potential of NTL data in unveiling the fuzzy characteristics of peri-urban areas. In addition, the conventional methodological approaches—which utilized the crisp values of NTL data for peri-urban demarcation—are claimed to be suitable to demarcate peri-urban areas at the national/global level only. Meanwhile, the ground-truthing outcome on NTL VIIRS data mapping of 2016 implies that the methodological approach reported in this study also appears to be a good fit at the regional level. Thus, quantifying the fuzzy characteristics of peri-urban areas and translating those findings into demarcating peri-urban growth pockets using NTL data provides a strong theoretical basis for this study.

Nevertheless, this study did not test other peri-urban demarcation approaches. Hence, how the methodological approach reported in this study prevails over others in terms of accuracy cannot be determined. However, the adapted methodological approach provides an easy way of demarcating peri-urban areas by using the readily available datasets at hand, which in turn facilitates prompt executions of peri-urban growth decisions.

In addition, this study explores how peri-urbanization is interpreted as an interaction space of population growth, poverty, and increased vulnerability to natural hazards, particularly flooding. In this way, the research demonstrates the problem of rampant peri-urbanization and subsequent fragilities—e.g., social and poverty problems, natural hazards, affecting large cities. Thus, the results are of interest to a wide audience and are not limited to the remote sensing community.

**Author Contributions:** Conceptualization, T.Y.; methodology, M.G.M. and T.Y.; software, M.G.M.; validation, M.G.M.; formal analysis, M.G.M.; investigation, M.G.M. and T.Y.; resources, M.G.M. and T.Y.; data curation, M.G.M.; writing—original draft preparation, M.G.M.; writing—review and editing, M.G.M. and T.Y.; visualization, M.G.M.; supervision, T.Y. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors. The first author thanks Queensland University of Technology (QUT) for proving necessary supports to carry out this research. These supports have been provided through the Commonwealth-funded 'Research Training Program (RTP)' and 'QUT Higher Degree by Research (HDR) Tuition Fee Waiver' scholarships. The authors also thank the Bangladesh Water Development Board, the Capital Development Authority (RAJUK), WorldPop, and the US Geological Survey (USGS) for assisting this research with datasets. Finally, the authors thank the special issue guest editors and anonymous three referees for their invaluable comments on an earlier version of the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Improving Spatial Agreement in Machine Learning-Based Landslide Susceptibility Mapping**

**Mohammed Sarfaraz Gani Adnan 1,2, \* , Md Salman Rahman 3 , Nahian Ahmed 4 , Bayes Ahmed 5,6 , Md. Fazleh Rabbi <sup>7</sup> and Rashedur M. Rahman 4**


Received: 6 September 2020; Accepted: 12 October 2020; Published: 14 October 2020

**Abstract:** Despite yielding considerable degrees of accuracy in landslide predictions, the outcomes of different landslide susceptibility models are prone to spatial disagreement; and therefore, uncertainties. Uncertainties in the results of various landslide susceptibility models create challenges in selecting the most suitable method to manage this complex natural phenomenon. This study aimed to propose an approach to reduce uncertainties in landslide prediction, diagnosing spatial agreement in machine learning-based landslide susceptibility maps. It first developed landslide susceptibility maps of Cox's Bazar district of Bangladesh, applying four machine learning algorithms: K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Random Forest (RF), and Support Vector Machine (SVM), featuring hyperparameter optimization of 12 landslide conditioning factors. The results of all the four models yielded very high prediction accuracy, with the area under the curve (AUC) values range between 0.93 to 0.96. The assessment of spatial agreement of landslide predictions showed that the pixel-wise correlation coefficients of landslide probability between various models range from 0.69 to 0.85, indicating the uncertainty in predicted landslides by various models, despite their considerable prediction accuracy. The uncertainty was addressed by establishing a Logistic Regression (LR) model, incorporating the binary landslide inventory data as the dependent variable and the results of the four landslide susceptibility models as independent variables. The outcomes indicated that the RF model had the highest influence in predicting the observed landslide locations, followed by the MLP, SVM, and KNN models. Finally, a combined landslide susceptibility map was developed by integrating the results of the four machine learning-based landslide predictions. The combined map resulted in better spatial agreement (correlation coefficients range between 0.88 and 0.92) and greater prediction accuracy (0.97) compared to the individual models. The modelling approach followed in this study would be useful in minimizing uncertainties of various methods and improving landslide predictions.

**Keywords:** landslides; remote sensing; uncertainty; K-Nearest Neighbor; Multi-Layer Perceptron; Random Forest; Support Vector Machine

#### **1. Introduction**

Due to the destructive potential of landslides, this natural phenomenon poses a serious threat to human life, property, and the environment in the areas in which they occur [1,2]. Access to continuous and accurate information on landslide occurrence is essential for managing the risk to this unpredictable hazard [2,3]. Mapping landslide susceptibility is a widely conceived approach to estimating the likelihood of occurrence of this complex natural phenomenon [1,3–5]. The development of remote sensing technologies in the last few decades enables researchers to map landslide susceptibility more efficiently, due to the availability of high spatial and temporal resolution data [3,4,6]. For instance, high-resolution remote sensing (satellite imagery) data are used to develop various thematic layers explaining the topography, land cover, geology, and hydrology, which are essential parameters for predicting landslides [4,7]. Remote sensing techniques are also useful in developing accurate landslide inventory maps [3,6].

Along with the quality of available data, the choice of appropriate methodology is essential for developing reliable susceptibility maps [2]. During the last several decades, many landslide susceptibility models have been developed based on the geographic information system (GIS) and remote sensing technology [8]. Examples of such models include the weights-of-evidence [9,10], multivariate regression analysis [10,11], analytical hierarchy process [12], and the evidential belief function [13]. Applications of various machine learning algorithms in landslide susceptibility mapping (LSM) have evolved in recent decades. As a widely applicable method in data mining, the K-Nearest Neighbor (KNN) algorithm made early appearances in landslide prediction [14,15]. The Logistic Regression (LR) [2,12] and Support Vector Machine (SVM) [8,16] models also gained much popularity as adaptive systems for LSM [15]. Artificial neural networks in the form of a Multi-Layer Perceptron (MLP) were also used for this task [17]. More recently, evidence from various studies indicates that ensembles such as the Random Forest (RF) model can improve machine learning-based landslide prediction [1,18]. However, the outcomes of landslide susceptibility mapping could be subject to considerable uncertainties due to errors and variability in model choice, data used, system understanding, weighting factors, and human judgment [19,20].

Since the access to accurate landslide prediction maps is the prerequisite to decision-makers, the results must be carefully analyzed and critically reviewed before disseminating to support the end-users [9]. While developing landslide susceptibility maps, challenges may arise in (i) measuring the accuracy of a susceptibility assessment [21], and (ii) selecting an "optimal" combination of methods for susceptibility assessments [22]. Most of the validation processes of LSM consist of two steps: simulating landslide susceptibility and comparing the predicted results with the observed landslide locations [1,9]. Validation techniques must possess qualities such as reliability, robustness, degree of fitting, and prediction skill [21]. However, the performance evaluation of most of the LSMs was carried out based on the testing datasets [9,23]. Thus, a similar performance of multiple models at the testing landslide locations does not ascertain the same degree of agreement in terms of spatial predicted patterns [9].

Whilst many recent studies applied various combinations of machine learning algorithms to map landslide susceptibility [23–26], pixel-wise agreement in landslide prediction between various methods is inadequately understood. The resultant spatial heterogeneity in landslide prediction with different techniques creates uncertainties in LSM [9,19,27]. To address this challenge, this study aimed to propose a method to reduce uncertainties in landslide prediction. Therefore, it evaluated the extent of agreement of landslide prediction maps generated by applying four different machine learning algorithms. A combined landslide prediction map was developed by integrating the results of these four models. The study was carried out in Cox's Bazar district of Bangladesh (Figure 1).

**Figure 1.** (**a**) Location map of Cox's Bazar district in Bangladesh; and (**b**) the sub-districts of Cox's Bazar district. Digital Elevation Model (DEM) source: [28].

#### **2. Materials and Methods**

The study was conducted in three stages. First, landslide susceptibility maps (LSMs) of the study area were developed using four machine learning algorithms: K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Random Forest (RF), and Support Vector Machine (SVM). Second, the extent of spatial agreement of predicted patterns in LSMs was assessed by estimating the pixel-wise correlation of landslide probabilities obtained using various methods. Finally, an LSM was developed combining results from the four machine learning models (Figure 2). This study also estimated population exposure to landslide by overlaying gridded population layers of the year 2020, collected from WorldPop [29] and UNHCR [30], on LSMs.

#### *2.1. Study Area*

This study addressed the Cox's Bazar district, which is located in the south eastern region of Bangladesh (Figure 1a). The study area lies between latitude 20◦53′46.7" N and 21◦14′29.8" N, and longitude 92◦02′08.2" E and 92◦18′27.0" E. It is comprised of seven (out of eight) sub-districts (locally termed as *Upazilas*) of Cox's Bazar district (Figure 1b). The low-lying areas such as Kutubdia sub-district, part of Maheshkhali sub-district, and Saint Martin's island (Figure 1b) were not considered in this study. The study area is diverse and unique, both in terms of ecosystem services and biodiversity, and currently, an epitome of global geopolitics as it is accommodating over one million Rohingya refugees. It is characterized by relatively high elevation land (mean elevation is 18 m), compared to the rest of the country. At present, approximately a total of 3.4 million people inhabit 1869 km<sup>2</sup> of land (estimated using data from WorldPop [29] and UNHCR [30]).

**Figure 2.** The process of evaluating spatial agreement among various machine learning technique-based landslide susceptibility maps and optimizing the landslide prediction map. LSM=landslide susceptibility maps; KNN = K-Nearest Neighbor; MLP = Multi-Layer Perceptron; RF = Random Forest; SVM = Support Vector Machine.

The area receives the annual mean precipitation of 4288 mm [31]. The heavy rainfall triggers both flash floods and landslides in this area [11,12]. The majority of the historical landslides in Bangladesh occurred in this region [11]. For instance, a major landslide triggered by heavy rain in June 2017 killed at least 156 people in the south eastern hilly region of Bangladesh where the study area is located [32]. Unplanned urbanization, rapid growth of population, hill cutting, and deforestation are associated with the recent increase in landslide hazards [11,31]. Notably, Rohingya refugee camps, especially the Kutupalong camp of Ukhia sub-district (Figure 3) is located in areas that are highly susceptible to landslides. The Kutupalong camp is considered as the most densely populated refugee settlement area in the world, where around 75,000 people live per km<sup>2</sup> [31,33]. Any catastrophic landslide will cause significant damage to human lives and assets. Hence, an accurate assessment of landslide susceptibility is paramount for developing a plan for landslide risk management.

#### *2.2. Landslide Inventory Mapping*

Landslide inventory mapping is one of the essential steps for landslide prediction and susceptibility mapping. This study utilized the landslide inventory map developed by Ahmed, Rahman, Sammonds, Islam, and Uddin [11] (Figure 3). They developed the latest landslide inventory map of the Cox's Bazar district by retrieving the historical landslide information from newspapers and various organizations and later verified those with global positioning system (GPS) and reconnaissance surveys. This study also used information about landslide movement type, its distribution and style, rate of flow, damage, the volume of displacement, material, and the reason for movement by preparing a landslide investigation form collected from Ahmed, Rahman, Sammonds, Islam, and Uddin [11]. A total of 1262 sample locations were used, where the number of landslide and non-landslide locations was 670 and 592, respectively. To develop the models, it is necessary to obtain non-landslide cells (where landslides did not occur). From the existing literature, Huang et al. [34] identified three methods for obtaining non-landslide grid cells: (i) the seed cell procedure; (ii) randomly selecting non-landslide locations from the landslide free areas; and (iii) non-landslide locations selected in areas with a slope lower than 2◦ . This study followed the second approach to select random locations within the study area,

where landslides did not occur. These cells provided the models with the necessary data during the training stage [35,36]. The sample locations were split into two classes: (i) 60% locations (52% landslide and 48% non-landslide locations) were used to train the machine learning-based landslide prediction models, and (ii) 40% testing locations (54% landslide and 46% non-landslide locations) were employed to evaluate the performance of the machine learning models (Figure 2).

**Figure 3.** Landslide inventory map of this study. Data source: [11].

#### *2.3. Landslide Conditioning Factor*

The performance of LSMs depends on the choice of landslide conditioning factors. Numerous studies on LSM have been conducted based on machine learning techniques [1,16,18,23,26,37,38], with various combinations of landslide conditioning factors being used. However, the selection of factors should be (i) based on their degree of affinity with landslide locations, (ii) measurable, (iii) non-redundant, and (iv) based on the knowledge of geomorphological characteristics of the area under study [2]. Based on the knowledge obtained from the literature, as well as, expert knowledge on the study area, a total of 12 variables were selected in this current study (Table 1). Areas with an elevation of less than 5 m, as well as, waterbodies and sandy sea beach areas (waterbody and restricted in Figure 4) were excluded from the LSMs [39].

**Figure 4.** *Cont*.

**Figure 4.** Landslide conditioning factors. SPI = Stream Power Index. NDVI = Normalized Difference Vegetation Index.

Topographical and hydrological parameters including aspect, elevation, slope, curvature, and Stream Power Index (SPI) are important factors that limit the density and spatial extent of landslides [2,37,38,40]. Raster maps of aspect, elevation, curvature, slope, and SPI were derived at 30-m spatial resolution from the Advanced Land Observing Satellite (ALOS) Digital Elevation Model (DEM) [28] (Figure 4a–e). Elevation influences landslides primarily by affecting different biophysical parameters and anthropogenic activities. Only a limited number of studies, conducted on a specific basin, found that landslides occur at certain elevations [41]. Elevation can determine the spatial variability of landslides because it is affected by geological tectonics [37]. It can also influence the occurrence of landslides by impacting other causative factors such as slope, curvature, and SPI [42]. Aspect, indicating the direction of slope [43], indirectly influences the distribution of landslide locations by affecting the general physiographic trend of the area and/or the main precipitation direction [37,40]. Slope angle is considered as one of the most influential factors for the occurrence of landslides, as it affects the concentration of moisture and the level of pore pressure, as well as, controls regional hydraulic continuity [2,40]. All of these processes influence slope instability [8]. Curvature is also considered as a landslide influencing factor that directly controls the velocity of water flow, delimiting erosion [8,40]. SPI also determines the erosion potential of the surface [43] and is considered as an essential predictor of landslides [37,38]. Areas with high SPI values indicate a higher erosion potential, while negative values suggest no predicted erosion [44,45]. In this study, a layer of SPI was derived using the following equations in GIS:

$$SPI = A\_s \times \tan \beta \tag{1}$$

where *A<sup>s</sup>* and β indicates the specific catchment area (m<sup>2</sup> /m) and slope gradient, respectively [43].

It is widely conceived that various geological factors significantly influence the occurrence of landslides, as these factors often lead to a difference in strength and permeability of rocks and soils [2]. This study considered the three geological factors of surface geology, soil type, and soil texture (Figure 4i–k). Digital geologic and geophysical data of Bangladesh were collected from the U.S. Geological Survey [46]. The surface geology map of the study area includes a total of 11 classes: water (H2O), Bhuban formation (Miocene, Tb), Dupi Tila formations undivided (QTdd), valley alluvium and colluvium (ava), Girujan clay (Pleistocene and Neogene, QTg), Tipam Sandstone (Neogene, Tt), Boka Bil formation (Neogene, Tbb), beach and dune sand (csd), marsh clay and peat (ppc), Dupi Tila formation (Pleistocene and Pliocene, QTdt), and Dihing formation (Pleistocene and Pliocene, QTdi) (Figure 4i). Primary-level parameters such as soil type and soil texture are essential predictors of landslides. These parameters determine the amount of moisture content indicating the degree of stability of the soil [25,37,47,48]. Soil type and soil texture data were collected from the Bangladesh Agricultural Research Council [49].

Other anthropogenic, environmental, and locational factors considered in this study include distance to stream, land cover, normalized difference vegetation index (NDVI), and distance to road (Figure 4f–h,l). The land cover and NDVI maps of the year 2020 were prepared using Landsat satellite images based on the Google Earth Engine Platform. The land cover map was developed applying a supervised classification technique with the Random Forest algorithm. In the case of southern Bangladesh, a recent study demonstrated that this method has a higher classification accuracy compared to other land cover classification techniques [50]. The land cover map contains five classes: bare land, built-up area, crop land, vegetation, and waterbody (Figure 4g). Proximity to roads explains the locations of landslides, as the artificial and natural slopes adjacent to a road are sensitive to this hazard [51]. Road-cuts, excavation, and additional load can induce anthropogenic instability of the soil, promoting landslides [2,5]. A layer of distance to road network was developed using the Euclidian distance algorithm. Likewise, the location of areas with respect to natural drainage channels can also demonstrate the locations of landslides [11], as streams may change the stability of an area by eroding the slopes [5,51]. In this study, distance to stream networks was derived from the ALOS DEM. Again, by applying the Euclidian distance algorithm, a map of distance to stream was generated.


**Table 1.**Landslide conditioning factors used in this study.

#### *2.4. Multi-Collinearity Analysis of Landslide Conditioning Factors*

The selected landslide causative factors could be subject to multi-collinearity; hence, it is necessary to estimate the correlation of independent variables before modelling landslide susceptibility [8]. To eliminate the factors susceptible to multi-collinearity, this study determined variance inflation factors (VIF) [53] of 12 selected landslide conditioning factors using R [54]. VIF is a well-known method to determine the multi-collinearity of landslide conditioning factors [8,55]. A VIF value of a variable exceeding 5 indicates potential serious multicollinearity [53,55]. In this study, the selected landslide conditioning factors yielded VIF values < 2.8, indicating the absence of potential multi-collinearity (Table 1).

#### *2.5. Landslide Susceptibility Modelling*

#### 2.5.1. Pre-Processing

Using the binary locations (landslide and non-landslide), values of the selected 12 conditioning factors were extracted in a geographic information system (GIS) environment. As evident in Table 1, eight were continuous variables, while the remaining four variables had discrete characteristics. In order to represent discrete (categorical) variables semantically, they must be considered as a composite feature (where the number of generated binary features and the number of categories are equal). These discrete variables were encoded using a one-hot encoding scheme [31], implying that multiple binary features were generated to represent a single discrete feature. The number of one-hot encoded features depends on the number of variable classes. For instance, there are 11 categories in the geology variable. If a landslide location was found in a geology class, a value 1 was encoded to the class, while the other 10 classes were encoded as 0. This data pre-processing method was applied for all other discrete variables. For each variable, mean and standard deviation were calculated. The mean of each variable was then subtracted from the corresponding value in a variable and divided by the standard deviation. This reduces training time since optimization routines have a smaller parameter space to traverse.

#### 2.5.2. Hyperparameter Optimization

Hyperparameter optimization can improve the accuracy of machine learning algorithm-based models. The process aims to select the optimal hyperparameter values according to the evaluation index [56]. Three approaches are frequently used for optimizing hyperparameters: grid search, random search, and Bayesian optimization [57]. This current study applied the grid search technique along with 5-fold cross-validation on the training set to perform hyperparameter optimization. Hyperparameters that provide the best performance were chosen for final training and testing samples of respective machine learning models. For instance, the optimal number of neighbors of five in the KNN (Table 2) indicates that values of landslide conditioning factors corresponding to a landslide location were compared against the values of landslide predictors of five other sample locations, to obtain the most reliable prediction. Table 2 summarizes the hyperparameters, their search range and optimal values of the four models.


**Table 2.**Hyperparameters, search range, and optimal values of the machine learning-based landslide susceptibility models.

#### 2.5.3. Machine Learning Models

(1) K-Nearest Neighbor (KNN)

The KNN algorithm classifies an instance (landslide or non-landslide) that is mostly represented within its (*k*) neighbors. The parameter *k* is often a small positive integer [58]. The proximity between the samples is measured using a distance metric. The distance metric indicates how similar or different are the profiles of conditioning factors for any given two samples. Data points with similar conditioning factors will have a small feature distance between them. Though the model is simple in terms of hyperparameters, it becomes computationally expensive as the number of samples becomes large. The landslide susceptibility associated with a certain set of values of conditioning factors is determined by calculating its distance to each training data point (in high-dimensional feature space). The *k* nearest data points are used to determine the landslide susceptibility. The dominant susceptibility class within those *k* nearest neighbors (i.e., the class with the highest number of members in the *k* members) becomes the class membership of the new data point [14,59].

#### (2) Multi-Layer Perceptron (MLP)

Multi-Layer Perceptron (MLP) is a type of neural network with one or more hidden layers. Due to the presence of a hidden layer, the internal representations (as higher-order intermediate features) can be learned. Each layer consists of one or more neurons; the outputs (activations *a*) can be represented by Equation 2 [31]. The output of the *i*th neuron in the *j*th layer was obtained by calculating the sum of activations from the previous layer (*i* − 1) weighted by the parameters of layer *i* and then passing into an activation function *f*. Considering that there are several types of activation functions (sigmoid, hyperbolic tangent, rectified linear unit), the choice of activation functions is discussed in Section 2.5.3 (Hyperparameter Optimization). In this study, since a total of 23 features were derived by one-hot encoding during the pre-processing step, the first input layer of the MLP had 23 neurons. The resultant map was represented in terms of the probability of landslide occurrence.

$$a\_i^j = f(\sum\_{k=0}^n \omega\_k^j a\_k^{j-1}) \tag{2}$$

where *f* is the activation function, ω *j k* is the weight of *k*th neuron in layer *j*, *a j*−1 *k* is the activation of neuron k in layer *j* − 1 (the previous layer), *j* is the layer index, *i* is the neuron index, and *n* is the number of neurons in layer *j*.

#### (3) Random Forest (RF)

Random forest (RF) is considered as a powerful ensemble-learning method that can be applied for classification, regression, and unsupervised learning [18]. This method has been widely applied in landslide susceptibility mapping [18,56,60]. Ensemble models generally train several weak learners and then take their aggregated outputs to obtain more reliable predictions. The RF algorithm builds weak learners in the form of decision trees. It estimates the mean of outputs of the individual weak learners, as shown in Equation (3). Each weak learner (*b*) corresponds to a function *f<sup>b</sup>* (*x*). The RF uses bootstrap aggregating where the weak learners train parallelly [31].

$$\mathcal{F}(\mathbf{x}) = \frac{1}{B} \sum\_{b=1}^{B} f\_b(\mathbf{x}) \tag{3}$$

where *F*ˆ(*x*) is the ensembled prediction from weak learners, *B* is the total number of weak learners, *b* is the weak learner index, and *f<sup>b</sup>* (*x*) is the function for *b*th weak learner.

#### (4) Support Vector Machine (SVM)

Support Vector Machine (SVM) is also a widely used machine learning algorithm in landslide susceptibility mapping [8,16,23,26]. This supervised learning method separates the classes with a decision surface that maximizes the margin of class boundaries [8]. The training locations, closest to the optimal hyperplane, are called support vectors [23]. Suppose each sample location has M number of features, the objective of the SVM algorithm is to find a hyperplane in M dimensional feature space that separates the samples of different classes. A hyperplane is R(M−1) dimensional in RM. A hyperplane in R 2 is a line, a hyperplane in R<sup>3</sup> is a plane, and so on. This hyperplane functions as a decision boundary, which determines the label of a sample (i.e. landslide or non-landslide). The margin around the hyper lane indicates that value exceeding 1 denotes a positive sample (landslide), and a value equal to −1 denotes a negative sample (non-landslide). If *X* (*X*1, *X*2, . . . . . . . . . , *Xn*) is the vector of landslide affecting factor and *Y<sup>j</sup>* (*Y*1, *Y*2) is the vector of landslide (1) or non-landslide (0) event, the optimal hyperplane can be found by solving Equation (4) [26].

$$f(\mathbf{x}) = \text{sign}\left[\sum\_{i=1}^{n} \alpha\_i Y\_j k(X\_i X\_i) + k\right] \tag{4}$$

where *k* is the offset from the origin of the hyperplane, *n* is the total number of factors that affects landslide, α*<sup>i</sup>* is the positive real constant, and *k*(*X*, *X<sup>i</sup>* ) is the Kernel function. To classify the binary events (landslide or non-landslide), the condition to solve Equation (4) was assumed as below:

$$\mathcal{Y}\_j[\boldsymbol{\omega}^T \boldsymbol{\varphi}(\mathbf{x}\_i) + \mathbf{c}] \ge 1 \iff \begin{cases} \boldsymbol{\omega}^T \boldsymbol{\varphi}(\mathbf{x}\_i) + \mathbf{c} \ge 1 \text{ if landside events occur} \left(\boldsymbol{Y}\_j = 1\right) \\\ \boldsymbol{\omega}^T \boldsymbol{\varphi}(\mathbf{x}\_i) + \mathbf{c} \le 0 \text{ if landside events not occur} \left(\boldsymbol{Y}\_j = 0\right) \end{cases} \tag{5}$$

where *w* is the weight vector and ϕ(*x<sup>i</sup>* ) is the total number of factors that affects landslide.

#### 2.5.4. Performance Evaluation Methods

The performance of landslide susceptibility models was evaluated using a well-known method called receiver operating characteristic (ROC) curve and subsequent area under the curve (AUC) [1,11,23,31,43]. The ROC curves were developed using the 40% sample testing data. The ROC curve indicates the performance of a binary classifier system, representing sensitivity as a function of the false positive rate (1-specificity).

The sensitivity of a model is the ratio of the number of true positives to the sum of the number of true positives and false negatives. The specificity is the ratio of the number of true negatives to the sum of the number of true negatives and false positives. The ROC curve can be developed by plotting sensitivity in the y-axis against the cumulative distribution function of the false positive rate in the x-axis. The estimated AUC value can be categorized as poor (0.5–0.6), average (0.6–0.7), good (0.7–0.8), very good (0.8–0.9), and excellent (0.9–1) [1,43,60]. Besides, various statistical indices such as overall accuracy, precision, recall, and F1-score were estimated by developing a confusion matrix [1,11,43].

#### *2.6. Evaluation of Spatial Agreement and Optimizing Prediction Map*

To evaluate the inter-model agreeability, a pixel-wise agreement between two machine learning algorithms was estimated. Therefore, Pearson's correlation coefficient was estimated for a total of six possible combinations of machine learning model-based landslide susceptibility maps. Here, Pearson's correlation coefficient indicates the covariance of landslide predictions, obtained by using two algorithms, divided by the product of their standard deviations. The correlation coefficient can range from +1 to −1, where values zero as indicating no agreement and ±0.29 as low degree, ±0.30–±0.49 as moderate degree, ±0.50 to <±1 as high degree, and ±1 as perfect agreement [60].

Following the evaluation of spatial agreement, an optimized landslide prediction map was developed combining susceptibility maps generated by applying the four machine learning algorithms. The combined map was developed by following a methodology proposed by Rossi, Guzzetti, Reichenbach, Mondini, and Peruccacci [22], where they established a logistic regression (LR) model. The LR model included binary landslide and non-landslide locations as the dependent variable and the results of the four landslide susceptibility models as the independent variables. The obtained regression coefficients were incorporated in Equation (6) [43] in GIS to derive the probability (*P*) of landslides in the study area.

$$P = \frac{1}{1 + e^{-z}} \tag{6}$$

where *z* is the linear combination of independent variables which was estimated using the following equation:

$$z = \theta\_0 + \theta\_1 \mathbf{x}\_1 + \theta\_2 \mathbf{x}\_2 + \dots + \theta\_n \mathbf{x}\_n \tag{7}$$

where θ<sup>0</sup> is the intercept of the model, θ*<sup>i</sup>* (*i* = 1, 2, . . . , *n*) indicates the regression coefficient of independent variables, and *x<sup>i</sup>* (*i* = 1, 2, . . . , *n*) represents the *n* number of independent variables. Validation of the resultant combined model was performed by developing the ROC curve by using the 40% testing data.

#### **3. Results**

#### *3.1. Landslide Susceptibility Modelling*

#### 3.1.1. Landslide Prediction

Figure 5 shows landslide susceptibility maps of Cox's Bazar district developed by applying the four machine learning algorithms. The generated landslide probability maps were classified into five categories each by applying the Jenks natural breaks classification method in GIS: (i) very low (0–0.1), (ii) low (0.11–0.3), (iii) medium (0.31–0.5), (iv) high (0.51–0.85), and (v) very high (0.86–1). As evident in Figure 6, the proportion of landslide susceptible area varied from one model to another. Among all methods, the SVM resulted in the highest proportion of area (38.7%) susceptible to the landslide of 'high' and 'very high' severity, while the Random Forest (RF) algorithm yielded a relatively lower proportion (23.1%) of landslide susceptible area. Likewise, the ratio of the population exposed to 'high' and 'very high' landslide susceptible zones varied for different algorithms. For all the four methods, the percentage of landslide exposed population ranged between 34% to 48% (Figure 6).

**Figure 5.** *Cont*.

**Figure 5.** Landslide susceptibility maps obtained by four machine learning algorithms: (**a**) K-Nearest Neighbor (KNN), (**b**) Multi-Layer Perceptron (MLP), (**c**) Random Forest (RF), and (**d**) Support Vector Machine (SVM).

**Figure 6.** Landslide susceptible area obtained using five models: K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), and combined model. Blue dots indicate the proportion of people exposed to 'high' and 'very high' susceptible zones.

#### 3.1.2. Evaluation of Models' Performance

To evaluate the performance of various landslide susceptibility models, a performance matrix was derived using the test samples (40% of the total data) (Table 3). The performance evaluation indices indicated a very high prediction accuracy of all the models. In terms of overall accuracy, the RF classifier resulted in the highest accuracy (96.63%), followed by the MLP (95.45%), SVM (94.06%), and KNN (90.69%). However, the overall accuracy is a universal metric, hence, it does not indicate which specific classes were being inaccurately classified. To obtain further insights into the agreement

between the observed and modelled locations (landslide and non-landslide)—precision, F1-score, and recall were estimated (Table 3). The RF classifier achieved the best accuracy with respect to all performance indicators. The MLP followed closely and consistently in terms of all indicators. In relation to the estimated AUC values, the RF classifier yielded the highest accuracy (0.962), followed by the MLP (0.960), SVM (0.935), and KNN (0.927) (Figure 7). The relatively greater values of performance indicators of RF and MLP can be attributed to their ability to learn complex relationships between geospatial characteristics of an area and the occurrence of landslides [17,18,60].

**Model Overall Accuracy Precision F1-score Recall Non-Landslide Landslide Non-Landslide Landslide Non-Landslide Landslide** KNN 0.9069 0.9227 0.9227 0.9015 0.9015 0.8811 0.8811 MLP 0.9545 0.9547 0.9547 0.9528 0.9528 0.9508 0.9508 RF 0.9663 0.9633 0.9633 0.9652 0.9652 0.9672 0.9672 SVM 0.9406 0.9385 0.9385 0.9385 0.9385 0.9385 0.9385

**Table 3.** Performance evaluation indicators of the machine learning based landslide susceptibility models.

**Figure 7.** Receiver operating characteristic (ROC) curves of the five models: K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), and combined model.

#### *3.2. Spatial Agreement of Various Methods*

This study developed a correlation matrix by comparing pixel-wise landslide probabilities between various methods (Figure 8) to evaluate the extent of agreement of one landslide susceptibility model over another. Although the values of AUC were very similar for various methods (Figure 7), a substantial difference in the agreement was observed in LSMs obtained using the different techniques. Overall, the correlation coefficient ranges from 0.69 to 0.85 (Figure 8). The combinations of SVM-RF resulted in the highest degree of the agreement, while the KNN-SVM yielded the lowest agreement.

**Figure 8.** Correlogram to show the agreement between the five landslide susceptibility models: K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), and combined model.

#### *3.3. Aggregated Landslide Susceptibility Mapping*

− − Since spatial heterogeneity in landslide prediction exists between the different machine learning-based approaches, an aggregated susceptibility map combining the outputs of all algorithms would minimize the uncertainty of individual methods. In this study, a regression-based approach was adopted. A multivariate logistic regression (LR) was established incorporating the binary landslide inventory data as the dependent variable and the results of the four landslide susceptibility models as independent variables. The outcome of the LR model is summarized in Table 4. Among the four models, the MLP, RF, and SVM were statistically significant (*p*-value < 0.05). The coefficient of determinants (R<sup>2</sup> ) of 0.80 indicates a very good model performance. In relation to the estimated regression coefficients, the RF model had the highest degree of agreement with the landslide inventory, followed by the MLP, SVM, and KNN. The pattern of influence of various models in predicting landslides corresponds to their level of accuracy in terms of their respective AUC values (Figure 7).


−


The estimated LR coefficients of the four models and intercept were incorporated in Equation (6) to derive the combined LSM. Again, the resultant aggregated map was categorized into five classes applying the Jenks Natural Break algorithm (Figure 9). The combined susceptibility map yielded the highest AUC value (0.965) compared to the single susceptibility forecasts (Figure 7). About 26.8% of the total study area was within the 'high' and 'very high' landslide susceptible zones, where approximately 21.7% of total population inhabit (Figure 6). In respect to spatial agreement, the combined LSM resulted in greater spatial agreement with the all four models, with the correlation coefficient ranging between 0.85 and 0.92 (Figure 8).

**Figure 9.** (**a**) Combined landslide susceptibility map of Cox's Bazar district, (**b**) ratio of landslide susceptible zones (high and very high) in various sub-districts (Upazila), and (**c**) landslide susceptibility in the Rohingya refugee camps of Ukhia sub-district.

The extent of landslide susceptible areas varies in different sub-districts (*Upazila*) of Cox's Bazar. Teknaf Upazila is the most susceptible, where more than 8% of the total study area was susceptible to landslides of 'high and 'very high' severity (Figure 9b). A substantial proportion of area (7% of the study area) in Ukhia sub-district was also susceptible. The Rohingya refugee camps in this area were located within high and very high landslide susceptible zones. Various recent studies also found that changes in the geomorphological, hydrological, and anthropogenic environments due to the Rohingya influx caused their settlement areas vulnerable to landslides [11,31].

#### **4. Discussion**

The spatial disagreement in prediction among various techniques creates challenges in selecting the most suitable susceptibility map for managing landslide hazards [9,21,22]. The current study seeks to address this challenge by estimating the extent of spatial agreement, as well as, proposing a method to combine landslide susceptibility maps and thus incorporating the valid results of various models. The study focused on the Cox's Bazar district of Bangladesh, which is well known as being vulnerable to landslide disasters [11,12,31]. First, LSMs were developed by applying four machine learning algorithms—K-Nearest Neighbor (KNN), Multi-Layer Perceptron (MLP), Random Forest (RF), and Support Vector Machine (SVM)—featuring hyperparameter optimization. In comparison to the existing studies on LSM of Cox's Bazar district [11,12,31], this current study employed up-to-date data of landslide conditioning factors. In addition, it restricted low-lying areas (waterbodies and elevation < 5 m) in susceptibility mapping, otherwise, the resultant maps would have been prone to overestimation of landslide susceptible zones, as was the case of some recent studies. While evaluating the models' performance, all of them yielded very high prediction accuracy, with the AUC values ranging between 0.927 to 0.962. The results of various recent studies have also ascertained that different machine learning-based models yielded high accuracy in predicting landslides [1,15,17,26,27,61].

This study hypothesized that different susceptibility models can result in different LSMs, despite incorporating similar landslide inventory data. The assessment of spatial agreement between various models revealed spatial heterogeneity in landslide predictions, with the estimated pixel-wise correlation coefficients of landslide probability between various models ranging from 0.69 to 0.85. The spatial distribution of landslide susceptibility obtained in this study was also different than that of a recent study conducted in Cox's Bazar district of Bangladesh [11]. This highlights the uncertainty in landslide predictions of various models, despite their considerable prediction accuracy in terms of the AUC values. Most of the existing studies on machine learning-based LSM had a major focus on identifying the most suitable method for predicting this natural phenomenon [1,8,18,23,60], while little attention has been given in analyzing uncertainties resulting from the spatial disagreement in landslide prediction [9]. The current study is the first case study-based contribution to investigate this major gap in the existing literature.

This study further developed a combined LSM by integrating the results of the four machine learning-based landslide predictions, adopting a method proposed by Rossi, Guzzetti, Reichenbach, Mondini, and Peruccacci [22]. The result indicates an improvement in landslide prediction accuracy. Existing studies, which applied multiple machine learning algorithms to map landslide susceptibility, mainly evaluated different methods based on quantitative measures [8,16,18]. Whilst quantitative measures of model fit are useful, they are not conclusive in determining the efficacy and reliability of susceptibility assessment [22]. A combined landslide susceptibility map that this study developed would help to minimize the uncertainties of individual methods.

#### **5. Conclusions**

Predicting a complex natural phenomenon such as a landslide is a challenging task and subject to considerable uncertainties. An accurate prediction of landslides is the prerequisite for managing this hazard. In this study, the spatial association in landslide prediction between various machine learning-based models was analyzed to quantify the spatial agreement of predicted landslide susceptibility. By addressing uncertainties in various models, this study also developed a landslide susceptibility map combining the outcomes from various models. The results indicate an improvement in landslide prediction compared to the individual models.

Despite achieving an improved result in landslide prediction, this study has some limitations that could be addressed in future results. Landslide inventory data used in this study was developed based on various secondary sources and validated through fieldwork [11]. Scarcity of data, including detailed landslide inventory on the study area, made it difficult to model a landslide more accurately. Accuracy of the LSM results depended on input parameters used, particularly the DEM. The ALOS

DEM of 30-m resolution used in this study had a low root-mean-square error (1.78 m) in vertical accuracy and was considered to be the most accurate freely available DEM [43,62]. However, for future research, high-resolution DEM could be employed to improve the existing landslide susceptibility modelling frameworks.

This study is an attempt to integrate results of multiple machine learning-based landslide susceptibility models to minimize uncertainties and improve landslide predictions. The modelling framework used in this study could be transferred to other landslide-susceptible regions. Landslide susceptibility maps can enable urban planners in identifying suitable areas for urban development [63]. The combined landslide susceptibility map of Cox's Bazar district could be useful to policymakers and practitioners in sequencing and prioritizing interventions in managing landslides. The proposed model is an advancement in the existing landslide susceptibility models that intends to predict landslides more accurately. The results of this model could be utilized in improving the existing landslide early warning system [11], to strengthen landslide disaster risk mitigation strategies to support for the resilient future of inhabitants of the study area.

**Author Contributions:** Conceptualization, M.S.G.A. and B.A.; methodology, M.S.G.A., M.S.R., and N.A.; software, M.S.G.A., M.S.R., N.A., and R.M.R.; validation, M.S.G.A., M.S.R., and N.A.; formal analysis, M.S.G.A., M.S.R., and N.A.; investigation, M.S.G.A., B.A., R.M.R., and M.F.R.; resources, M.S.G.A., M.F.R., and B.A.; data curation, M.S.G.A. and B.A.; writing—original draft preparation, M.S.G.A.; writing—review and editing, M.S.G.A. and B.A.; visualization, M.S.G.A. and B.A.; supervision, B.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Article* **Evaluating the E**ff**ects of Digital Elevation Models in Landslide Susceptibility Mapping in Rangamati District, Bangladesh**

#### **Yasin Wahid Rabby 1, \* , Asif Ishtiaque <sup>2</sup> and Md. Shahinoor Rahman 3**


Received: 16 July 2020; Accepted: 20 August 2020; Published: 22 August 2020

**Abstract:** Digital elevation models (DEMs) are the most obvious data sources in landslide susceptibility assessment. Many landslide casual factors are often generated from DEMs. Most studies on landslide susceptibility assessments rely on freely available DEMs. However, very little is known about the performance of different DEMs with varying spatial resolutions on the accurate assessment of landslide susceptibility. This study compared the performance of four different DEMs including 30 m Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM), 30–90 m Shuttle Radar Topographic Mission (SRTM), 12.5 m Advanced Land Observation Satellite (ALOS) Phased Array Type L band Synthetic Aperture Radar (PALSAR), and 25 m Survey of Bangladesh (SOB) DEM in landslide susceptibility assessment in the Rangamati district in Bangladesh. This study used three different landslide susceptibility assessment techniques: modified frequency ratio (bivariate model), logistic regression (multivariate model), and random forest (machine-learning model). This study explored two scenarios of landslide susceptibility assessment: using only DEM-derived causal factors and using both DEM-derived factors as well as other common factors. The success and prediction rate curves indicate that the SRTM DEM provides the highest accuracies for the bivariate model in both scenarios. Results also reveal that the ALOS PALSAR DEM shows the best performance in landslide susceptibility mapping using the logistics regression and the random forest models. A relatively finer resolution DEM, the SOB DEM, shows the lowest accuracies compared to other DEMs for all models and scenarios. It can also be noted that the performance of all DEMs except the SOB DEM is close (72%–84%) considering the success and prediction accuracies. Therefore, anyone of the three global DEMs: ASTER, SRTM, and ALOS PALSAR can be used for landslide susceptibility mapping in the study area.

**Keywords:** landslide susceptibility; Bangladesh; digital elevation model; random forest; modified frequency ratio; logistic regression

#### **1. Introduction**

Local terrain conditions, including terrain relief, hydrology, geology, and land use, are crucial for the assessment of landslide susceptibility [1]. These local features are often termed as "causal factors" [2]. Identifying these causal factors is considered as the steppingstone of landslide susceptibility assessment. Landslide susceptibility represents the likelihood of landslide occurrence in an area. It assumes that future landslides may occur in previous landslide locations where the causal factors already created conducive environments for triggering landslides [2–4]. Several natural and anthropogenic

factors can trigger landslides, thus called "triggering factors," such as volcanic activity, groundwater excavation, prolonged rainfall, rapid snow melting, hill cutting, deforestation, land-use change, etc. [5–8]. Moreover, landslide susceptibility assessment relies on the characteristics of landslide inventory—a detailed register of distribution and characteristics of past landslides [9,10]. Therefore, the success of landslide susceptibility assessment largely depends on the selection of causal factors and the quality of landslide inventories.

Landslide susceptibility can be assessed qualitatively or quantitatively [11]. Qualitative approaches of landslide susceptibility assessment are based on experts' judgments on causal factors. However, the mathematical relationships between landslide locations and casual factors are utilized in quantitative approaches [6]. Frequently, mixed methods and semi-quantitative methods are adopted to process experts' opinions in qualitative assessments. The widely-used methods in this domain are the analytical hierarchy process (AHP) [12], fuzzy logic [13], and GIS-based AHP [14]. In bivariate techniques, each causal factor is divided into a set of classes, and landslide locations are compared with each class. Thus, the bivariate relationship is established between landslide occurrence and one factor-class at a time [4]. The commonly used bivariate methods are frequency ratio, the weight of evidence, fuzzy logic, evidential belief function, and statistical index [3,15]. In contrast, the multivariate statistical methods determine the relationship between landslide occurrence and multiple causal factors. Examples of multivariate methods are logistic regression, adaptive regression spline, general additive models, and simple decision trees [16].

One of the limitations of bivariate and multivariate models is that they are constrained by normality and collinearity assumptions. Compared to these models, machine learning-based models are relatively less limited by these assumptions [17] and, therefore, can consider the nonlinear nature of landslides [18]. Some argue that machine learning-based models such as the random forest, gradient boosting, and support vector machines often outperform both bivariate and multivariate statistical models [19,20]. While the selection of methods is essential, landslide susceptibility assessment also depends on the types and quality of causal factors, mapping unit, and the scale of investigation [21]. Many causal factors are often derived from analyzing satellite imageries and topographic models, including land cover, elevation, slope, aspect, and hill cut [7]. Thus, these derived causal factors are often impacted by the spatial resolution of sources, geometric error, and instrument or sensor type. In landslide susceptibility mapping, digital elevation models (DEMs) often replace topographic maps to derive the most important causal factors (e.g., slope, topography, aspect). Moreover, many developing countries may not have topographic maps. Thus, landslide studies from these countries usually rely on the free of charge DEMs derived from remote sensors.

DEM is the digital representation of the earth's surface. It is widely used in various research areas in which topography plays an important role, such as hydrological modeling, geomorphological analysis, and feature extraction, landslide susceptibility and hazard assessment, erosion susceptibility, and glacier monitoring [22]. DEMs are often generated using data obtained from different remote sensors, including optical imaging sensors, light detection, and ranging (LiDAR), and synthetic aperture radar (SAR) [23]. The qualities of DEM-derived factors often depend on the spatial resolution of DEMs. Therefore, the choice of DEM is important for the assessment of landslide susceptibility [1]. To this day, a few attempts have been made to compare the performance of DEMs with different spatial resolutions in landslide susceptibility assessment. For instance, Dietrich et al. [24] compared different DEMs and found a similarity in performance irrespective of spatial resolution in identifying moderate landslide class (see also [25]). They argued that the resolution of DEM might not be very important to represent the slope failures. Similarly, Tian et al. [26] contended that finer resolution does not essentially lead to higher accuracy in landslide susceptibility assessment (see also [1,27]).

These past studies further indicated that the performance of DEMs is context-dependent meaning that the performance of a DEM in a region may not be assumed to be similar in another region [28–30]. They also argued that DEMs with fine spatial resolution may not necessarily have better performance over coarse resolution DEMs. Therefore, it is an utmost need to have comparative assessments of DEMs

in various contexts. We found that even though some parts of Bangladesh are vulnerable to landslides, no study investigated the relative performance of different DEMs in landslide susceptibility assessments. Against this backdrop, this study contextualized landslide susceptibility in Bangladesh and compared the performance of different DEMs and modeling techniques. The study area is selected from Bangladesh because the hilly southeastern parts of the country encounter landslides almost every year that often claim tens of lives [31–34]. Because of unavailability of LiDAR, the majority of the landslide susceptibility-related studies in Bangladesh used 30m Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM), 30–90 m Shuttle Radar Topographic Mission (SRTM), and 12.5 m Advanced Land Observation Satellite (ALOS) Phased Array type L-band Synthetic Aperture Radar (PALSAR) to obtain various topographic factors [7,31,32,35]. Survey of Bangladesh (SOB) has developed a DEM (25 m) for the whole of Bangladesh and no study has used this DEM in landslide susceptibility assessment. Therefore, the landslide susceptibility maps in Bangladesh are influenced by the usage of different DEMs and which DEM provides the most accurate susceptibility maps are largely unexplored [36]. As many causal factors are derived from DEM datasets, the selection of appropriate DEM is crucial for landslide susceptibility assessment. This study aims to compare the relative performance of four DEMs: ASTER GDEM (30 m), ALOS PALSAR (12.5 m), SRTM (30 m), and Survey of Bangladesh (SOB) DEM (25 m). This study further aims to compare three quantitative landslide susceptibility assessment techniques: modified frequency ratio (bivariate method), logistic regression (multivariate method), and random forest (machine learning method) [4,8,29,37].

#### **2. Study Area**

Rangamati, a hilly southeastern district of Bangladesh is selected as the study area because of the regular landslide occurrences in this area. More than 100 people died, and 12,000 families suffered losses due to landslides in this district [38]. Most landslides in Rangamati take place in three upazilas (sub-districts): Rangamati Sadar, Kaptai, and Kawkhali [35]. As such, the study area is narrowed down to these three Upazilas (Figure 1). The combined geographical area of these three sub-districts is 1145 km<sup>2</sup> and more than 40% of it is forested [39]. The geology of this area comprises Dhihing, Dupi tila, Girujan clay, Bhuban, Bokabil, and Tipam sandstone (Figure A3f of Appendix C)). The bedrock and soil structure of the areas are not stable, which makes the hills highly prone to landslides [40]. Climatologically, this area falls under a tropical monsoon climate, and the annual average temperature varies from a maximum of 36.5 degrees to a minimum of 12.5 degrees Celsius, and annual rainfall is 2673 mm with mean humidity level 71.6% [39].

**Figure 1.** Study area and landslide and non-landslide (pseudo-absence points) locations used ‐ ‐ in modeling.

#### **3. Synopsis on Data Utilization**

#### *3.1. Landslide Inventory*

Landslide inventory records different information, including the exact location, size, type, time of occurrence, causalities, trigger, and causes [2]. Rabby and Li [33] prepared and published [34] landslide inventory of the Chittagong hilly area, Bangladesh. They used participatory field mapping to prepare this inventory [41]. In our study, we selected 168 landslides from their inventory as it has been advised to use more than one method in landslide inventory preparation [42], we analyzed available Google Earth images on the Google Earth platform to map more landslides in the study area. We used the method proposed by Rabby and Li [33] for Google Earth mapping and mapped 93 landslides that occurred from January 2001 to January 2019. Therefore, in our study, we used a total of 261 (168 + 93) landslide locations. The mean size of the landslide was 274.2 m<sup>2</sup> . The smallest and the largest dimensions of the landslides were about 14.6 m<sup>2</sup> and 3422.4 m<sup>2</sup> , respectively.

#### *3.2. Landslide Causal Factors*

In our study, we used a total of 15 causal factors (Table 1) to produce landslide susceptibility maps (see Appendices A–C). Of these factors, seven factors were derived from DEM: elevation, slope, plan curvature, profile curvature, topographic wetness index (TWI), stream power index (SPI), and aspect. As we are comparing the performance of different DEMs, we derived each of these factors from four different DEMs: ASTER, SRTM, ALOS PALSAR, and SOB. The rest of the eight factors namely land use/land cover, land use/land cover change, geology, distance from the road networks, distance from the fault lines, distance from the drainage networks, rainfall, and normalized difference vegetation index (NDVI) were collected from different datasets. Maps of different causal factors had different resolutions, but for the convenience of comparison, we kept the 30 m resolution as the standard for landslide susceptibility maps. In the following sub-section, we provide a brief overview of the causal factors that we used in this study. We classify these factors into several classes primarily using Jenks Natural Break method in ArcGIS 10.7, unless otherwise mentioned.



#### 3.2.1. Elevation

A change in elevation can bring changes in geomorphology, vegetation, and rate of erosion in an area and thus alters the landslide susceptibility [8]. We derived elevation from ASTER, SRTM, ALOS PALSAR, and SOB DEMs (Figure A1a–d of Appendix A) and divided them into five classes (see Table A1 of Appendix D).

#### 3.2.2. Slope

The slope is one of the most critical factors of landslides. Generally, with the increase of slope, shear stress increases, and therefore landslide susceptibility increases [43,44]. Like elevation, we derived slopes from four different DEMs using the slope tool in ArcGIS 10.7 (Figure A1e–h of Appendix A) and divided them into five classes using the Jenks natural break method (see Table A1 of Appendix D). We found different maximum slope values for different DEMs-ASTER (51.89º), SRTM (61.24º), ALOS PALSAR (65.36º), and SOB (46.4º). As these values were different for the same study area, the five classes of slopes (see Table A1 of Appendix D) were different.

#### 3.2.3. Aspect

The direction of precipitation, sunlight, and wind depend on aspect, and therefore it has effects on the growth of vegetation, rate of erosion, and thickness of soil [45]. From the DEMs, four aspect maps (Figure A1i–l of Appendix A) were prepared and were divided into ten classes (see Table A1 of Appendix D).

#### 3.2.4. Plan Curvature and Profile Curvature

Curvature is the rate of change of slope over time in an area. We used the four different DEMs to produce four plan and profile curvature maps (Plan: Figure A2a–d; Profile: A2e–h of Appendix B). Profile and plan curvatures were divided into three classes: concave, convex, and flat. Among these classes, concave slopes are more prone to landslides because water cannot disperse equally on these slopes [5].

#### 3.2.5. Topographic Wetness Index and Stream Power Index

Topographic wetness index (TWI) increases with the decrease of the slope; therefore, it is inversely related to landslide susceptibility [4,43]. Stream power index (SPI) represents the erosion power of streams. SPI is directly related to slopes; in a steeper slope, SPI will be higher, representing more erosion power while in a flat alluvial plain, SPI is low [45]. TWI and SPI maps were derived from four DEMs using Equation (1) and Equation (2) (TWI: Figure A2i–l of Appendix B; SPI: Figure A2m–p of Appendix B)

$$\text{TWI} = \text{Ln}\left(\frac{\text{A}}{\tan \alpha}\right) \tag{1}$$

$$\text{SPI} = \text{A} \times \tan{\alpha} \tag{2}$$

where, A = Area of a specific catchment and α = Slope gradient of the specific area. We divided TWI and SPI into five classes (Table A1 of Appendix D)

#### *3.3. Rainfall*

The intensity and duration of rainfall controls the initiation of landslides [44]. We used the mean annual rainfall of five weather stations of Bangladesh Meteorological Department (BMD) to prepare the rainfall map using the Kriging interpolation method (Figure A3a of Appendix C). We later divided it into five classes (Table A1 of Appendix D).

#### *3.4. Distance-Based Causal Factors*

Distance from the road networks, drainage networks, and fault lines were the three distance-based causal factors in this study. We used the Euclidean distance tool in ArcGIS 10.7 to derive the distance of landslides from the targeted features: road, drainage, and fault lines (Figure A3b–d of Appendix C) and divided the distances into five classes (Table A1 of Appendix D). Distance from the road networks is one of the most critical factors. The undercutting of slopes during road construction and the vibration created by vehicles damage the slope stability [43]. Drainage network indicates the zone of erosion in an area, and erosion is indirectly linked with the landslide susceptibility [4]. Fault lines indicate the geomorphological discontinuity in an area. Near the fault lines, the shear strength of rock is minimum. Therefore, areas near to the fault lines are prone to landslides [44].

#### *3.5. Normalized Di*ff*erence Vegetation Index (NDVI)*

NDVI indicates the growth of vegetation and biomass of an area [46]. Generally, the probability of the occurrence of the landslide on the naturally vegetated surface is lower than the bare lands [8]. We used Landsat 8 level 2 imagery of 11/10/2017 to prepare the NDVI map (Figure A3e) and divide it into five classes (Table A1 of Appendix D).

#### *3.6. Geology*

The strength of rock and soil permeability depends on the geology of an area. Therefore, geology has an impact on landslide susceptibility [43]. In this study, we used the geological map (Figure A3f) (1:100,000) of the Bangladesh Geological Survey (BGS). There are eight types of geologic formation found here: Dihing and Dupi tila formation; Bokabil formation; Bhuban formation; Tipam Sandstone; Valley alluvium and colluvium; Dihing formation, Girujan clay, and waterbodies.

#### *3.7. Land Use land Cover*

Land use/land cover and land use/land cover change are the two most crucial landslide causal factors in the study area [35,47]. Ahmed [31] and Rabby and Li [33] found that the rate of change of land use/land cover in our study area is high compared to other adjacent areas. In this study, we used two Landsat imageries to analyze the land use/land cover change (Landsat 5: date of Acquisition: 24/12/1998; Landsat 8: Acquisition Date: 29/11/2018). We used supervised maximum likelihood classifier to classify the 1998 and 2018 images into four land use classes: bare land, vegetation, built-up, and water bodies (Figure A3g of Appendix C). We later employed post-classification change detection techniques to analyze the land use/land cover changes between 1998 and 2018 (Figure A3h of Appendix C).

#### **4. Methodology**

To compare the effects of four different DEMs: ASTER, SRTM, ALOS PALSAR, and SOB on landslide susceptibility maps, we used a bivariate method: modified frequency ratio (MFR), a multivariate method: logistic regression (LR) and a machine learning method: random forest (RF). We assessed two scenarios: (a) considering only DEM-based seven causal factors in the models, and (b) considering all 15 causal factors, including the DEM-based factors, in the models. As we used three methods: MFR, LR, and RF on four different DEMs under two scenarios, therefore, the outcome would be twenty-four landslide susceptibility maps. For legibility, we used different acronyms in later sections (see Table 2).


**Table 2.** Acronyms used for different models.

#### *4.1. Training and Validation Dataset*

We divided the 261 landslide locations randomly into training (75%) and validation (25%) datasets. Bivariate MFR is a one-class classification method where non-landslide locations or absence of landslides are not required [48]. On the other hand, for multivariate LR and machine learning-based RF, the selection of non-landslide locations (pseudo-absence points) is essential [4]. Any place that does not have landslide can be considered as non-landslide. We randomly selected 261 non-landslide locations (Figure 1) (pseudo absence points) from the study area [49]. We split these non-landslide locations into training (196) and validation (65) data sets. In total, we had 392 (196: landslide locations; 196: non-landslide locations) data points for training and 130 (65: landslide locations; 65: non-landslide locations) data points for testing the LR and RF models.

#### *4.2. Modified Frequency Ratio (MFR)*

MFR is an improved version of the widely used frequency ratio (FR) method [35,50]. Lee and Talib [51] proposed FR, which assesses the spatial relationship between the landslide locations and the landslide causal factors [52]. In the FR method, each of the causal factors must be divided into subclasses or categories; for example, in this study, we split slope into five categories using the Jenks natural break method (Table A1 of Appendix D). We calculated the FR values using Equation (3), and later these FR values of each of the subclasses of causal factors were used in the MFR model.

$$\text{FR}\_{\text{iĭ}} = \frac{\text{N}\_{\text{iġ}}/\text{N}}{\text{M}\_{\text{i}}/\text{M}} \tag{3}$$

where FRij = Frequency Ratio of jth Subclass of Factor i

Nij = Total area of the landslide pixels within the jth subclass of factor i

N = Total area of landslide pixels in the study area

Mi j = Total area of the pixels in the jth subclass of factor i

M = Total area of the study area

FR > 1 means association of landslides with that subclass. In other words, there is a probability of occurrences of landslides in that subclass. FR < 1 means no association [53].

For calculating the MFR, then we normalized the FRs using Equation (4).

$$\text{Rf}\_{\text{ij}} = \frac{\text{FR}\_{\text{ij}}}{\sum \text{FR}\_{\text{i}}} \tag{4}$$

where Rfij = Relative frequency of jth subclass of factor i

FRij = Frequency ratio of the jth subclass of factor i

P FR<sup>i</sup> = Sum of the frequency ratios of factor i

Later, we calculated the prediction rate (PR) using Equation (5). In the FR model, the overall contribution of causal factors to the occurrence of landslides is not measured. Only the subclass wise contribution is measured [6,50]. In MFR, we can measure the overall contribution because PR indicates overall association of a causal factor. The lowest value of PR is 1 and the higher the PR value the stronger is the association of causal factor with the landslides [6].

$$\text{PR}\_{\text{i}} = \frac{(\text{MaxRf}\_{\text{i}} - \text{MinRf}\_{\text{i}})}{(\text{MaxRf}\_{\text{i}} - \text{MinRf}\_{\text{i}})\text{min}} \tag{5}$$

where PR<sup>i</sup> = Prediction rate of factor i

MaxRf<sup>i</sup> = Maximum relative frequency of factor i

MinRf<sup>i</sup> = Minimum relative frequency of factor i

(MaxRf<sup>i</sup> − MinRf<sup>i</sup> ) min = Lowest difference between maximum and minimum relative frequency of all the factors

To calculate the landslide susceptibility ondex (LSI) and to produce the landslide susceptibility maps Equation (6) was used

$$\text{LSI} = \sum\_{i=1}^{n} \text{Rf}\_{\text{ij}} \times \text{PR}\_{\text{i}} \tag{6}$$

where LSI = Landslide susceptibility index

Rfij = Relative frequency of jth subclass of factor i

PR<sup>i</sup> = Prediction rate of factor i

For landslide susceptibility mapping, we used the reclassify tool in ArcGIS 10.7 to reclassify the categories of causal factors according to the Rf values. Later, we multiplied each of the reclassified raster layers with the prediction rates and summed up to produce the final landslide susceptibility maps.

#### *4.3. Logistic Regression (LR)*

Logistic regression (LR) is one of the most widely-used multivariate statistical methods in landslide susceptibility mapping [4,6,54–56]. An LR model predicts the presence of landslides using the binary landslide data (presence and absence of landslides or landslides and non-landslides) and their relationship with the landslide causal factors [56,57]. Here, landslide and non-landslide locations are dependent variables, and causal factors are independent variables. These independent variables can be numerical or categorical [4]. Equation (7) is used in LR model

$$\text{Logit}\left(\mathbf{Y}\right) = \ $1\mathbf{X}1 + \$ 2\mathbf{X}2 + \ $3\mathbf{X}3 + \mathbf{....} + \mathbf{....} + \mathbf{....} + \mathbf{....} + \dots + \mathbf{<\color{red}{251}{$ \mathbf{X} \ $}} + \$ 3\mathbf{X}i + \mathbf{e} \tag{7}$$

where Y = The presence of landslides

X<sup>i</sup> = ith Causal factor ß<sup>i</sup> = Regression coefficient of the ith causal factor e = Error We used Equation (8) to determine the probability.

$$\mathbf{P} = \frac{\exp \mathbf{Y}}{1 + \exp \mathbf{Y}} \tag{8}$$

We used the R software environment for the forward stepwise LR method. We multiplied the raster layers of the statistically significant causal factors with the coefficients and summed up using Equation (7) in the R software environment. Finally, we used Equation (8) to produce the landslide susceptibility maps.

#### *4.4. Random Forest Classification (RF)*

The random forest method was developed by Breiman [58] and is an ensemble learning method [19]. Lately, the use of the RF method for landslide susceptibility mapping has increased due to its high performance in predicting landslide locations [37,59].

This method uses bootstrapping techniques to generate a bunch of classification trees based on subsets of observations [27]. There is high variance among the individual trees, and therefore classification based on a single tree is unstable and prone to overfitting [37]. Random forest is improved over commonly used tree-based methods, such as a decision tree or bagged tree because it decorrelates the trees. RF uses ensembles of trees and lets each tree define the class membership, and finally, the respective class is assigned based on the highest votes [27,37]. Since the bootstrapping method is used, a set of data is not used in the model training stage and this set of data is known as out-of-bag (OOB) [27]. These OOB data are used to calculate the mean decrease of accuracy and Gini coefficient [37]. The accuracy and Gini coefficient are used in variable selection and ranking [19]. They also provide the statistical weights or variable importance of each of the predictors used in the model [27].

There are several advantages of using RF methods, such as rescaling and transformation of data are not essential; missing data and outliers can be ignored [60]. Moreover, it can deal with both numerical and categorical data, and the use of a dummy variable is not required [37]. In this study, we developed the Random Forest model in the R software environment using the "randomForest" package [61].

#### *4.5. Multicollinearity Diagnostics*

In the LR model, multicollinearity can bring inaccuracies in variance and unsuitability in estimates. On the other hand, in the RF model, it can affect the variable importance [62]. Therefore, multicollinearity diagnostics: variance inflation factor (VIF) and tolerance were used before using the causal factors in LR and RF models. Since VIF values were <10 and tolerance were <0.3, causal factors were independent and were used in these two models [63]

#### *4.6. Model Validation and Comparison*

Success and prediction rate curves were used to test the performance of the susceptibility models. The training data set was used for the success rate curve, and the validation data set was used for the prediction rate curve [6]. "The area under the curve" (AUC) of success rate indicates how well the model fits the training data while the AUC of prediction rate suggests how well the model will predict future landslides [15,64]. AUC value ranges from 0–1 or 0%–100% and it can be grouped into the following categories: 0.50–0.60 (fail); 0.60–0.70 (poor); 0.70–0.80 (fair); 0.80–0.90 (good), and 0.90–1.00 (excellent) [53]. We also used two non-parametric tests: Friedman and Wilcoxon Signed Rank test to assess whether there are any significant differences in performances between the susceptibility [65–68]. Friedman's test is used for multiple comparisons. This test determines whether there is any significant difference in performance in multiple models [29], while the Wilcoxon Signed Rank test is used for pairwise comparison of susceptibility models and, therefore, can indicate which models are significantly different [68].

However, statistical performance assessments such as success and prediction rate curves cannot show the level of agreement among the models. Therefore, we used convergent validation through the coverage based cross-comparison [69,70]. The MFR gives landslide susceptibility index while LR and RF provide the probability of landslides. We reclassified them into five susceptibility zones: very low, low moderate, high, and very high using the Jenks natural break method. Later, we used the raster calculator in the ArcGIS platform to subtract the reclassified model from one another. The outcome can be any integer, but only "zero" will indicate the areas which were classified into the same susceptibility zones by two compared models. We calculated the percentage of area under this "zero" class and this percentage indicates the spatial convergence or agreement between two compared models [69].

#### **5. Results**

#### *5.1. Susceptibility Assessment Using Modified Frequency Ratio (MFR)*

We found that the prediction rates (PRs) of the slope, aspect, and elevation derived from three DEMs (ASTER, SRTM, and ALOS PALSAR) are similar. The PRs for these three factors are around 2.25, 1.0, and 3.0, respectively (see Table A1 of Appendix D). Compared to these three DEMs, the SOB DEM showed different PRs. Our analysis revealed that for slope, aspect, and elevation, the SOB DEM-based PRs are 1.79, 2.37, and 2.41, respectively. We further found that with the increase of slope, the probability of landslide increases. The relatively safer zones are in areas below 8º slope where frequency ratio (FR) < 1. For ASTER, SRTM, and ALOS PALSAR DEMs, we found that the slope class 14–23º had the highest probability of landslides. However, for SOB DEM, the highest probability of landslide is in the 8–14º slope class. In the case of TWI, we found that ALOS PALSAR DEM has the highest PR (4.30) followed by ASTER DEM (PR = 3.19) and SOB DEM (PR = 1.61). For all the four DEMs, the probability of landslides was higher in areas where TWI is less than 6. The SPIs derived from four DEMs have lower PRs compared to other causal factors and the class-wise weight (FR values)

showed the same sort of pattern. We further found that for plan curvature, ALOS PALSAR has the highest PR (4.30), while for profile curvature, SRTM has the highest PR (4.68).

In general, we found no specific pattern of FR and PR values for these seven factors. We observed that causal factors derived from either ALOS PALSAR or SRTM have higher PR values, and SOB has the lowest PR among the four DEMs. The causal factors derived from different DEMs did not have a significant impact on the FR and PR values of MFR. This finding is similar to the findings of Chang et al. [29]. Most of the topographic factors are the first derivative of the DEMs other than the TWI and SPI. Thus, for TWI, DEMs have more substantial impacts than other factors. It is because there can be a small difference among the DEMs, and when the second derivative is used, these become pronounced [29].

The class-wise FR values of the eight factors that are not derived from the DEMs are generally similar; however, the PR values are different (see Table A1 of Appendix D). The causal factors derived from the DEMs played a crucial role in determining the PR values for these eight factors. For ASTER (0.141), SRTM (0.134), and ALOS PALSAR (0.139), the lowest difference was between the maximum and minimum relative frequency of aspect. While for SOB (0.160), it was SPI. Since these values are slightly different, it affected the PR values.

#### Landslide Susceptibility Maps (MFR)

We produced Landslide Susceptibility Indices (LSIs) of the four MFR models using Equation (6) for DEM-based causal factors. The LSI of MFR\_ASTER\_DEM ranged from 994.6 to 8388.1. The LSIs for MFR\_SRTM\_DEM LSIs for MFR\_SRTM\_DEM; MFR\_ALOS\_DEM and MFR\_SOB\_DEM ranged from 591.4 to 9458.8 and 527.8 to 10056.5 and 638.8 to 6130.0, respectively. LSI does not have a unit. It is the product of relative frequency and prediction rate Equation (6), and both of these do not have units. The greater the LSI, the greater is the landslide susceptibility, and the smaller the LSI value, the lower is the susceptibility [35,44]. The ranges of LSIs indicate that MFR\_SRTM\_DEM and MFR\_ALOS\_DEM models had a comparatively broader range than the rest of the two models. This happened because of the variable FR and PR values. For ASTER DEM, the highest PR value was for plan curvature, while for other DEMs it was for profile curvature (see Table A1 of Appendix D). For all DEM-based factors, SOB had the lowest PRs among the four models and therefore it affected the LSIs. Later, we used the same Equation (6) to produce four MFR models based on 15 causal factors. The LSI of MFR\_ASTER DEM ranged from 1613.00 to 20370.10. The LSIs for MFR\_SRTM, MFR\_ALOS, and SOB DEMs ranged from 1314.40 to 22300.34 and 1234.95 to 22180.24 and 1995.7 to 17316.9, respectively. The LSIs of MFR\_SRTM and MFR\_ALOS DEMs have a comparatively broader range than the rest of the two models. The highest PR value for ASTER was 6.25 and for SOB, it was 5.64 (Table A1 of Appendix D) for distance to the road network, while for SRTM and ALOS PALSAR, it was 6.61 and 6.38, respectively. For other causal factors (Table A1 of Appendix D), SOB had the lowest PRs among the four, and therefore LSI was the lowest. As mentioned earlier, the FR values varied for seven topographic factors derived from four different DEMs, and these factors had impacts on the PR of eight common factors. Ultimately these variations defined the LSI of the susceptibility maps.

Since different models had different LSIs, Rescale by Function tool in ArcGIS was used to normalize the LSIs into a 0.0–1.0 scale. Later, we used the Jenks natural break method to classify the normalized LSIs into five susceptibility zones: very low, low, moderate, high, and very high. Generally, the spatial appearances of the landslide susceptibility maps have similarities with the map of causal factors that have a higher contribution to landslides. In this study, this contribution is shown by PR and FR values (see Appendices A–C). We found that the spatial appearance of seven causal factors derived from SOB was different from the spatial appearance of seven causal factors derived from ASTER, SRTM, and ALOS PALSAR. ASTER, SRTM and ALOS PALSAR based susceptibility maps (Figure 2a–c) show a comparatively lesser percentage of area as very low or low susceptibility zones than the SOB-based landslide susceptibility map (Figure 2d). However, the SRTM based map shows comparatively more areas as high and very high susceptibility zones that the other maps. We found when all factors were

considered, areas near to the road network are highly susceptible to landslides (Figure 3a–d), because the PR values of the distance from the road networks were the highest.

‐ **Figure 2.** Landslide susceptibility maps produced using Modified Frequency Ratio (MFR), Logistic Regression (LR) and Random Forest (RF) models (seven DEM-based factors).

**Figure 3.** Landslide susceptibility maps produced using MFR, LR, and RF models (all 15 factors).

#### *5.2. Susceptibility Assessment using Logistic Regression (LR)*

‐ For the DEM-based factors, the LR model detected two to five statistically significant factors (see Table A2 of Appendix E). Elevation and slope were the two common statistically significant causal factors for ASTER, SRTM, and ALOS PALSAR based models. Since the ALOS PALSAR based model, the highest number of causal factors was chosen, DEM had the highest impact on the landslide susceptibility map. Odds ratio (Table A2 of Appendix E) shows that slope was the most important factor of landslides for ASTER and SRTM based models, while aspect came out as the most crucial factor for ALOS PALSAR and SOB based models.

When all (15) causal factors were used, we found a total of four to eight statistically significant causal factors (see Table A2 of Appendix E). Slope, elevation, SPI, and aspect were the significant DEM-based causal factors for LR\_SRTM and LR\_ALOS based models. For LR\_SOB, aspect was the statistically significant DEM-based causal factor. When 15 causal factors were used, the model assessed the interaction of DEM-based factors with the common eight factors. Therefore, when only the DEM-based causal factors were used, some factors came out statistically significant (e.g., SPI for

LR\_ASTER\_DEM). When all the factors were used in the LR\_ASTER based model, SPI came out insignificant. For LR\_SRTM and LR\_ALOS, three DEM-based causal factors were detected as statistically significant; therefore, for these two landslide susceptibility maps, DEM would have more impact than the LR\_ASTER and LR\_SOB susceptibility maps.

#### Landslide Susceptibility Maps (LR)

We used the Jenks natural break method to classify the probability of landslides into five zones: very low, low, moderate, high, and very high. The spatial appearances of the LR\_SOB\_DEM model (Figure 2h) have a different appearance than the other three maps. LR\_SRTM\_DEM (Figure 2f) and LR\_ALOS\_DEM (Figure 2g) have an almost identical spatial appearance. While in LR\_ASTER\_DEM models, the slope had a higher coefficient (ß = 0.26) (Table A2 of Appendix E) than the SRTM (ß = 0.14) and ALOS PALSAR ((ß = 0.05). Therefore, in the LR\_ASTER\_DEM map, the areas with steeper slopes, mainly in the mid-north and mid-south of the study area, were classified as high to very highly susceptible. In LR\_SRTM\_DEM and LR\_ALOS\_DEM maps, these areas were classified either as moderate or high susceptibility zones. In LR\_SOB\_DEM, only elevation and aspect were two significant factors. Therefore, the susceptibility map took the shape of the map of these two factors (see Appendix A).

When all (15) causal factors were used, the spatial appearance of LR\_SOB (Figure 3h) was different from the other three maps (Figure 3e–g). LR\_SOB map was influenced by the distance from the fault lines, distance from the road networks, and land use/land cover. Although aspect was a significant factor, the coefficient value of aspect was similar to other significant factors, and distance from the fault lines (ß = 1.07) (Table A2 of Appendix E) had a higher coefficient value than aspect. Therefore, most of the study area was classified as low or very low susceptibility zones. In the LR\_ASTER model (Figure 3e), the slope had the highest coefficient (ß = 0.31) (Table A2 of Appendix E), and therefore, areas with steeper slopes were classified as high or very high susceptibility zones. But in LR\_SRTM and LR\_ALOS models, the slope had comparatively lower coefficient values than ASTER. As a result, some areas were classified as moderate susceptibility zones in these two maps (Figure 3f–g). In LR\_SRTM and LR\_ALOS susceptibility maps, common factors such as distance from the road networks and fault lines, land use/land cover, and land use/land cover change did not have higher coefficient values than the DEM-based causal factors. That is why, unlike LR\_SOB, the spatial appearance of the susceptibility maps did not follow the appearance of the maps of common factors.

#### *5.3. Susceptibility Assessment using Random Forest (RF)*

RF\_ASTER\_DEM, RF\_SRTM\_DEM, and RF\_ALOS\_DEM models detected (Figure 4a) slope and RF\_SOB\_DEM model identified the aspect as the most critical factor. When all 15 factors were used in the models, RF\_ASTER and RF\_SRTM (Figure 4b) detected slope and the rest of the two models detected distance from the road network as the most important causal factor. For RF\_ASTER, RF\_SRTM, and RF\_ALOS, DEM-based factors such as elevation, TWI, and aspect had higher importance in the models than the common factors. But in the RF\_SOB model DEM-based factors had less importance than the common factors. There is no similarity among the models in detecting the importance of DEM-based causal factors. For example, In RF\_SOB, the slope was ranked as one of the least important factors, but for other models, it was ranked as the most important factor (Figure 4b).

‐

‐ ‐

‐

 ( **Figure 4.** Variable importance plots of random forest model: (**a**) Digital Elevation Model (DEM)based causal factor (7) used in the models. (**b**) All (15) factors used in the models. Pl=plan curvature; Pr= profile curvature; LC= land use/land cover; LLC= land use/land cover change; DF= distance from the fault lines; DD= distance from the drainage networks; DR= distance from the road networks.

Landslide Susceptibility Maps (RF)

Like MFR and LR, we used the same method to classify the probability of landslides into five susceptibility zones. The spatial appearance of the RF\_SOB\_DEM susceptibility map (Figure 2l) was different from the susceptibility maps of the other three models (Figure 2i–k). For, RF\_ASTER\_DEM (Figure 2i) areas in the mid-north to mid-south were classified as high or very high susceptibility zones. While in RF\_SOB\_DEM and RF\_ALOS\_DEM the same areas were classified as moderate susceptibility zones. In RF\_ASTER\_DEM, slope was the most critical factor. Similarly, in RF\_SRTM\_DEM and RF\_ALOS\_DEM slope was the most crucial factor, but in these two models, the contribution of the slope (Figure 4a) to the model is lesser than the RF\_ASTER\_DEM model. In RF\_ASTER\_DEM, the difference of variable importance between slope and other factors was comparatively higher than the other models, the effect of slope on the susceptibility map was visible.

RF\_ASTER; RF\_SRTM, RF\_ALOS; and RF\_SOB models (Figure 3i–l), spatial appearances were different from each other. Since in the RF\_SOB model, distance from the road networks was the most crucial factor, areas near to roads were classified as high or very high susceptibility zones. Distance from the road network was not ranked as the most critical factor in the other three models.

#### *5.4. Validation and Comparison of Landslide Susceptibility Maps*

#### 5.4.1. Success and Prediction Rate Curves

#### DEM-Based Causal Factors

When only DEM-based seven causal factors were used for MFR models, among all DEMs the MFR\_SRTM\_DEM model gave the superior performance for both success (AUC = 80.73%) and prediction (AUC = 77.37%) rate curves (Figure 5a,b). The MFR\_SOB\_DEM model appears to perform the weakest in assessing success and prediction. The AUCs of success rate curves (Figure 5a) showed that MFR\_SRTM\_DEM falls under the good category while MFR\_ASTER\_DEM and MFR\_ALOS\_DEM fall under the fair category. But MFR\_SOB\_DEM falls under the poor category. AUCs of prediction rate curves (Figure 5b) show that all models other than the MFR\_SOB\_DEM gave fair performances.

‐ **Figure 5.** Success and prediction rate curves (seven DEM-based factors): (**a**) MFR (success); (**b**) MFR (prediction); (**c**) LR (success); (**d**) LR (prediction); (**e**) RF (success); (**f**) RF (prediction).

For LR, LR\_ALOS\_DEM outperformed the other three models (Figure 5c,d). LR\_SOB\_DEM presented the weakest performance among the four models and thus fell under the fail category. The AUCs of success and prediction rates (Figure 5c,d) show that the other three models are under the fair category. For RF models, we got similar results as the LR model. RF\_ALOS\_DEM outperformed other models, and RF\_SOB\_DEM was the least accurate model. RF\_SRTM\_ALOS\_DEM and RF\_SRTM\_DEM gave an almost similar performance.

We found that when all 15 causal factors are used, different models showed variable performances. For the MFR model, the use of 15 causal factors decreases the predictive performance on an average by 5% of landslide susceptibility maps based on ASTER, SRTM, and ALOS PALSAR (see Figure 6a,b).

models, DEM-based causal factors can give better prediction performance and use of non-DEM-based factors can reduce the accuracy. Inclusion of more DEM-based causal factors and more landslide locations may increase the accuracy of the models in the study area. ‐

**Figure 6.** Success and prediction rate curves (15 factors): (**a**) MFR (success); (**b**) MFR (prediction); (**c**) LR (success); (**d**) LR (prediction); (**e**) RF (success); (**f**) RF (prediction).

For the LR model, the use of 15 factors increased the accuracy of the model. For three global DEMs success rates increased by around 5.0% but for SOB it increased by 21.7%. On the other hand, prediction rates showed the same trend as for three global DEMs the increase of performance was around 3.5% but for SOB it was 17.0%. It proves that the use of common factors increased the accuracy substantially for SOB DEM.

Like the LR model, for RF models, the use of 15 causal factors increased the accuracy of the model. For ASTER and SRTM the increase of the success rate was around 7%. For ALOS PALSAR the success rate increased by 12.4%. Here again, SOB had the highest increase (22.6%) in success rate. The increase in prediction rate was not as high as the success rate. For three global DEMs prediction rates increased by around 2%–3% and for SOB the prediction rate increased by 10.1%. Machine learning algorithms such as the random forest learn the behavior or the training data. Therefore, the increase of success rate due to inclusion of new variables was high. Since machine learning algorithms learn the behavior of the training data it fails to predict the validation or unknown data [60]. Therefore, in our study, the increase of prediction rate is around 50% lower than the increase of prediction rate for RF models.

#### 5.4.2. Spatial Comparison of Landslide Susceptibility Maps

Spatial convergence indicates how much area is classified into same susceptibility zones. When seven DEM-based factors were used in the MFR model, MFR\_SOB\_DEM had 30% of spatial convergence while the remaining DEMs had around 40% (Table 3). As we discussed before, the landslide

susceptibility maps of MFR\_SOB\_DEM has a different spatial appearance (Figure 2d) than the rest of the three susceptibility maps and these results (Table 2) support the previous discussion of our study. For the LR models, Table 3 shows a similar trend. Spatial convergences of the LR\_ASTER\_DEM, LR\_SRTM\_DEM, and LR\_ALOS\_DEM were around 44%, while the LR\_SOB\_DEM showed approximately 19% of spatial convergence. For the RF models, Table 3 shows the similarities with the findings of MFR and LR.


**Table 3.** Spatial comparison and convergence analysis of landslide susceptibility maps.

When all factors were considered for modeling, spatial convergence between the DEMs (Table 3) increased around 40% for MFR models. While for the LR and RF models, the spatial convergence was approximately 25% and 12%, respectively. In the MFR model, all causal factors were used, while in the LR model, significant causal factors were used and in the RF model, 2–3 causal factors, for example, profile and plan curvatures had comparatively low or no variable importance in the model.

The results of Friedman tests (Table 4) show that in both the scenarios (a. seven DEM-based causal factors used, and b. 15 causal factors used) for MFR and RF models *P* < 0.05. It means at least one of the landslide susceptibilities models had significantly different performance than the rest of the models. While for LR models, when seven DEM-based causal factors were used, at least one of the models was statistically different in performance than the rest of the models. When all factors were used in LR models, there was no statistically significant difference in performance between the models. Wilcoxon signed-rank test conducted the pairwise comparison. The results (Table 4) show that in scenario two, the performances of landslide susceptibility maps produced using SRTM and ALOS PALSAR did not have a statistically significant difference. Other than that, all the performances of the MFR based landslide susceptibility maps were statistically (α = 0.008 after Bonferroni correction) different from each other. When seven causal factors were used for LR models, the performance of SOB based models was significantly different (Table 5) from all other models. But when all factors are used, these differences become insignificant. It indicates that eight common factors overshadow the effect of SOB based causal factors. In the MFR model, it did not happen since it did not consider the interaction of causal factors.


**Table 4.** Result of the Friedman Test for landslide susceptibility maps.

In the case of RF models, RF\_ALOS\_DEM (Table 5) was statistically different from the other models. But when all causal factors were used the difference of performance became insignificant for RF\_SRTM model. RF used a more complex algorithm than the LR and MFR models. Therefore, the Wilcoxon Signed-Rank test gave different results for RF models than MFR and LR models.


**Table 5.** Comparison of landslide susceptibility maps based on three different DEMs using Wilcoxon signed-rank test.

\* = Significant after Bonferroni correction (*P* < 0.008).

#### **6. Discussion**

This paper evaluates the suitability of three available global DEMs: ASTER, SRTM, and ALOS PALSAR and a local DEM: SOB for landslide susceptibility mapping in Rangamati district, Bangladesh. Causal factors derived from ASTER and ALOS PALSAR DEM have been used in landslide susceptibility mapping in different parts of the Chittagong hilly areas, Bangladesh [7,9,31–35]. Since the study areas of these studies were different, we could not compare them to find out which DEM gives the best accuracy in the prediction of landslide susceptibility [36]. Our study showed that three global DEMs outperformed the local DEM in all three landslide modeling scenarios.

In the first scenario, only DEM-based causal factors were used in modeling and for MFR models, MFR\_SRTM\_DEM outperformed the other three models. The difference of AUCs of both the success and prediction rate curves between \_SRTM\_DEM and ALOS\_DEM was around 3%, indicating a similarity in predictions. For the processing of ALOS PALSAR DEM, SRTM GL1 data is used for radiometric correction [71]. Therefore, the quality of ALOS PALSAR DEM depends on the quality of SRTM DEM. SOB DEM-based MFR did not show a good performance for the study area and the prediction performance can be improved when a more representative landslide inventory with more landslide locations is used. We found that the ASTER DEM-based MFR model showed a weaker performance than the other two open-source global DEMs. Our result is consistent with other studies that utilized ASTER DEM [29,72,73]. It may happen because ASTER DEM contains many artifacts such as the presence of peaks in the flat terrain, and it ultimately affects the landslide susceptibility map [29]. The poor performance of the SOB DEM can be attributed to the interpolation methods that were used to extrapolate elevations from the available spot heights in the hilly parts of Bangladesh [74]. It affected the accuracy and quality of DEM in the hilly parts of Bangladesh. On the other hand, for LR and RF, ALOS PALSAR based models outperformed the rest of the models. Here, again the difference between LR\_ALOS\_DEM and LR\_SRTM\_DEM was low for success and prediction. In all cases, SOB based models gave the worst performance and causal factors derived from SOB DEM cannot explain the landslide susceptibility of the study area. For example: In LR\_SOB\_DEM, the LR model used two significant causal factors: elevation and aspect (Table A2 of Appendix E). The low coefficient (ß) values of these two factors indicate that these two causal factors cannot adequately explain the landslide susceptibility of the study area.

In the second, scenario, for MFR models, the use of 15 causal factors increased the prediction accuracy for MFR\_SOB. But for the other three models, it reduced accuracy. It indicates that causal factors derived from three global DEMs were capable enough to explain the landslide susceptibility of the study area. MFR is a bivariate model, and it does not consider the interaction of the causal factors. Moreover, unlike LR and RF models, it does not require non-landslide (pseudo absence point) in modeling [44]. When all causal factors were used, PRs of some of the common causal factors were comparatively higher than the PRs of the DEM-based causal factors. For example, PRs of distance from the road networks were around 6.00 (Table A1 of Appendix D), while PRs of the DEM-based causal factors ranged from 1.00–4.43. Therefore, the higher PRs of the common causal factors overshadowed the DEM-based factors [6,35,44]. On the other hand, the quality of the SOB DEM was poor and was not capable of explaining the landslide susceptibility of the study area. That is why, in the MFR\_SOB model, the prediction accuracy increased, but the increase was very low. For LR and RF models, in second scenarios, the prediction accuracy increased by 2%–3% for global DEM-based models, while for SOB based models, it exceeded 10%. Both LR and RF models consider the interaction of causal factors, and therefore, in these two models, the effects of common factors on prediction performance were revealed better than the MFR models.

The findings of this study have similarities with other research where the suitability of different global and local DEMs was evaluated for landslide susceptibility mapping [26–28]. In most of the studies, ASTER DEM-based landslide susceptibility maps were outperformed by the other global DEM-based landslide susceptibility maps. On the other hand, local DEM-based landslide susceptibility maps have better prediction accuracy than the global DEMs [1,29]. In these studies, local DEMs

were mainly light detection and ranging (LiDAR)-based DEMs, which generally have very high spatial resolution compared to global DEMs [29]. SOB DEM was prepared under the project titles "Improvement of Digital Mapping System of Survey of Bangladesh", where the main aim was to prepare a 1:5000 scale DEM of major cities in Bangladesh. This SOB project prepared DEM for the hilly areas of Bangladesh, including our study area using the local spot heights. Different interpolation methods were used to prepare 25m DEM from these local spot heights [73]. Therefore, in hilly areas of Bangladesh, the quality of SOB DEM is not good enough, and the application of this DEM in geomorphological studies such as landslide susceptibility mapping can give questionable results similar to what we got in our study. Our study also pointed out the importance of the preparation of very high-resolution DEMs using LiDAR for the hilly areas of Bangladesh. It will help in various geomorphological studies, including landslide susceptibility mapping. As we did not find a substantial difference among global DEMs, any global DEM can be utilized for landslide susceptibility mapping in Bangladesh in the absence of very high-resolution DEMs.

For an in-depth study, we utilized non-parametric tests. Chang et al. [30] used non-parametric tests to detect a significant difference in performance in landslide susceptibility maps prepared using different DEM-derived causal factors. Their study did not find any significant difference in performance for two machine learning methods: RF and support vector machines. In our research, we did not see any significant difference in performance for RF and LR based models. But for the MFR model, we found a significant difference in performance. It indicates that the effect of DEM-based causal factors on the performance of bivariate models is more than on the multivariate and machine learning models. Thus, we suggest using multivariate and machine learning methods and any one of the global DEMs for landslide susceptibility mapping in Bangladesh.

#### **7. Conclusions**

This paper assesses the effects of the DEM-derived causal factors on the landslide susceptibility maps produced using the bivariate (e.g., MFR), multivariate (e.g., LR), and machine learning (e.g., RF) models. In this study, we tested two scenarios: a. susceptibility assessment with only seven DEM-based causal factors; b. inclusion of other 8 causal factors along with DEM-derived factors. The success and prediction rate curves showed SRTM DEM outperformed under an MFR model in both scenarios. Our analysis revealed that ALOS PALSAR DEM provided the best prediction accuracy while using both LR and RF models in landslide susceptibility mapping. For all models and scenarios, the SOB DEM does not perform well compared to other DEMs.

The prediction accuracies of landslide susceptibility mapping using only DEM-derived factors is low compared to the utilization of all casual factors. Although SOB DEM has a poor performance in susceptibility assessment with only a DEM-derived factor, the accuracy is significantly improved when other non-DEM-based factors are added. Therefore, we argue that causal factors derived from the SOB DEM have limited influence in landslide susceptibility assessment for the study area. Besides, for the LR and RF models, the use of all the causal factors increased the prediction performance. Spatial convergence analysis showed that three global DEM-based models have similar accuracies and performed far better than SOB DEM-based models. Therefore, we recommend that SOB DEM should not be used for landslide susceptibility assessment in Bangladesh. Although ASTER DEM-based models showed the weakest performance among three global DEMs, the difference of performance was negligible. Therefore, we recommend any one of these three global DEMs can be utilized for landslide susceptibility assessment in Bangladesh.

Our study also highlights that, for the MFR model, DEM had the highest impact on the accuracy of landslide susceptibility assessment as Wilcoxon rank tests showed that the performance of susceptibility maps was significantly different. For the LR and RF models, the effect of DEM was less significant. We contend that while using the bivariate model, we must be careful about the quality of DEM. Two scenarios in this study helped to understand the impact of DEMs in landslide susceptibility assessment. Moreover, we used multiple metrics to evaluate the accuracies of susceptibility assessment

including AUC, spatial convergence analysis, and non-parametric test. Although the use of one performance measure is a common practice, the utilization of various measures helped this research to understand the impacts of DEM and makes the conclusion robust.

The main limitation of our study is that we cannot ascertain if adding more DEM-derived factors such as terrain roughness could improve the result. The impacts of DEM on landslide susceptibility maps are not universal and may vary from place to place. Therefore, we cannot conclude that a specific DEM can be better than others. Advanced machine learning and deep learning methods can be used to check whether these algorithms can reduce the differences in prediction performance of landslide susceptibility maps prepared using different DEM-derived causal factors.

**Author Contributions:** Conceptualization, Y.W.R; methodology, Y.W.R.; M.S.R.; analysis, Y.W.R.; writing, Y.W.R. A.I.; M.S.R.; review and editing, A.I. and M.S.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding

**Acknowledgments:** The authors thank Bayes Ahmed (Lecturer, Institute for Risk and Disaster Reduction, University College London) for his assistance in obtaining some datasets.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Figure A1.** Landslide Causal Factors: (**a**) Elevation (ASTER); (**b**) Elevation (SRTM); (**c**) Elevation (ALOS PALSAR); (**d**) Elevation (SOB); (**e**) Slope (ASTER); (**f**) Slope (SRTM); (**g**) Slope (ALOS PALSAR); (**h**) Slope (SOB); (**i**) Aspect (ASTER); (**j**) Aspect (SRTM); (**k**) Aspect (ALOS PALSAR); (**l**) Aspect (SOB).

#### **Appendix B**

**Figure A2.** Landslide Causal Factors: (**a**) Plan Curvature (ASTER); (**b**) Plan Curvature (SRTM); (**c**) Plan Curvature (ALOS PALSAR); (**d**) Plane Curvature (SOB); (**e**) Profile Curvature (ASTER); (**f**) Profile Curvature (SRTM); (**g**) Profile Curvature (ALOS PALSAR); (**h**) Profile Curvature (SOB); (**i**) TWI (ASTER); (**j**) TWI (SRTM); (**k**) TWI (ALOS PALSAR); (**l**) TWI (SOB); (**m**) SPI (ASTER); (**n**) SPI (SRTM); (**o**) SPI (ALOS PALSAR); (**p**) SPI (SOB).

#### **Appendix C**

**Figure A3.** Landslide Causal Factors: (**a**) Rainfall; (**b**) Distance from the Road Networks; (**c**) Distance from the Drainage Networks; (**d**) Distance from the Fault Lines; (**e**) NDVI; (**f**) Geology; (**g**) Land Use / Land Cover; (**h**) Land Use/ Land Cover Change.

#### **Appendix D**

**Table A1.** Spatial relationship between Causal Factors and Landslides.ASTER DEM; B= SRTM DEM; C= ALOS PALSAR DEM D= SOB DEM.



**Table A1.** *Cont.*


**Table A1.** *Cont.*


**Table A1.** *Cont.*


**Table A1.** *Cont.*


**Table A1.** *Cont.*

#### **Appendix E**



1 = Land use/land cover change; 2 = distance from the drainage networks; 3 = distance from the road networks;

4 = land use/land cover; 5 = distance from the fault lines.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# **Landslide Characterization Applying Sentinel-1 Images and InSAR Technique: The Muyubao Landslide in the Three Gorges Reservoir Area, China**

**Chao Zhou 1,2 , Ying Cao 3 , Kunlong Yin 3, \*, Yang Wang 3 , Xuguo Shi 1 , Filippo Catani 4 and Bayes Ahmed 5**


Received: 21 August 2020; Accepted: 13 October 2020; Published: 16 October 2020

**Abstract:** Landslides are a common natural hazard that causes casualties and unprecedented economic losses every year, especially in vulnerable developing countries. Considering the high cost of in-situ monitoring equipment and the sparse coverage of monitoring points, the Sentinel-1 images and Interferometric Synthetic Aperture Radar (InSAR) technique were used to conduct landslide monitoring and analysis. The Muyubao landslide in the Three Gorges Reservoir area in China was taken as a case study. A total of 37 images from March 2016 to September 2017 were collected, and the displacement time series were extracted using the Stanford Method for Persistent Scatterer (StaMPS) small baselines subset method. The comparison to global positioning system monitoring results indicated that the InSAR processing of the Muyubao landslide was accurate and reliable. Combined with the field investigation, the deformation evolution and its response to triggering factors were analyzed. During this monitoring period, the creeping process of the Muyubao landslide showed obvious spatiotemporal deformation differences. The changes in the reservoir water level were the trigger of the Muyubao landslide, and its deformation mainly occurred during the fluctuation period and high-water level period of the reservoir.

**Keywords:** landslide deformation; InSAR; reservoir water level; Sentinel-1; Three Gorges Reservoir area (China)

#### **1. Introduction**

Landslides are one of the most common types of natural hazards that cause serious economic losses, casualties, and damages to buildings, critical infrastructures and industrial settlements [1]. The impact of landslides is particularly high in rural mountainous areas, where land management is difficult to achieve, and risk mitigation is scarce due to the large distances, difficult logistics and harsh climatic conditions. As an example, on 2 May 2014, a large-scale landslide occurred in the Badakhshan Province in Northeastern Afghanistan due to continuous heavy rainfall. It resulted in nearly 2700 deaths and more than 300 houses were buried [2]. Again, a giant landslide occurred in

the Himachal Pradesh region of Northern India on 13 August 2017 that killed 47 people and nearly a 300-m highway was completely destroyed.

Monitoring and early warning systems are effective methods to reduce the risk of landslides [3], but the application of them on each single, potentially unstable slope is often impossible in mountain areas, due to the high spatial frequency of affected slopes. In such cases, remote sensing may help to reduce the cost and time for the application of mitigation measures, because it can provide a preliminary low-cost assessment of the severity of the slope instability and allow for the prioritization of critical cases. The use of remote-sensing data is one of the hotspots of landslide research at present. Among all the physical manifestations of a mass movement, the surface deformation is the most intuitive and comprehensive to measure and use for hazard assessment. It is a critical indicator for developing landslide early warning systems. Therefore, the implementation of deformation monitoring is of great significance for landslide prevention and mitigation [4,5]. The majority of fatal landslide events occur in less developed countries, such as China, India, Nepal, Bangladesh, etc. [1]. Traditional deformation monitoring equipment, such as an inclinometer, global positioning system (GPS), etc., are well-suitable, but their high costs limit their applications in underdeveloped areas.

Synthetic Aperture Radar Interferometry (InSAR) is an effective surface deformation monitoring method with a wide range and high accuracy. Commercial radar imaging can be expensive and mainly used for landslide hazard prevention in developed areas, including landslide identification [6,7], monitoring and early warning [8–10] and risk assessment [11]. With the free distribution of Sentinel-1 satellite images, InSAR has become a low-cost landslide monitoring technique. Small Baselines Subset (SBAS) InSAR is a time series analysis method of InSAR and can achieve effective monitoring in rural and urban areas. It has been gradually applied in landslide-prone areas, which provides new technical means for landslide monitoring and early warning [12–16].

The Three Gorges Reservoir area (TGRA) has been impounded since 2003. The periodical scheduling of the reservoir water level from 145 m to 175 m has been carried out since 2009. Affected by heavy rainfall and reservoir water level scheduling, a large number of new landslides and reactivation of old landslides have occurred [17]. Until now, more than 500 landslides have undergone varying degrees of deformation. However, due to the high cost of dedicated landslide monitoring, the latter was carried out only in 254 landslides in the TGRA [18], which cannot meet the needs of disaster prevention and mitigation.

The Muyubao landslide located in the TGRA was taken as a case study in this paper. Based on the Sentinel-1 radar images, the SBAS InSAR analysis method was applied to extract the information of landslide movement. The characterization of its temporal and spatial deformation was conducted as well. Moreover, the evolution mechanism of the Muyubao landslide was analyzed with the combined consideration of its engineering geological conditions and triggering factors, including rainfall and fluctuation of reservoir water level. This study aims to explore the feasibility and effectiveness of Sentinel-1 images and InSAR technique in landslide monitoring and analysis in the TGRA as a case example from an underdeveloped landslide-prone region.

#### **2. Case Study: Muyubao Landslide**

#### *2.1. Geological Conditions*

The Muyubao landslide occurred in Shazhenxi Town, Zigui County of Hubei Province on the bank of the Yangtze River and 56 km away from the Three Gorges Dam. The Muyubao landslide is delimited laterally by N-S-oriented trenches, while the upper portion is straight and smooth. It is of a chair-like shape, with smooth topography in the middle and lower parts and steeper terrain in the upper part (Figure 1). The elevation of the Muyubao landslide ranges from 120 m.a.s.l to 425 m.a.s.l, with an average slope angle of about 20◦ . It has an average thickness of 50 m and an estimated volume of 9 Hm<sup>3</sup> . The entire landslide involves an area of 180 <sup>×</sup> <sup>10</sup><sup>4</sup> <sup>m</sup><sup>2</sup> , with a maximum longitudinal dimension of 1200 m and an average width of 1500 m (Figure 2).

**Figure 1.** (**a**) The location of the Three Gorges Reservoir area (TGRA), (**b**) location of the Muyubao landslide, and (**c**) geomorphological delimitation of the Muyubao landslide.

**Figure 2.** Topographical map of the Muyubao landslide.

The Muyubao landslide developed within a flysch formation dipping in the same direction and roughly parallel to the slope. The stratigraphy is made by the typical succession of quartz sandstone and sand-mudstone layers of the Jurassic Xiangxi formation (Figure 3). The sliding body is mainly composed of two parts. The surface layer is made up by loose deposits composed of alluvial sub-clay, gravel layers with clay, colluvial and eluvial sub-clay and collapsed block stones. The lower part of the sliding body is composed of highly disrupted layered Quartz sandstone, which is relatively intact in the western upper portion and weathered and fractured in the upper and lower portions. ‐ ‐ ‐ 

‐ ‐ ′ ‐ ′ **Figure 3.** Schematic geological cross-section I-I′ of the Muyubao landslide. Please see the location of I-I′ in Figure 2.

 The upper portion of the sliding body has a linear detachment surface in the profile map, with an inclination of about 25◦ and a thickness of 60~90 m. The front part is an uplift platform formed by shear sliding, with the thickness of 80~120 m, and the inclination angle of the rock layer is about 27◦ (Figure 3). The thickness values here refer to the central part of the landslide, as illustrated in the profile of Figure 3, which may decrease towards the boundaries. The landslide is still active, as it has experienced several reactivation events in the recent past [19,20]. This makes it a priority case for monitoring, since, in case the movement would develop towards sudden failure, it will endanger the lives and property of 140 households (500 people) in the landslide area and threaten the safety of the road and shipping of the Yangtze River.

#### *2.2. Field Investigation of Landslide Deformation*

‐ After the reservoir impounding to 175 m in the TGRA in September 2009, the deformation of the Muyubao landslide began to accelerate. Macro-deformations appeared in the landslide mass. In 2011, the road surface across both the boundaries of the landslide was damaged, and the roadbed on the west boundary was broken in widths of 20 cm (Figure 4a). The cracks in the wall of the village buildings on the eastern side of the landslide continued to be stretched (Figure 4b). In the investigation of 2014, a local collapse was found on the west side of the landslide (Figure 4c). The road across the boundaries was severely damaged, showing tensile cracks (Figure 4d). There was also a collapse under the road in the middle of the west side of Muyubao landslide, with a length of 50~70 m, a width of 10~15 m and a thickness of 5~10 m (Figure 4e). A tensile trench was formed by a partial collapse in the east side of the landslide, which is about 60 m long and 15 m wide (Figure 4f).

**Figure 4.** Deformations on the Muyubao landslide: (**a**) road damage on the west boundary, (**b**) deformation of buildings, (**c**) a local collapse on the west side, (**d**) road damage on the east side, (**e**) a collapse in the middle of the west side and (**f**) a tensile trench on the east side.

#### **3. Methods**

#### *3.1. Small Baselines Subset InSAR Analysis*

‐ ‐ With the development of the InSAR technique and the increasing of synthetic aperture radar (SAR) images, Ferretti et al. [21] proposed the method called Persistent Scatterer InSAR (PSInSAR) based on the differential InSAR technique. The main idea of the PSInSAR is to use multiple SAR images covering the same area to analyze the stability of the amplitude and phase and, then, to identify the pixels that are less-affected by the spatiotemporal decorrelation. To obtain accurate deformation information, the components of the phase need to be jointly analyzed and modeled to remove errors. In the PSInSAR method, a unique master image is selected from all images to generate interferograms. The strategy will result in the presence of long-space baseline interference pairs. It produces a low

density of the target reflectors in case of nonurban areas. Hence, in rural areas with few artificial buildings, the permanent scattering pixels may be very sparse [22].

The small baseline subset method was initially proposed to overcome the problem of decorrelation by making full use of the interferograms with both small temporal baselines and short perpendicular baselines [23,24]. In rural areas, plenty of pixels have no dominated scattering characteristics in the SAR image. The pixels showing a slow decorrelating filtered phase (SDFP) are widely distributed in natural terrain and maintained good coherence during a short time interval. The Stanford Method for Persistent Scatterer (StaMPS) was proposed in 2004 [25]. In the StaMPS SABS method, these kinds of pixels are identified and analyzed to monitor surface displacement. The SDFP pixels are selected through their phase characteristics. In order to reduce the calculation burden, an initial subset of pixels containing almost all SDFP pixels are firstly selected through amplitude analysis. The amplitude dispersion value is calculated as the indicator of phase stability. The wrapped phase is composed of spatially correlated phase and spatially uncorrelated look angle error. The spatially correlated phase, including ground deformation, elevation error and orbit error, is estimated by the bandpass filtering in the frequency domain. The spatially uncorrelated look angle error mainly consists of the spatially uncorrelated elevation error. It is estimated through its correlation with the perpendicular baseline. The residual, obtained by removing these two terms from the wrapped phase, gives an estimation of the decorrelation noise γ*x* (Equation (1)), which indicates the stability of the pixel phase:

$$\gamma\_{\mathbf{x}} = \frac{1}{N} \left| \sum\_{1}^{N} \exp \left\{ \sqrt{-1} (\psi\_{\mathbf{x},i} - \widetilde{\psi}\_{\mathbf{x},i} - \Delta \widehat{\phi}\_{\boldsymbol{\theta},\mathbf{x},i}^{\boldsymbol{u}}) \right\} \right| \tag{1}$$

where *N* is the number of interferograms, ψ*x*,*<sup>i</sup>* is the wrapped phase, φˆ *<sup>u</sup>* θ,*x*,*i* is the spatially uncorrelated look angle error and ψe*x*,*<sup>i</sup>* is the spatially correlated term. The final SDFP pixels are selected through the threshold analysis of γ*x*. Then, these SDFP pixels can be decomposed with the three-dimensional phase unwrapping method [26]. The unwrapped phase on a given pixel can be expressed as:

$$\psi\_{\mathbf{x},i} = \mathcal{W} \{ \phi\_{\mathbf{D},\mathbf{x},i} + \phi\_{\mathbf{A},\mathbf{x},i} + \phi\_{\mathbf{O},\mathbf{x},i} + \phi\_{\mathbf{T},\mathbf{x},i} + \phi\_{\mathbf{N},\mathbf{x},i} \} \tag{2}$$

where ψ*x*,*<sup>i</sup>* is the unwrapped phase, and φ*D*,*x*,*<sup>i</sup>* , φ*T*,*x*,*<sup>i</sup>* , φ*A*,*x*,*<sup>i</sup>* , φ*O*,*x*,*<sup>i</sup>* and φ*N*,*x*,*<sup>i</sup>* are the phase components due, respectively, to ground deformation, topographic error, atmospheric disturbance, inaccurate orbit information and noise. In the StaMPS SBAS analysis, a theoretical framework for three-dimensional phase unwrapping was proposed. These different phase components were estimated through iterative filtering with consideration of their characteristics in spatial-temporal domains. The deformation phase can be obtained by removing the other components from the unwrapped phase. Afterwards, a time series of deformation can be obtained from the phase using singular value decomposition.

#### *3.2. Time Series InSAR Processing of Muyubao Landslide*

Sentinel-1 is a two-satellite constellation with the prime objective of land. It is the first satellite developed by the European Commission and the European Space Agency (ESA) for the Copernicus Programme. This satellite was launched in April 2014. After half a year of trial operation, it began to be gradually used after October 2014. One goal of the mission is to provide C-Band SAR data continuity following the retirement of ERS-2 and the end of the Envisat mission. In this study, 37 images acquired by Sentinel-1a, in the period from 10 March 2016 to 13 September 2017 (Figure 5), were collected from the ESA [27]. We collected the precise orbit parameters of the Sentinel-1 in the ESA as well [28]. The angle of the orbit with the north is 12.16◦ , and the average incidence angle of three sub-swaths are 34.01◦ , 39.39◦ and 40.02◦ , respectively.

‐ **Figure 5.** The time distribution of sentinel-1 images. ‐

‐ ‐ ‐ Interferograms are generated applying the two-track differential method [29]. The SRTM (Shuttle Radar TopographyMission) DEM (digital elevation model) covering the study area is utilized to generate single-look differential interferograms. Temporal and perpendicular thresholds are, respectively, set to 90 days and 1000 m for generating interferograms. Considering that these thresholds will generate many interferograms and result in a heavy computational burden, each image was mostly combined to three subsequent images for interferometric processing. Data processing was carried out in COMET-LiCSAR [30]. As a result, 97 interferograms were produced as the input for the SBAS analysis (Figure 6). The amplitude difference dispersion was applied to selected coherent targets, and it was set as 0.6. Subsequently, the time series displacement was obtained by applying the StaMPS SBAS method. Moreover, in order to reduce the effects of residual atmospheric artefacts, the obtained time series displacement within the Muyubao landslide has been referenced to a stable neighbor point (31◦1 ′52.62"N, 110◦29′57.48"E) [15,31]. ‐ ‐ ′ ″ ′ ″

**Figure 6.** Baseline network of the interferograms.

‐ ‐ ‐ ‐ The direction of displacement acquired by the SBAS method is the line-of-sight (LOS) of radar satellite. Before applying the measurement to the landslide analysis, the LOS displacements should be projected along real slope components to get actual vectors. Hilley et al. [32,33] proposed a widely used projection method; the LOS displacement can be projected into the direction of the steepest descent slope. In this case study, a field investigation showed that the slip direction was 16◦ . Hence, the LOS displacement was projected onto the downward direction of along the sliding direction.

#### **4. Results**

#### *4.1. InSAR Results of Muyubao Landslide*

As stated in Section 3.2, the displacement time series of the Muyubao landslide was extracted by applying the time series InSAR technique and projected onto the direction of the downward sliding direction. Figure 7 shows the average deformation speed of the Muyubao landslide. The strong deformation mainly occurred at the eastern upper portion of the landslide, with deformation rates up to 100 mm/year. In the direction perpendicular to the sliding direction, the deformation of the eastern side was slightly stronger than the western side of the Muyubao landslide.

**Figure 7.** The spatial distribution of the Interferometric Synthetic Aperture Radar (InSAR) monitoring points, with deformation speed and the spatial distribution of the feature points. Note: Global Positioning System is abbreviated to GPS, and Persistent Scatterer InSAR is abbreviated to PSI.

‐ ‐ ‐ ‐ ‐ ‐ ‐ ‐ Some points in different parts of the Muyubao landslide were selected to further analyze the spatial deformation characteristics. The locations and cumulative displacement time series curves are shown in Figures 7 and 8, respectively. It can be seen that the cumulative displacements of PSI-04 and PSI-05 show the active part of the landslide (Figure 7). The cumulative displacement of these two points from March 2016 to September 2017 exceeded 250 mm (Figure 8d,e). The deformations of the monitoring points in the middle part were smaller, with the cumulative displacement about 150 mm (Figure 8f,h). The deformation rates were the smallest (about 50 mm/y) at the western front part of the landslide (Figures 7 and 8i–k). Compared to the linear monitoring curves at the upper portion of the landslide, the monitoring curves near the lower portion showed slight step-like characteristics (PSI-01, PSI-02, PSI-06, PSI-09 and PSI-10), which were directly affected by periodic rainfall and reservoir scheduling [34,35]. In summary, the deformation of the Muyubao landslide seems to show obvious

differences in space. The deformation on the eastern upper portion of the landslide was the strongest, followed by the middle part, and the western part showed the smallest deformation.

**Figure 8.** Time series displacement of InSAR and triggering factors.

#### *4.2. The Deformation Analysis of Muyubao Landslide*

The Three Gorges Reservoir oscillates regularly between 145 m and 175 m, with annual cycles (Figure 9). It can be seen from Figure 9 that, from January to February 2017, the TGRA remained at a normal water storage level of 175 m. Under the uplift pressure of the reservoir water, the Muyubao landslide was in a state of continuous deformation, the southeast edge of which showed a larger displacement rate (Figures 10f and 11f). From March to April 2017, with the slow decline of the reservoir level, the landslide gradually stabilized as a whole, and the displacement rates of the monitoring points were less than 10 mm/month (Figures 10g and 11g). From May to June 2017, the TGRA was experiencing its rainy season, while the reservoir level began to decline rapidly. The displacement rates of all monitoring points of the landslide increased gradually, some of which reached 30 mm/month (Figure 11h). Due to the hysteresis characteristics of rainfall infiltration, the displacement rates of the upper potion were greater than the middle part, which grew gradually with the increase of rainfall infiltration capacity and infiltration depth (Figure 10h). The landslide showed slow deformation when

the reservoir water kept at the level of 145 m from July to August 2017. The infiltration lag effect of the previous rainfall, combined with the large continuous rainfall in the reservoir area, triggered the strong deformation period of the year 2017. Some of the monitoring displacement rates reached 40 mm/month (Figure 11c,i).

The tensile cracks at the upper portion of the landslide provides a passage for the infiltration of continuous heavy rainfall. Therefore, the monitoring points of the eastern upper portion showed the largest cumulative displacements and displacement rates, which gradually decreased towards the front part (Figure 10i).

**Figure 9.** The operation curve of the Three Gorges Reservoir area (2010–2018).

**Figure 10.** Deformation velocity of the Muyubao landslide (time-yymmdd), the subfigures of ( ‐ **a**–**i**) are refer to the different time range of InSAR monitoring.

‐ **Figure 11.** Deformation velocity value distribution of the Muyubao landslide (time-yymmdd), the subfigures of (**a**–**i**) are refer to the different time range of InSAR monitoring.

#### **5. Discussion**

#### *5.1. The Reliability Analysis of InSAR Monitoring in the Muyubao Landslide*

Interferometry techniques have a huge potential to monitor landslides, as demonstrated by worldwide examples [12,15,32]. The monitoring accuracy of InSAR is affected by various factors, such as the atmosphere and terrain. Therefore, it is necessary to evaluate the reliability of the time series displacement before it is applied for landslide analysis. In the InSAR monitoring of the Muyubao landslide, the reliability analysis of the results can be carried out from two aspects.

‐ ‐ ‐ ‐ (a) Comparison of InSAR monitoring results at different locations. The monitoring point PSI-12 outside the Muyubao landslide was selected for comparative analysis. According to the InSAR monitoring results (Figure 12), the monitoring location of PSI-12 did not deform during the monitoring period, which is consistent with the results of the field investigation. While the monitoring points (PSI-01~PSI-11) within the landslide were deforming during the same period (Figure 8). By comparing the InSAR results of the monitoring points within and outside the landslide, it demonstrates that the InSAR monitoring in this study can identify the deformation and nondeformation zones.

‐ **Figure 12.** The InSAR monitoring results of PSI-12. ‐

‐ ‐ ‐ ‐ ‐ ‐ (b) Comparison of monitoring results with different techniques. GPS is a reliable monitoring technique of ground deformation, which has been widely used in various fields [36,37]. In order to verify the accuracy of InSAR monitoring, the comparison was carried out between the monitoring results of GPS and PSI points at similar locations of the Muyubao landslide (the locations are shown in Figure 7). Considering that the automatic GPS monitoring equipment was installed in June 2016, the monitoring results from June 2016 to September 2017 were used for comparison. The results of GPS-01-PSI-08 and GPS-02-PSI-07 are shown in Figures 13 and 14, respectively. The cumulative displacement of InSAR and GPS at similar locations are highly consistent in magnitude. It indicates that the InSAR monitoring of this study is reliable and can be utilized for landslide analysis. ‐ ‐ ‐ ‐ ‐ ‐

‐ ‐ ‐ ‐ **Figure 13.** Results comparison of GPS-01 and PSI-08.

‐ ‐

‐ ‐

‐ ‐ ‐ ‐ ‐ ‐

‐

‐ ‐

**Figure 14.** Results comparison of GPS-02 and PSI-07. ‐ ‐

#### *5.2. The Formation Mechanism of the Muyubao Landslide*

Based on the analysis of geological conditions and monitoring data (Figure 8), the deformation of the Muyubao landslide showed a large spatial difference. Large deformation mainly occurred in the middle and upper portions, while the deformation of the uplift at the front part was smaller (Figures 7 and 10). The profile with the largest deformation is shown in Figure 15. It can be seen that the deformation of the entire section is relatively small in the period of July-September 2016. It became larger in the period of January–March 2017. At the front edge of the profile near the Yangtze River, the Muyubao landslide deformed slightly in both periods. A field investigation was conducted for further analysis of the landslide deformation mechanism.

By analyzing the results of the field investigation, it can be seen that the deformation and failure mode is controlled by the lithology condition and the structure of the slope. Under the long-term gravitational process, the slope was creeping down along the weak mudstone layer. The front stress accumulated gradually, which resulted in the bending and uplifting in the middle and front stratum of the slope. Once the potential slip surface of the bending part connects, the accumulated energy may suddenly release, which may cause collapse and high-speed sliding. In conclusion, the evolution process can be divided into four stages: slipping stage, bending and uplifting stage, local shear stage and completely connecting stage of the shearing surface (Figure 16a) [19].

The bending and uplifting in the front stratum limited the deformation of the front part, which occurred when the slope was creeping down along the lines of weakness. Due to the topography and geomorphology conditions, the uplift in the front increases the sliding resistance effect of the landslide. When rainfall occurs, the gullies distributed on the surface of the landslide provides favorable channels for the rapid discharge of surface water, thus reducing the threat of water seepage to the landslide stability. Meanwhile, in the front part, the sliding surface is of a gentle slope, and the rock mass is bent and dips inside, which forms a sliding resistance for the landslide as well. In short, under natural conditions, the geological characteristics are beneficial to the overall stability of the landslide. However, due to the combined influence of factors in the TGRA, the Muyubao landslide has been undergoing a continuous creep deformation.

‐

‐

‐

‐

‐

‐ ‐ ′ **Figure 15.** Spatiotemporal characteristics of the deformation of the Muyubao landslide: (**a**) the velocity map of Muyubao landslide, and (**b**) the velocity of geological cross-section I-I′ .

**Figure 16.** (**a**) Conceptual model of the landslide evolution process (revised from Deng et al. [19]), and (**b**) a photo of the front part of the Muyubao landslide.

‐

‐

‐

‐

#### *5.3. The Relationship between Landslide and Influencing Factors*

For the landslide in the TGRA, most of their movements were influenced by the fluctuation of the reservoir water level [38]. Seen from the InSAR results of the Muyubao landslide (Figure 17, March 2016–September 2017), the deformation mainly occurred during the fluctuation period and high-water level period of the reservoir (>170 m). From March to June 2016, when the reservoir water level dropped from 167 m to 145 m, the landslide deformed by about 70 mm. The relationship between the stability of the Muyubao landslide and the reservoir water level during this monitoring period was similar to that of many other reservoir landslides [15,39,40]. Such as the Lorenzo-1 and Rules Viaduct landslides of the Rules Reservoir in Southern Spain [15], the InSAR data shows that the three acceleration periods of both landslides are related to drawdown periods of the water level. Many similar cases are also studied in the TGRA [17,41]. When the reservoir level drops rapidly, the pore-water pressure within the landslide begins to dissipate, and the dissipation speed lags greatly behind the reservoir drawdown speed. This process will induce hydrodynamic pressures. High hydrodynamic pressure is no more balanced by water lateral confining in the basin, which increases the sliding force of the landslide, thus reducing its stability. From November 2016 to February 2017, when the reservoir was at a water level of higher than 170 m, the Muyubao landslide deformed by 25 mm. Therefore, it was still deforming during the period of high water level. The high water level increased the height of the groundwater level inside the reservoir landslide, which changed the saturation state of the soil and reduced the shear strength. At the same time, the uplift force formed by the reservoir water in the submerged sliding mass reduced the sliding resistance force of the landslide, thereby reducing the stability of the landslide. We also analyzed the relationship between the rainfall and acceleration of movement within the Muyubao landslide. It showed a deceleration in movement related to the rainfall peaks. Due to the large thickness and scale of the Muyubao landslide, it is difficult for rainfall to seep into the slip surface to promote landslide movement. The decorrelation between reservoir landslides and rainfall are also shown in the Lorenzo-1 and Rules Viaduct landslides and other landslides in the TGRA [42]. Based on the long-term GPS monitoring data of the Muyubao landslide, Deng et al. [19] and Wan et al. [20] analyzed the relationship between the landslide deformation and its inducing factors (reservoir water level and rainfall). These papers showed how water level variation and rainfall have influences on landslide displacement patterns, which is consistent with the conclusion of this paper. Moreover, it indicates that the application of InSAR data is reliable for analyzing the deformation mechanism of landslides. ‐

**Figure 17.** Displacement, rainfall and reservoir levels of the Muyubao landslide.

‐ ‐

‐

‐

‐

‐

#### *5.4. The Future Development of InSAR in Landslide Application*

The InSAR technique is suitable for observing slow-moving landslides at a large scale. There are always spatial differences in the deformation of large-scale landslides [43,44]. The landslide deformation characteristics (direction, intensity, etc.) in different positions and periods are not the same. The displacement obtained by the InSAR technique was a one-dimensional LOS direction (East-West). Nowadays, the LOS displacement is often projected into the steep slope direction to make it consistent with the main sliding direction, which is convenient for deformation analyses. However, this method cannot meet the demands of accurate landslide monitoring. Therefore, it is necessary to develop an extraction method for landslide three-dimensional deformation based on the ascending and descending orbit images and the landslide deformation evolution.

In the current early warning system, the monitoring methods of surface deformation are mostly the global navigation satellite system (GPS, Beidou, etc.). The remote-sensing techniques, such as InSAR, will be widely applied in landslide monitoring and early warning in the future. The monitoring results of these techniques are not point data but a sort of areal deformation metrics that can provide much more deformation information. Therefore, we need to develop data mining algorithms to obtain more information from the massive motoring data of these novel techniques. It will significantly improve the effectiveness of the landside early warning system.

#### **6. Conclusions**

In this study, Sentinel-1 images and the StaMPS SBAS method were utilized to extract the time series displacement of the Muyubao landslide. During this monitoring period, the changes in the reservoir water level were the main triggering factor of the Muyubao landslide, and its deformation mainly occurred during the fluctuation period and high-water level period of the reservoir. This landslide also showed a significant spatial difference, which was the strongest on the eastern upper portion of the landslide, followed by the middle part, and the western front portion showed the slightest deformation.

The Muyubao landslide experienced a translational mode during its evolution process. The natural geological characteristics (formation mechanism, slope structure, topography and geomorphology) are beneficial to the overall stability of the landslide under natural conditions. However, when the external environment changed, the combined effects of reservoir water and rainfall contributed to the deformation of the Muyubao landslide.

The monitoring and characterization of the Muyubao landslide was carried out by applying Sentinel-1 images and the time series InSAR technique. Through field investigation and verification with GPS, we found that this method can effectively monitor slow-moving landslides and presents the advantage of high-density coverage of monitoring points. The InSAR monitoring results can provide us more information for comprehensive understanding of the target landslides. In short, the application of Sentinel-1 images and InSAR techniques in carrying out landslide monitoring is an effective and economical method.

**Author Contributions:** C.Z. processed the dataset, performed the analysis and wrote the original draft of this paper. Y.C. contributed to processing the dataset and performed part of the analysis. K.Y. and Y.W. supervised the interpretation of the results from the geomorphological point of view. F.C. supervised the analysis. X.S., F.C. and B.A. edited the manuscript. All authors contributed to paper writing and revision. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China, grant numbers 41907253 and 41702330.

**Acknowledgments:** We thank the assistance of the Research Center of Geohazard Monitoring and Warning in the Three Gorges Reservoir, China. We thank the reviewers for their suggestions that improved the quality of this paper. We thank Liang Xin for helping us in data collection.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# **Flash Flood Susceptibility Modeling and Magnitude Index Using Machine Learning and Geohydrological Models: A Modified Hybrid Approach**

#### **Samy Elmahdy 1, \* , Tarig Ali <sup>1</sup> and Mohamed Mohamed 2,3**


Received: 8 July 2020; Accepted: 14 August 2020; Published: 20 August 2020

**Abstract:** In an arid region, flash floods (FF), as a response to climate changes, are the most hazardous causing massive destruction and losses to farms, human lives and infrastructure. A first step towards securing lives and infrastructure is the susceptibility mapping and predicting of occurrence sites of FF. Several studies have been applied using an ensemble machine learning model (EMLM) but measuring FF magnitude using a hybrid approach that integrates machine learning (MCL) and geohydrological models have not been widely applied. This study aims to modify a hybrid approach by testing three machine learning models. These are boosted regression tree (BRT), classification and regression trees (CART), and naive Bayes tree (NBT) for FF susceptibility mapping at the northern part of the United Arab Emirates (NUAE). This is followed by applying a group of accuracy metrics (precision, recall and F1 score) and the receiving operating characteristics (ROC) curve. The result demonstrated that the BRT has the highest performance for FF susceptibility mapping followed by the CART and NBT. After that, the produced FF map using the BRT was then modified by dividing it into seven basins, and a set of new FF conditioning parameters namely alluvial plain width, basin gradient and mean slope for each basin was calculated for measuring FF magnitude. The results showed that the mountainous and narrower basins (e.g., RAK, Masafi, Fujairah, and Rol Dadnah) have the highest probability occurrence of FF and FF magnitude, while the wider alluvial plains (e.g., Al Dhaid) have the lowest probability occurrence of FF and FF magnitude. The proposed approach is an effective approach to improve the susceptibility mapping of FF, landslides, land subsidence, and groundwater potentiality obtained using ensemble machine learning, which is used widely in the literature.

**Keywords:** NUAE; flash flood; BRT; CART; naive Bayes tree; geohydrological model

#### **1. Introduction**

Flash floods are a temporary overflow of rivers or valley plains as a natural response to unusually heavy rains. They can cause damage to infrastructure and human life [1,2]. FF usually occur frequently at narrow mountainous valleys (wadis), alluvial fans at the foot of mountainous and narrow coastal areas as a response to climate change and intensive rainfall over an impermeable and an impervious surface [3,4]. Globally, about one-third of the Earth's surface (where more than 70% of the world population reside), frequently experiences to flash flooding [5].

The UAE, including the study area, has not escaped this natural hazard since it experiences several flash flooding on a regional scale. The northern part of the UAE recorded huge amounts of rain between 9 January and 12 January 2020. The heaviest rainfall was 24 years ago in Khor Fakkan with 144 mm (5.66 inches) of accumulated rainfall (https://www.ncm.ae). In Ras Al Khaimah (RAK), one woman was crushed to death after a wall collapsed during a violent storm.

In Ghalilah and Al Fahlain villages of the RAK, flash floods destroyed roads, farms and flooded the village graveyard (Figure 1). Away from the mountainous areas, the cities of Sharjah and Dubai have experienced monstrous floods consuming roads and vital areas such as Terminal 1 of Dubai International Airport, shopping malls and Jabal Ali (https://www.ncm.ae). Flash flooding events solely depend on several terrain and geohydrological parameters such as alluvial plain width, mountainous valley width, altitude, topographic slopes, topographic curvature, steam density, topographic relief, the angle of repose and, of course, the intensity of rainfall. The angle of repose or talus slope ranges between 25 and 40 and depends upon the nature and type of the rocks and is directionally proportional to the flash flood magnitude [6].

**Figure 1.** Raster map of perception showing the heaviest rainfall was 24 years ago in Khor Fakkan with 144 mm (5.66 inches) of accumulated rainfall (https://www.ncm.ae/). Photographs of flash flood damages during January 2020 in the RAK area, NUAE. Yellow points highlight flash flood locations across the study area.

These consequences can be controlled or, at least, reduced by constructing a regional and precise susceptibility mapping and analysis [7] and calculating the angle of repose or talus for each hydrological basin. Thus, building an accurate geohazard model and measuring flash flood magnitude over a regional scale is one of the researchers and decision-makers important task [8]. Susceptibility can be defined as a prediction of where the future hazardous event is likely to occur [9,10]. The wide availability of free of charge remote sensing data and machine learning algorithms allowed researchers to susceptibility map and predict flash floods over a regional scale efficiently and economically [11–14].

Several hydrological models have been developed using hydrological parameters such as rainfall and runoff [15–19]. However, these techniques have been built based on a single dimension and changeable parameters due to climate change and soil erosion. Additionally, these models lack sensitive analysis and field observation. Other studies have been applied for FF susceptible mapping using the data-driven and K-nearest neighbors (K-NN) [20–23], analytic hierarchy process (AHP) [24], frequency ratio (FR) [25], firefly algorithm (FA) [26,27], feature selection method (FSM) [26], support vector machine (SVM) [27] artificial neural network (ANN) [28], and weight of evidence (WoE) [29], and decision tree (DT) [30,31].

A novel approach has been employed for flood susceptibility mapping [29–33]. Recently, a comparative assessment of decision tree algorithms for susceptibility modeling has been performed [34–36]. Most of these studies have been focused on susceptibility mapping of FF using ensemble machine learning or a comparative assessment of machine learning algorithms. However, these studies have not focused on FF conditioning parameters such as alluvial plain width, valley width and basin slope. Additionally, the magnitudes of FF has not been taken into considerations. This study aims to modify a hybrid integration approach for flash flood susceptibility mapping in an arid region. Here, we first performed a comparison between BRT, CART, and NBT models for FF susceptibility mapping for the first time. The best FF susceptibility map was chosen and then modified by dividing it into seven basins. Each basin has its own FF magnitude. The FF magnitude was calculated using four new FFCPs namely alluvial plain width, valley width, basin gradient and mean slope. The proposed approach represents an advancement step to modify predicted maps of FF, landslides, land subsidence and groundwater potential produced using machine learning models. The modified approach can be of great help to risk management specialists and geohazard prevention scientists.

#### **2. Study Area**

The study area stretches from longitude 54◦58′21′′E to 56◦29′42′′E and latitude 24◦33′45′′N to 26◦5 ′24′′N and has an area of about 11,871 km<sup>2</sup> . It includes the Emirates of Dubai, Sharjah, Ajman, Umm Al Quwain, Ras al Khaimah and Fujairah (Figure 2). Most of the built-up area is concentrated on coastal strips and waterfronts such as creeks and artificial lakes, while the agricultural area is limited to the alluvial plains, wherever rainfall and paleochannels (wadis) are found.

The area is characterized by narrow alluvial coastal plains in the north-western and the eastern parts of the study area with a width ranging from 2 to 5 km, reaching its maximum width at Falahyeen and Al Dhaid villages (No. 9 and 19 in Figure 2). Lithologically, the upper streams (mountainous areas) are dominated by the igneous and metamorphic rocks in the east and carbonate rocks in the north and alluvial deposits at the foot of the mountainous areas [13]. The area has weather varying from hot and humid during the summer and being warm during the winter (Figure 3a). The annual rainfall varies from 30 mm in the south-eastern desert near the city of Dubai to 180 mm in the mountainous areas in the north and east [37,38]. The maximum number of rainfall days over the study is four to six days per month during the period from December to March (Figure 3b). The maximum daily precipitation value is 1.2 mm during March (Figure 3c) (Giri and Singh 2015). The estimated annual rainfall over the mountainous and coastal areas was about 97% of total rainfall over the NUAE [38].

**Figure 2.** Elevation map generated from a DEM showing the location of the study area (white polygon), and main cities and towns of the study area (green stars).

Hydrologically, the area is comprised of three aquifers: a carbonate, ophiolite, coastal, and an alluvial. The aquifers are drained by several surface wadi courses. Their trends are common in the NW-SE, NNW-SSE, NE-SW and NNE-SSW directions [39,40]. These features play an important role in flash floods by accumulating rainwater from upstream and crash houses and farms in the downstream [39].

**Figure 3.** Monthly temperature and precipitation (**a**), number of days of rainfall (**b**), and daily precipitation (**c**) over the NUAE including the study area.

#### **3. Datasets and Methodology**

5 The proposed approach can briefly be described as the following steps: (i) constructing a flash flood inventory map (dependent variable), (ii) constructing flash floods conditioning parameters (independent variables), (iii) spatially analyzing the relationship between each conditioning parameter and flash flood events, (iv) optimal parameterization and flash flooding susceptibility mapping, (v) evaluating the performance and assessing the accuracy of machine learning models, and (vi) dividing the area into seven basins and calculating flash floods magnitude for each basin. A flowchart of the methodology adopted in the current study is shown in Figure 4.

*Remote Sens.* **2020**,*12*, 2695**2020**, , x FOR PEER REVIEW 7 of 31

**Figure 4.**Flowchart of the methodology applied to the study area.

#### *3.1. Construction of Flash Floods Inventory Map (FFIM)*

FFIM is an excellent indicator for FF susceptibility mapping. Here we used several sources including Google searches, the Google Earth application and local reports of newspapers and weather. These reports were collected and downloaded via the webpage of the National Centre of Metrology webpage (https://www.ncm.ae/Radar\_UAE\_Merge). Since 1990, 61 flash flood events were reported across the study area, and the most severe event happened between 9 and 12 January 2020 with 144 mm (5.66 inches).

Most of the FF locations were reported to be distributed in the mountainous valleys, narrow alluvial coastal plains and alluvial fans at the foot of the mountainous areas (Figure 2). These FF locations were used as training datasets to investigate the spatial relationship between flash floods conditioning parameters and flash flooding occurrence, to learn the machine learning models, and to evaluate the performance and assess the accuracy of the three machine learning models.

#### *3.2. Spatial Analysis and Construction of Flash Flood Conditioning Parameters*

#### 3.2.1. Construction of FFCPs

This study aims to map the susceptibly of flash floods and measure their magnitudes in an arid mountainous region with a minimum number of essential FFCPs to reduce errors and computational time and enhance the performance of the BRT, CART and NBT models [41,42]. Three types of FFCPs were chosen based on their degrees of influencing FF occurrences namely terrain and geohydrology. The terrain parameters include altitude, topographic slope, relief, topographic minimum curvature, while the geohydrology parameters include lithology, stream network (wadi courses), stream density, and distance from stream courses (Figures 5 and 6). Thematic maps of FFCPs such as altitude, topographic slope, topographic relief, topographic curvature, and stream networks (wadi courses) were generated from ALOS DEM with a spatial resolution of 30 m using raster surface of 3D analysis and a hydrology of spatial analysis tools implemented in ArcGIS v.10.2 software. First, maps of altitude, slope, relief and topographic curvature were calculated by importing a 30 m DEM, converting a DEM into raster grid and applying raster surface to the raster grid. The range of altitude and relief from 100 m to 1800 m (m.s.l), the slope map classified into five classes: (i) 0◦–5◦ , (ii) 5◦–15◦ , (iii) 15◦–30◦ , (iv) 30◦–60◦ , and (v) >60◦ and the range of curvature from −200 to 50. Second, stream network was derived from a DEM using D8 algorithm implemented in hydrology tool. The algorithm starts by fill gaps (central pixel with no data) and determines into which neighboring pixel any water in a central will flow. After that, the flow direction and downhill slope of a central pixel to one of eight neighbors was calculated. Then, flow accumulation was calculated followed by deriving major stream networks using a threshold value of 45 [14]. This value was optimal to reveal the major stream networks in the study area. After that, drainage basins were calculated using the calculated flow direction theme. Third, distance from stream networks and the density of stream network were constructed using distance and density of spatial analyst tools implemented in the ArcGIS v. 10.2 software. Fourth, the lithological map was constructed from the Operational Landsat Imager (OLI) Landsat 8 acquired on 9 December 2019 (Path 160, rows 42 and 43) using maximum likelihood classifier (MLC) implemented in the Envi. v. 4.5 software. The MLC was trained using 200 training datasets collected from scanned geological maps at a scale of 50,000 collected from the UAE ATLAS. The ALOS DEM and Landsat 8 images were downloaded from the USGS Global Visualization Viewer (GloVis) (www.glovis.usgs.gov) portal.

**Figure 5.** Maps of flash flood conditioning parameters used in flash flood susceptibility mapping: (**a**) altitude, (**b**) slope, (**c**) topographic minimum curvature, and (**d**) topographic relief.

**2020**, , x FOR PEER REVIEW 11 of 31

**Figure 6.** Maps of flash flood conditioning parameters used in flash flood susceptibility mapping: (**a**) lithology, (**b**) distance from streams, and (**c**) stream density.

#### 3.2.2. Spatial Analysis

Altitude and topographic slope are the most important conditioning parameters for FF occurrences as they control water flow, flow direction, surface runoff and infiltration rate [25,42]. Sites at a lower altitude have a higher probability of FF where water flowing down from upper streams [43]. The topographic slope has a crucial influence on surface water flow, flow direction, runoff, infiltration rate and FF occurrence. As topographic slope increases, runoff potential increases resulting in FF [44]. Topographic curvature has a similar influence on FF occurrence. Sites with negative values for curvatures are zones of water accumulation and, thus, a higher probability of FF occurrence, while sites with positive values for curvature are zones of water dispersion, and thus have a lower probability occurrence of FF [25]. Lithology and its physical characteristics (e.g., porosity and permeability) strongly influence infiltration rate, runoff potential, stream network distribution, and thus FF occurrence [29]. Other FF conditioning parameters such as stream density and distance from streams also play a significant role in FF occurrence. As the distance from streams decreases, the probability of FF occurrence increases [45]. Factors such as aspect, land use/land cover (LULC), NDVI, topographic wetness index and index of the erosion power are secondary parameters and introduce bias and error during the modeling process and can be ignored [12,46,47]. These various FFCPs were chosen based on the geoenvironmental characteristics of the study area and used widely in this literature. These parameters can help in detecting flash flood-affected areas from the surrounding areas since flash flood occurrence is identified as varying greatly with the intensity of rainfall, altitude, slope and stream network [48,49].

#### *3.3. Background and Theories of Models*

#### 3.3.1. Boosted Regression Tree (BRT)

The BRT is an ensemble technique and differs statistically from traditional methods. The BRT consists of machine learning and statistical techniques designed to improve the accuracy and the performance of a single model by fitting a group of models before combining these models for classification and prediction [50]. The BRT model merges regression from classification and regression tree (CART) and boosting techniques to produce a combined modeling. Boosting is a technique designed to enhance the performance of regression trees similar to model averaging [51]. However, the BRT implements a stepwise process, where the models are fitted to a subset of the training dataset. This subset used at every iteration of the model fit is stochastically chosen with no replacement.

The shrinkage parameter or learning rate determines the level of contribution for each tree to the growing model, while the number of nodes in a tree (tree complexity) decides whether interactions are fitted [52]. Then, these parameters determine the total number of trees required for prediction [53].

Elith et al. (2008) [53] described the model as the following steps:

1. Initialize weights to be equal *w<sup>i</sup>* = 1/*n* for *m* = 1 to iter classification *Cm*:


$$
\alpha\_m = \log((1 - r\_m)/r\_m) \tag{1}
$$

5. Recalculate weights

$$wi = w\_i \exp(\alpha\_m I(yi \neq \mathbb{C}\_m))\tag{2}$$

6. Majority vote classification: sign [Σ *M <sup>m</sup>*−<sup>1</sup> α*mCm*(*x*)]

#### 3.3.2. Classification and Regression Trees (CART)

The CART is one of the most common algorithms for the classification of data. It is resistant to missing data, and its variables do not need to have a normal distribution [51,54]. It is a binary recursive partitioning procedure capable of processing continuous and nominal attributes as targets

and predictors and was developed by Friedman (1975) [55], Breiman (1984) [56], and Breiman and Stone (1978) [57].

**2020**, , x FOR PEER REVIEW 12 of 31

The algorithm has been successfully applied in medical applications to predict the value of a dependent variable based on the different values of independent variables [58], economics applications [59], photogrammetry [60], environmental protection [61], food science and chemistry [62,63], landslide susceptibility mapping [64], and groundwater potential mapping [65]. Classification trees are used when an independent variable is categorized, while regression trees are used when independent is continues and to predict its value (Figures 5 and 6). The CART algorithm is designed as a sequence of trees where the ends are terminal nodes. It consists of three elements: (i) rules of splitting data at a node based on the value of one variable, (ii) stopping rules for deciding when a branch is terminal and can be split no more, and (iii) a prediction for the target variable in each terminal node (Figure 7). The major problem of building a valuable tree is finding the proper guidelines to prune the tree.

**Figure 7.** Diagrams represent Classification and Regression Trees (CART).

At the first stage, classification is created and leads to producing a tree with several branches. The number of branches of any tree depends on the degree of dispersion of data. The size of the tree depends on specific parameters such as the minimum population in the successive nodes, the minimum population of children, the maximum number of levels and the maximum number of nodes [51]. It is worthy to note that there is no relationship between the size of the tree and the accuracy of classification. The correct classification can be made by decreasing the overfiting of the training set.

The phase of cutting is created by generating the biggest possible trees and this process lies in reducing the total number of leaves and tending to increase the accuracy of classification. The final phase is the selection of a tree with a lower number of misclassifications and a higher accuracy. This higher accuracy can be released with the application of cross-validation using Equation (3):

$$\text{RE}(d) = 1/(\text{N} \sum \text{(}\_{i=1} \text{) } (yi - d(\text{xi}))^2 \tag{3}$$

where *yi* is the number of points in the testing set (real variable), *xi* is the number of points in the testing set (variable classified with d model), *N* is the number of cases in a testing set. The results of the predicted model were evaluated using a set of testing samples. The measure of the cross-validation *R*α(*T*) is a linear dependence between the complexity of the tree and the cost of misclassifications Equation (4) [51].

$$R\_{\alpha}(T) = R(T) + \alpha T \Longleftrightarrow \alpha = R\_{\alpha}(T) - R(T)/|T| \tag{4}$$

≡ am

α *α* ⬄ *α <sup>α</sup>* − α where *R*α(*T*) is the cost-complexity measure, *R*(*T*) is the cost of misclassifications, |*T*| is the complexity of tree measures as the number of terminal nodes in the tree, a parameter of tree complexity (assumes values from 0 for a maximal tree to 1 for a minimal tree).

The produced regression rule set was then applied to all FFCPs to map flash flood susceptibility. It is worthily of note that the dependence (complexity of the tree) and accuracy of classification should be taken into consideration. The low complexity of the tree usually leads to the low accuracy of classification.

The output of CART is a hierarchical binary tree which subdivides the prediction space into several regions (*Rm*) where the response factors have similar values (≡ *am*) based on Equation (5):

$$f \cong a\_m \colon \forall \mathfrak{x} \in \mathbb{R}\_m \tag{5}$$

3.3.3. Naive Bayes Tree (NBT)

Naive Bayes (NB) is a machine learning classifier that creates a probability-based model. It works based on Bayes Theorem, which is known as Naive Bayes. The NB uses a decision tree (DT) for its structure and organizes an NB model on every leaf node of the constructed DT [66]. The NBT exhibits a significant classification performance and accuracy [67,68].

During the NB process, the impact of an attribute value on a specific class is independent of the value of another attribute and known as class conditional independence. This conditional independence of NB makes the datasets to train quicker and it considers all the vectors as independent and applies the Bayes rule [69]. Bayes role can be explained as follows (Equation (6):

$$\mathbf{P(A|B)} = \mathbf{P(B|A)} \mathbf{P(A)} / \mathbf{P(B)}\tag{6}$$

where:

P(A|B) = conditional probability of A given B

P(B|A) = conditional probability of A given B

P(A) = probability of event A

P(B) = probability of event B

The model starts by estimating the probability of each class in the model, calculating the covariance and variance matrix, and building the discriminate function for each class [70–72].

#### *3.4. Optimal Model Parameterisation and Flash Flood Susceptibility Mapping*

As a first step, the CART, BRT and NB models were fitted in SATISTICA v. 7 [73], Salford system [74,75], and in R (R Development Core Team 2006) v.3.0.2 [76], implementing gbm, dismo, rpart, and random forest packages [77]. These tools have a stochastic gradient boosting tree which is widely used for regression problems related to predicting and mapping continues dependent variables [73]. After that, the setting and optimizing of all parameters was performed. These parameters were; learning rate, the number of additive trees, the proportion of sub-sampling, and so forth.

Here, the optimal value for the learning rate was set as 0.1, additive trees were 185, and the maximum size of the tree was five. These values may lead to precise results accuracy [74]. In this study, the random point's values have been extracted from each variable of FFCPs for the presence and absence condition of the FF. After that, all three machine learning models were then run based on the mechanism of the open-source tools. Using these tools, FFSM was calculated for each pixel in the thematic maps of FFCPs and then converted into text files. Finally, these text and dbase files were imported into SPSS v.25 to evaluate the models' performance and generate FFSM in GIS environment of ArcGIS v.10.2 software.

During the prediction processing, the models used FFCPs and the regression tree separates the FFCPs into two groups [78,79]. A group such as distance from streams, altitude, and slope in the upper part of the regression tree indicates an approximate area with a higher probability occurrence of FF. Another group, such as altitude, slope, and topographic curvature in the lower part of the regression tree allowed recognition areas of a higher probability of FF occurrence. Among several interval methods, the quantile method, which is used widely in the literature, was chosen to classify FFSM [12,14,36]. The produced FFSM was then classified into four classes namely low, moderate, high, and very high.

#### *3.5. Evaluation of the Models Performance*

To evaluate the models' performance, we used 61 FF locations. The datasets were divided into 43 (70%) for model training and 18 (30%) for the model validation. These datasets were classified and selected randomly using the Hawth's Tool implemented in the ArcGIS v. 10.2 Software. We calculated the accuracy metrics for each model. Each metric includes accuracy, precision, recall and F1 score. The F1 score was found to the best technique and used widely in literature [13,14,80]. The F1 score was calculated based on four parameters, namely true positive (*TP*), true-negative (*TN*), false-positive (*FP*), and false-negative (*FN*) using the following equations from 7–11:

$$Accuracy = TP + (TN/TP) + FP + FN + TN \tag{7}$$

$$Xappa = po - pe/(1 - pe)\tag{8}$$

where *po* is the observed agreement ratio, and *pe* is the expected agreement

$$\text{Precision} = \text{TP/TR} + \text{FP} \tag{9}$$

$$\text{Recall} = \text{TP} / \text{TP} + \text{FN} \tag{10}$$

$$\text{F1} = \text{2} \times \text{precision recall}(\text{precision} + \text{recall}) \tag{11}$$

where *TP* is the true-positive; *FP* is the false-positive and *FN* is the false-negative.

The performance of SVM and SAM were evaluated using the open-source R 4.0.0 software. Further validation was performed using the receiver operating characteristics (ROC) curve, which is used widely in the literature due to its simplicity, easiness and higher accuracy [81]. The curve has been successfully used by several researchers in several applications such as groundwater potential mapping [82], and land subsidence susceptibility mapping [12]. The obtained prediction FF maps sometimes contain errors. These errors sometimes come from the deficiency of the FFCPs quality and the structure of the models [46,83].

The accuracy of the produced prediction maps was measured using the area under the curve (AUC) [84]. The AUC ranges from 0 to 1. AUC with a value of 1 indicating a good prediction, and a value of 0 indicating the model is not efficient and cannot predict FF occurrence. Both the success and prediction rates were created to assess the accuracy of the FFSM [85]. The value of AUC can be estimated via the following equation [86]

$$\text{AUC} = \Sigma \left( \text{TP} + \Sigma \text{TN} / (\text{P} + \text{N}) \right) \tag{12}$$

where TP (true positive) and TN (true negative) are the numbers of pixels that are correctly classified. P is the total number of pixels with torrential phenomena, and N is the total number of pixels of no flash floods.

#### *3.6. Geohydrological Model for FFMI and Filling the Gaps in MLC Maps*

Although ensemble-based machine learning models have been used widely in FFS mapping due to their greater accuracy, these models still have some limitations regarding FFCPs. These include the length of the basin, basin area, the gradient of each basin, alluvial plain width, and mean slope. These new parameters are very important in measuring the FF magnitude. Here, we first delineated drainage basins from a DEM using a hydrological tool implemented in the Arc GIS v. 10.2 Software. After that, each basin was considered and treated as a separate FF zone and its magnitude was measured by calculating the following parameters (Figure 8 and Table 1):

**N** 

**2020**, , x FOR PEER REVIEW 16 of 31

**Figure 8.** 3D Perspective view of Google Earth illustrates the geometry and new parameters used for estimating flash floods magnitudes (**a**), and the influence of **Figure 8.** 3D Perspective view of Google Earth illustrates the geometry and new parameters used for estimating flash floods magnitudes (**a**), and the influence of repose angle, alluvial plain width, gradient and relief on flash flood occurrence (**b**).


**Table 1.** Flash flood index parameters used for calculating flash flood magnitude for each zone (basin).

(i) Calculating the length of each basin (Lb)

(ii) Calculating the relief for each basin (Bh)

relief = B<sup>h</sup> = hmax − hmin (the difference between the maximum and minimum heights) (iii) Calculating the gradient of each basin (G◦ ) using the following equation

$$\text{Gradient} = (\text{Bh/Lb}) \times 60\tag{13}$$

(iv) Calculating the area for each basin (A)

$$\mathbf{A} = \mathbf{b}\mathbf{a}\sin\text{ area (km}^2\text{)}\tag{14}$$

(v) Calculating the alluvial plain width (Aw) for each basin manually in a GIS

(vi) Calculating the mean slope (Ms) for each basin using a moment statistic

(vii) Calculating FF magnitude for each basin with the following equation;

$$\text{Flash Flood magnitude} = \text{Ms/ln (A/G}^{\circ}) \tag{15}$$

#### **4. Results and Discussion**

#### *4.1. Evaluation of the Models Performance and Validation*

Visual inspection shows that there are some differences among the FFSM maps produced using machine learning models. Thus, it is important to evaluate model performance and assess the prediction accuracy. The results from the evaluation of the model performance show that the BRT model had the highest accuracy, followed by the CART and the NB models. The BRT yields an F1 score value of more than 0.91 for all FFS classes, followed by the CART with an F1 score value of more than 0.90 for high and very high classes (Figure 9).

The NB had the lowest F1 score for all FF classes. Thus, the validation results confirmed a positive agreement between the observed and predicted values for the BRT and CART models. Additionally, the slight difference between the F1 score of the BRT and the CART models is due to the gap between the two models and is not statistically different [87]. The BRT model offers reliable information regarding the FF to be predicted [42]. The BRT has the boosting approach that can employ an existing AI method and has the dual advantage of boosting and decision trees [87]. Further quantitative validation using the ROC curve was performed to examine the reliability of the obtained FFSM [88]. Similar to the F1 score, the BRT model has the highest AUC value (0.92), followed by the CART model (0.90) and the NB model (0.79). The high performance of the BRT is because it combines the CART with a boosting algorithm (Figure 10).

**2020**, , x FOR PEER REVIEW 17 of 31

**Figure 9.** A comparison of precision (**a**), recall (**b**) and F1 score (**c**) for the FF susceptibility class using BRT, CART and NBT models.

**2020**, , x FOR PEER REVIEW 18 of 31

**Figure 10.** ROC curves for the FFS maps produced by BRT, CART and NBT.

#### *4.2. Spatial Analysis and Flash Floods Susceptibility Mapping*

The results of the spatial analysis show that the extreme FF events had occurred at narrow alluvial plains of the mountainous and coastal areas. These areas are characterized with steep slopes, high relief, surface run-off and high density of streams. The higher density of streams reflects rocks with a lower rate of permeability that has a higher probability of FF occurring. The most important FFCPs affecting FF occurrence altitude and slope (Figure 5a,b). Both parameters strongly influence relief, topographic curvature (Figure 5c,d), soil moisture and surface run-off. For topographic curvature, convex classes (>0) have a very low influence on FF occurrence. Concave slopes (<0) had the strongest impact on FF occurrence (Figure 5c). About 90% (40 FF events) of the past FF events had occurred at an elevation from 300 m to 1400 m and slopes between 10◦ to 15◦ (Figure 5a). Another important FFP affecting flood was lithology. For the lithology factor, the upper streams are dominant by igneous and metamorphic rocks, while the lower streams are dominant by alluvial deposits. Most of the past FF events had occurred in the alluvial plains and fans (flooded plains) at the foot of the mountainous areas (igneous and metamorphic rocks) (Figure 6a). For distance from streams and streams density, the highest number of the past FF events had occurred in areas within 1000 m from the major stream networks (wadi courses) and characterized by a low density of streams (Figure 6b,c).

Parameters such as LULC and aspect and plan curvature have no significant contribution to the modeling process and could affect the accuracy of the model's predictions [13,44,89]. These parameters should be ignored and not considered in the modeling process since the aspect is already calculated during the extraction of stream networks, and the area is characterized by low urban development [13,42].

Maps of FFSMs were constructed by dividing the study area into separated pixels. Each pixel was categorized as a flood and non-flood class. Thus, the FFS index for each map was calculated for all pixels and each pixel was assigned a unique susceptibility index [12,13,36]. The testing of several classification methods such as equal interval, geometrical interval, natural break and quantile shows that the quantile and interval methods were the most appropriate method to classify flooded and non-flooded areas, respectively. This finding agrees well with similar studies applied by Khosravi et al. (2016) [36] who tested several classification methods for different susceptibility mapping. Susceptibility maps of FF produced using BRT, CART and NBT model are shown in Figure 11. These susceptibly indices were categorized into four classes intervals using the quantile technique, which is used widely in the literature [12,36,90]. The produced susceptibly classes were recognized namely very high, high, moderate and low construct FFSMs (Figure 11).

The maps demonstrate that the high and very high susceptibility classes are commonly located in wadi courses and alluvial plains of the mountainous areas in the east and north. Some portions of very high and high classes are located at the foot of mountainous areas. About 54% (3196.4 km<sup>2</sup> ) of the total area was classified as high and very high classes of FF, 19.3% (1136 km<sup>2</sup> ) was classified as moderate susceptibility classes of FF, and 26.5% (1561 km<sup>2</sup> ) as low class susceptibility of FF. The effectiveness of the proposed MCL models was confirmed by the highest F1 and AUC values than the individual MCL model.

**Figure 11.** Flash flood susceptibility maps: (**a**) BRT, (**b**) CART, (**c**) NBT. **Figure 11.** Flash flood susceptibility maps: (**a**) BRT, (**b**) CART, (**c**) NBT.

#### *4.3. Geohydrological Model for FFMI and Filling the Gaps in MCL Maps*

Although the BRT model yields the highest performance, the geographical and spatial variability of the valley depth and alluvial plain width parameters have not been taken into consideration. In this study, the FF magnitude index (FFMI) was calculated using a set of new terrain parameters for each derived basin (Table 1). These parameters include basin area (A) (Figure 12a) the length of the basin (Lb) (Figure 12b), relief (Bh) (Figure 12c), alluvial plain width (Aw) (Figure 13a), gradient (G◦ ) (Figure 13b), and mean slope (Ms) (Figure 13c).

Figure 12a shows that the area is divided into seven basins (zones) of flash flood and can be divided into two types. The first type is narrow coastal zones such as RAK in the northwest, Masafi, Rul Dadanh-Dibba and Fujairah-Kalba in the east. The second type is wide inland basins (zones) such as Falahyeen and Al Dhaid in the west and Hatta-Houylate in the south (Figures 1, 2 and 12a). Except for Al Dhaid and Falaheen basins, all basins are small in area, short in length, drained by dendritic streams in shape and narrow alluvial plains. These zones and their adjoining areas have high gradient angles ranging from 10◦ to 33◦ , high relief values of more than 900 m, mean slope of than 30◦ , and an alluvial plain width of less than 5 km (Figures 12 and 13). Lithologically, all upper streams are dominated by the igneous, metamorphic, and carbonate rocks, while the lower streams are dominated by alluvial deposits. These parameters directly influence the magnitude of the destruction of the FF and have a greater impact on the occurrence of FF in an arid region. For example, a basin (zone) with a higher relief and runoff potential indicates rocks with lower permeability, steeper slopes, relief, and high runoff potential in a basin with a narrow alluvial plain, which can cause susceptibility to floods [91].

Figure 14a shows the modified map of FF produced using the proposed hybrid approach. The map shows different FF zones. Each zone has its own FF magnitude. The estimated FF magnitude values for the basins of RAK and Massafi were 3.24 and 3, respectively (Table 1 and Figure 14a). Villages, roads and farms in these basins were severely affected zones. They cover an area of about 1379 km<sup>2</sup> (23.4%). Rol Dadnah and Fujairah-Kalba basins that cover an area of 1055.6 km<sup>2</sup> (17.9%) and have high FF magnitude values of 2.96 and 2.71, respectively. Hatta-Houylate has a moderate FF magnitude of 1.11, while Falahyeen and Al Dhaid have FF magnitude values of 0.57 and 0.16, respectively.

To validate the produced FFMI, the past FF events were draped over the FFMI and spatial analysis was performed. The results showed that most of the past FF events (40 FF events) had occurred in high and very high FF susceptibility zones. Further analysis was performed by draping the existing infrastructures and agricultural area over the FFIM shows that most of the villages and farms in mountainous areas and the RAK are located in areas at a higher risk. This fact is acceptable since all settlements, farms and roads have been constructed in the high and very high susceptible zones.

The proposed approach permits that FFCPs be updated at any time, as new parameters become available.

**Figure 12.**Maps of flash flood conditioning factors used for measuring FF magnitude: (**a**) basin area, (**b**) basin length, (**c**) relief.

**2020**, , x FOR PEER REVIEW 23 of 31

**Figure 13.**Maps of flash flood influencing conditioning factors used for measuring FF magnitude: (**a**) alluvial plain. width, (**b**) gradient, (**c**) mean slope.

**Figure 14.** Maps of FF susceptibility obtained using a hybrid approach and new FFCPs (**a**), and its related infrastructures risk map (**b**).

#### **5. Discussion**

#### *5.1. Evaluation of the Models Performance and Validation*

In this study, a hybrid approach, which integrates machine learning and geohydrological models, was modified to map FF susceptible areas and measure their FF magnitude in an arid mountainous region. We first used three machine learning models to map the susceptibility of natural phenomena with nonlinear relationships and without the need for prior elimination of statistical supposition and data transformation [12,92,93]. These types of models can fit complex nonlinear relationships between FF locations and conditioning parameters and their efficiency compared based on accuracy matrices (precision, recall and F1 score) and AUC-ROC [14].

The results demonstrated that the BRT model had the highest performance, while NBT a higher accuracy comparing with NBT [53]. This finding is consistent with Rahmati et al. (2020) [94] who used a machine learning approach for spatial modeling of agricultural droughts. They concluded that the BRT and CART models showed the best performance and prediction accuracy compared with NBT and linear supervised classifiers. Our findings also agree well with Naghibi et al. (2016) [65], who concluded that the BRT model produced the best prediction results followed by the CART and RF models. These machine learning, used widely in the literature, were applied due to their simplicity in

description, their accuracy, and straightforwardness of interpretation [7,8,13,14,22,23,29–31,33,53,94,95]. However, limited numbers have been applied to FF susceptibility mapping using a hybrid approach, which integrates machine learning models and morphological and geohydrological parameters to map FF susceptibly and measure its magnitude for each basin the FFSM.

#### *5.2. Spatial Analysis and Flash Floods Susceptibility Mapping*

FF is one of the main destructive phenomena that occur in mountainous areas and narrow alluvial coastal areas, especially in the NUAE. FF susceptibility mapping using remote sensing and MCL algorithms is considered as a crucial step to reduce the destructive impact of any future FF event [36,80,96]. Spatial analysis showed that most of the built-up and agricultural areas of the Emirates of RAK in the northwest and Fujiarah in the East (95%), and some parts of the Emirates of Ajman and Sharjah (20%) are located in high and very high susceptible zones. Thus, most of roads, dams, farms, and the human population are highly susceptible FF because they are located in wadi courses of the mountainous areas and at the foot of the mountainous areas. These areas receive intensive rainfall due to the impact of climate change [38]. In these zones, a proper urban planning scheme is very important to reduce risk hazard of any future FF event (Bathrellos et al., 2017).

Tremendous numbers of previous studies proposed a combination of MCL models for FFS mapping. They built susceptibility maps using several conditioning factors that are relatively complex [28,36,38,86,96]. Other studies have shown that intensive precipitation, LULC and geohydrology parameters are important factors controlling FF occurrence [28,36,96]. Further studies have shown that factor such as human activities is a significant in FF occurrence [25,94]. These factors such as LULC and human activities could not consider as significant factors in the study area due to low population and intensive human activities. Additionally, the obtained FFSMs using MCL are, in realty, altitude and/or slope map. Thus, it is important to modify geohydrological model and a hybrid approach.

#### *5.3. Geohydrological Model for FFM Indexing and Filling the Gaps in MLC Maps*

To measure FF magnitude and fill the gaps in the MCL maps, it is important to a hydrological model. Until now, there is no standard rule to choose FFCFs, flood and non-flood locations. Here, the result obtained using the proposed approach and new FFCPs is consistent with the constructed FF inventory map and demonstrated that the proposed approach was able to map susceptible FF and measure their magnitudes in an arid region and much more accurately and reliably compared to ensemble machine learning approaches that are widely used to susceptibility map groundwater potentiality [82], land subsidence [12], landslides [3,42,85], and flash floods [3,23,26,29–31]. The obtained susceptibility maps using MCL can be upgraded and re-categorized using the proposed approach and demonstrated that the approach was able to create a satisfactory FFM. The result shows that the highest number of the past FF events in the study area are commonly occurred in the major mountainous streams (wadi courses) and the narrow coastal strip in the east and in the northwest. These areas are lowlands covered by alluvial deposits, located at the foot of the Oman mountains and characterized by the gentle slope.

Based on the new map of FFMI and its related infrastructures map (Figure 14b), about 153.34 km in length of mountainous roads and those at the foot of mountainous areas are dangerous and deadly roads. Roads of residential areas are also dangerous and had a higher probability to destroy (Figure 14b). In Ras Al Khaimah (RAK), one woman was crushed to death after a wall collapsed during a violent storm (NCM, 2020). In Ghalilah and Al Fahlain villages of the RAK, flash floods destroyed roads, farms and flooded the village graveyard (Figure 1). The risk of damage can be reduced by constructing valley dams and a real-time alert system in the mountainous areas. The existing human settlements in the valley mouth should be shifted to the terrain at a lower elevation with a very gentle slope. Here, the produced FFSM and FFMI can be used as a reference for decision-makers and urban planners.

The results of the proposed approach permit a better understanding of the natural hazard setting of the study area for the first time. The results also facilitate the detection of sites of a higher probability of FF occurrence help identification of infrastructures that are located at high risk. The use of geohydrological approach can be used to fill the gaps in the FFSMs obtained using MCL models and represents an effective approach for FFSM and measuring FF magnitude, particularly in the NUAE, which has not been investigated previously. This finding agrees well Chen et al. (2019) [97] who concluded that the superiority of hybrid models. However, some limitations have been reported during the modeling process. These limitations include the spatial resolution and number of FF conditioning parameters as well as the optimal parameterization of the machine learning algorithms [12,13,95]. Therefore, future work will focus on FF susceptibility mapping using new FFC parameters such as alluvial plain width, the depth of the mountainous valley, and the gradient of the basin. Future work will focus on constructing a real-time meteorological system that is needed to predict areas with a higher FF occurrence. Plantation of *Prosopis Cineraria* forests and merging steel wedges and screens on the wadi slopes are also needed to reduce runoff potential.

#### **6. Conclusions**

In this study, a hybrid approach that integrates machine learning (the BRT, CART and NBT) and geohydrological models was applied for FF susceptibility mapping and constructing FFMI. The proposed approach was applied, for the first time, to the NUAE. Eight FFCPs, namely; altitude, topographic slope, topographic curvature, relief, streams density, lithology, and distance from streams, were chosen for FFSM. The parameters were selected based on their level of influencing FF occurrence, the geo-environmental characteristics of the study area, the geological background of the authors, and those used widely in this literature. Parameters such as LULC, aspect, plan curvature, and NDVI were ignored since the aspect (flow direction) already calculated during stream network extraction, and the study area is characterized by low population, human activity, and large vegetation cover.

The performance of the machine learning models was evaluated by calculating accuracy metrics using the F1 score for each model and ROC curve. The results showed that the BRT had the highest performance followed by the NBT and CART models. The produced FFSM using the BRT was modified by applying a geohydrological approach, and results showed that the area consists of seven FF zones. Each FF zone has its geohydrological characteristics and FF magnitude. The highest FF magnitude was found to be in the zones of the RAK and Masafi, Rul Dadna, and Fujairah-Kalaba, while the lowest FF magnitude was found to be in the zones of Al Dhaid and Falahyeen in the west. These magnitudes can be further enhanced by applying the proposed approach to sub-basins using remote sensing data with a higher spatial resolution. New FFCPs such as alluvial plain width, stream depth, basin gradient and mean slope can be considered in any future study, especially in an arid region. As a conclusion, the proposed approach and new FFCPs from this study demonstrated the superiority of hybrid models, and the obtained FFSMs can assist urban planners, geohazard specialists and decision-makers to reduce the risk of the FF in an arid region.

**Author Contributions:** Data providing, M.M.; supervision and project administration T.A.; writing—original draft and data analysis, S.E. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research has received funding under financial grant SCRI 18 Grant EN0- 284.

**Acknowledgments:** The authors would like to thank the American University of Sharjah for supporting this research.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Article* **On the Importance of Train–Test Split Ratio of Datasets in Automatic Landslide Detection by Supervised Classification**

#### **Kamila Pawluszek-Filipiak \* and Andrzej Borkowski**

Institute of Geodesy and Geoinformatics, Wroclaw University of Environmental and Life Sciences,

50-375 Wroclaw, Poland; andrzej.borkowski@upwr.edu.pl

**\*** Correspondence: kamila.pawluszek-filipiak@upwr.edu.pl

Received: 31 July 2020; Accepted: 15 September 2020; Published: 18 September 2020

**Abstract:** Many automatic landslide detection algorithms are based on supervised classification of various remote sensing (RS) data, particularly satellite images and digital elevation models (DEMs) delivered by Light Detection and Ranging (LiDAR). Machine learning methods require the collection of both training and testing data to produce and evaluate the classification results. The collection of good quality landslide ground truths to train classifiers and detect landslides in other regions is a challenge, with a significant impact on classification accuracy. Taking this into account, the following research question arises: What is the appropriate training–testing dataset split ratio in supervised classification to effectively detect landslides in a testing area based on DEMs? We investigated this issue for both the pixel-based approach (PBA) and object-based image analysis (OBIA). In both approaches, the random forest (RF) classification was implemented. The experiments were performed in the most landslide-affected area in Poland in the Outer Carpathians-Ro ˙znów Lake vicinity. Based on the accuracy assessment, we found that the training area should be of a similar size to the testing area. We also found that the OBIA approach performs slightly better than PBA when the quantity of training samples is significantly lower than the testing samples. To increase detection performance, the intersection of the OBIA and PBA results together with median filtering and the removal of small elongated objects were performed. This allowed an overall accuracy (OA) = 80% and F1 Score = 0.50 to be achieved. The achieved results are compared and discussed with other landslide detection-related studies.

**Keywords:** automatic landslide detection; OBIA; PBA; random forests; supervised classification

#### **1. Introduction**

The limitations of landslide field mapping are widely reported in the literature [1–8]. In certain conditions, such as densely vegetated terrain, field-based investigation is ineffective or even impossible [9]. Benefiting from an abundant collection of remote sensing (RS) data, automatic approaches have been introduced to landslide studies by various scientists [1,3–8,10–28]. Among the automatic methods, pixel-based (PBA) [4,5,14,16,19] and object-based (OBIA) [7,8,10–13,15,20] classification methods can be distinguished. Different studies have compared the performance of OBIA and PBA in various RS applications [29–31], including landslide detection [20,32,33].

Various supervised classification methods can be applied in PBA and OBIA to detect landslides. Supervised classification requires the collection of both training and testing data to produce classification results and assess the classification accuracy. The collection of good quality landslide ground truth data to train the classifier is a challenge due to the time, access, and interpretability constraints, in addition to the need for expert knowledge [34]. Many scientists emphasize that training samples have a significant

impact on classification accuracy [35–37]. In particular, the number/quantity of training samples, which can also be interpreted as the training–testing area ratio, has a crucial impact on classification results.

In the RS literature, many works present the problem of training samples in classification and report that a reduction of the training sample quantity results in a decrease in accuracy [35–37]. However, there is a lack of studies that investigate the influence of the training–testing split ratio on the accuracy of landslide detection.

Considering this research gap, the objective of this study was to assess the influence of the training–testing split ratio of the study area on the accuracy of automatic landslide detection using supervised classification and based on DEM-derived features.

Reading the literature related to automatic landslide detection can lead to confusion because there is ambiguity in terms of basic concepts. For this reason, we adopted the predominant terminology used in the machine learning (ML) community [38,39]. According to this, the initial dataset is split into training and testing datasets. A small portion separated from the training samples is called the validation dataset. Training and validation samples are used to construct and fine tune the classifier. We used the so-called cross-validation approach for this purpose. The performance of the trained classifier was than verified and assessed using the testing dataset.

We divided our study area into training and testing sites according to two strategies. In the first strategy, testing areas were divided using the so-called region growing approach. In the second strategy, we divided our study area into training and testing areas based on the boundary determined by the water reservoirs of Ro˙znów Lake and Dunajec River. These various classification schemes were implemented for PBA and OBIA. The classification was performed using the random forest (RF) classification algorithm. Numerical investigations were carried out in the area highly prone to land sliding located close to Ro ˙znów Lake in Poland.

#### **2. Related Studies**

Automatic methods for landslide mapping include analyses of RS data, such as optical images [8,10,40,41], synthetic aperture radar (SAR) data [42,43], and Light Detection and Ranging (LiDAR) delivered digital elevation models (DEMs) [14,15,17,18,44,45]. The diversity of data and their resolution provide opportunities for various types of investigations. Since SAR data processing allows for estimating ground deformation, these data are usually applied for monitoring purposes and indirectly for landslide detection [43,46–49]. Optical RS data and LiDAR data allow landslides to be directly detected. Some researchers have attempted to utilize low-resolution optical images, such as Landsat [50–53]. However, these data appear to be not detailed enough for the detection of some small landslides [53]. The launch of SPOT as the first medium-resolution satellite captured significant attention of the scientific community. The first applications of SPOT data for landslide detection were presented by [54] and [55]. Subsequently, numerous other scientists have applied medium-resolution optical images for landslide detection [56–58] also integrated with SAR data [59]. A completely new research dimension has been provided by very-high-resolution optical images [41,60,61]. The application of optical images is effective for the detection of recent landslide catastrophic events that generate explicit and visible land cover changes (before and after the event), for example, the loss of vegetation, presence of fresh soil, and exposure of debris [12]. For old and/or slow-moving landslides where the changes in land cover/land use cannot be clearly observed, it is often impossible to distinguish landslide-affected areas from the image background; however, this issue also depends on the image resolution [8,12,15]. Thus, LiDAR is used due to its multi-return laser pulse, which has the ability to penetrate through plant cover. This allows for the filtering of vegetation and other non-ground objects and provides very detailed bare-earth terrain [62]. Therefore, LiDAR-delivered DEM is commonly used solely [4,5,14,15,17,18,63] or integrated with other data [12,64] for landslide detection in such areas.

Among the automatic approaches related to this study that utilized DEM data, McKean and Roering [4] were probably the first researchers who attempted automatic extraction of landslide features from a 1-m LiDAR-DEM in a 0.5-km<sup>2</sup> landslide complex near Christchurch, New Zealand. Surface roughness allowed for separating the landslide complex into four kinematic units. Subsequently, Glenn et al. [63] carried out a numerical analysis of LiDAR elevation data collected for two canyon-rim landslides covering an area of 17 km<sup>2</sup> in southern Idaho, USA. They separated landslides into various morphological domains based on morphometric data, topographic measurements, and field observations. One year later, Sato et al. [65] captured topographic information from an airborne LiDAR survey, such as the terrain gradient, topographic texture, and local convexity, and classified landform types into 17 domains over a 3.8-km<sup>2</sup> landslide area in the Shirakami Mountains, Japan. Noteworthy research was presented by Booth et al. [5], who applied two-dimensional discrete Fourier transform and continuous wavelet transform for two LiDAR-DEMs to characterize the spatial frequencies of morphological features characteristic of deep-seated landslides in the Puget Sound lowlands, Washington, and the Tualatin Mountains, Oregon, USA. In the same year, a similar work was presented by Kasai et al. [66]. They applied a 1-m LiDAR DEM to identify geomorphic features within deep-seated landslides in a 5-km<sup>2</sup> mountainous terrain area in the Kii mountain range, Japan. Chen et al. [19] used DEM-derived features and the RF algorithm for landslide mapping. Aspect, DEM, and slope images and their texture and window moving standard deviation filtering were applied for landslide detection in the region of Three Gorges, China. Pawluszek et al. [14] applied an extended set of DEM derivatives to assess the sensitivity of automatic landslide mapping using various supervised classification methods in the area of Carpathians in Poland. For semi-automatic extraction of landslide features, Passalacqua et al. [67] and Tarolli et al. [44] proposed two different approaches. However, both found a problem related to PBA, which does not consider or only marginally considers the local geomorphological setting and "context", such as the size, shape, and position in the landscape of the extracted features. Therefore, new needs appeared for the exploration of contextual information.

Around the year 2000, the Geographic Information System (GIS) and image processing community began to pay special attention to OBIA [68]. OBIA, in contrast to PBA, utilizes a full range of spectral, spatial, textural, and contextual parameters to delineate regions of interest [7,10,11,68]. In OBIA, individual landslides are considered an ensemble of pixels, rather than individual pixels that are spatially unrelated [13,68,69]. In our study area, because landslides did not generate explicit and visible land cover changes, the application of optical data solely would be ineffective; thus, we integrated these data with a DEM. Nevertheless, previous research based on optical RS presented leading developments in OBIA methodologies. For instance, Lahousse et al. [70] developed a multi-scale OBIA to map shallow landslides in the Baichi watershed in Taiwan after the 2004 Typhoon Aere event. Furthermore, the ML classification method has also been applied for landslide detection. Sumpf and Kerle [10] took advantage of OBIA and ML and proposed a supervised workflow for landslide detection to reduce manual labor and objectify the choice of significant object features and classification thresholds. They utilized very-high-resolution RS images (Quickbird, IKONOS, Geoeye-1, and aerial photographs). In addition, Stumpf et al. [8] introduced a semi-automatic approach based on object-oriented change detection for landslide rapid mapping and the use of very-high-resolution optical images. The algorithm was first developed in a training area of Altolia and subsequently tested without modifications in an independent area of Italy.

Due to the limitation of optical RS, OBIA has also captured the interest of scientists utilizing DEM for landslide detection [71]. The first example of an OBIA and DEM application for landslide detection is the study of Van Den Eeckhaut et al. [7]. The authors utilized support vector machine classification and DEM derivatives, such as the slope gradient, roughness, and curvature, in the Flemish Ardennes in Belgium for mapping slow-moving landslides in densely vegetated terrain, in which optical and spectral data could not be applied. Then, Li et al. [20] identified forested landslides using OBIA, DEM, and RF algorithms in the area of Three Gorges, China. Pawluszek et al. [15] performed multi-aspect analysis of OBIA for landslide detection in Polish Flysch Carpathians by utilizing only DEM data. They found that OBIA is very sensitive to scale and DEM resolution, and texture-related variables (grey level co-occurrence measures) were not helpful in landslide detection. Moreover, at present, geomorphological mapping is also integrated with OBIA. Knevels et al. [13] implemented OBIA

combined with geomorphological mapping to identify landslides in Oberpullendorf, Austria [7,13]. Prakash et al. [12] integrated DEM and Sentinel-2 images with ML and deep learning methods for landslide detection in Daglas county, Oregon, USA.

Most of the aforementioned landslide approaches utilized supervised classification for landslide detection, but none have investigated the effect of the train–test split ratio of the study area on the accuracy of landslide detection. This problem is widely recognized and discussed in RS, for instance, in the literature related to land cover mapping [35–37]. Thus, this motivated us to investigate this research issue in applications for landslide detection.

In addition to supervised-based methods, other types of automatic algorithms exist that are based on DEM analysis and are worth mentioning. For instance, Leshchinsky et al. [18] presented a new approach for the automatic and consistent mapping of landslide deposits called the contour connection method (CCN) based on DEM. In CCM, contours and nodes are applied to mapping and vectors are used to connect the nodes to evaluate gradients and associated landslide features based on criteria defined by the users. Another study that continued the application of this method was presented by Gaidzik et al. [72]. The authors mapped landslides based on two approaches: (1) manual mapping using satellite images and (2) automatic landslide morphology detection by employing the CCM. The automated inventory provided by the CCM with LiDAR DEMs effectively minimizes the time and subjectivity required. A continuation of this method was presented by Bunn et al. [17], who utilized a semi-automated method called the scarp identification and contour connection method (SICCM), which utilizes various geologic conditions automatically or semi-automatically introduced by simple inputs and interpretation from an expert. The application of the presented approach was demonstrated for three various study areas: the Oso landslide in Snohomish County, Washington, and Dixie and Pittsburg in Oregon Coast Ranges.

#### **3. Study Area and Data**

#### *3.1. Study Area and Geological Conditions*

The study area is located in the vicinity of Ro ˙znów Lake, in the central part of the Outer Carpathians, in the Małopolskie municipality, Poland (Figure 1). The study area covers from 49◦40′N to 49◦46′N latitude and from 20◦38′E to 20◦48′E longitude. Within the study area of 157 km<sup>2</sup> , around 21 km<sup>2</sup> is affected by landslides. This means that landslides occupy 13% of the entire study area. Within the study area, there are translational, rotational, or combined rock-debris slides and typical debris slides [73–76]. Based on Vernes' classification, updated by Hungr et al. [74], landslides within the study are slow- to very slow-moving landslides. The landslide activity is significantly connected with hydro-geological factors, such as rock stratification and precipitation. Activation of deep rockslides requires long continuous precipitation of 100 to 500 mm per month while cumulative rainfall of 50–400 mm over the course of 2–5 days can induce mudslides and debris slides [75]. Usually, north-facing landslides are found to be complex, while south-facing landslides tend to be insequent or subsequent [75]. Figure 1c,d presents the various landslide morphologies within the study area. Unfortunately, there are many landslides with smoothed morphology (Figure 1d), which makes them difficult to detect.

**Figure 1.** Location of the study area (**a**) with a false color image (spectral bands: 4-3-2) of a Sentinel-2A image (**b**) acquired 3/10/2015. Examples of landslide shapefile from the national landslide database (SOPO) for (**c**) landslide with visible terrain roughness and (**d**) landslide with smoothed terrain.

Appendix A Figure A1a presents the normalized difference vegetation index (NDVI), Corine Land Cover (CLC), and NDVI index (A-b) for the study area. According to CLC, the study area is mostly covered by non-irrigated arable land (26%), mixed forest (20%), and lands principally occupied by agriculture with significant natural vegetation (18%). The remaining parts are covered by various types of forest (coniferous forest, broad-leaved forest), plantations, pastures, and water bodies (8%). Appendix A Figure A1b presents the NDVI index calculated for Sentinel-2A data acquired at 25/03/2020. As can be observed, 53.7% and 31.8% of the whole area have values greater than 0.6 and 0.3, respectively. This indicates that most of the study area is covered by vegetation (forest and agricultural areas), which is in agreement with CLC.

The terrain of the Beskid Mountain Range area mainly has features of low- and medium-high mountains and medium-high foothills [77]. The slope length ranges from 0.6 to 1 km [75]. Predominant slope gradients are in the range of 0–68◦ and the relative elevations range from 266 to 613 m in the montane area. In sub-montane areas, slope gradients are in the range of 0–72◦ and 0–82◦ for Wielickie and Ci ˛e ˙zkowickie Foothills, respectively. Correspondingly, the relative elevation within Wielickie and Ci ˛e ˙zkowickie Foothills is from 232 to 486 m and 234 to 581 m, respectively. In Ci ˛e ˙zkowickie Foothills, landslides range in size from 537 m<sup>2</sup> to 92 ha. In Wielickie Foothills, landslides range from 584 m<sup>2</sup> to 26 ha. In Beskid Mountains, landslides range from 925 m<sup>2</sup> to 37 ha. The mean size of the landslides within the study area is 3 ha. Large inactive landslides can be generally observed in Beskid Wyspowy in woodland areas, on upper slope segments, and in cones of depression [73]. The most susceptible area to land sliding is that directly adjacent to Ro ˙znów Lake.

In Appendix B, the geological map for the study area, with explanation, is presented. The study site mainly comprises Eocene–Oligocene sandstones and shales and Upper Cretaceous sandstone and conglomerate–Lower Stebna layers. Additionally, many different geological subunits are interconnected with each other (see Appendix B). Based on Appendix B, it can be observed that the landslide bodies are mainly located in the boundaries/contacts of the units and steep slope areas along Ro ˙znów Lake, where the slope stability is poor. These areas are mainly covered by sandstones and shales. For example, in the boundaries of the Eocene sandstone and shale in the Sl ˛aska series, Oligocene-aged shale of the Krosno ´ layers is found in high slope areas along the lake and Paleocene–Eocene-aged spotted shale is found in the Magura Series. In contrast, landslides are less observed in medium-thick Oligocene sandstones and shales of the Sl ˛aska Series. Other geological units have a low susceptibility for landslides [ ´ 73,75].

#### *3.2. Data*

Various data were utilized for this analysis. LiDAR data were acquired using a Riegl LiteMapper 6800i system based on the Q680i laser scanner. The point cloud planimetric density is equal to 4–6 points/m<sup>2</sup> , and the estimated root mean square error for the height component is about 0.15 m [78]. The ability of LiDAR to capture topographic information is highly advantageous in forested areas [7,19,20]. The landslide inventory database (SOPO) from the Polish National Geological Institute was utilized to capture the training and testing datasets. The SOPO database consists of geological data, in addition to information on landslide locations and their type, and on areas prone to mass movements.

The location of existing landslides was collected in the SOPO database by the method approved by Polish National Geological Institute [79]. This method included conventional techniques, mostly comprising field reconnaissance, the visual interpretation of aerial photographs, the analysis of historical data, and detailed geomorphological/geological analysis [75]. Landslides within the study area stored in the SOPO database were mapped during field work in the years 2010, 2011, 2012, 2013, 2014, and 2015 [76,80,81]. Additional mapping work was also performed on the basis of topographic maps at a 1:10,000 scale supported by stereoscopic analyses of aerial photographs and LiDAR data [82]. In addition to LiDAR and landslide inventory maps, geological maps over the study area were acquired in raster format from the Polish National Geological Institute. Furthermore, Sentinel-2A images acquired on 25/03/2020 and road network maps from Open Street Map were utilized. Table 1 summarizes the data used, and their types and sources.


#### **4. Methods**

An overview of the methodology applied is presented in Figure 2. The 2-m DEM generated from the LiDAR point cloud was used for the extraction of topographic variables. Other data, such as Open Street Map (OSM) or Sentinel-2A data, were used for additional non-topographical layer extraction. Moreover, we utilized the DEM to extract the streams' networks and Sentinel-2A to extract the extent of Ro ˙znów Lake. A detailed description of the extracted variables is presented in Section 4.1. The extracted variables were used for supervised classification using pixel- (PBA) and object-based (OBIA) approaches. Despite the various classification approaches used, we classified the study area using the two training and testing strategies presented in Section 4.4. The accuracy parameters were computed for each classification approach and for the various training and testing strategies.

**Figure 2.** Overview of the entire methodology carried out in the present study.

Additionally, taking advantage of various processing approaches (pixels vs. object), the final detection map was generated for overlapped results from PBA and OBIA. Then, additional refinement was carried out (see Section 4.6). Furthermore, by utilizing the RF algorithm, we were able to indirectly assess the feature relevance in automatic landslide mapping.

#### *4.1. DEM Generation and Feature Extraction*

The classified LiDAR point cloud was acquired within the IT System of the Country Protection (ISOK) project [78,83]. LiDAR point cloud filtration within this project was performed using different software, mainly TerraScan based on Axellson's filtering method [84–86]. A filtered point cloud was corrected manually based on a visual inspection of the point cloud and aerial photographs. A classified LiDAR point cloud with a mean density of 4–6 /m<sup>2</sup> was used to generate the base DEM. The natural neighbor interpolation method was used to avoid smoothing of possible terrain breaklines represented in the original point cloud. However, according to the recommendations provided in [14], we resampled the base 0.5-m DEM into a 2-m resolution. This allowed us to preserve all landslide surface features while significantly decreasing the data volume and removing the artifacts present in the data at the original resolution. The issues connected with the suitable DEM resolution have been investigated by various scientists, many of whom reported that the finest DEM resolution is not the best choice [12,14,15,26,87]. Then, DEM derivatives (also called topographic variables or land-surface variables and DEM variables) were calculated from the DEM. Based on a literature review, the DEM variables presented in Table 2 were utilized. However, for OBIA, the mean pixels' value of DEM variables inside the object was used. We applied 14 DEM variables, which are widely used and recommended in the literature [7,10,13–15,19,20,87]. Roughness, curvature, mean slope, topographic position index (TPI), and openness were calculated using various kernel sizes according to the recommendations provided by Pawluszek et al. [14]. To take advantage of the hillshade, which is

calculated by illuminating the DEM with sunlight coming from a specific direction, we calculated the hillshade layers using eight sun angles (Figure 3) and then summed these layers into one hillshade layer. This allowed us to simulate the sunlight coming from various directions [34].

**Figure 3.** Interrelationships of the used variables. The subscripts by the variable "hillshade" indicate layers illuminated, in particular sun directions.

In contrast to PBA, OBIA takes advantage of various geometrical variables (compactness, rectangularity, etc.). These variables were applied in the OBIA approach. In addition to the topographical variables calculated from the DEM, the geology, normalized difference vegetation index (NDVI), and proximity to roads, lake, and streams were also implemented in the classification. A large amount of previous research [11,12] has utilized the NDVI for landslide identification. NDVI application is only effective in the detection of recent catastrophic events that generate explicit and visible land cover changes (before and after the event), including loss of vegetation, the presence of fresh soil, and the exposure of debris. Nevertheless, we also utilized NDVI as an additional layer for better segmentation and possible landslide boundary extraction. Landslide boundaries usually appear along various land use classes (rivers, streams, forest boundaries, etc.). Additionally, the NDVI layer was used for the Ro ˙znów Lake extraction (NDVI < 0), which afterwards allowed for the extraction of the lake proximity variable.

Geology is one of the most important aspects that influences the occurrence of landslide identification and has been applied in many studies [30,88–96]. However, the proximity of roads and water reservoirs (lake/streams) are also reported as critical factors that influence landslide occurrence [89–92,94–96]. Information on the settings, methods, and software used are specified in Table 2, while the interrelationships of data used, and the extraction of various landslide classification variables are shown in Figure 3. Table 2 also provides the literature sources where particular variables are explained in detail.


**Table 2.**Variables used for landslide detection with the setting, software, and methods utilized to calculate them.

#### *4.2. Pixel-Based and Object-Based Classification*

An overview of the implemented PBA and OBIA classification is depicted in Figure 4. In the pixel-based approach, all used features are treated as a raster that is co-registered and resampled into the common resolution of 2 m. This makes a per-pixel analysis computationally effective [12]. As can be seen in Table 2, PBA, unlike OBIA, does not consider geometrical and contextual information [12,19,33,68]. In object-based classification, also known as geographic object-based image analysis (GEOBIA) or an object-oriented approach (OOA), the study area is segmented into groups of meaningful homogeneous objects [12,68]. This approach assumes that the neighboring pixels likely belong to the same class or object. A segmented object in OBIA can then be classified using spectral, geometric textural, or spatial variables and relationships. Based on [7,8,10–13,15,68], landslides are better represented by heterogeneous objects (collection of pixels) rather than single pixels. The first and the most important step in OBIA is the segmentation of the study area into objects that are candidates for landslides. Methods like multiresolution image segmentation and simple linear iterative clustering are predominantly used for the segmentation of objects. Some segmentation algorithms require scale parameters that influence the shapes and sizes of the resulting objects. These scale parameters vary depending on the applied segmentation algorithm. Moreover, the selection of appropriate scale parameters is not a straightforward task. In addition, landslides have a multiscale character. This means that in the real world, landslides come in a wide range of shapes and sizes, thus tuning segmentation scale is challenging. For this reason, various algorithms for scale tuning have been proposed in the literature (e.g., plateau objective function), which combined with expert knowledge allows for more effective landslide detection; however, the problem remains when landslides with significantly different sizes exist [11,28]. Because our research goal was to investigate the influence of the ratio between the training area and testing area, we utilized a trial-and-error procedure, which is also applied in the literature, to estimate the scale value [100]. We set each of the shape and compactness parameters equal to 0.5. This segmentation was performed for all extracted variables using multiresolution segmentation in eCognition.

**Figure 4.** Steps performed in the pixel-based approach (PBA) and object-based image analysis (OBIA) classification.

#### *4.3. Random Forest Classifier and Variable Importance*

For the numerical investigations within this research, a mature ML classifier, random forest, was used in both approaches (PBA and OBIA). This is a nonparametric classifier developed by Breiman [101]. The RF classifier allows reliable classification results to be achieved using predictions derived from an

ensemble of decision trees [101]. This is a crucial advantage that allows for a dimensionality reduction of the RS data. In the literature, this classifier is widely applied to various RS applications, such as land cover mapping [102,103] and landslide detection [12,19,20]. Detailed information on RF classification can be found in [101,104]. Moreover, this classifier can be effectively applied to select and rank variables with the greatest ability to discriminate between the target classes based on the impurity function of the Gini index, which is known as the mean decrease or Gini importance [101,104,105]. In Section 4.2, we provide an evaluation of the variable importance within PBA and OBIA classification. Additionally, a large feature set can cause problems, such as: (1) A long time needed to train the algorithm, (2) a long time and many resources needed to generate the variables, and (3) overfitting when too many irrelevant features are utilized [19]. Hence, the feature relevance assessment is a highly important aspect of the classification task. Nevertheless, we did not reduce our input variables because they did not decrease the classification time. However, if a larger study area is analyzed and more input layers are used, this variable reduction would be beneficial. During the training process, cross-validation was performed (Figure 4). This means that 10% of the training samples was removed from the training dataset and used to perform cross-validation. This allowed the evaluation of the accuracy of the predictive model applied and to fine tune this model. After the training process, formal classification was performed for the whole investigated area using various training and testing strategies, which are presented in the next subsection. The RF classification for both approaches (PBA/OBIA) was performed for 500 trees, with the tree depth equal to 30.

#### *4.4. Training and Testing Strategies*

Selection of the training samples is a critical step in supervised classification and the focal point of our study. According to [36,106], the training sample size has a larger impact on the classification accuracy than the algorithm itself. This conclusion was made based on an evaluation of the impact of training data size on various classifiers in land cover mapping. This issue is especially important in deep learning methods, where a large amount of well-labelled training samples is needed to prevent the classifier from overfitting [107]. Using various training sample sizes, Huang et al. [24] achieved an OA between 69% and 75%. However, OA is not the best estimator of the classification results, particularly for imbalanced classes. Nevertheless, this result shows that the training sample quantity influences the accuracy of the classification. Therefore, the acquisition of ground truth samples is a key factor when planning feature detection based on supervised classification methods. In addition to the quantity of training samples, the strategy for training sample selection is also important. Based on the literature, there are generally two sample selection strategies: manual and random sampling design [102,103,108]. Random sampling design is based on the identification and labeling of small random patches of homogeneous pixels/objects in an image [108]. Chen et al. [19] reported that random sampling design introduces the effect of spatial autocorrelation, which affects the classification accuracy. In manual sampling design, the study area is split into two datasets (training and testing) based on administrative or environmental boundaries. Training samples are spatially compact with no autocorrelation effect, unlike random sampling design. Manual sampling design is thus more reasonable from a practical point of view. Collecting landslide ground truth data is time-consuming. Consequently, these ground truths come from landslide inventory maps generated for specific regions. Landslide inventories are usually performed systematically on a part-by-part basis. Therefore, areas that have already been investigated and mapped can be used for training the algorithm and predicting landslide locations in areas where landslide inventory maps have not yet been generated (especially in poorly accessible areas).

We used a manual sampling design in our study and utilized landslide polygons and corresponding landslide pixels delivered from the SOPO inventory to train OBIA and PBA variants of classifiers, respectively. However, we implemented the training–testing split ratio (TTR) (compare Figure 4) according to two various strategies. In the first strategy, training samples were selected in the center of the investigated area and covered 13% of the entire study area. The remaining portion of the study area

consisted of six testing areas (TAs), which were split using the region growing approach (Figure 5a). These six various TAs (Table 3) were used for region growing testing. For instance, this means that the training quantity for TA 1 covers 50% of the total investigated area and for TA 6 covers 13% of the total investigated area (Figure 5a). This also corresponds to a training-split ratio of 1 and 0.15 for TA 1 and TA 2, respectively. In the second strategy (natural boundary splitting design), the study area was split into a testing and testing area along the boundary designated by Ro˙znów Lake and Dunajec River (Figure 5b). In this variant, the training area covers 54% of the entire study area. The quantitative values of the training samples in various testing strategies are presented in Tables 3 and 4 for the region growing and the natural boundary splitting designs.

**Figure 5.** Various training and testing strategies used: (**a**) region growing testing design with various testing areas abbreviated as TA 1–6 and (**b**) natural boundary splitting design (the right green area used for training and red left area used for testing).

**Table 3.** Overview of the applied training sample size in the region growing testing design. TSQ—training samples quantity of the entire study area; TTR—training-testing split ratio; LTSQ—landslide training samples quantity of the entire classified area; and NLTSQ—non-landslide training samples quantity of the entire classified area.



**Table 4.** Overview of the used training sample size in the natural boundary splitting design. For the explanation of the abbreviations, see Table 3.

#### *4.5. Classification Accuracy Parameters*

To directly compare the accuracy of OBIA and PBA for various testing areas, we carried out an accuracy assessment at the pixel level. For this process, landslide shapefiles from the landslide inventory database were rasterized with a 2-m resolution and overlaid with the achieved PBA classification results. For OBIA, additional rasterization of the classification results was needed to overlay the OBIA results with the reference data acquired in the framework of the SOPO database. To compare the classification accuracy and evaluate the landslide detection skills of the tested variants, the confusion matrix for a particular variant was calculated. This matrix includes the true positive (TP), true negative (TN), false positive (FP), and false negative (FN) values. Based on these values, the overall accuracy (OA) of the model is calculated as follows:

$$\text{OA} = \frac{\text{TP} + \text{TN}}{\text{TP} + \text{FP} + \text{TN} + \text{FN}}. \tag{1}$$

When comparing the classification results, we followed the recommendation in [103] to investigate more than just the OA. The F1 score, probability of detection (POD), and probability of false detection (POFD) are additional measures for the accuracy parameter:

$$\text{F1 Score} = \frac{\text{2} \times \text{recall} \times \text{precision}}{\text{recall} + \text{precision}} = \frac{\text{2} \times \text{TP}}{\text{2} \times \text{TP} + \text{FP} + \text{FN}'} \tag{2}$$

$$\text{POD (recall)} = \frac{\text{TP}}{\text{TP} + \text{FN}'} \tag{3}$$

$$\text{POFD} \,\,(\text{fallout}) = \frac{\text{FP}}{\text{FP} + \text{TN}}.\tag{4}$$

These parameters are especially important when imbalanced data are the subject of classification. POD provides a view of correctly classified landslide data, while POFD portrays how many non-landslide areas have been classified as landslides. The preferred value of the POD is 1, while that of POFD is zero. The F1 score defines the harmonic average of precision and recall. These additional accuracy assessment parameters are specifically important if the mapping focuses on small classes (i.e., classes of a limited extent in the image data). Small classes will have little influence on the OA, although they may be key in determining the usefulness of the classification [103]. This situation appears when dealing with landslide mapping. It is relatively rare for landslides to cover 50% of the whole study area; therefore, their influence on the OA is lower than that of the non-landslide class. For our study area, the landslide density is 14.7% of the whole study area (23.1 km<sup>2</sup> of landslide areas and 157 km<sup>2</sup> of non-landslide areas). Thus, the influence of the landslide class on the OA is small. It thus makes sense to also investigate the F1 score, POD, and POFD indexes to obtain a better overview of the classification accuracy and landslide detection skills.

#### *4.6. Post-Processing and Final Landslide Map Generation*

Because landslide extents detected by the OBIA approach are represented by multiple objects or multiple pixels, we performed a post-processing refinement of the results. We determined the most probable landslide extent by extracting the regions detected by PBA, as well as by OBIA. The intersection of both results provided the most likely results. Additionally, when observing the final

results, we noticed many small and elongated objects that do not represent landslide areas. Since the minimum landslide size within our study area is equal to 537 m<sup>2</sup> (estimated based on SOPO inventory), we consequently removed objects smaller than 500 m<sup>2</sup> from the intersected results. A shape/perimeter index lower than 5 was another threshold parameter. Finally, filtering of the result using a median filter with a 6 × 6 window size was performed to fill the small holes within the landslide bodies.

#### **5. Results**

#### *5.1. Accuracy Assessment of Various Training and Testing Strategies*

The cross-validation rate achieved during RF training was higher than 0.98, which means that the models were correctly trained, and data could be classified using RF. As previously mentioned, in the region growing sample design, we performed classification of the total study area using training samples located in the center of the study area. Then, an evaluation of the results was performed for the growing testing areas. Table 5 presents the accuracy assessment parameters (F1 score, POD, POFD, and OA) for particular testing areas. These results could be refined using post-processing to increase their accuracy. However, to directly compare PBA and OBIA and the influence of the TTR on the detection results, refinement was not performed at this stage. Analyzing the achieved results, we can notice that the F1 score of both approaches (PBA vs. OBIA) is comparable; however, OBIA performed slightly better (completeness and precision) (Figure 6). This can be especially observed for TA 3–6. Notably, the same algorithm and parameters were utilized in both classification approaches. The classification accuracy changes only under various training sample sizes. This important finding is shown in Table 5. The detection landslide skills decrease subsequently with a decrease in the contribution of training samples in the total study area. Notably, the OA does not change significantly, and the other accuracy parameters, such as the F1 score and POD, consequently decrease. This fact is more obviously presented in Figure 6 for the F1 score accuracy measure. The region growing testing shows that the landslide detection skills decrease proportionally to decreases in the TTR. However, a substantial decrease is observed when the training sample quantity decreases from 50% to 26%. After this, we can observe a small decrease in accuracy. This proves the previously discussed issue when the OA index is used alone to evaluate landslide detection accuracy.


**Table 5.** Accuracy assessment for training and testing strategy 1 (region growing testing). Testing areas 1–6 abbreviated as TA 1–6.

**Figure 6.** Decrease of the F1 score value for the region growing testing areas (compare with Figure 5).

Based on the achieved results, it can be concluded that the best result was achieved with 50% of the training samples (F1 Score = 0.58). These results are limited, likely due to the imbalance in landslide and non-landslides classes. In the tested scenario, only 10% were landslide areas, while 40% were non-landslide areas.

To verify if accuracy above 70% and an F1 score at the level of 0.5 can truly be achieved when the training and testing areas are similarly large, we performed training and testing according to the natural boundary splitting design. The study area was split along the natural Ro˙znów Lake and Dunajec River boundary. Although, in this variant, the testing and training areas were similarly large (TTR = 1.2), we achieved slightly worse results. For instance, the OBIA approach provided an OA, F1 score, POD, and POFD equal to 72%, 0.48, 0.87, and 0.30, respectively (for comparison, for TTR = 1, the OA, F1 score, POD, and POFD were equal to 73%, 0.58, 0.88, and 0.31, respectively, in the region growing strategy). Table 6 presents the accuracy parameters for both classification approaches, while Figure 7 shows a graphical representation of the classification results.

**Figure 7.** Training and testing area were superimposed on the random forest (RF) classification results for OBIA (**a**) and PBA (**b**) according to strategy 2.



#### *5.2. Feature Relevance*

The feature relevance for classification can be assessed as an output from the RF algorithm training according to the Gini impurity reduction [101]. Figure 8 presents the variable importance for both classification approaches. As can be observed in both the PBA and OBIA classification approaches, the geology, DEM, and roughness are the most important variables. The geometric variables were found to be relatively unimportant in landslide detection using OBIA. The reason for this could be the segmentation settings, which lead to objects that did not result in significant geometric values. In this scenario, over-segmentation results in many small objects that represent the extent of one landslide body. Thus, the geometric parameters of segmented objects do not correspond directly to the landslide extent and therefore landslide object geometric parameters. Thus, to extract the boundary of one landslide body, several segmented objects of this landslide should be merged together.

**Figure 8.** Variable importance assessed based on the random forest algorithm for pixel-based (**a**) and object-based approaches (**b**).

#### *5.3. Final Landside Map Generation*

The common elements of the OBIA and PBA results from the second strategy (natural boundary splitting design) were then refined. This refinement was performed via the post-processing step described in Section 4.6. The final map of the detected landslides with TP, TN, and FN is presented in Figure 9. The accuracy indexes after the subsequent post-processing steps are presented in Table 7. Based on these, we can observe that the intersection of the PBA and OBIA approach, in addition to the removal of small and elongated objects, helped to decrease the POFD index. This means the minimization of over-classification or over-mapping (a high false positive rate). Medial filtering slightly increased POFD but also increased POD, which is desirable. From another point of view, these post-processing steps also decreased the POD, which reflects the probability of correctly detected landslides. However, the OA and F1 scores subsequently increased in the following post-processing steps. Increasing the F1 score indicates the performance of classification by taking into account true positives (correctly classified landslides), as well as false positives (wrongly detected landslides).

**Figure 9.** Training and testing area superimposed on the results of the PBA and OBIA joint approach for landslide detection.


**Table 7.** Accuracy assessment for training and testing strategy 2 with the accuracies acquired after post processing steps (Ro ˙znów Lake splitting).

#### **6. Discussion**

#### *6.1. Landslide Classification Accuracy with Respect to the Training Samples*

Based on the results presented in Section 4.1, it can be observed that the F1 score and OA with respect to the TTR decreases proportionally to the decreased training sample contributions in the whole investigated study area. However, when comparing the OBIA and PBA results, OBIA performed better for testing areas 3–6 when the TTR decreased. Thus, it can be concluded that OBIA performs better than PBA when the quantity of training samples is smaller. Additionally, based on the region growing testing design, it can be assessed that to achieve an F1 score at the level of 0.5, the training area should be as large as the testing area. Therefore, this should be considered when performing landslide detection using supervised classification.

Additionally, when comparing the results from the region growing design and natural neighbor splitting design, it can be seen that the landslide detection skills are smaller in the second strategy. The term "landslide detection skills" refers to how well the algorithm detected both classes: landslide and non-landslide areas. This can be represented by the F1 score. Comparing results from the first and second strategy where a TTR around 1 (training and testing area are similarly large) was applied, higher landslide detection skills of the second strategy should be expected. The explanation for this could be a smaller landslide class or landslide sample contribution in the training samples. In the region growing design for similarly large areas for training and testing, landslide samples covered an area of 10.7 km<sup>2</sup> , while in the natural splitting design, they covered 8.3 km<sup>2</sup> . Another issue could be related to the landslide morphology, because in both strategies, various landslides were used for classification. The terrain roughness, value of curvature, and other variables can differ between landslides.

Furthermore, study area conditions could be the reason for slightly smaller landslide detection skills in the second strategy. These could be geological changes, various elements of landslide training samples due to the many types of land used (agricultural vs. forests), etc. Specifically, this relates to how the classification accuracy changes under various geological and environmental conditions, also taking into account the local morphometry of a particular landslide (e.g., training the algorithm on a study area in a hilly or mountainous terrain covered by forest and evaluating it using a study area that is extensively cultivated, and vice versa). In addition to this aspect, the training sample number and training sample size are also noteworthy aspects to investigate. In this research, we investigated the training–testing split ratio. However, the training sample number can also influence the classification results. Thus, the topic of selecting training samples is not exhausted and various aspects were not covered in this paper but should be investigated in future works.

It is worth mentioning that the achieved accuracy of landslide detection from the natural splitting design is affected by the different characteristics of the training and testing areas. More specifically, landslides located on the one side of Ro˙znów Lake (training area) can have other characteristics than this located on the other side of the lake (testing area). In a perfect scenario, landslides used, or the training, that are randomly and evenly distributed across the investigated study can better capture a variety of the characteristics and can more effectively detect landslides. However, as was mentioned before, the collection of ground truth data across the study area is very challenging and time-consuming

from a practical point of view. Usually, such ground truth data (landslide inventory) is generated on the part by part basis. In Poland, landslide inventory is performed commune after commune. In the case of the study area, Ro˙znów Lake and Dunajec River are also the border between two communes, namely Łososina Dolna commune (testing area) and Gródek nad Dunajcem (training area). Landslide inventory for Łososina was created in 2011 while for Dunajec around 2015. Thus, from the practical point of view, it is desirable to utilize existing landslide inventory for training and detecting landslides in the area where such an inventory is not available.

#### *6.2. Comparison with Other Related Studies*

Our final results for landslide automatic detection were achieved via integration of the PBA and OBIA results and the post processing refinement described in Section 4.6. Considering the classification accuracy measures (Section 4.5), we achieved moderate agreement with the ground truth data (F1 score = 0.5). Thus, there is still space for improvement in automatic landslide mapping. In addition to various data, approaches, and classification accuracy measures, we attempted to compare our results to those of other studies related to landslide detection based on DEM. However, it should be mentioned that direct comparison is not possible due to the various study areas used in various works or also different accuracy measures. Anyway, to somehow relate our study with some existing in the literature and to summarize our achievement and limitations, we made this comparison. To compare these results with those in [13] and those in our previous studies [14,15], we additionally calculated the Kappa index, which is also a frequently used classification accuracy measure in the RS community. Some scientists discussed the limitation connected with the Kappa index in accuracy evaluation [109,110]. Since some papers present Kappa and OA and/or recall only [13–15], we decided to not omit the Kappa index due to the limited number of presented accuracy measures, which can be used for comparison.

Comparing the accuracy measures of the other studies presented in Table 8, the results achieved in this study are consistent with those of previous studies using similar methods [12,14,15,19,20], especially ML-based or deep learning classification methods [12] (compare Table 8). Nevertheless, these accuracy indexes still show only moderate landslide detection skills. The authors in [12] achieved a smaller POFD, which indicates a smaller amount of false positives when using similar OBIA and ML classification approaches. This is probably due to the specificity of the study area (Oregon, USA). From Google Earth satellite images, it is apparent that the majority of Oregon is covered by dense forest. There is only one city (Elkton), one main road (No. 38), and a lack of agricultural areas. Thus, the explanation for the higher false positive rate could be the forest coverage, which maintains the characteristic landslide topography. Additionally, when comparing our results with the work in [20] conducted on the Three Georges in China, similar results were obtained. When comparing OA with the work of [17], it can be stated that a similar accuracy was achieved; however, the POD (recall) presented in [17] was significantly smaller, especially for the study area of Dixie Mountain. Based on the results in Table 8, landslide detection in Dixie Mountain was the worst. This proved again that accuracy measures other than OA are needed (e.g., F1 score, POD, POFD, K) to reliably compare classification results between various areas with different landslide densities and different conditions. Additionally, the authors in [17] observed that the scarp identification and contour connection method (SICCM) mapped various study areas differently, either under-mapping or over-mapping in various study areas. Thus, there is no clear indication that SICCM mapped one study area better than another.

Based on the Kappa and POD, slightly better results were provided by Knevels [13]. The reason for the higher Kappa value could be the methodology applied in this study. Specifically, landslide detection in the area of Oberpullendorf in Austria was performed by OBIA and support vector machine (SVM) classification, but the authors in [13] integrated geomorphological mapping with the OBIA approach. Specifically, the authors focused first on landslide scarp detection and then on the detection of neighboring landslide bodies. The relationships between these features likely increase the algorithm's detection skills. This strategy would be beneficial for landslide detection but is computationally more demanding and needs additional parameters to be tuned.


**Table 8.** Comparison with other works and accuracy assessment indexes.

Additionally, by comparing the accuracy measures with the works of Pawluszek et al. [14,15], it can be concluded that previous approaches offer better detection skills. However, Pawluszek et al. [14,15] utilized a significantly smaller study area (26.3 km<sup>2</sup> compared to the 157 km<sup>2</sup> analyzed in this paper). An additional issue is the training sample designs that they utilized in their previous investigations. In [14], they utilized stratified random sampling to train the algorithm, while in [15], the authors manually selected random samples across the image. The authors of [19,111] also reported that random samples taken across an area affected by landslides are more beneficial for landslide detection rather than small coherent clusters used as training samples. However, this is the effect of a spatial auto-correlation, which contributed to the final accuracy of landslide detection. There is no clear indication whether this is an advantage or drawback of these specific training sample designs. Nevertheless, from a practical point of view, when the landslide ground truth data need to be collected during field investigations, a manual sample design is more pragmatic than a random sampling design. This is mostly due to the time needed to collect samples across the investigated area during field work. Therefore, based on the observations of our current and previous studies, the selection of the training samples is a significant aspect that influences the final results and should be undeniably considered when planning ground truth data collection during field work.

#### *6.3. Opportunities and Limitations of the Presented Approach*

A detailed analysis of the final landslide detection map reveals some opportunities and limitations of the proposed approach. A section of the landslide detection map is presented in Figure 10a,b as an example of these limitations and opportunities. For a better understanding of the classification performance, the classification results were superimposed on the hillshade map (Figure 10a,b). We selected these specific parts of the map to discuss various issues affecting landslide detection. In Figure 10a, the presence of many false positive results can be observed. This means that the landslides were over-mapped. In Figure 10b, we can observe very appropriate landslide mapping in the middle part of the landslide in the areas with clear and fresh topographical characteristics (rough surface, etc.). In the upper part of the landslide (Figure 10b), we can observe false negatives, which means that this area was not properly mapped by the algorithm as a landslide but rather as a non-landslide area. This is due to the smoothed morphology, which is changed by agricultural treatments. However, in the lower part of the landslide, where the topography is again smoother, we can observe significantly more false positives, similar to the situation in Figure 10a. To explain this issue, we investigated our training samples used for training the RF algorithm.

**Figure 10.** Landslide prediction in the testing area (**a**) with a large false positive rate, (**b**) with small false positive rate and (**c**–**f**) examples of landslides used for training the RF algorithm superimposed with the hillshade map.

Figure 10c–f presents the landslides located in the training area in the second classification strategy. These landslides were mapped by geologists in the field and are included in the official national landslide database. The morphology of the landslides used for training (Figure 10c–f) clearly suggests that the characteristic landslide features were smoothed and altered. The reasons for this are probably denudation and/or agricultural treatments. In such cases, it is highly challenging to evaluate if a false positive is truly a false positive or if it is also a landslide body where the typical landslide morphology has been smoothed. Based on visual interpretation, some areas with rough terrain have been correctly classified by the algorithm and clearly show the landslide extent (green color). Additionally, here we can observe the problem of landslide feature visibility, which makes OBIA integration with geomorphological mapping (division into some characteristic landslide parts) more challenging or impossible. The problems are mostly connected with an appropriate landslide scarp definition, because in Figure 10c–f, these characteristic landslide features are invisible. Having considered these aspects, it is our opinion that to minimize landslide over-mapping (reflected by a high POFD index), altered and smoothed landslides should be removed from the training process. This will probably help in more effective landslide boundary extraction and will minimize the false positive rate. Additionally, another aspect is related to the quality of the reference data because the delineation of landslide polygons can be too sparse and generalized. This would influence the accuracy of landslide detection [106,112]. Therefore, the quality of landslide shapefiles located within the training site should be investigated and discussed in future works.

#### **7. Conclusions**

Training samples are essential for supervised classification. In the case of automatic landslide mapping, it is especially important to determine how much representative data is required to achieve the specified level of accuracy in landslide detection based on supervised classification methods.

The region growing testing performed in this study shows that landslide detection skills decrease proportionally to a ratio decrease of the training–testing area. However, a substantial decrease is observed when the training sample quantity decreases from 50% to 26%. The application of region growing testing allowed us to assume that the training areas should be as large as the testing area. To verify this assumption, training sample selection according to the natural splitting design, which covers almost half of the entire study area, was used as the second strategy. In this strategy, the OA and F1 score were 72% and 0.42, respectively, and proved our assumption that the appropriate ratio of the training–testing area would be around 1. Slightly lower landslide detection skills when compared to the region growing design (an OA and F1 score of 73% and 0.58, respectively) can be related to other aspects of training sample selection (training sample number, quality of landslide inventory, etc.) or the environmental condition of the study area, which should also be investigated in future works.

In addition to, the training–testing ratio, which was the main focus of this study, the final landslide detection map was also generated by the intersection of the OBIA and PBA approaches and refinement of the results. Refinement included median filtering and the removal of small elongated objects, which allowed us to remove false positives from the final results. However, we inferred that the smoothed and vanished morphology of the landslides used for training and/or the quality of the landslide inventory have a direct influence on the rate of false positives. Nevertheless, the achieved results (OA = 80% and F1 score = 0.5) are consistent with those presented in the literature.

The RF algorithm also allowed us to identify the most relevant variables for landslide detection. In both cases (PBA and OBIA), the geology and terrain roughness were the most important variables and should undeniably be used in landslide detection. Furthermore, geometry-related variables were insignificant in the OBIA approach, probably due to the undersegmentation strategy used for the OBIA classification in this study.

In summary, this study, supported by the comprehensive literature review, allows us to draw a few conclusions for further research on landslide detection approaches. Firstly, from a practical point of view, manual sampling design should be selected to evaluate the landslide detection skills of algorithms based on supervised classifications. Secondly, the OA measure alone should not be used to evaluate the classification results, especially for imbalanced classes. Further, the train–test ratio should be around 1 (the training area should be as large as the testing area). The quality of the landslide ground truth sample is also an important issue. Additionally, the removal of old and denudated landslides whose characteristic topography is not visible in the terrain's morphology should be removed from the training samples. Moreover, the environmental conditions of various study areas and the influence of landslide detection skills should be tested in the future to assess the transferability of the algorithms. Finally, the landslide phenomenon, due to its complexity, is highly challenging to detect; thus, the integration of the OBIA approach with geomorphological mapping, also taking into account morphometry, would be preferable.

**Author Contributions:** Conceptualization, K.P.-F.; experiments, K.P.-F.; data capture, K.P.-F. and A.B.; formal analysis, K.P.-F.; validation, K.P.-F.; writing—original draft preparation, K.P.-F.; writing—review and editing, K.P.-F. and A.B.; supervision, A.B.; funding acquisition, K.P.-F. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Science Centre of Poland (Grant No. 2018/28/T/ST10/00528).

**Acknowledgments:** This work was carried out by Kamila Pawłuszek-Filipiak during a research internship at the Technical University in Dresden (No. 2018/28/T/ST10/00528) financed by National Science Centre of Poland. The authors are very grateful to Tomasz Wojciechowski, Zbigniew Perski, and Piotr Niescieruk for providing geological information and documentation on the study area. The authors are also grateful to Manfred Buchroithner for his valuable discussions connected with the remote sensing data classification issues and three anonymous reviewers for helping improve this manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Figure A1.** Corine Land Cover Map of the study area superimposed on the slope image (**a**) and (**b**) Normalized Difference Vegetation Index (NDVI) index.

*Remote Sens.* **2020**, *12*, 3054

#### **Appendix B**

**Figure A2.** Geological map of the study area.




#### **Table A1.** *Cont*.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Letter* **Mapping, Monitoring, and Prediction of Floods Due to Ice Jam and Snowmelt with Operational Weather Satellites**

**Mitchell D. Goldberg 1 , Sanmei Li 2 , Daniel T. Lindsey 3 , William Sjoberg 1 , Lihang Zhou 1 and Donglian Sun 2, \***


Received: 7 May 2020; Accepted: 3 June 2020; Published: 9 June 2020

**Abstract:** Among all the natural hazards throughout the world, floods occur most frequently. They occur in high latitude regions, such as: 82% of the area of North America; most of Russia; Norway, Finland, and Sweden in North Europe; China and Japan in Asia. River flooding due to ice jams may happen during the spring breakup season. The Northeast and North Central region, and some areas of the western United States, are especially harmed by floods due to ice jams and snowmelt. In this study, observations from operational satellites are used to map and monitor floods due to ice jams and snowmelt. For a coarse-to-moderate resolution sensor on board the operational satellites, like the Visible Infrared Imaging Radiometer Suite (VIIRS) on board the National Polar-orbiting Partnership (NPP) and the Joint Polar Satellite System (JPSS) series, and the Advanced Baseline Imager (ABI) on board the GOES-R series, a pixel is usually composed of a mix of water and land. Water fraction can provide more information and can be estimated through mixed-pixel decomposition. The flood map can be derived from the water fraction difference after and before flooding. In high latitude areas, while conventional observations are usually sparse, multiple observations can be available from polar-orbiting satellites during a single day, and river forecasters can observe ice movement, snowmelt status and flood water evolution from satellite-based flood maps, which is very helpful in ice jam determination and flood prediction. The high temporal resolution of geostationary satellite imagery, like that of the ABI, can provide the greatest extent of flood signals, and multi-day composite flood products from higher spatial resolution imagery, such as VIIRS, can pinpoint areas of interest to uncover more details. One unique feature of our JPSS and GOES-R flood products is that they include not only normal flood type, but also a special flood type as the supra-snow/ice flood, and moreover, snow and ice masks. Following the demonstrations in this study, it is expected that the JPSS and GOES-R flood products, with ice and snow information, can allow dynamic monitoring and prediction of floods due to ice jams and snowmelt for wide-end users.

**Keywords:** ice jam; snowmelt; flood mapping; monitoring and prediction; VIIRS; ABI

#### **1. Introduction**

Floods are the most frequent natural hazard throughout the world. The regions where river flooding due to ice jams may happen, during the spring breakup season, include: 82% of the area of North America, including the whole of Canada and 52% of the United States; most of Russia; Norway, Finland, and Sweden in North Europe; China and Japan in Asia [1]; and other morphological areas, like in the alpine valleys [2]. In the United States, floods cause the highest amount of life and economic loss among all the severe weather events [3]. Floods caused by snow/ice melting occur almost every year in the United States, for example, severe floods occurred along the Red River in April 2020, spring 2014, April 2013, March 2010 and April 2006. The Northeast and North Central United States are especially effected by floods due to ice jam and snowmelt. Although most flood events caused by ice jams and snowmelt are relatively minor and only affect local areas, a high number of significant floods related to ice jam and snowmelt have caused severe property damage and deaths.

The National Weather Service can issue routine river flood outlooks and warnings in the United States, but there is currently no widespread way to determine flood extent over land resulting from snowmelt and ice jams. Due to the complexity of river ice processes and thermal rises, and the relative predisposition to melts or instability effects in mountain areas, modeling floods due to ice jams and snowmelt is more complicated than modeling open-water flood [4–7]. Although numerical models have been developed for ice floods in several rivers all over the world [8–11], these were mainly designed for simulation, and rarely for prediction [4,11]. Yu et al. [12] indicates that the uncertainties in ice thermal and flow conditions inhibit the predictive capability of hydraulic/river ice models.

Satellite remote sensing provides a useful approach to detecting, determining and estimating the flood extent, as well as damage and impact over rivers and land bodies [13–15]. Operational weather satellites can provide ideal tools for flood detection, because of their large spatial coverage, frequent observations, low cost, and ease in distinguishing between water and land. During the daytime, flood maps can be derived from optical sensors onboard the operational weather satellites, such as visible (VIS), near-infrared (NIR) [13–15] and shortwave-infrared (SWIR) [16] observations under clear sky conditions. Due to their capacity to penetrate non-rainy clouds, microwave remote sensing instruments, including active airborne synthetic aperture radar (SAR) imagery [17] and passive microwave (MW) instruments [18–22], and especially SAR with high spatial resolution (10–30 m), if available, can provide invaluable flood information under almost all weather conditions. Nevertheless, SAR usually has a narrow swath and long revisit time (6–12 days) [17], while a flood is often a short-term event. Meanwhile, passive MW sensors usually have very coarse spatial resolutions (10–25 km) [18–22]. Optical sensors can explain surface flood information straightforwardly, with simpler preprocessing.

Remote sensing demonstrates great potential for monitoring river ice conditions [23]. Moderate Resolution Imaging Spectroradiometer (MODIS) and RADARSAT-2 were used to detect unbroken ice cover, monitor ice cover conditions and estimate ice volume [24,25]. The large spatial coverage and frequent observations of operational weather satellites, like the Suomi National Polar-Orbiting Partnership (S-NPP) and the Joint Polar Satellite System (JPSS) series, have unique advantages for flood monitoring. The S-NPP and JPSS constellation allows the Alaskan region to receive low latency data from 28 daily overpasses [26]. For high latitude regions, the revisit time is about 50–90 min, depending on latitudes and locations. Since in high latitude regions conventional observation is sparse, the capacity for multiple observations from operational polar-orbiting satellites during the daytime makes the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard the S-NPP and JPSS series very attractive for flood monitoring. These floods can be tracked dynamically by VIIRS in near-real time, which can thus be used for early warning and loss assessment by users from river forecast centers.

Floods caused by snowmelt and ice jams occur almost every year in the United States, and notable scenarios include the floods that happened along the Yukon River and Koyukuk River in Alaska in May–June 2013 due to ice jams, and the significant flooding that occurred along the Red River, recently, in April 2020. In order to meet the needs from end users, in this study, the S-NPP VIIRS 375-m and GOES-R imager data are used to detect floods caused by ice jams and snowmelt. Here we demonstrate an application of our flood algorithm in ice jam flood monitoring, and an application for snowmelt flood detection.

#### **2. Data and Methods**

#### *2.1. Study Sites*

The Yukon River is the third longest river in North America and the longest river in Yukon, Alaska. The river originates from British Columbia, Canada, and flows west to Alaska in the United States. Ice jams and flooding are very common on the Yukon River when warming temperatures in spring melt the ice. In May 2013, a persistent ice jam on the Yukon River overtopped its banks and carried flooding water to the town of Galena in Alaska. Since then, there has been no real big ice jam flood till now. The location of the Yukon River is marked in Figure 1.

**Figure 1.** The location map of the study sites in the United States. The thick blue lines mark the major rivers in North America.

‒ ‒ The Red River flows northward, along the border of North Dakota and Minnesota, in the United States, through Manitoba, Canada. The Red River passes through several cities, including Fargo and Grand Forks in the United States, and Manitoba's capital, Winnipeg in Canada. Water draining northeast on a gentle slope was dammed by the south edge of the continental ice sheet. In spring, the Red River thaws first from the south in North Dakota, while still frozen farther north, causing widespread flooding. The location of the Red River is showed in Figure 1.

#### *2.2. Data Used*

To estimate the flooding caused by ice jam and snowmelt, S-NPP VIIRS, GOES-R Advanced Baseline Imager (ABI) and other types of ancillary data were used:

‒

1. Calibrated VIIRS level 1b data at imagery channel 1 (red: 600–680 nm), channel 2 (near-infrared: 850–880 nm), channel 3 (shortwave infrared: 1610 nm), and thermal infrared channel 5 (1050–1240 nm) with 375-m spatial resolution.


#### *2.3. Methods*

#### 2.3.1. Flooding Water Detection

Because of different underlying surface conditions, there are two primary types of floods: the most common flood occurs over vegetation or bare land, referred to as supra-vegetation/bare land flood; another flood type mainly occurs on top of snow/ice surfaces, referred to as supra-snow/ice flood. These two types of floods show different spectral characteristics in optical sensor observations, in visible, near infrared, shortwave infrared and thermal infrared channels, and thus require different methodologies for flood detection using optical sensor data, like the VIIRS imagery.

The supra-snow/ice flood is a special flood type because the underlying layer is still covered with snow/ice. Because the reflectance of snow and ice is high, floodwater over a snow/ice surface reflects much more in visible (VIS) and near infrared (NIR) channels than floodwater in normal supra-vegetation/bare land floods, while the reflectance in the visible channel ( *RVis*) is still higher than in the NIR channel (*RNIR*) [30–32]. The detection of supra-snow snow/ice flood also uses similar variables: *RVis*, *RNIR* and NDVI (or Normalized Difference Vegetation Index). However, the melting snow/ice surface and shadows cast on the snow/ice surface share similar spectral features in these three variables, and thus may be confused with supra-snow/ice floodwater. We therefore introduce a new variable, DNDVI, defined as the Difference in NDVI between a pixel and its snow/ice neighbors. As demonstrated in Li et al. [33], shadows on snow and melting snow surfaces have similar *RVis*, *RNIR* and NDVI values, while melting snow and shadows can be separated from supra-snow/ice floodwater using the DNDVI value.

For supra-snow/ice floods, VIIRS snow/ice mask is applied before flood detection to determine snow/ice cover. The decision-tree technique is used to distinguish supra-snow/ice floodwater from snow/ice cover and shadows based on these variables: reflectance in the visible channel *R*VIS, NDVI and DNDVI [28].

#### 2.3.2. Cloud Shadow Removal

For flood detection, cloud shadow is always the biggest challenge because cloud shadows share very similar spectral characteristic with flooding water in the visible, near infrared, short-wave infrared and thermal infrared channels, meaning these cannot be separated from one another via spectral characteristics. Thus, during water detection based on the decision-tree approach, most cloud shadows are counted as water. To remove these cloud shadows from flooding water pixels, we evaluated the cloud shadow results in cloud masks and applied them in cloud shadow removal first. Then we adjust the geometric cloud shadow removal algorithm [34] for VIIRS imagery. This method made an assumption that one cloud pixel casts, at most, one cloud shadow pixel. A spherical geometry model between cloud shadows and clouds is developed, and cloud height is required. To avoid possible errors in cloud height products, the geometry model is applied iteratively to build the cloud-to-shadow relationship. By adjusting the geometric cloud shadow removal algorithm, further improvements can be made to solve the remaining cirrus-cloud shadows [34].

#### 2.3.3. Terrain Shadow Removal

Terrain shadow is another big challenge in flood detection, because terrain shadows also show similar reflectance properties to water, and may be misclassified as flooding water. To remove terrain shadows, an object-based method is developed using the digital elevation model (DEM) data, resampled to VIIRS or GOES-R resolution from the Shuttle Radar Topography Mission (SRTM)-2 and Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) [35]. Since terrain shadows usually form in mountainous areas while flooding water mainly accumulates in low-lying areas, the surface roughness of terrain shadows is much greater than that of floodwater [36]. Instead of working on single pixels, this object-based method treats a group of adjacent pixels as one object to calculate surface roughness.

The method was applied to identify terrain shadows in the VIIRS-derived flood maps. The validation results show that more than 95% of the terrain shadows can be separated from the flooding water, and some of the remaining cloud shadows can also be removed [35].

#### 2.3.4. Flooding Water Fraction Derivation

Since a flood is the overflowing of water onto normally dry land area, for a coarse-to-moderate resolution sensor like VIIRS and ABI, flooded pixels may be mixed with water and land. Thus, the flooding water fraction can represent mixed pixel information, and contain more information than just "yes/no" flood water mask [9], as in the most common satellite-based flood mapping. Therefore, after water classification, if a pixel is classified as "Water", we further calculate its water fraction based on the linear mixture model [16]:

$$f\_{\rm lv} = \frac{R\_{\rm clh\\_land} - R\_{\rm clh\\_mix}}{R\_{\rm clh\\_land} - R\_{\rm clh\\_water}}\tag{1}$$

where *f* <sup>w</sup> is the water fraction, *R*ch\_mix is the reflectance for mixed pixels, *R*ch\_land is the reflectance for pure land pixels and *R*ch\_water is the reflectance for pure water pixels. The reflectance in the visible (VIS) channel (e.g., VIIRS Imagery Band 1 or I1: 0.64 µm), near IR (NIR) channel (e.g., VIIRS Imagery Band 2 or I2: 0.865 µm) and shortwave IR (*SWIR*) channel (e.g., VIIRS Imagery Band 3 or I3: 1.61 µm) are used. As a land pixel may be any surface type (like vegetation, grass, bare land, etc.), *R*ch\_land values vary for different surface types. In order to find the exact threshold values, especially the *R*ch\_land for land end members, a dynamic nearest neighbor searching (DNNS) method was developed to dynamically search the nearby land and water end members [16]:

$$\frac{\frac{R\_{\text{vis\\_mix}}}{R\_{\text{SNIR\\_mix}}} - \frac{R\_{\text{vis\\_nvar}}}{R\_{\text{SNIR\\_mix}}} < \frac{R\_{\text{vis\\_land}}}{R\_{\text{SNIR\\_Link}}} < \frac{R\_{\text{vis\\_mix}}}{R\_{\text{SNIR\\_mix}}}$$

Equations (2) provide the basis for finding the nearby pure land and water pixels, which are searched in a dynamic window (100 × 100 pixels) around each mixed pixel. The nearest pure land and water pixels that satisfy the relationship described in Equations (2) are located in the loop, the average reflectance of all the identified land pixels is taken as *R*ch\_land, and the average channel reflectance of all the found water pixels is used as the reflectance of pure water (*R*ch\_water). The water fraction can then be calculated from Equation (1). Based on the difference in the water fraction after and before flooding, a flood map can be derived. The algorithm process flowchart can be found in Li et al. [33].

#### **3. Results**

High latitude areas in North America may suffer from floods due to ice jams, especially during spring break up season. Here we show an example of a disastrous flood caused by an ice jam along the Yukon River in Alaska from, 27 May to early June, 2013.

The algorithms described above were applied to the Suomi-NPP/VIIRS data to map and monitor the flooding process dynamically. The flood detection is performed with the water detection and fraction products at the original 375-m resolution.

Figure 2 shows the VIIRS false color image and the corresponding flood detection map at 20:27 Coordinated Universal Time (UTC) on 27 May 2013. A long segment of the Yukon River near Galena was still covered with ice. Ice in the eastern section was mostly melted. Water flowed out of the riverbed to the east of Galena due to the ice jam. The flood could be identified from the VIIRS false color images. With the flood detection algorithms developed in this study, flooding water was detected at water fractions from 60% to 100%. At this time, the flooding water was confined to a small area, and city of Galena was still safe.

**Figure 2.** VIIRS false color image (**upper**) and the corresponding flood detection map (**lower**) along the Yukon River in Alaska on 27 May 2013.

The largest flooding occurred at 21:29 UTC on May 29 (Figure 3). Most of the flooding water fractions near Galena were close to 100%. Figure 3 shows VIIRS data with large areas of flooding water near Galena. The largest area of the flood was estimated to be approximately 18 miles long. In addition to the flooding along the Yukon River, flooding also occurred along the Koyukuk River because of an ice jam. Afterwards, the downstream ice melted gradually, and the flood water then began to retreat. Comparisons of visual analyses with the VIIRS false color images show a good consistency in the flood detection results (Figures 2 and 3). VIIRS flood maps can be generated automatically at near-real time, and are quantitative and more objective than using visual analysis in flood detection.

**Figure 3.** VIIRS false color image (**upper**) and the corresponding flood detection map (**lower**) along the Yukon River near Galena in Alaska on 29 May 2013.

One advantage of polar-orbiting satellites in high latitude regions is that multiple observations can be made available during a single day, which can thus help dynamically monitor and predict floods due to ice jams. Figure 4 further shows the formation regarding the ice jam flood near Galena, Alaska. We can see how ice jams can be determined by observing ice movement and flooding water evolution. In this figure, green arrows show the current ice location, yellow arrows mark the latest ice location, and red arrows identify the ice jam locations. We can see over a high latitude region, like Alaska, polar-orbiting satellite can provide multiple observations during a day. This is especially very helpful for tracking ice movement. We can see that within two hours, from 20:45 UTC to 22:27 UTC, on May 26, ice moved 8.9 km downstream along the Yukon River. From 22:25 UTC on May 26 to 20:27 UTC on May 27, ice moved 62.4 km further downstream within one day. Meanwhile, ice melted and became flooding water. The flooding progressed rapidly. Over less than 2 hours, from 20:27 UTC to 22:04 UTC, on the same May 27, ice moved further downstream 11.2 km toward the city of Galena. Late in the night of May 27, the flooding waters increased and the melting ice flowed downstream along the Yukon River, and most of Galena was under water by the morning of May 28 (Figure 4). The residents of Galena were forced to evacuate.

**Figure 4.** The flood detection maps at 20:45 UTC (**a**) and 22:27 UTC (**b**) on May 26; and at 20:27 UTC (**c**) and 22:04 UTC (**d**) on May 27, 20:10 UTC on May 28 (**e**) and 21:29 UTC on May 29 (**f**), 2013. Yellow arrow displays the latest ice location, green arrow indicates current ice location, and red arrow marks the ice jam location.

Figure 4e,f further shows the formation and retreat of the ice jam flood near Galena, AK, from 28 to 29 May 2013. With the multiple available observations, ice movement and flooding water evolution can be tracked. We can see that from 22:04 UTC on May 27 to 20:10 UTC on May 28, ice moved downstream another 1.9 km, along the Yukon River. Meanwhile, we can see the 36.5-km-long ice jam section. At 21:29 UTC on May 29, the ice cover changed to mixed water and ice, or to overflow status, and this means most of the river channel was open and the floodwater reached the maximum extent. Later, flooding waters decreased substantially, and retreated by 1 June 2013 (not shown).

The VIIRS and GOES-R flood products with ice and snow information can also be used to detect and monitor flood due to snowmelt, as shown in Figures 5 and 6. Figure 5 demonstrates how snow gradually melted and became flooding waters. The Red River flows from south to north toward colder latitudes, where ice jams tend to block the flow during the spring thaw season. Flooding within the Red River is a yearly signal of the end of winter and coming of summer. Spring of 2020 is proving to be no different; with major flooding occurring over much of the Red River and its tributaries due to seasonal snowmelt, the flood-prone river overtopped its banks again.

**Figure 5.** GOES-R ABI Flood Extent Product on April 5 (**a**), 6 (**b**), 7 (**c**) and 8 (**d**), 2020.

Wondering if there were any areas experiencing impactful flooding outside of the current NOAA (or National Oceanic and Atmospheric Administration) flood warnings, forecasters turned to GOES-R and VIIRS flood products for help in highlighting areas of observed floodwater coverage. Although GOES-R ABI is also an optical sensor, its high temporal resolution (5 min) enables it to capture some clear sky observations, allowing the possibility of observing floods during the day [37]. Figure 5 demonstrates GOES-R ABI flood products from 5 to 8 April 2020, when moderate-to-major flooding was occurring along the Red River and its tributaries within the central and northern basin, due to gradual snowmelt. In this figure, snow is marked as white and ice is represented by the cyan color. Values higher than 60% (orange and red coloring) were of particular interest, and lower values into the 30–50% range were believed to be non-impactful standing meltwater. On April 5, there was still some snow along the Red River region. On April 6, snow started melting. On April 7, snow further melted and resulted in flooding, and on the next day, on April 8, snow continued to melt, and the flood extent area increased even more.

**Figure 6.** VIIRS 5-day (April 7–11) composite flood product in April 2020 (the color scale is the same as Figures 4 and 5. Courtesy of Dave Jones at the Storm Center).

The Red River passes through some of North Dakota's most populated areas. While the ABI flood products are updated hourly, the ABI's spatial resolution of 1 km may smooth out the spatial extent of potentially impactful floodwaters. VIIRS offers the same imagery at the finer resolution of 375 m, but at the expense of producing only one image during the daytime, which requires a clear sky to provide useful information. The VIIRS 5-day composite flood map can remove cloud contamination, and is shown in Figure 6. It confirmed higher percentage values of flooding water in the same areas of interest. Although our algorithms have been intensively validated and evaluated [33], ground observations of the I-29 road closure due to flooding in North Dakota can also validate our flood product.

High resolution satellite imagery, down to 10-m from Sentinel-2, can be obtained from the Sentinel-Hub EO Browser [38]. A timely, cloud-free pass from the Sentinel-2 satellite over the area of interest is available for comparison and evaluation (Figure 7). As shown in Figure 7, compared to Figure 6, in the VIIRS flood map, a high percentage of floodwater fraction (>90% in red color) corresponds to deep water, in the dark blue color in the Sentinel imagery, while lower percentage values (60~80% in yellow color) correspond to shallow water, in the lighter blue color in the Sentinel image. Sentinel-2 imagery hinted that the spatial distribution of these floodwaters was close to the ABI [Figure 5c,d] and VIIRS (Figure 6) flood extent areas, confirming that impactful flooding would be possible there.

**Figure 7.** Sentinel-2 Short wave infrared composite (SWIR) imagery on 10 April 2020.

#### **4. Discussion**

The large spatial coverage and frequent revisits of coarse-to-moderate resolution operational satellite imagery, such as the VIIRS onboard the SNPP and the current and future JPSS series, and the ABI onboard the GOES-R series, have advantages in flood detection and monitoring over large areas. In high latitude regions, multiple observations are available from polar-orbiting satellites during the day, which can help dynamically monitor and predict floods due to ice jams. In the cases of this study, floods were due to ice jams and snowmelt, but the procedures and algorithms can be applied to warm season floods due to heavy rainfall as well. The VIIRS and GOES-R flood products are routinely generated at the Space Science and Engineering Center (SSEC), University of Wisconsin, Madison, and the Geographic Information Network of Alaska (GINA) at the University of Alaska, which have access to directly broadcast VIIRS and GOES-R data. The VIIRS and GOES-R near-real time (NRT) flood products can be accessed in Real Earth and Advanced Weather Interactive Processing System (AWIPS)-II. The latest flood products are available in NRT from Real Earth [39]. The archived global flood products can be available from the JPSS Proving Ground Global Flood Products Archive [40]. Relatively few studies have been undertaken that apply models in river ice forecasting. Currently, numerical models for ice floods were developed for simulating ice jam flood for several specific rivers, but they were seldom used for the prediction of ice jam locations and floods [8–11,41]. One big advantage of the VIIRS and GOES-R flood products, including snow and ice information, is that they can be generated automatically at near-real time, are not limited to specific rivers, and can be used for the dynamic monitoring and prediction of floods due to ice jams and snowmelt all over the globe. Thick ice can rapidly break up under the condition of warm temperature. Morales-Marín et al. [42] found that ice breakup occurred when the simulated water temperature (Tw) was above 5 ◦C. In most high latitude regions, if there is significant snow cover, warm temperatures will not only melt ice, but also melt snow, and can cause ice jam flooding. By combination with temperature data, it is expected that satellite-based flood products will allow more quantitative predictions regarding the breakup timing and locations of floods due to ice jams and snowmelt.

#### **5. Conclusions**

In this study, satellite imagery from VIIRS and ABI flood products provided excellent details of river and overland flooding. Even though the spatial resolution of GOES-R is relatively coarse (1 km), its highest temporal resolution imagery, such as the ABI, was a good starting point in searching for floodwaters over a large area. In providing the highest flood extent signals, as well as multi-day composite flood products from higher spatial resolution imagery, VIIRS has proven to be a good approach in pinpointing areas of interest to target in more detail. In high latitude regions, conventional observations are usually sparse, while polar-orbiting satellite observations are available at multiple times in the day, and show the advantages of dynamic monitoring and prediction of floods due to ice jam. Comparisons via visual inspection with the false color images, high resolution satellite imagery and ground observations showed good agreement. With the efforts and demonstrations of this study, the VIIRS and GOES-R flood products can provide dynamic monitoring and prediction of floods due to ice jams and snowmelt for wide-end users.

**Author Contributions:** Methodology, S.L. and D.S.; software, S.L.; validation, S.L. and D.S.; formal analysis, S.L. and D.S.; investigation, S.L. and D.S.; data curation, S.L.; writing—original draft preparation, D.S.; writing—review and editing, D.S., S.L., M.D.G., D.T.L., W.S. and L.Z.; visualization, S.L. and D.S.; supervision, M.D.G., D.T.L., W.S., and L.Z.; project administration, W.S. and L.Z.; funding acquisition, M.D.G., D.T.L., and L.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work is supported by NOAA JPSS and GOES-R Programs under grant #NA12NES4400008. The contents are solely the opinions of the authors and do not constitute a statement of policy, decision, or position on behalf of NOAA or the U. S. Government. We thank the reviewers and the editors for their helpful and constructive comments!

**Acknowledgments:** This work is supported by NOAA JPSS and GOES-R Programs. The constructive and helpful comments from the reviewers and editors are greatly appreciated!

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Article* **Analysis of Ice Storm Impact on and Post-Disaster Recovery of Typical Subtropical Forests in Southeast China**

**Wutao Yao 1,2 , Yong Ma 1,3, \*, Fu Chen 1 , Zhishu Xiao 4 , Zufei Shu 5 , Lijun Chen 4 , Wenhong Xiao 4 , Jianbo Liu 1 , Liyuan Jiang 1,2 and Shuyan Zhang 1,2**


Received: 30 October 2019; Accepted: 24 December 2019; Published: 2 January 2020

**Abstract:** Ice storms greatly affect the structure, dynamics, and functioning of forest ecosystems. Studies on the impact of such disasters, as well as the post-disaster recovery of forests, are important contents in forest biology, ecology, and geography. Remote-sensing technology provides data and methods that can support the study of disasters at the large-to-medium scale and over long time periods. This study took Chebaling National Nature Reserve in Guangdong Province, China, as the study area. First, field-survey data and remote-sensing data were comprehensively analyzed to demonstrate the feasibility of replacing the forest stock volume with the mean annual value of the Enhanced Vegetation Index (EVI), to study forest growth and change. We then used the EVI from 2007 to 2017, together with a variety of other remote-sensing and forest sub-compartment data, to analyze the impact of the 2008 ice storm and the subsequent post-disaster recovery of the forest. Finally, we drew the following conclusions: (1) Topography had a considerable effect on disaster impact and forest recovery in Chebaling. The forest at high altitudes (700–1000 m) and on steep slopes (25–40 ◦ ) was seriously affected by this disaster but had a stronger post-disaster recovery ability. Meanwhile, the hardest-hit area for coniferous forest was higher and steeper than that for broad-leaved forest. (2) In the same terrain conditions, coniferous forests were less affected by the disaster than broad-leaved forests and showed less variation during the post-disaster recovery process. Nevertheless, broad-leaved forests had faster recovery rates and higher recovery degrees; (3) Under the influence of human activities, the recovery and fluctuation degree for planted forest in the post-disaster recovery process was significantly higher than that for natural forest. The study suggests that forest has high disaster resistance and self-recovery ability after the ice storm, and this ability has a strong correlation with the type of forest and the topographic factors such as elevation and slope. At the same time, human intervention can speed up the recovery of forests after disasters.

**Keywords:** ice storm; forest ecosystems; disaster impact; post-disaster recovery; remote sensing

#### **1. Introduction**

Natural disasters, such as snowstorms, ice storms, earthquakes, landslides, tornadoes, volcanoes, hurricanes, and other types of disasters, affect natural ecosystems in complex and profound ways [1–4]. Forest ecosystems are particularly disturbed by such disasters, with the effects including the decline in tree density, loss of forest cover, and the change of biodiversity [5,6]. However, forests demonstrate a remarkable capacity to naturally recover from such disturbances over time [7–9]. The evaluation of the impact of disasters on forest ecosystems and of post-disaster recovery have been important areas of research in forestry and ecology [10–12].

Most areas of Southern China were severely affected by ice storm between 11 January and 5 February 2008. In total, 19 provinces, autonomous regions, or municipalities with a population of over 100 million were affected [13]. The snow, ice, and sleet not only caused extensive social disruption and economic losses but also severe environmental damage, destroying 1.98 <sup>×</sup> <sup>10</sup><sup>7</sup> ha, or nearly 13%, of China's forests [14]. Guangdong, Jiangxi, Hunan, Hubei, and Guizhou were particularly badly affected. The freezing weather and sleet, which lasted more than 20 days, caused the greatest disaster in one hundred years in Southern China. In most of the affected areas, parts of the tree trunks and branches were broken, and this created gaps in the canopy. A few trees were completely destroyed in some hardest-hit areas. After the disaster, many studies on ice-storm assessment were published. Some of these studies used MODIS remote-sensing data, DEM data, and forest-resource-distribution maps to analyze the impact of the disaster on different types of forests on a large scale [14–16]. Comparative analysis of the degree of damage done to different kinds of forests by using forest-resource-investigation data has also been a common research topic [17,18]. At present, most relevant studies have focused on the destruction of forests caused by this ice storm; few have looked at forest recovery.

Studies on disaster disturbance and recovery heterogeneity, spatial distribution, and causes can be differentiated into two main types [19]: site-specific studies and regional remote-sensing approaches. Site-specific studies use field assessments of either a limited number of sites or plots within an affected area or of a random selection of trees covering the entire study area [20]. Sample-plot configurations have included transects [21], as well as square [6,8] or circular plots [22,23]. These contain a variety of forest species and complex terrain [17,24]. For example, Ge et al. [17] took advantage of the pre- and post-ice storm surveys of a permanent plot in the Shennongjia region to make an assessment of the recovery from the 2008 ice storm based on forest dynamics. Wang et al. [24] established four plots in the Shierdushui Nature Reserve, to examine the degree of damage to dominant species and the measured diameters at breast height (DBHs), as well as to examine the sprout response (indicated by the number of sprouts per stem) of the evergreen broad-leaved forest to the severe winter storm.

Remote-sensing satellite images are used to examine impact and recovery on a regional scale. Compared with site-specific field surveys, remote sensing is a more economical tool for monitoring large-scale forest recovery after disasters [25]. Jiao et al. [10] used multitemporal Landsat images focused on a mountainous region that had the most severe forest destruction caused by theWenchuan earthquake and selected the NDVI-SMA method (which couples the NDVI with spectral mixture analysis), to extract forest cover information. They then quantitatively estimated spatiotemporal variations in forest recovery for the entire mountainous disaster area after the earthquake. Hislop et al. [26] examined the utility of eight spectral indices for characterizing fire disturbance to sclerophyll forests and subsequent recovery in the eastern half of Victoria, Australia, in order to determine their relative merits in the context of Landsat time-series. Wilson and Norman [27] analyzed spatial and temporal trends in vegetation greenness and soil moisture by applying the normalized difference vegetation index (NDVI) and normalized difference infrared index (NDII) to one Landsat path/row for the dry summer season from 1984 to 2016 in the Cienega San Bernardino wetland.

Both site-specific studies and remote-sensing regional approaches have their advantages and disadvantages. Site-specific studies can obtain accurate and detailed data, which is conducive to targeted research. However, it is difficult to obtain large-scale and spatiotemporally continuous data using this method. Remote-sensing regional approaches can solve this problem; however, due to the lack of long time-series of field survey data, the accuracy of most studies needs to be verified. In addition, the inversion accuracy of remote-sensing parameters still needs to be improved. Therefore, this study intends to verify the reliability of remote-sensing forest-assessment parameters, using field-survey data, and to use field-survey data to supplement remote-sensing data for disaster research.

In this study, we sought to evaluate the impact of the ice storm, as well as the naturally occurring post-disaster forest recovery. We focused on the Chebaling National Nature Reserve, an important area of protected subtropical forest in China which supports numerous rare wild animals and plants. Our main objective was to investigate spatial and temporal variations in forest damage and recovery after the ice storm. First of all, spatial correction between the forest stock volume given by the sub-compartment data and the remotely sensed EVI (Enhanced vegetation index) was carried out to verify the feasibility of replacing the forest stock volume with remotely sensed EVI data. Then, in terms of disaster impact and post-disaster recovery, we analyzed the impact of elevation, slope, and forest types on EVI change from 2007 to 2017. Finally, in this paper, we summarized the characteristics of the impact of the disaster on the forest in Chebaling, as well as the characteristics of the post-disaster recovery, and preliminarily discussed the causes of the phenomenon. This study has important implications for the evaluation of disaster impacts and for medium-scale studies of long-term natural recovery processes following natural disasters.

#### **2. Materials and Methods**

#### *2.1. Study Area*

Our study area was located in the Chebaling National Nature Reserve (24◦40′–24◦46′N, 114◦07′–114◦16′E), Guangdong Province, China (Figure 1). Chebaling is considered important for protecting typical subtropical evergreen broadleaf forests and rare flora and fauna [28,29]. It was established in 1981 and upgraded to a national nature reserve in 1989. Chebaling encompasses an area of 7545 ha, and there are 1928 plant species and 1558 animal species present within the reserve [28]. The climate of Chebaling is classed as moist, moderate subtropical monsoon; the topography in the region is complex, with an elevation range of 318–1219 m above sea level. The landform is characterized by mountainous areas that are typical of the South China fold system. The average annual temperature is 19.6 ◦C, and annual precipitation is 1467 mm. Chebaling is located in the transition zone from the southern subtropical area to the middle subtropical area and is dominated by primary evergreen broad-leaved forest. Planted forest, cultivated land, and villages are limited to the flat central area. The ice storm in 2008 had a serious impact on the forest in Chebaling. In the past ten years, the forest has gradually been restored, and the natural forest has been largely unaffected by human disturbance during the restoration period. The modest area and complex topography of Chebaling enable us to analyze the characteristics of natural forest recovery and how these vary according to the vertical zone.

**Figure 1.** The map above represents the location and high-resolution remote-sensing image of Chebaling (the red star represents the central point of Chebaling; the GF-1 remote-sensing image had a resolution of 2 m and was acquired on 15 February 2017); the map below shows the DEM of Chebaling.

#### *2.2. Data*

#### 2.2.1. Remote-Sensing Data

The most important remote-sensing parameter used in this study was the Enhanced Vegetation Index (EVI). Vegetation indices are often used to assess vegetation status and forest recovery. In the various vegetation indices, the EVI and Normalized Difference Vegetation Index (NDVI) are the most commonly used in forest ecology studies. The study area was located in the subtropical zone, so, in order to avoid saturation of the vegetation index in this area of lush vegetation [30], we chose the EVI as the main vegetation index to be used. The EVI is a common vegetation index that was developed to improve sensitivity in high biomass regions and to improve vegetation monitoring through a de-coupling of the canopy background signal and atmospheric influences [31,32]. In this study, we took one calendar year as the basic time unit, and annual composites of the EVI data were made. We intended to focus on the average and best state of the forest for each year, and so we calculated the annual mean

EVI and annual maximum EVI for the forest in Chebaling. Annual EVI data were derived from Landsat TM, ETM+, and OLI composites (path 122, row 43; UTM zone 49 N). We obtained data by using the Climate Engine (https://clim-engine.appspot.com/). These remote-sensing data have been processed in the Climate Engine, including radiometrically and atmospherically corrected. Then, the annual EVI data were made by using all the available cloud-free Landsat data for the selected calendar years. The years from 2007 to 2011 relied on Landsat 5 TM data, whereas 2012 relied on Landsat 7 ETM+ data; more recent observations used Landsat 8 OLI data. A few pixels (<2% each year) were of poor quality and were excluded from the analysis.

In addition, digital elevation model (DEM) data, GF-1 satellite data, and 9 cloud-free Landsat scenes acquired at specific times were also used in this study. DEM data were used to extract elevation and slope factors, while GF-1 and cloud-free Landsat images were used for classification. The DEM data were derived from ASTER GDEM data provided by NASA and had a spatial resolution of 30 m. The GF-1 multispectral satellite images had a resolution of 2 m and were acquired on 15 February 2017. The cloud-free Landsat data were acquired from the United States Geological Survey (https://espa.cr.usgs.gov) and the Landsat satellite program [33,34]. The data acquisition times were concentrated in the dry season, i.e., from October to December of each year, from 2008 to 2017, with the exception of 2010 (26 March), 2011 (20 August), and 2012 (no data). Further information about the cloud-free Landsat data is shown in Table 1.

**Table 1.** Cloud-free Landsat data details.


#### 2.2.2. Forestry Sub-Compartment Data

The sub-compartment is the basic unit of forest resource statistics and management. The forestry and biological characteristics of forests in the same sub-compartment are basically the same. After the 2008 ice storm, the management department of Chebaling conducted annual field surveys of the forest in the reserve, using forestry sub-compartments as the basic unit. We acquired these data (covering 2009 to 2016) and used them to verify and supplement the satellite data. Chebaling is divided into 451 sub-compartments, among which 423 sub-compartments are covered by different tree species. The forest stock volume and main forest types in each sub-compartment were the main parameters used in this study: these are the most important factors for the investigation of forest stands [35] and the main indicators used to evaluate forests. Figure 2a shows the spatial distribution of forest stock volume for 2016. Forest stock volume is one of the best predictors of biomass at the stand level [36]. Moreover, as an important tool in the understanding of forest dynamics, it can be used to predict whether a forest will act as a CO<sup>2</sup> emission source or sink [37]. Therefore, the state of regional forests can be reliably described by using the forest stock volume, and the forest stock volume can also be used as the ground verification data for remote-sensing parameters. In the sub-compartment data, the forest in Chebaling was divided into 10 main types (Figure 2b): tea (T), moso bamboo (MB), woody fruit crops (WFC), *Pinus massoniana* (PM), Chinese fir (CF), coniferous and broad-leaved mixed forest (CABM), coniferous mixed forest (CM), broad-leaved mixed forest (BM), other softwood broadleaved forests (OSB), and other hardwood broadleaved forests (OHB). In addition, there was a NF (non-forest)

class. The forest sub-compartment data thus included a large number of forest types and were highly accurate; the data were used to supplement the remote-sensing data in the research.

**Figure 2.** (**a**) Spatial distribution of forest stock volume; (**b**) spatial distribution of forest types. These maps were derived from forestry sub-compartment data in 2016.

#### *2.3. Method*

In this section, data analysis and data-processing methods were used to verify the feasibility of replacing the actual state of forest with the remote-sensing vegetation index and to extract 3 important factors (forest types, elevation zones, and slope zones) from multisource data for subsequent study.

#### 2.3.1. Correlation Analysis

– By studying the correlation between various remote-sensing vegetation indices and measured forestry data, scholars have evaluated the feasibility of using remote-sensing data to study forest changes, as well as the applicability of various remote-sensing indices [38–40]. Macedo et al. [41] used forest inventory data (24 plots) and forest indices (NDVI, EVI, SR, and SAVI) derived from high-spatial-resolution satellite images, to estimate and map the aboveground biomass of Mediterranean *Quercus rotundifolia* in Southern Portugal. Correlation analysis, variance analysis, and linear regression were used in their study; the simple ratio (SR) median value was considered to be the best predictor (R<sup>2</sup> = 75.3) of the aboveground biomass. Bolton et al. [42] used samples of ALS data and Landsat time-series metrics to produce estimates of the top height, basal area, and net stem volume for two timber-supply areas near Kamloops, British Columbia, Canada, using an imputation approach. Their results showed that Landsat-imputed attributes correlated strongly with ALS-based estimates in these blocks (R <sup>2</sup> = 0.62 and relative RMSE = 13.1% for top height, R <sup>2</sup> = 0.75 and relative RMSE = 17.8% for basal area, and R <sup>2</sup> = 0.67 and relative RMSE = 26.5% for net stem volume) and that remote sensing data could be used to produce wall-to-wall estimates of key inventory attributes. On the basis of the results of previous studies, we analyzed the correlation between EVI (annual mean EVI and annual maximum EVI) and forest stock volume in Chebaling in terms of both spatial correlation and temporal correlation.

A linear regression analysis was conducted to identify the relationship between the forest stock volume at the sub-compartment scale and the maximum and mean values of EVI within each sub-compartment. The linear regression model used was as follows:

$$\text{SV} = \text{a1} + \text{a2} \cdot \text{EVI} \tag{1}$$

where SV is the forest stock volume, EVI is the maximum or mean value of the EVI within each sub-compartment, and a1 and a2 are regression coefficients.

#### 1. Spatial Correlation

Figure 3a,b respectively show the relationship between the forest stock volume and the maximum value of the EVI and the mean value of the EVI at the sub-compartment scale in 2016. R 2 , which represents the goodness of the fit between the maximum EVI value and the forest stock volume, varies from 0.62 to 0.71. The mean value of R 2 for these 8 years is 0.6721. R 2 for the correlation between the mean value of the EVI and the forest stock volume varies from 0.63 to 0.70, with a mean value of 0.6737. These significant correlations indicate that the spatial relationship between the forest stock volume and the mean value of the EVI is similar to that between the stock volume and the maximum value of the EVI.

**Figure 3.** Spatial correlation analysis. (**a**) Spatial correlation between maximum value of EVI and forest stock volume in 2016; (**b**) spatial correlation between mean value of EVI and forest stock volume in 2016.

2. Temporal Correlation

–

–

– The mean and maximum values of the EVI in Chebaling for the years 2009–2016, combined with the forest stock volume for each year, were used to analyze the temporal correlation between the variables. Because of the low quality of the Landsat-7 data from 2012 and the fluctuations in EVI caused by there being insufficient data after cloud removal in 2016, the correlation was recalculated after removing these two years (Figure 4). R <sup>2</sup> between the mean value of the EVI and the forest stock volume is 0.7613, indicating a strong correlation between them. However, there is no correlation between the maximum value of the EVI and the forest stock volume (R <sup>2</sup> = 0.036). –

– – **Figure 4.** Temporal correlation analysis (2009–2015, excluding 2012). (**a**) Temporal correlation between maximum value of EVI and forest stock volume; (**b**) temporal correlation between mean value of EVI and forest stock volume.

Based on the above correlation analysis results, it can be seen that the mean EVI value has a strong correlation with the forest stock volume, both temporally and spatially, and can, therefore, be used to represent the forest stock volume. The mean value of the EVI was thus used for subsequent disaster impact and post-disaster recovery analysis.

#### 2.3.2. Classification

The classification of forest types based on remote-sensing data is the key and also the most difficult point in the application of remote-sensing technology to forestry. At present, there are many studies on identifying vegetation and forest types by using medium- and low-resolution remote-sensing data [43–45]; in contrast, there is a lack of forest-type classification based on high-resolution remote-sensing data [46,47], and related theories and methods are still at the initial stage because of unsatisfactory classification results. However, remote-sensing classification is able to distinguish between forest and non-forest (NF) categories well.

Information about the main forest types in each sub-compartment can be obtained through field surveys, which are more accurate and detailed than that obtained via the remote-sensing method. However, the accuracy of forest boundary information derived from forestry sub-compartment data is poor, and the update frequency cannot meet practical and research needs. By comprehensively utilizing remote-sensing classification results and sub-compartment forest types information, it is possible to refine the boundaries of different forest types; this is beneficial to studies of the difference in recovery between different forest types after ice storm.

Using remote sensing image process software ENVI5.3, we preprocessed (atmospherically and radiometrically corrected) 9 cloud-free Landsat scenes—one from each year from 2008 to 2017 with the exception of 2012—and used the maximum likelihood classification method [48] to classify the processed data. As these are medium-resolution remote-sensing data, only four classification categories were used: forest, water, cultivated land and buildings, and bare land. The proportions of each category are shown in Table 2. According to the classification results, the distribution and proportion of land-use types in Chebaling varied little from 2009 to 2017. Therefore, it was possible to use the 2017 forest boundary to represent the Chebaling forest boundary for the ten-year period studied.


**Table 2.** Landsat land-use classification results (2008–2017).

The image used for the 2017 classification was acquired by the GF-1 remote-sensing satellite. Geometric registration, radiometric correction, orthophoto correction, and band fusion were carried out by ENVI5.3, to obtain the standard image. The classification method we used was the object-based random forest method, which is one of the most accurate and widely used algorithms [49–53] for remote-sensing image classification. The classification software used was eCognition, a professional remote-sensing image-classification software. Based on the results of several experiments, the final classification parameter settings were determined. The parameters for the segmentation process were (1) scale parameter: 50; (2) composition of homogeneity criterion: shape: 0.2, compactness: 0.5. The parameters for the classification characteristics included spectral characteristics (mean value and standard deviation of each band, NDVI), geometric characteristics (border index, shape index), and texture characteristics (GLCM Entropy, GLCM Mean, GLCM Standard Deviation, and GLCM Correlation), giving a total of 15 parameters. The overall accuracy of the classification results obtained was 96.5478%, and the kappa coefficient was 0.9544. The classification map is shown in Figure 5a. Based on the classification results, and in combination with the forest types and boundary information

derived from the forestry sub-compartment data, the final forest-types distribution map for Chebaling was generated (Figure 5b), and the areas of different forest types were shown in Table 3.

**Figure 5.** (**a**) Land-use classification map based on GF-1 data; (**b**) forest-types distribution map derived from forestry sub-compartment data and the land-use classification map.


#### 2.3.3. Grading Methods for Elevation and Slope

Based on the DEM data and actual situation of terrain, Chebaling was divided into 9 elevation zones with 100 m intervals and 9 slope zones with 5-degree intervals. The forest area and percentage coverage in each elevation and slope zone are shown in Table 4. However, only changes in the forest were considered in this study, so the non-forest area was not included in the subsequent analysis. Figure 6 shows the spatial distribution of the elevation and slope zones.

**Table 4.** Details of the elevation and slope zones.


**Figure 6.** (**a**) Spatial distribution of different elevation zones in Chebaling; (**b**) spatial distribution of different slope zones in Chebaling.

#### **3. Results**

In the previous section, the feasibility of using the annual mean value of the EVI (referred to simply as the EVI from now on) to represent the forest stock volume was demonstrated. Therefore, the EVI was used to represent the status of the forest for the study of disaster impact and post-disaster recovery from 2007 to 2017. EVI of 2012 and 2016 were not used in our study, and the reasons were discussed in the analysis in Section 2.3.1. Therefore, we used the EVI of the remaining nine years of the period 2007–2017 for disaster study. In this section, the difference of disaster impact and post-disaster recovery in different elevation zones, slope zones, and for different forest types are analyzed from two aspects of single factor and multiple factor respectively. –

#### *3.1. Single-Factor Analysis*

#### 3.1.1. Disaster Analysis in Different Elevation Zones

The broken-line graph (Figure 7) shows the change in EVI in different elevation zones from 2007 to 2017. First, in terms of the impact of disasters, the EVI of forests in all elevation zones were greatly reduced due to the ice storm from 2007–2008. However, the EVI decreased more in middle- and high-elevation zones (green, blue, and purple) than in low-elevation zones (red, orange, and yellow). Second, in terms of post-disaster recovery, although in some years (2008–2011) the EVI fluctuated slightly, the overall trend was that there was a rise in EVI in all zones, meaning that the lowest value occurred in 2008 and the highest value in 2017. From 2008 to 2011, the EVI in the 400–600 m zones was the highest in Chebaling. However, after 2013, the EVI in the higher altitude areas, particularly in 600–1000 m zones, gradually exceeded that in the 400–600 m zones. During this period, there was a continuous gentle rise in EVI at elevations above 600 m. In contrast, below 600 m, the EVI fluctuated greatly. – – – – –

**Figure 7.** Broken-line graph showing EVI changes from 2007 to 2017 in different elevation zones.

Table 5 gives the results of a comprehensive analysis of the disaster impact and the recovery after the disaster. The first thing to explain here is the calculation method of fluctuation degree in Table 5 and following several tables, taking the fluctuation degree of elevation zones in Table 5 as an example. First, the standard deviation of annual EVI growth value from 2008 to 2017 in each elevation zone was calculated. Second, the initial classification of fluctuation degree was calculated. If there was no significant difference between the standard deviation of EVI growth value in each elevation zone, the fluctuation degree of all elevation zones would be set as L. If there was a significant difference between the standard deviation in each elevation zone, the two or three highest values would be set as H, the two or three lowest values would be set as L, and the others would be set as M. Third, the final classification of fluctuation degree was calculated. According to the broken-line graph of EVI changes from 2007 to 2017 in different elevation zones, if the EVI change trend of one elevation zone was different from that of most other elevation zones, or the fluctuation amplitude of several years were

significantly higher than that of other elevation zones, the fluctuation degree of this elevation zone would increase by one level (L increases to M, M increases to H, H remains unchanged). Otherwise, the fluctuation degree would remain unchanged.

**Table 5.** Details of the disaster impact and post-disaster recovery in different elevation zones. In the property row of the table, disaster impact refers to the absolute value of EVI difference between 2008 and 2007; post-disaster recovery refers to the absolute value of EVI difference between 2017 and 2008; fluctuation degree represents the fluctuation of EVI from 2008 to 2017; H, M, and L in fluctuation-degree column represent high, medium, and low, respectively. –


As shown in Table 5, the forest in the 700–1000 m elevation zones (the three red rows) had a high EVI value before the disaster. Although the impact of the disaster was relatively severe in these zones, the post-disaster recovery rate and increase value of EVI were also the highest, and the recovery process was quite smooth and without any big fluctuations. Conversely, the EVI in the 300–600 m elevation zones (the three blue rows) was relatively low before the disaster and decreased little after the disaster; however, the recovery rate was slow and the EVI fluctuated greatly. This indicates that the forest in the high-altitude area of Chebaling was seriously affected by the disaster but also showed a stronger post-disaster recovery ability. – – – – – – –

#### 3.1.2. Disaster Analysis in Different Slope Zones –

–

The change trend of the EVI in each slope zone from 2007 to 2017 is shown in Figure 8. From 2007 to 2008, EVI in all slope zones decreased significantly, and the decrease value in high slope zones were slightly higher than that in low slope zones. From 2008 to 2017, the change trends in EVI in different slope zones are similar. Before 2011, the EVI increased or decreased by about the same amount in each slope zone every year, and so the differences between the absolute values of the EVI remained constant. However, the differences between the absolute values of the EVI in different slope zones decreased significantly after 2011, especially from 2013 to 2015, as the degree of recovery in the different slope zones started to vary. Overall, the forest recovery in the areas with steeper slopes was better than that in the less-steep areas between 2008 and 2017.

**Figure 8.** Broken-line graph showing EVI changes from 2007 to 2017 in different slope zones.

The results of the analysis of the disaster impact and disaster recovery for all of the slope zones are summarized in Table 6. Before the ice storm, the EVI values in the zones with slopes between 5 ◦ and 25 ◦ were the highest. However, the impact of the disaster on the forest in Chebaling increased gradually as the slope increased. As a result, in 2008, areas of forest on steeper slopes had lower EVI values. During the post-disaster recovery process from 2008 to 2017, the change trends in EVI in all of the slope zones were basically the same, and the amount of fluctuation was small. There was a positive correlation between the degree of disaster recovery and the degree of disaster impact. Therefore, the ranking of slopes zones by EVI value in 2017 was same as that in 2007.


**Table 6.** Details of the disaster impact and post-disaster recovery in different slope zones.

#### 3.1.3. Disaster Analysis for Different Forest Types – –

–

Figure 9 shows the trends in EVI for different forest types from 2007 to 2018. Following the disaster, EVI for BM and OHB decreased the most; however, EVI for WFC decreased the least. In terms of post-disaster recovery, the EVI trends for the planted forest types (T and WFC) were significantly different from those for the other eight forest types. The T and WFC EVI values fluctuated a lot, with the EVI for T always being lower than that for WFC. The EVI for T and WFC reached the peak in 2015, showing that the planted forest can recover to a high EVI level faster. The EVI trends for PM, CABM, and CF were basically the same, with the EVI rising steadily and showing little fluctuation. Moreover, the EVI for these three forest types were higher than those for the other types (except CM and BM) most of the time. The EVI trends for CM and BM were similar and CM had the highest EVI value in most years. Although the EVI for BM was lower than EVI for the CM, CF, PM, and CABM in 2008, it increased rapidly after that and was one of the highest values in 2017. In contrast, the EVI for CM and BM fluctuated slightly more than for the CF, PM, and CABM. The MB EVI was moderately high in 2008, but its growth rate was low from 2008 to 2017, which lead to this value being low in 2017. The EVI for OHB was low in 2008 and changed in a similar way to the BM and CM; however, it was always 0.02 to 0.03 lower than the BM EVI. The EVI for OSB did not rise as quickly as the EVI for the other forest types. This EVI was the lowest in most years; in addition, its value fluctuated more than all the other EVI values, except those for the two planted forest types (T and WFC).

**Figure 9.** Broken-line graph showing EVI changes from 2007 to 2017 for different forest types.

The results of the analysis of the disaster impact and post-disaster recovery for the different forest types are shown in Table 7. The damage caused by the disaster was less in the areas covered by planted forest (T and WFC) than in the areas of natural forest. In addition, the EVI for the planted forest areas fluctuated greatly during the post-disaster recovery process. Before the disaster, the EVI for the coniferous forest (CF, PM, and CM areas) and CABM were higher than for most of the other forest types. These four forest types were less affected by the disaster than the broad-leaved forest, and the EVI in these areas showed a relatively steady rise during the post-disaster recovery. BM and OHB were seriously affected by the disaster but recovered quickly—the EVI here fluctuated slightly more than for the coniferous forests. OSB had a low degree of disaster impact and post-disaster recovery, which can be attributed to its low EVI value before the disaster. MB was moderately affected by the disaster, and its EVI value increased the least after the disaster.


**Table 7.** Details of the disaster impact and post-disaster recovery for different forest types.

#### *3.2. Multifactor Comprehensive Analysis*

We have analyzed the relationship between EVI change and single factor (elevation, slope, and forest type) from 2007 to 2017. However, the spatial distribution of 10 forest types were different from each other; for example, the forest-types distributed in the 400–500 m zone were different from those in the 900–1000 m zone, and WFC and PM grew in regions with different elevation and slope. Therefore, disaster analysis for different forest types needs to be further studied. Multifactor analysis was carried out for this section. We combined forest types with elevation zones and slope zones, respectively, for a comprehensive analysis and used control variable method to improve the accuracy of analysis results.

We calculated the distribution proportion for 10 forest types in different elevation and slope zones and found that the numbers for forest types distributed in the four elevation zones (400–800 m) and four slope zones (10–30◦ ) were bigger than in other elevation and slope zones. Moreover, the area of these elevation and slope zones was larger than that of other elevation and slope zones (Table 4). Therefore, we selected these elevation and slope zones for multifactor analysis. Similarly, the areas for six forest types (PM, CF, CM, CABM, BM, and OHB) were larger than other forest types (Table 3), and these six forest types had wider ranges of elevations and slopes. As a result, these six forest types were selected for multifactor analysis.

#### 3.2.1. Disaster Analysis for Different Forest Types in Four Elevation Zones

According to the EVI trend for different forest types in the four elevation zones (Figure 10), combined with the statistical table of disaster analysis (Tables 8 and 9), we comprehensively analyzed the relationship between EVI change and elevation for 10 forest types from three aspects: disaster impact, post-disaster recovery, and fluctuation degree in the recovery process. First, as shown in Figure 10 and Table 8, EVI for all forest types decreased more in higher elevation zones than in lower elevation zones from 2007 to 2008. EVI for OSB decreased the least, while EVI for BM and OHB decreased the most among all forest types. Second, the EVI change trend for most forest types from 2008 to 2017 fluctuated larger in the 400–500 m elevation zone than in other elevation zones above 500 m

(Figure 10). Moreover, as the elevation increased, the values in the post-disaster recovery columns (Table 9) for most forest types increased, and the absolute value of EVI difference between coniferous forests (PM, CF, and CM) and broad-leaved forests (BM, OHB, and OSB) increased significantly. Third, T and WFC were only distributed in 400–500 m elevation zone. In the 400–500 m elevation zone, EVI for T decreased more than it did for the eight other forest types (except OSB), which indicated that T was highly affected by the disaster. In the post-disaster recovery process, the fluctuation degree and increased value of EVI for T and WFC were the highest among 10 forest types. Finally, among the eight forest types (not including WFC and T) distributed in the four elevation zones, the disaster impact and post-disaster recovery degree for BM and OHB were both higher than other forest types in the same elevation zone. – –

(**c**) (**d**)

– – – – **Figure 10.** Broken-line graph showing EVI changes for 10 forest types, from 2007 to 2017, in four elevation zones: (**a**) 400–500 m; (**b**) 500–600 m; (**c**) 600–700 m; and (**d**) 700–800 m.



 **of** 


**Table 9.** Post-disaster recovery and fluctuation degree analysis results for different forest types in four elevation zones.

3.2.2. Disaster Analysis for Different Forest Types in Four Slope Zones

Statistics and an analysis were also conducted on four typical slope zones. Based on the information in Figure 11 and Tables 10 and 11, we can draw the following conclusions. First, in the slope range of 10–30 ◦ , with the increase of slope, the disaster-impact degree for most forest types gradually increased. However, the increase of the disaster-impact degree caused by the rise of slope zones was obviously slighter than that caused by the rise of elevation zones. This indicates that elevation is more decisive than slope in disaster impact. Second, in the post-disaster recovery process, the fluctuation degree for each forest type was similar in different slope zones. However, the absolute value of EVI difference for different forest types gradually increased with the increase of slope. The EVI of coniferous forests (PM, CF, and CM) were significantly higher than that of broad-leaved forests (BM, OHB, and OSB) in high slope zones. Third, since T and WFC were only distributed in the 400–500 m elevation zone, compared with other forest types in the same slope zone, they were less affected by the disaster and had the highest fluctuation degree. Finally, in the comparison with different forest types in each slope zone, BM was most vulnerable to the disaster but also had the highest post-disaster recovery and fluctuation degree, followed by OHB. – –

– – – – **Figure 11.** Broken-line graph showing EVI changes for 10 forest types, from 2007 to 2017, in typical slope zones: (**a**) 10–15 ◦ ; (**b**) 15–20 ◦ ; (**c**) 20–25 ◦ ; and (**d**) 25–30 ◦ .


**Table 10.** Disaster-impact-analysis results for different forest types in four slope zones.

**Table 11.** Post-disaster recovery and fluctuation-degree analysis results for different forest types in four slope zones.


#### 3.2.3. Disaster Analysis for Six Forest Types

After disaster analysis in typical elevation and slope zones, we then analyzed the disaster impact and post-disaster recovery for six typical forest types. First, we studied the influence of elevation. In terms of disaster impact, according to the statistics in Table 12, the areas that were least affected by the disaster were distributed in lowest elevation zones for five forest types (not including CF); moreover, as the elevation increased, the degree of disaster impact gradually increased, or first rose and then fell. On the contrary, the degree of disaster impact first fell and then rose as the elevation increased for CF. The hardest-hit area of CABM and coniferous forests (CM and PM) were in higher elevation zones than that of broad-leaved forest (BM and OHB).

**Table 12.** Disaster-impact-analysis results for six forest types in different elevation zones.


During the 10 years after the disaster, as shown in Figure 12, EVI for coniferous forests (PM, CM, and CF) in middle- and high-elevation zones were mostly higher than that in low-elevation zones. On the contrary, EVI for CABM and broad-leaved forests (BM and OHB) in middle- and low-elevation zones were higher than that in high-elevation zones. According to Table 13, the highest value in post-disaster recovery columns for coniferous forests (PM and CM) were in higher elevation zones than for broad-leaved forests (BM and OHB), but they were all in the 600–1000 m elevation zones. The fluctuation degree for the six forest types in the low-elevation zones was the highest. However, the fluctuation degree for broad-leaved forests (BM and OHB) decreased gradually with the elevation increases, but the fluctuation degree for coniferous forests (PM and CM) in high-elevation zones was higher than in middle-elevation zones. Overall, CABM fluctuated little in all elevation zones, and its recovery process was relatively stable than the other five forest types.

**Figure 12.** Broken-line graph showing EVI changes from 2007 to 2017, in all elevation zones, for six forest types: (**a**) CF; (**b**) PM; (**c**) CM; (**d**) CABM; (**e**) BM; and (**f**) OHB.

–

– – – – – – – –


**Table 13.** Post-disaster recovery and fluctuation degree analysis results for six forest types in different elevation zones.

In terms of slope study, we made a comprehensive analysis according to Figure 13 and Tables 14 and 15, and compared it with the analysis of elevation. As shown in Figure 13, from 2007 to 2008, the decreased value of EVI in different slope zones showed a significant difference for CF and CM, but showed little difference for BM, OHB, PM, and CABM. Among six forest types (Table 14), as the increase of slope, the values in the disaster impact columns for CM showed a trend of first decrease and then increase, with the lowest value in 15–20◦ slope zone. These values for CF and CABM kept increasing as the slope increased. However, these values for the other three forest types (BM, OHB, and PM) showed a trend of first increase and then decrease as the slope increased. In general, the slope zones above 20◦ were the hardest-hit areas for all forest types. By comparing the difference of disaster-impact degree on different elevation and slope zones for six forest types, we can see that the influence of slope was greater than that of elevation on CF; the influence of elevation was greater than that of slope on PM, CABM, BM, and OHB; elevation and slope all had strong influence on CM.


**Table 14.** Disaster-impact-analysis results for six forest types in different slope zones.

**Table 15.** Post-disaster recovery and fluctuation-degree-analysis results for six forest types in different slope zones.


– – – – – – – – –

–

**Figure 13.** Broken-line graph showing EVI changes from 2007 to 2017, in all slope zones, for six forest types: (**a**) CF; (**b**) PM; (**c**) CM; (**d**) CABM; (**e**) BM; and (**f**) OHB.

– – – – – – In the post-disaster recovery process from 2008 to 2017, as shown in Figure 13, for three coniferous forests (PM, CF, and CM), EVI in middle- and high-slope zones were higher than that in low-slope zones; for two broad-leaved forests (BM and OHB), EVI in middle- and low-slope zones were always higher than that in high-slope zones. In addition, it can be seen from Tables 14 and 15 that the increased value of EVI in the post-disaster recovery process for six forest types was positively correlated with the decreased value of EVI after the disaster. The slope zones that were seriously affected by the disaster also had higher recovery degree. Moreover, The EVI trend for six forest types were rising steadily in all slope zones, without any significant fluctuation.

#### **4. Discussion**

– –

– – – – – – – –

The results of quantitative analysis by remote sensing showed the difference of forest EVI change trend in a variety of topographic conditions after the ice storm. First, from the single-factor-analysis results, there were obvious differences in disaster impact and post-disaster recovery for different elevation and slope zones in the forest. Areas at an altitude of 700–1000 m and a slope of 25–40 degrees were most affected by the disaster but also had the highest post-disaster recovery degree. Next most affected were the highest-altitude areas above 1000 m and with the steepest slopes greater than 40 degrees. While the areas below 700 m and with slopes of 25 degrees or less were least affected by the disaster and had the lowest post-disaster recovery degree, but the fluctuation degree were high during the recovery process. Except for areas below 500 m, EVI for forest in other elevation and slope zones increased rapidly in the first three years (2009–2011) following the disaster, and the growth rate gradually slowed down in the later period. In addition, from the results of multifactor analysis, we find that the areas that were most affected by the disaster and had the highest recovery degree for coniferous forests had a higher altitude and steeper slope than broad-leaved forest.

Based on the theoretical analysis and field investigation, we believe that the following were the most important factors behind this result. (1) Freezing rain, strong winds, and ice have a greater impact on regions at higher elevations and steeper slopes, which resulted in greater losses in these regions, as has similarly been demonstrated by other studies [6,54,55]. (2) Villages, farmland, and planted forests were distributed in the areas at a lower altitude and with a gentler slope. Therefore, these areas were greatly affected by human activities, leading to the highest level of EVI fluctuation during the disaster-recovery process. Areas with elevations of 700–1000 meters and slopes of 25–40 degrees were mainly covered by natural forest. The EVI in these areas was higher than in other areas before the disaster. Although these areas were seriously impacted by the disaster, and so the EVI here decreased more, the biological characteristics of natural forests enable them to recover quickly after disasters. Therefore, in the decade after the disaster, the EVI of the forest in this region increased the most. The forest density, average height, and diameter at breast height (DBH) are all affected by the topography, climate, and other factors, and so were lower in areas above 1000 m and slopes above 40 degrees in Chebaling. As a result, the EVI in this region was lower than in other regions. Although the disaster had a great impact on the forest in these areas, the EVI here decreased less in 2008, and the increase in value in the following 10 years was also less than in the middle-altitude and moderate-slope areas. (3) In the areas above 700 m and above 25 degree, as the elevation and slope increased, EVI for broad-leaved forest decreased significantly, while EVI for coniferous forest changed little or increased slightly. Therefore, the hardest-hit areas for coniferous forest were higher and steeper than for broad-leaved forest.

The study of the disaster impact and the post-disaster recovery for different forest types revealed two interesting phenomena. (1) There was a great difference between natural and planted forests in terms of the change in EVI from 2007 to 2017. Natural forests had a rich variety of species and high level of biodiversity, while planted forests were more homogenous, thus planted forests had a lower ability to withstand the disaster than natural forests [56–58]. Therefore, in the same elevation zones, planted forests were more severely affected by the ice storm than natural forests. However, human activities changed the natural recovery process. This resulted in the planted forests recovered fast but also produced large fluctuations in the EVI during the process of post-disaster recovery. (2) In the comparative analysis of different forest types in the same elevation zone and slope zone, we found that coniferous forests suffered less EVI decline than broad-leaved forests. This suggests that coniferous forests are more resilient to ice and snow than broad-leaved forests, which might result from broad-leaved forests having broad, flat crowns that expose a large surface area of branches and, therefore, make them more susceptible to extensive damage. In contrast, coniferous trees expose a smaller proportion of their lateral branches to ice accumulation [59,60], resulting in less physical damage than in broad-leaved forests. However, due to their characteristics and the hot, humid climate in Chebaling, broad-leaved forests can photosynthesize faster and thus have a higher rate of recovery.

#### **5. Conclusions**

In this study, we used multisource, long-time series remote-sensing data and field-survey data to evaluate the spatial and temporal variations in forest damage and recovery after the 2008 ice storm in Chebaling National Nature Reserve. Firstly, by analyzing the relationship between remotely sensed EVI data and field survey data in Chebaling, we concluded that the annual mean EVI can be used to represent the forest stock volume because of the strong correlation between them and can reflect the status of regional forests and also changes in status. Secondly, the effects of topography and forest types on disaster impact and post-disaster recovery were analyzed from two aspects of single factor and multiple factor, respectively. Our results indicate that topography had a considerable effect on disaster impact and forest recovery, and elevation was more decisive than slope in disaster impact. The disaster impact and recovery degree for all forest types in high-altitude and steep-slope areas were higher than those in low-altitude and gentle-slope areas, especially in the 700–1000 m elevation zones and 25–40◦ slope zones. However, coniferous forests in the high-elevation zones and steep-slope zones grew better than broad-leaved forests, so the hardest-hit areas for coniferous forests were higher and steeper than that of broad-leaved forests. The disaster analysis for different forest types showed that broad-leaved forests were more affected by the ice storm than coniferous forests but had faster recovery rate and a higher degree of recovery. Although planted forests were more severely affected by the ice storm than natural forests in the areas with similar topographical conditions, the recovery rate for planted forests was faster than that for natural forests because of human intervention. But the recovery process fluctuated greatly. Compared with the areas with monospecific tree species, the recovery process of coniferous and broad-leaved mixed forest is more stable. This study focuses on analyzing the characteristics and laws of disaster impact and post-disaster recovery. In the future, the driving force of forest recovery will be further studied and more experiments will be conducted in bigger scale and more regions to analyze the universality of the characteristics and laws summarized in the study.

**Author Contributions:** Data curation, L.J. and S.Z.; formal analysis, W.Y.; funding acquisition, Y.M.; investigation, L.C. and W.X.; methodology, W.Y. and Y.M.; resources, F.C. and Z.S.; writing—original draft, W.Y.; writing—review and editing, Y.M., Z.X., and J.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by Hainan Provincial Natural Science Foundation of China, grant number 419QN263; the National Key Research and Development Program of China, grant number 2017YFC0503802; and the 2018 Central Fund for Forestry Reform and Development.

**Acknowledgments:** We appreciate the support and assistance given by the Guangdong Chebaling National Nature Reserve staff during the fieldwork.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article*

# **A Novel Ensemble Approach for Landslide Susceptibility Mapping (LSM) in Darjeeling and Kalimpong Districts, West Bengal, India**

**Jagabandhu Roy 1 , Sunil Saha 1 , Alireza Arabameri 2, \*, Thomas Blaschke <sup>3</sup> and Dieu Tien Bui 4, \***


Received: 12 October 2019; Accepted: 15 November 2019; Published: 2 December 2019

**Abstract:** Landslides are among the most harmful natural hazards for human beings. This study aims to delineate landslide hazard zones in the Darjeeling and Kalimpong districts of West Bengal, India using a novel ensemble approach combining the weight-of-evidence (WofE) and support vector machine (SVM) techniques with remote sensing datasets and geographic information systems (GIS). The study area currently faces severe landslide problems, causing fatalities and losses of property. In the present study, the landslide inventory database was prepared using Google Earth imagery, and a field investigation carried out with a global positioning system (GPS). Of the 326 landslides in the inventory, 98 landslides (30%) were used for validation, and 228 landslides (70%) were used for modeling purposes. The landslide conditioning factors of elevation, rainfall, slope, aspect, geomorphology, geology, soil texture, land use/land cover (LULC), normalized differential vegetation index (NDVI), topographic wetness index (TWI), sediment transportation index (STI), stream power index (SPI), and seismic zone maps were used as independent variables in the modeling process. The weight-of-evidence and SVM techniques were ensembled and used to prepare landslide susceptibility maps (LSMs) with the help of remote sensing (RS) data and geographical information systems (GIS). The landslide susceptibility maps (LSMs) were then classified into four classes; namely, low, medium, high, and very high susceptibility to landslide occurrence, using the natural breaks classification methods in the GIS environment. The very high susceptibility zones produced by these ensemble models cover an area of 630 km<sup>2</sup> (WofE& RBF-SVM), 474 km<sup>2</sup> (WofE& Linear-SVM), 501km<sup>2</sup> (WofE& Polynomial-SVM), and 498 km<sup>2</sup> (WofE& Sigmoid-SVM), respectively, of a total area of 3914 km<sup>2</sup> . The results of our study were validated using the receiver operating characteristic (ROC) curve and quality sum (Qs) methods. The area under the curve (AUC) values of the ensemble WofE& RBF-SVM, WofE & Linear-SVM, WofE & Polynomial-SVM, and WofE & Sigmoid-SVM models are 87%, 90%, 88%, and 85%, respectively, which indicates they are very good models for identifying landslide hazard zones. As per the results of both validation methods, the WofE & Linear-SVM model is more accurate than the other ensemble models. The results obtained from this study using our new ensemble methods can provide proper and significant information to decision-makers and policy planners in the landslide-prone areas of these districts.

**Keywords:** landslide; machine learning models; remote sensing; ensemble models; validation

#### **1. Introduction**

Mountainous regions are threatened by the common natural disaster of landslides. Like hurricanes, floods, droughts, earthquakes, soil erosion, and tsunamis, landslides are important environmental disasters which cause damage and destruction to residential areas, roads, agricultural fields, gardens, and grasslands. Spatially predicting landslide-prone areas may play an important role in disaster management, and it can be considered the standard tool for decision-making in different areas [1]. In geological engineering, landslides are defined as the downward movement of material mass on a slope [2]. Worldwide, mountainous areas are profoundly affected by landslides due to the instability of slopes and masses [3]. For example, the Indian Himalayan mountain regions such as Jammu & Kasmir, Himachal Himalaya, Kumayun, Darjeeling, Sikkim, and north-eastern hilly regions are severely affected by landslides [4]. In the Darjeeling Himalayan region, landslides have a severe environmental impact on socio-economic development. Every year the Darjeeling and Kalimpong districts are frequently heated by landslides due to heavy monsoon rainfall and seismic activity [5]. During July–August in 1993, May in 2009, and September in 2011, the Darjeeling and Kalimpong districts were severely affected by extreme landslides [6]. Furthermore, some major towns in these districts, such as Darjeeling, Mirik, Kurseong, and Kalimpong were hit by landslides during June–July in 2015 due to heavy rainfall, causing fatalities and damage to properties. Therefore, it is necessary to address the landslide risk faced by this particular region to reduce the impact of this environmental disaster. Information regarding the magnitude, character, and probability of landslides can be used to reduce the impact of landslide hazards and for sustainable environmental development and future planning [7]. Therefore, landslide potentiality zoning is an important step for sustainable land management, not only for this particular region but also for other mountainous regions all over the world. The chance of landslide occurrence depends on various conditioning factors rather than a single factor. For preparing the landslide susceptibility map, two things are important: Firstly, landslide inventory data which is considered the dependent variable, and, secondly, landslide conditioning factors which are considered independent variables. In this study, the landslide inventory map was prepared using data collected from Google Earth imagery, a global positioning system (GPS), and extensive field investigations. The landslide conditioning factors or environmental factors, such as slope, aspect, altitude, curvature, geology, soil, land use, normalized differential vegetation index (NDVI), distance from drainage, distance from fault, distance from road, topographic wetness index (TWI), and stream power index (SPI) were selected based on the findings of previous literature (Yilmaz [8], Abedini et al. [9], Regmi et al. [10], Chawla et al. [11], Shahabi and hasim [12], Roy and Saha [13], Pradhan [14], Pourghasemi et al. [15], Pham et al. [16], and Goetz et al. [17]). The landslide inventory data and aforementioned landslide conditioning factors were used to prepare the LSMs with the help of the remote sensing data (RS) and geographical information systems (GIS). Nowadays, most researchers argue that machine learning algorithms using remote sensing and geographical information systems are reliable and appropriate methods for assessing landslide hazards. During recent decades, many studies on landslide susceptibility mapping have been conducted in various parts of the world. Researchers have applied different approaches to produce landslide susceptibility maps, such as statistical models, probabilistic models, knowledge-driven models, and machine learning models using geographical information systems and remote sensing techniques like the analytical hierarchy process (AHP) and bivariate statistics [9,18], logistic regression (LR), artificial neural networks (ANN), frequency ratio (FR), naive bayes classifier, auto logistic modeling, static methods, multivariate adaptive regression, two-class kernel logistic regression, SVM, artificial neural network kernel, logistic regression and logistic tree, random forest, and decision tree methods [19]. Ensemble techniques have been shown to achieve better results than a single method. In this article, WofE was ensembled with four kernels (radial basis function (RBF), linear kernel, polynomial kernel, and sigmoid kernel) of SVM to predict probable landslide hazard areas and for a comparison of the results. The Darjeeling and Kalimpong districts are parts of the eastern Himalayan region in India. Both districts are mostly covered in hilly terrain. Every year, these districts are affected by landslides, which cause destruction

to the roads, residential areas, tea gardens, and forests, leading to numerous fatalities. Therefore, these regions were selected as the study area to raise awareness among the public and government so necessary steps can be taken to mitigate the landslide hazard.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The Darjeeling and Kalimpong districts are situated in the eastern Himalaya region of India and are mainly covered in hilly and rugged mountainous terrain. Combinedly, these districts cover an area of 3914 square km. The research site is bounded within the 26◦27" to 27◦13"N latitudes and 87◦59"E to 88◦53"E longitudes (Figure 1). The altitude of the study area ranges from 15 m to 3602 m above the mean sea level. Climatically, the region is influenced by the south-west and north-east Indian monsoon. The summer season is very wet, and the winter season is dry and cold. The temperature of this region can drop close to zero degrees. According to the Indian metrological department, the rainfall of this region ranges from 1877 mm to 2333 mm. Geologically, the region is composed of Precambrian (Darjeeling gneiss, Daling series), Permian (Damuda series), Miocene (Swaliks), and recent Pleistocene (Alluvium) lithologies, as shown in Table 1 [20]. The Gorubathan and Rangamati surface are tectonic landscape of these regions. The majority of the study area is covered in Triassic rock. Regarding its geomorphology, the research site is composed of active flood plain, alluvial plain, folded ridge, highly dissected hill slope, intermontane valley, and piedmont fan plain [11]. Pedologically, the region is characterized by various soil texture classes; namely, gravelly-loamy, fine-loamy to coarse-loamy, gravelly-loamy to loamy skeletal, and gravelly-loamy to coarse-loamy [21]. Several rivers—namely, the Mahananda, Tista, Mechi, Balason, Jaldhaka, Rammam and Rangit—flow across these districts originating in the mountainous areas. The Darjeeling and Kalimpong districts are famous for national and international tourism. Some attractive tourist places in these regions are Tiger Hill, Rock Garden, Mahakal Temple, Dhirdham Temple, Batasia Loop, Ghoom Monastery, and Happy Valley Tea Garden. The major economic activities of these regions are tea plantation, horticulture, and tourism. The healthy and tasty tea of this region is famous worldwide. Siliguri, Darjeeling, and Kalimpong are the major towns and headquarters within our study area. the total population comprises 18,46,823 people, of which 50.75 % are males, and 49.25% are females. The population density of the research site is 586 people/km<sup>2</sup> , which is comparatively higher than the mean Indian population density [22]. The length of the national highway, state highway, and other main district highway has increased from 100 to 111 km, 80 to 191 km, and 37 to 79 km from 2001 to 2011. Different cultural communities are present in the study area, such as Nepali, Lepcha, Bhutia, and Rai.



Source: Mallet (50); Gansser, (51); Pawde and Saha, (52).

**Figure 1.** Study area and landslide location map.

#### *2.2. Methodology*

The methodology of the present study is depicted in Figure 2. The flowchart is divided into four main steps, as follows. Step 1: Data used: here, the landslide inventory map (LIM) and landslide conditioning factors (LCFs) data layers were prepared. Step 2: Multicollinearity analysis of landslide conditioning factors was carried out. Step 3: New ensembles of weight-of-evidence (WofE) and SVM models were applied to prepare the landslide susceptibility maps (LSMs). Step 4: LSMs were validated using the receiver operating characteristics (ROC) and quality sum (Qs) methods to measure the capability of the models and identify the best suitable model.

**Figure 2.** Methodological flowchart of the present work.

#### *2.3. Data Preparation*

#### 2.3.1. Landslide Inventory Dataset

It is vital to analyze the landslide distribution and landslide conditioning factors to determine which areas are most at risk of landslide occurrence. The landslide inventory map (LIM) is an important part of the evaluation and assessment of landslide hazards and risks. Some researchers have used landslide inventory datasets for landslide susceptibility mapping [8–17,23]. In the present study, a total of 326 landslides were identified through extensive field investigations using a global positing system (GPS) and Google Earth imagery. Out of 326 landslides, 228 (70%) landslides were chosen randomly for landslide modeling purposes, and 98 (30%) landslides were used to validate the prepared landslide susceptibility maps. The landslide inventory map (LIM) was prepared in a GIS environment and is shown in Figure 1. Field photographs of some landslides in the study area are shown in Figure 3.

′ ′ **Figure 3.** Field photographs of some landslides in the study area. (**a**) Sikkim-Kalimpong road (27◦03′20"N, 88◦26′03"E) (**b**) Sevokekalimandir (26◦54′01"N, 88◦28′18"E). (**c**) Lish catchment (26◦57′N, 88◦30′17"E). (**d**) Darjeeling road (26◦54′33"N, 88◦17′10"E). (**e**) Pagla Jhora (26◦52′47.70"N, 88◦18′11.24"E). (**f**) Sevoke Road (26◦54′33"N, 88◦28′04"E).

#### 2.3.2. Preparing Effective Factors

Landslides are processes of mass movement under the influence of different effective factors. Accordingly, it is essential to analyze the conditions of the selected factors to assess landslide susceptibility. The topographic (altitude, slope, aspect), climatic (rainfall), lithological (geology, distance from lineament), hydro-morphological (geomorphology, distance from river, sediment transportation index, stream power index, topographic wetness index), land use, vegetation index, soil texture physical properties, and earthquake intensity are the major effective factors responsible for landslides in general. Previous studies, including Yilmaz [8], Abedini et al. [9], Regmi et al. [10], Chawla et al. [11], Shahabi et al. [12], Roy and Saha [13], Pradhan [14], Pourghasemi et al. [15], Pham et al. [16], and Goetz et al. [17] used these effective factors for landslide susceptibility mapping. In the present study, rainfall (Figure 4d), elevation (Figure 4a), slope (Figure 4b), aspect (Figure 4c), geology (Figure 4e), soil texture (Figure 4f), distance from river (Figure 4 g) distance from lineament (Figure 4h), distance from road (Figure 4k), geomorphology (Figure 4o), land use/land cover (Figure 4i), normalized differential vegetation index (Figure 4j), topographic wetness index (Figure 4l), sediment transportation index (Figure 4n), stream power index (Figure 4m), and seismic zone (Figure 4p) maps were used to delineate the landslide susceptibility area. Different techniques, which are mentioned in Table 2, were used to prepare the thematic layers of these effective factors. A DEM with a spatial resolution of 30m\* 30m was selected to prepare the landslide susceptibility maps, and all of the parameters with scales greater or lesser than the DEM were resampled into 30m\*30m resolution.

Slope is one of the main landslide conditioning factors. The spatial distribution of slope ranges from 0 to 89 degrees (Figure 4b). The aspect (Figure 4c) was classified into ten categories, i.e. flat (−1), north (0–22.5; 337.5–360), northeast (22.5–67.5), east (67.5–112.5), southeast (112.5–157.5), south (157.5–202.5), southwest (202.5–247.5), west (247.5–292.5), north-west (292.25–337.5). The altitude of the study area ranges from 15 m to 3602 m above mean sea level (Figure 4a). The spatial distribution of average rainfall ranges from 1877 mm to 2333 mm (Figure 4d). The geological map was obtained from the geological survey of India. The river buffer map was classified into five classes, based on the distance from the river, using the natural breaks classification method. The maximum distance from the river in this study area is 4.33 km (Figure 4g). Similarly, the maximum distances from the road and lineament are 16.4 km (Figure 4k) and 10 km (Figure 4h), respectively. The land use of this study area was classified into five categories; namely, water bodies, settlement, vegetation, tea gardens, fallow land, and agricultural land (Figure 4i). The NDVI values range from −0.072 to 0.432 (Figure 4j). The topographic wetness index (TWI) value of the study area ranges from 1.95 to 18.41 (Figure 4l). Geomorphologically, the research area consists of active flood plain, alluvial plain, folded ridge, highly dissected hill slope, inter mountain valley, and piedmont fan plain (Figure 4o). The seismic map of the study area was classified into two categories; namely, moderate and high seismic zones. The values of the moderate risk zone range from 3 to 5 on the Richter scale, while values above 5 on the Richter scale characterize the high seismic prone areas (Figure 4p). The spatial value of STI ranges from 0 to 203 (Figure 4n). The value of SPI in the study area ranges from −11.0 to 7.81 (Figure 4m).

The elevation, slope, aspect, rainfall, normalized differential vegetation index (NDVI), topographic wetness index (TWI), stream power index (SPI), and sediment transportation index (STI) factors were classified into five sub-layers using the natural breaks classification method in a GIS environment (Figure 4). The land use/land cover (LULC) was determined by the maximum likelihood classification method (Figure 4). The geology, soil texture, geomorphology, and seismic zone maps were categorized into different sub-layers using a general classification technique in a GIS environment (Figure 4).

**Figure 4.** *Cont*.

**Figure 4.** Landslide conditioning factors - **a**. elevation, **b**. slope, **c**. aspect, **d**. rainfall, **e**. geology, **f**. soil texture, **g**. distance from river, **h**. distance from lineament, **i**. land use/land cover (LULC), **j**. normalized differential vegetation index (NDVI), **k**. distance from road, **l**. topographic wetness index (TWI), **m**. stream power index (SPI), **n**. sediment transportation index (STI), **o**. Geomorphology, **p**. Seismic map.




**Table 2.** *Cont*.

#### *2.4. Multicollinearity Analysis*

Multicollinearity analysis is a vital way of identifying and selecting appropriate landslide conditioning factors [13]. In this study, multicollinearity was evaluated through the tolerance value and variance inflation factor (VIF). In normal conditions, tolerance values under 0.10 or VIF values of 10 and above indicate multicollinearity [31–33]. In the present study, the multicollinearity test of landslide conditioning factors was done using SPSS software.

#### *2.5. Models*

#### 2.5.1. Weight-of-Evidence (WofE) Model

The present study demonstrates the application of the ensemble WofE and SVM model (a Bayesian probability model) for the assessment of landslide susceptibility in the GIS environment. Two types of data were incorporated in the weight-of-evidence model; namely, the landslide inventory data and landslide conditioning factors. The weights were assigned to each landslide conditioning factor by the weight-of-evidence (WofE) model. This model may be compared to the other statistical methods such as the data-driven model that is generally used for the Bayesian probability model [29,34–40]. Mohammady et al. [38] and Regmi et al. [10] emphasized the value of using the weight-of-evidence model for the evaluation of landslide hazard zones.

The positive weight (W+) and negative weight (W−) were calculated to complete the weight-of-evidence function. This calculation was the basis for assigning the weights to the landslide conditioning factors (B) based on the presence and absence of landslides within the area [35] using the following equations (1, 2).

$$\mathbf{W}\_{\mathbf{i}}^{+} = \ln \frac{\mathbf{P} \langle \mathbf{B}/\mathbf{A} \rangle}{\mathbf{P} \langle \mathbf{B}/\mathbf{A} \rangle} \tag{1}$$

$$\mathbf{W}\_{\mathrm{i}}^{-} = \mathrm{In} \frac{\mathrm{P} \{ \overline{\mathbf{B}} / \mathbf{A} \}}{\mathrm{P} \{ \overline{\mathbf{B}} / \overline{\mathbf{A}} \}} \tag{2}$$

Here, P is the probability and ln is the natural log. Similarly, B and B indicate the presence and absence of landslide predictive factors. A and A indicate the presence and absence of landslides. A positive weight (W+) indicates the presence of landslides in a sub-category of landslide conditioning factors and the magnitude of this weight is an indication of the positive correlation between landslide conditioning factors and landslide occurrence. A negative weight (W−) indicates the absence of landslides in a sub-category of landslide conditioning factors. A negative weight also indicates a negative correlation between the landslide conditioning factors and the occurrence of landslides [36]. For modeling purposes, the weight contrast C (C<sup>=</sup> <sup>W</sup>+−W−) measures the spatial association between landslide conditioning factors and landslide occurrences. A positive C value indicates a positive spatial association and a negative C value indicates a negative spatial association [37].

The standard deviation of W is calculated using Equation (3):

$$\mathbf{S(C)} = \sqrt{\mathbf{S^2} \mathbf{W^+} + \mathbf{S^2} \mathbf{W^-}} \tag{3}$$

where S(W+) indicates the variance of the positive weights and S (W−) indicates the variance of the negative weights. The variance of the weights was calculated using the following equation:

$$\mathbf{S}^2 \mathbf{W}^+ = \frac{1}{\mathbf{N} \{ \mathbf{B} \cap \mathbf{A} \}} + \frac{1}{\mathbf{B} \cap \overline{\mathbf{A}}} \tag{4}$$

$$\mathbf{S}^2 \mathbf{W}^- = \frac{1}{\mathbf{N}\{\overline{\mathbf{B}} \cap \mathbf{A}\}} + \frac{1}{\overline{\mathbf{B}} \cap \overline{\mathbf{A}}} \tag{5}$$

The studentized contrast is the final weight. It is a measure of confidence and is defined as the ratio of the contrast divided by its standard deviation. The studentized contrast serves as an informal test of whether C is significantly different from zero or if the contrast is likely to be "real" [35]. After applying the WofE model, the factor weights calculated by this model were ensembled with the SVM model.

#### 2.5.2. Support Vector Machine (SVM) Model

Among the different machine learning algorithms, SVM is an important supervised learning binary classifier that is based on the structural risk minimization principle [41–44]. This method separates the hyperplane formation from the training dataset. The separating hyperplane is prepared in the original space of n coordinates (xi parameter in vector x) between the points of two distinct classes [43]. The maximum margin of separation between the classes is discovered by SVM and, therefore, builds a classification hyperplane in the center of the maximum margin [14,44]. If a point is located over the hyperplane, it will be classified as +1 and, if not, will be classified as −1. The training points adjoined to the optimal hyperplane are called support vectors. Once the decision surface is acquired, new data can be classified [45] considering a training data set of instance label pairs (*XiY<sup>i</sup>* ) with *X<sup>i</sup>* ∈ *R n* , *Y<sup>i</sup>* ∈ {+1,−1} and *i* = 1......, *m*. To delineate the landslide susceptibility zones, X represents the vector space that includes rainfall, slope, aspect, elevation, geology, soil texture, land use/land cover, normalized differential vegetation index, distance from river, distance from lineament, distance from road, topographic wetness index, sediment transportation index, stream power index, geomorphology, and the seismic zone map. Meanwhile, the +1 class indicates landslide pixels, whereas the −1 class indicates non-landslide pixels.

The aim of SVM is to find the optimal separating hyperplane that can separate the training dataset into the two classes of landslides and non-landslides {+1, −1}. The separating hyperplane separates data using the following equations:

$$Y\_{\bar{l}} = (WX\_{\bar{l}} + b) \ge 1 - \xi\_{\bar{l}} \tag{6}$$

where w is a coefficient vector that defines the orientation of the hyperplane in the feature space, b is the offset value of the hyperplane from the origin, and ξ*<sup>i</sup>* represents the weak positive variables [46]. The problem of optimization will be solved through the determination of an optimal hyperplane using Lagrangian multipliers [47].

$$\text{Minimize}\sum\_{i=1}^{n}\alpha\_{i} - \frac{1}{2}\sum\_{i=1}^{n}\sum\_{j=1}^{n}a\_{i}\alpha\_{j}Y\_{i}Y\_{j}(\mathbf{X}\_{i}\mathbf{X}\_{j})\tag{7}$$

$$\text{Subject to } \sum\_{i=1}^{n} a\_i Y\_i = O\_\prime \text{ } 0 \le a\_i \le \mathbb{C}\_\prime \tag{8}$$

where ai represents the Lagrange multipliers, C is the penalty value, and the slack variables ni allow for penalized constraint violation. The decision function, which will be used for the classification of new data, can then be written as:

$$\log(X) = \text{sign}(\sum\_{i=1}^{n} Y\_i \alpha\_i X\_i + b) \tag{9}$$

If the hyperplane cannot be separated by the linear kernel function, the original input data may be shifted into a high-dimensional feature space through some nonlinear kernel functions. The classification decision function is presented in Equation (10):

$$\log(X) = \text{sign}(\sum\_{i=1}^{n} Y\_i \alpha\_i \mathcal{K}(X\_i X\_j) + b) \tag{10}$$

#### where *K(Xi, Xj)* is the kernel function.

Linear kernel (LN), polynomial kernel (PL), radial basis function kernel (RBF), and sigmoid kernel (SIG) are the most popular kernel types for SVM analysis [14]. PL and RBF are called Gaussian kernels, and they are the most commonly used kernels in the literature [43]. To prepare the landslide susceptibility map using SVM, we used the remote sensing (RS) software ENVI 4.7, which is an environment for visualizing images. The ENVI 4.7 SVM classifier has four types of kernels; namely, radial basis function (RBF), linear kernel, polynomial kernel, and sigmoid kernel. The mathematical calculation was carried out as shown in Table 3.


**Table 3.** SVM kernel types and their equations.

#### (Source: Tien Bui et al. [46], Yao et al. [43]).

#### **3. Results**

#### *3.1. Considering the Multicollinearity Analysis of the E*ff*ective Factors*

The landslide conditioning factors were tested for multicollinearity. The results show that the lowest tolerance value of landslide conditioning factors is 0.446 for rainfall and the highest tolerance value is 0.824 for slope (Table 4). The highest variance inflation factor (VIF) value is 2.241, and the lowest VIF value is 1.213 (Table 4). However, the tolerance values of landslide conditioning factors are greater than 0.1, and VIF values are less than 0.1 and 10, suggesting that there are no collinearity problems among these factors. Therefore, the selected 16 landslide conditioning factors are suitable and accurate for modeling landslide susceptibility.

**Table 4.** Multicollinearity analysis of landslide conditioning factors.


#### *3.2. Relationship Between Landslide Location and E*ff*ective Factors*

The WofE values of each class of explanatory variables stand for the degree of landslide occurrence (Table 5). The topographic factors of elevation, slope, and aspect are vital factors which determine the landslide susceptibility of an area. Areas in high elevations are more susceptible to landslides compared to lower altitude areas. In the present study, the altitude level between 422 m to 985 m has the highest WofE value, which indicates a high susceptibility to landslides. The other sub-layers of elevation are comparatively less susceptible to landslides. Slope plays a vital role in landslide hazard assessment. When slope stability becomes weak, the tendency of landslide occurrence is very high. Therefore, high slope values indicate a high probability of landslide occurrence. In our study area, the slope sub-class of 36 to 79 degrees is more prone to landslides compared to the other sub-layers of slope because this sub-class of slope has attained the maximum value of the WofE model. Aspect is also correlated with the probability of landslide occurrence. Aspect is the direction that a slope faces. In this study, south facing slopes obtained the maximum WofE value, indicating a high susceptibility to landslides. Heavy rainfall detaches the soil and rock easily, leading to an increased probability of landslide occurrence. The study area is highly influenced by the monsoon rainfall from June to November, during which the tendency of landslide occurrence is very high. The rainfall sub-layer of 2167 mm to 2239 mm attained the highest WofE values and, therefore, has a higher risk of landslides compared to the other sub-layers of rainfall. Regarding the geology, Darjeeling gneiss, daling series, and swaliks geological segments attained the highest WofE values, suggesting the highest risk of landslides. The soil texture is strongly associated with the probability of landslide occurrence. Gravelly-loamy, gravelly-loamy to loamy-skeletal, and coarse-loamy soil texture classes, with WofE values of 23.44, 21.01 and 19.05, respectively, indicate a higher risk of landslide occurrence compared to the other soil texture classes. River proximity also increases the chances of landslide occurrence. Areas nearest to rivers have a higher landslide risk compared to areas in further distance classes. Here, areas in the class of 0 to 1.66 km distance from rivers have a high probability of landslide occurrence with a WofE value of 14.78. Similarly, areas closest roads and lineaments have a high probability of landslide occurrence based on the WofE values. In recent times, land use has had a strong influence on the occurrence of landslides. Our study area is categorized into five land use types; namely, water bodies, settlement, vegetation, fallow land, and agricultural land. The outcome of the WofE model indicates that the fallow land has a higher risk of landslides compared to vegetation and other land uses. High normalized differential vegetation index areas are less prone to landslide occurrence and vise-versa. Here, the −0.07 to 0.12 NDVI sub-class with a WofE value of 33.27 is the most critical zone for landslide occurrence. The other sub-layers of NDVI indicate lower probabilities of landslide occurrence. For the factors of TWI, STI, and SPI, the maximum values have the highest probability of landslide occurrence. Geomorphologically, the folded ridge and highly dissected mountain regions have the highest potentiality of landslide occurrence, with WofE values of 15 and 33, respectively. Comparatively, the hilly and mountainous regions are more prone to landslides than the plain and plateau regions. Seismologically, the high seismic zone is more susceptible to landslide occurrence than the low seismic zone.

All sub-layers of the different landslide conditioning factors were assigned a weight by the WofE model in the GIS environment. The weighted layers were then converted to a raster layer to prepare the landslide susceptibility map. Before the landslide susceptibility mapping, the weighted (by WofE) layers were reclassified as the input data layers of the support vector machine (SVM) for ensembling with WofE.


**Table 5.**Spatial relationship between landslide conditioning factors and landslide occurrence extracted by the Weight-of-evidence (WofE) model.


**Table 5.** *Cont*.


**Table 5.** *Cont*.


**Table 5.** *Cont*.

#### *3.3. Landslide Susceptibility Models*

The support vector machine is an important machine learning algorithm that is used to assess an area's susceptibility to landslides and other natural hazards. In the present study, the SVM classification was used and ensembled with WofE. The landslide conditioning factors; namely, elevation, slope, aspect, rainfall, geology, soil texture, land use land cover, normalized differential vegetation index (NDVI), distance from river, distance from road, distance from lineament, topographic wetness index (TWI), stream power index (SPI), sediment transportation index (STI), geomorphology, and seismic zone map were used as the input of the SVM classification. The probability values of the SVM classification ranges from 0 to 1. Pixels of images or conditioning factors indicate the landslide susceptibility index with two values, i.e., 0 to 1 where 0 represents stable conditions and 1 value indicates a high chance of landslides occurrence. The SVM classification has four kernel types; namely, radial basis function, linear kernel, polynomial kernel, and sigmoid kernel. These functions were applied in the SVM classification. The output file images created by the SVM classification were integrated and used to prepare the landslide susceptibility maps (LSMs) in the GIS environment.

The four landslide susceptibility maps (LSMs) shown in Figure 5a–d were prepared using the four ensemble models of WofE and SVM; namely, WofE & RBF-SVM, WofE & Linear-SVM, WofE&Polynimial-SVM, and WofE& Sigmoid-SVM. These landslide susceptibility maps (LSMs) were classified into four categories; namely, low, medium, high, and very high susceptibility to landslides, using the natural breaks classification method in the GIS environment. In the WofE& RBF-SVM ensemble map, the four landslide susceptibility classes of low, medium, high, and very high covered 1071 km<sup>2</sup> (34%), 813 km<sup>2</sup> (25.8%), 635 km<sup>2</sup> (20.2%), and 630 km<sup>2</sup> (20%) area of the districts, respectively (Table 6 and Figure 6). In the WofE and Linear-SVM model, the low, medium, high, and very high landslide susceptibility classes covered an area of 1128 km<sup>2</sup> (35.8%), 918 km<sup>2</sup> (29.1%), 630 km<sup>2</sup> (20%), and 474 km<sup>2</sup> (15%), respectively (Table 6). In the WofE& Polynomial-SVM model, the low, medium, high, and very high susceptibility classes covered an area of 1095 km<sup>2</sup> (34.8 %), 944 km<sup>2</sup> (30%), 608 km<sup>2</sup> (19.3%) and 501 km<sup>2</sup> (15.9 %), respectively (Table 6). Meanwhile, in the WofE & Sigmoid-SVM ensemble landslide map, the classes of low, medium, high, and very high landslide susceptibility covered 1153 km<sup>2</sup> (36.6%), 893 km<sup>2</sup> (28.3%), 605 km<sup>2</sup> (19.2%) and 398 km<sup>2</sup> (15.8%) of the area, respectively (Table 6).


**Table 6.** Areal distribution of ensemble model landslide susceptibility maps (LSMs).

**Figure 5.** Landslide Susceptibility maps (LSMs) produced by different ensemble models - **a**. **Figure 5.** Landslide Susceptibility maps (LSMs) produced by different ensemble models – (**a**). WofE& RBF-SVM, (**b**). WofE&Linear-SVM, (**c**). WofE& Polynomial-SVM, (**d**). WofE& Sigmoid-SVM models.

**Figure 6.** Areal distributions of LSMs by– (**a**). area distribution of LSMs, (**b**). percentage of area distribution of LSMs.

#### *3.4. Validation and Comparison of Models*

The landslide susceptibility maps of Darjeeling and Kalimpong districts were prepared by the ensembles of WofE and SVM. These LSMs were then validated using the receiver operating characteristics (ROC) curve, which justifies and evaluates the accuracy of the models [48–56]. The ROC curve was prepared along the X and Y-axis. The X-axis indicates the false positive rate (1-specificity) and the Y-axis indicates the true positive rate (sensitivity) [57]. ROC curves have been extensively used for the assessment of susceptibility maps [8,12,15,58–66]. In the present study, of the 326 landslides, 98 (30%) landslides were used to validate the landslide susceptibility maps. The area under curve (AUC) values of the ensemble models WofE& RBF-SVM, WofE& Linear-SVM, WofE& Polynomial-SVM, and WofE& Sigmoid-SVM are 87%, 90%, 88%, and 85%, respectively, indicating that they are very good models for the identification of landslide hazard zones (Figure 7a–d). Based on the results of the ROC curves, the WofE& Linear-SVM model is considered more accurate (AUC = 90%) than the other three ensemble models.

**Figure 7.** Validation of LSMs using the ROC curve showing the area under curve (AUC) – (**a**). WofE& RBF-SVM, (**b**). WofE&Linear-SVM, (**c**). WofE& Polynomial-SVM, (**d**). WofE& Sigmoid-SVM models.

It is not sufficient to validate the susceptibility models with only one validation method because this can lead to erroneous results if the samples are randomly distributed across the basin. Therefore, it is essential to cross check the validation result using another suitable validation method. In the present study, the quality sum (Qs) index was used as a second method to assess the accuracy and compare the landslide susceptibility models. Abedini and Tulabi [67] used the Qs method for landslide hazard assessment. In the Qs method, greater values indicate a higher accuracy and correctness of

2

S

i i r n

S

<sup>A</sup> D =

n

s r i=1 Q = D -1 S

i n

 

i

the landslide susceptibility map, whereas lower values indicate lower accuracy [67]. To evaluate this index, the density ratio (Dr) was first calculated using Equation (11).

$$\mathbf{D}\_{\mathbf{r}} = \frac{\frac{\mathbf{S}\_{\parallel}}{\mathbf{A}\_{\parallel}}}{\frac{\sum\_{i}^{n} \mathbf{S}\_{\parallel}}{\sum\_{i}^{n} \mathbf{A}\_{i}}} \tag{11}$$

where *Si* is the sum of the area of the landslides in each risk class, *Ai* is the area in the class of risk, and *n* is the number of risk classes in a zonation map. The Qs index is shown in Equation (12).

$$\mathbf{Q}\_{\mathbf{s}} = \sum\_{\mathbf{i}=1}^{n} (\mathbf{D}\_{\mathbf{r}} - \mathbf{1})^{2} \times \mathbf{S} \tag{12}$$

where *Qs* is the quality sum index, *Dr* is the density ratio, and *S* is the areal ratio of each risk class to the total area. The Qs method is a reliable validation technique which is calculated based on the landslide distribution and landslide hazards map using Equation (20). The four ensemble models in this study obtained the following Qs values: the WofE & RBF-SVM ensemble model scored 2.10; the WofE& Linear-SVM ensemble model scored 2.24; the WofE&Polynominal-SVM ensemble model scored 2.10, and the WofE& Sigmoid-SVM ensemble model scored 2.18 (Table 7). In line with the ROC results, the Qs validation results also indicate that the WofE and Linear-SVM model is more accurate than the other ensemble models.


**Table 7.** Mathematical Calculation of Qs Method of Ensemble LSMs.

#### **4. Discussion**

Landslide susceptibility maps play a vital role in stakeholders making suitable decisions in landslide-prone areas. Landslide events not only cost human lives, but also destroy residential areas, roads, and agricultural fields. The assessment of landslide hazards using LSMs performed in this study is an important tool to mitigate landslide hazards, sustain the environment, and help the residents of high risk landslide susceptibility zones. In this study, ensemble models of weight-of-evidence (WofE) and SVM were used to prepare the landslide susceptibility maps (LSMs). The different statistical,

knowledge-driven, probabilistic, and machine learning models were used to recognize which areas are at severe risk of landslide occurrence. Several past studies have produced landslide susceptibility maps using different methods and models, such as landslide numerical risk factor (LNRF), frequency ratio (FR), analytical hierarchical process (AHP), SVM, artificial neural network (ANN), logistic regression (LR), conditional probability (CP), multi-criteria decision approach (MCDA), bivariate statistical, bivariate and multivariate models [8,9,60–65]. These studies determined the critical zones of landslide risk in their respective study regions. However, in the present study, a new ensemble technique was used, which has shown better results than those of previous studies. An ensemble of the two or three models may provide better results than any single model. In the present study, landslide susceptibility maps were prepared using ensemble models of WofE& RBF-SVM, WofE& Linear-SVM, WofE & Polynomial-SVM, and WofE & Sigmoid-SVM. These models are reliable and accurate in this field. The landslide susceptibility maps were created using landslide inventory data (326 landslides) and landslide conditioning factors (16 environmental factors). The landslide susceptibility maps (LSMs) produced by the ensemble models were classified into four susceptibility classes; namely, low, medium, high, and very high susceptibility to landslide occurrence. The high susceptibility landslide probability zones of the WofE & RBF-SVM, WofE& Linear-SVM, WofE & Polynomial-SVM and WofE& Sigmoid-SVM models cover areas of 630 km<sup>2</sup> (20%), 474 km<sup>2</sup> (15%), 501 km<sup>2</sup> (15%), 497 km<sup>2</sup> (15%), respectively.

The landslide susceptibility maps (LSMs) were validated and compared using the receiver operating characteristic (ROC) and quality sum (Qs) validation methods. Based on these validation methods, all models are considered very good to excellent. A high resolution DEM for this area is not freely available, posing the main challenge for the researchers in this study. If high resolution images were used for the extraction of landslide conditioning factors instead of a 30m DEM, these methods could be used to model landslide susceptibility at a micro level and achieve better results [68,69]. Of the four ensemble models, the landslide map produced by the WofE & Linear-SVM model is more suitable and accurate than those produced by other models. The areal distribution of the landslide susceptibility maps is shown in Figure 7. In the present study, these very high susceptibilities landslide probability zones are found in the middle portion of the study area. The areas in these districts closer to roads, such as NH-31 road, Rohini road, Rishi road, Darjeeling road, Sevoke road, and Sikkim-Kalimpong roads, are highly affected by landslides. Teesta River is the major river in these districts. The areas closer to the Teesta River are the most critical zone of landslide susceptibility. The Lish catchment, Mahananda catchment, and Torsha catchment are major catchments which are highly affected by landslides. The other critical landslide areas are Sukhia-Pokhari, Kurseong, Sevoke, Majua tea garden, and Kalimpong. The main factors determining landslide risk in these regions are heavy rainfall, steep slope, elevation, soil texture, geology, distance from road and LULC. During the monsoon season, these areas are strongly affected by landslides due to heavy rainfall. These regions are also affected by high seismic intensity, which is an important cause of landslides. However, the study carefully chalks out the landslides risk zones of Darjeeling and Kalimpong districts. This study will help the government to mitigate the landslides effect and strengthen the public conscious for sustainable development.

#### **5. Conclusions**

Landslides are very harmful natural hazards that cost human lives and cause widespread damage to roads, residences, gardens, and agricultural land. In this study, the weight-of-evidence (WofE) and SVM models were ensembled to produce landslide susceptibility maps (LSMs) for the Darjeeling and Kalimpong districts. The ensemble approach is an appropriate method for landslide susceptibility mapping that provides better results than using a single model. The four LSMs produced in this study were classified into four categories; namely, low, medium, high, and very high susceptibility to landslide occurrence. In the various models, the very high susceptibility class covered 20% (WofE& RBF-SVM mode), 15% (WofE& Linear-SVM model), 15.9% (WofE& Polynomial-SVM model), and 15% (WofE& Sigmoid-SVM models) of the study area, respectively. The very high landslide-prone

areas are mainly located in the southern and middle parts of Darjeeling and Kalimpong districts. In particular, the Lish catchment area, Teesta catchment area, Sevoke road, and Majua tea garden areas are highly susceptible to landslide occurrences. The results of the ensemble models were validated using the QS index and ROC methods. Both validation methods confirmed the landslide susceptibility maps produced by the WofE& RBF-SVM, WofE& Linear-SVM, WofE& Polynomial-SVM, and WofE& Sigmoid-SVM ensemble methods as being excellent and appropriate. Of the four ensemble models, the WofE & Linear-SVM model was found to be more accurate than other ensemble models. This work helps to increase awareness of the public and government and aims to reduce the impact of landslides by providing steps and suitable strategies of hazard mitigation. Some necessary steps and techniques are essential in the very high landslide risk zones of the study area. Identification of faults, weak geological regions, proper drainage management, and afforestation programs in landslide-prone areas may reduce the landslide risks. The results obtained from this study can provide proper and significant information to the decision-makers and policy planners in the landslide-prone areas of these districts.

**Author Contributions:** Methodology, J.R., S.S., and A.A.; formal analysis, J.R., and S.S.; investigation, J.R., S.S., and A.A.; writing—original draft preparation, J.R., S.S., and A.A.; writing—review and editing, J.R., S.S., A.A., T.B., and D.T.B.

**Funding:** This research was partly funded by the Austrian Science Fund (FWF) through the Doctoral College GIScience (DK W 1237-N23) at the University of Salzburg.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Article* **Sequential InSAR Time Series Deformation Monitoring of Land Subsidence and Rebound in Xi'an, China**

#### **Baohang Wang, Chaoying Zhao \*, Qin Zhang and Mimi Peng**

School of Geology Engineering and Geomatics, Chang'an University, Xi'an 710054, China; 2018026011@chd.edu.cn (B.W.); dczhangq@chd.edu.cn (Q.Z.); 2018026013@chd.edu.cn (M.P.) **\*** Correspondence: cyzhao@chd.edu.cn; Tel.: +86-29-8233-9251

Received: 7 October 2019; Accepted: 29 November 2019; Published: 1 December 2019

**Abstract:** Interferometric synthetic aperture radar (InSAR) time series deformation monitoring plays an important role in revealing historical displacement of the Earth's surface. Xi'an, China, has suffered from severe land subsidence along with ground fissure development since the 1960s, which has threatened and will continue to threaten the stability of urban artificial constructions. In addition, some local areas in Xi'an suffered from uplifting for some specific period. Time series deformation derived from multi-temporal InSAR techniques makes it possible to obtain the temporal evolution of land subsidence and rebound in Xi'an. In this paper, we used the sequential InSAR time series estimation method to map the ground subsidence and rebound in Xi'an with Sentinel-1A data during 2015 to 2019, allowing estimation of surface deformation dynamically and quickly. From 20 June 2015 to 17 July 2019, two areas subsided continuously (Sanyaocun-Fengqiyuan and Qujiang New District), while Xi'an City Wall area uplifted with a maximum deformation rate of 12 mm/year. Furthermore, Yuhuazhai subsided from 20 June 2015 to 14 October 2018, and rebound occurred from 14 October 2018 to 17 July 2019, which can be explained as the response to artificial water injection. In the process of artificial water injection, the rebound pattern can be further divided into immediate elastic recovery deformation and time-dependent visco-elastic recovery deformation.

**Keywords:** sequential estimation; InSAR time series; groundwater; land subsidence and rebound

#### **1. Introduction**

The interferometric synthetic aperture radar (InSAR) is a remote sensing technique, which has been commonly used in the investigation of large-scale ground deformation. Land subsidence in urban areas has been investigated by the InSAR technique in Las Vegas, USA [1], Houston–Galveston, USA [2], Mexico City, Mexico [3], northeast Iran [4], West Thessaly Basin, Greece [5], Pisa urban area, Italy [6], Rome metropolitan area, Italy [7], Beijing [8], Tianjin [9], Taiyuan [10] and Datong, China [11].

Xi'an, China, has suffered from severe land subsidence and ground fissure hazards since the 1960s [12–14]. During the progress of economic development and urbanization, groundwater was over-exploited for more than 50 years [15]. Consequently, it caused the formation of fourteen ground fissures accompanying land subsidence throughout the city [14,15]. The maximum land subsidence rate reached 300 mm/year in 1996, and the maximum cumulative subsidence reached approximately 3 m over the past 60 years [16].

In order to alleviate the land subsidence and ground fissures caused by over-extraction of groundwater, artificial water injection and limitation of pumpage are two effective measures. In 1996, a policy of limiting the over-pumping of groundwater was issued, and the deformation rate began to decrease [17]. When an aquifer water level rises during artificial water injection, the rebound can

be divided into short-term elastic recovery and time-dependent visco-elastic recovery [18]. Previous study revealed the uplift phenomena at Jixiangcun (point D in Figure 4) in Xi'an from 2012 to 2018 [19]. In this study, we used the sequential estimation method to map the surface deformation between 2015 and 2019 in Xi'an with 83 Sentinel-1A SAR images in terrain observation with progressive scans (TOPS) mode. Results show that some areas, such as Sanyaocun-Fengqiyuan and Qujiang New District, subsided continuously, areas such as Xi'an City Wall uplifted slowly, and areas such as Yuhuazhai rebounded after long-term subsidence, during the analyzed period. study revealed the uplift phenomena at Jixiangcun (point D in Figure 4) in Xi'an from 2012 to 2018 [19]. In this study, we used the sequential estimation method to map the surface deformation between 2015 and 2019 in Xi'an with 83 Sentinel-1A SAR images in terrain observation with progressive scans (TOPS) mode. Results show that some areas, such as Sanyaocun-Fengqiyuan and Qujiang New District, subsided continuously, areas such as Xi'an City Wall uplifted slowly, and areas such as Yuhuazhai rebounded after long-term subsidence, during the analyzed period. This paper is organized as follows: Section 2 describes the sequential InSAR time series

Remote Sens. 2019, 11, x FOR PEER REVIEW 2 of 17

This paper is organized as follows: Section 2 describes the sequential InSAR time series estimation method. Section 3 shows the study area and data. Section 4 shows three different surface deformation phenomena, including continuous land subsidence, uplift and rebound after long-term subsidence. Finally, a discussion on rebound deformation and conclusions are given in Sections 5 and 6, respectively. estimation method. Section 3 shows the study area and data. Section 4 shows three different surface deformation phenomena, including continuous land subsidence, uplift and rebound after long-term subsidence. Finally, a discussion on rebound deformation and conclusions are given in Section 5 and Section 6, respectively.

#### **2. Methodology** 2. Methodology

The flow chart of data processing is shown in Figure 1, which includes three core steps: selection of coherent pixels, three-dimensional (3D) phase unwrapping and sequential estimation of deformation parameters. The flow chart of data processing is shown in Figure 1, which includes three core steps: selection of coherent pixels, three-dimensional (3D) phase unwrapping and sequential estimation of deformation parameters.

Figure 1. Flow chart of sequential InSAR time series estimation. **Figure 1.** Flow chart of sequential InSAR time series estimation.

#### 2.1. Selection of Coherent Pixels *2.1. Selection of Coherent Pixels*

To mitigate the effects of decorrelation and retrieve large-gradient deformation, the small baseline subset (SBAS) InSAR method was proposed based on the interferograms with short spatial and temporal baselines [20]. In this paper, we use the temporal coherence to select coherent pixels [21,22], which is defined in Equation (1) for one generic pixel x: To mitigate the effects of decorrelation and retrieve large-gradient deformation, the small baseline subset (SBAS) InSAR method was proposed based on the interferograms with short spatial and temporal baselines [20]. In this paper, we use the temporal coherence to select coherent pixels [21,22], which is defined in Equation (1) for one generic pixel *x*:

N

$$\gamma\_x = \frac{1}{N} \left| \sum\_{i=1}^{N} \exp \left\{ \sqrt{-1} (\psi\_{x,i} - \widetilde{\psi}\_{x,i} - \Delta \widehat{\phi}\_{\partial,x,i}^u) \right\} \right| \tag{1}$$

interferogram, represents the spatially correlated phase component, and <sup>u</sup> represents the where *N* is the number of interferograms, ψ represents the flattened and topographically corrected interferogram, ψerepresents the spatially correlated phase component, and ⌢ φ *u* θ represents the spatially uncorrelated phase component (look-angle error phase). A more detailed introduction is provided in [21,22].

Followed by the phase unwrapping, the 3D phase unwrapping method was employed to mitigate the closed-loop discontinuities error in two-dimensional (2D) phase unwrapping [23]. It was used to explore the spatial and temporal relationships within the multi-interferograms, i.e., involving the computation of two Delaunay triangulations, which are usually referred to as "temporal" and "spatial" triangulations, respectively [24,25].

#### *2.2. Sequential InSAR Time Series Estimation*

After the atmospheric phase, orbital and digital elevation model (DEM) errors were removed from the interferograms, and we estimated the time series deformation phases by using the following function model:

$$
\begin{bmatrix}
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\
0 & 0 & 0 & \dots & 0 & -1 & 1 \\
\end{bmatrix}
\begin{bmatrix}
\varphi\_1 \\
\varphi\_2 \\
\vdots \\
\varphi\_N \\
\end{bmatrix} = 
\begin{bmatrix}
um{w}\_1 \\
um{w}\_2 \\
\vdots \\
um{w}\_M \\
um{w}\_L \\
\end{bmatrix}
\tag{2}
$$

where ϕ*i*(*i* = 1, · · · , *N*) denotes the deformation phase at the different synthetic aperture radar (SAR) acquisition date and *N* is the number of SAR images. Note the deformation at the first SAR acquisition date is set to zero, i.e., ϕ<sup>0</sup> = 0. To estimate the deformation time series, the archived SAR data are modeled as:

$$\begin{aligned} \mathbf{V}\_{1} &= \mathbf{A}\_{1}\mathbf{X}^{(1)} - \mathbf{L}\_{1} \ \mathbf{\color{red}{12.0pt}{ $\bf P\_{1}$ }}\\ \mathbf{X}^{(1)} &= \left(\mathbf{A}\_{1}\mathbf{^{T}P\_{1}}\mathbf{A}\_{1}\right)^{-1}\mathbf{A}\_{1}\mathbf{^{T}P\_{1}}\mathbf{L}\_{1} \\ \mathbf{Q}\_{X^{(1)}} &= \left(\mathbf{A}\_{1}\mathbf{^{T}P\_{1}}\mathbf{A}\_{1}\right)^{-1} \end{aligned} \tag{3}$$

where *L*<sup>1</sup> is archived SAR data with design matrix **A**<sup>1</sup> and weight matrix **P**1. *X* (1) indicates the first estimation of parameter *X*, and **Q***X*(1) is its cofactor matrix. The superscript T stands for the transposition of a matrix.

When we acquire a new SAR image, unlike conventional SBAS InSAR, to estimate deformation time series for all SAR images again we use the sequential estimation to update dynamically the deformation time series by only considering the unwrapped interferograms related to the new SAR image. Assuming the new measurements *L*<sup>2</sup> are the unwrapped interferograms related to the (*N* + 2)−th new SAR acquisition, the weight matrix is **P**2, the design matrixes are **A**<sup>2</sup> and *B*, and parameters are *X* and *Y*, we can write its observational equation as follows:

$$\mathbf{V\_2} = \begin{bmatrix} \mathbf{A\_2} & \mathbf{B\_1} \end{bmatrix} \begin{bmatrix} \mathbf{X^{(2)}} \\ \mathbf{Y} \end{bmatrix} - \mathbf{L\_2} \quad , \quad \mathbf{P\_2} \tag{4}$$

According to the principle of least squares (LS) Bayesian estimation [26], it holds that:

$$\mathbf{V}\_2^\mathsf{T}\mathbf{P}\_2\mathbf{V}\_2 + \left(\mathbf{X}^{(2)} - \mathbf{X}^{(1)}\right)^\mathsf{T}\mathbf{Q}\_{X^{(1)}}^{-1}\left(\mathbf{X}^{(2)} - \mathbf{X}^{(1)}\right) = \min \tag{5}$$

Then, we can deduce Equation (6) through Equations (3), (4) and (5) [26,27] as follows:

$$\begin{aligned} \begin{bmatrix} \mathbf{X}^{(2)} \\ \mathbf{Y} \end{bmatrix} &= \begin{bmatrix} \mathbf{X}^{(1)} + \mathbf{J}\_{\mathbf{x}}(\overline{\mathbf{L}}\_{2} - \mathbf{B}\mathbf{Y}) \\ (\mathbf{B}^{\mathsf{T}}\mathbf{Q}\_{\mathbf{J}}^{-1}\mathbf{B})^{-1}\mathbf{B}^{\mathsf{T}}\mathbf{Q}\_{\mathbf{J}}^{-1}(L\_{2} - \mathbf{A}\_{2}\mathbf{X}^{(1)}) \\ \mathbf{Q}\_{\mathbf{X}^{(2)},\mathbf{Y}} \end{bmatrix} \\ \begin{bmatrix} \mathbf{Q}\_{\mathbf{X}^{(2)}} = \mathbf{Q}\_{\mathbf{X}^{(1)}} - \mathbf{J}\_{\mathbf{x}}\mathbf{A}\_{2}\mathbf{Q}\_{\mathbf{X}^{(1)}} + \mathbf{J}\_{\mathbf{x}}\mathbf{B}\mathbf{Q}\_{\mathbf{Y}}\mathbf{B}^{\mathsf{T}}\mathbf{J}\_{\mathbf{x}}^{\mathsf{T}} \\ \mathbf{Q}\_{\mathbf{X}^{(2)},\mathbf{Y}} = -\mathbf{J}\_{\mathbf{x}}\mathbf{B}\mathbf{Q}\_{\mathbf{Y},\mathbf{Y}} \end{bmatrix} \\ \begin{aligned} \mathbf{Q}\_{\mathbf{Y}} &= \left(\mathbf{B}^{\mathsf{T}}\mathbf{Q}\_{\mathbf{I}}^{-1}\mathbf{B}\right)^{-1} \\ \mathbf{J}\_{\mathbf{X}} &= \mathbf{Q}\_{\mathbf{X}^{(1)}}\mathbf{A}^{\mathsf{T}}\mathbf{Q}\_{\mathbf{I}}^{-1} \\ \mathbf{Q}\_{\mathbf{f}} &= \mathbf{P}\_{\mathbf{Z}}^{-1} + \mathbf{A}\_{2}\mathbf{Q}\_{\mathbf{X}^{(1)}}\mathbf{A}\_{2}^{\mathsf{T}} \end{aligned} \end{aligned} \tag{6}$$

where [*X* (2) ;*Y*] is the updated deformation time series, in which *Y* is the cumulative deformation at the new SAR acquisition date, and **Q**[*X*(2) ;*Y*] represents their cofactor matrixes. **J***<sup>x</sup>* is the gain matrix, in which **Q***<sup>J</sup>* is the updated cofactor matrix with the new SAR image. Therefore, we can update the deformation parameters as quickly as possible, once a new SAR image is presented. For a more detailed discussion of sequential estimation of SBAS-InSAR dynamic deformation parameter methods, readers can refer to [28].

#### **3. Study Area and Data**

#### *3.1. Study Area*

Xi'an, the capital of Shaanxi province, China, is bounded by the Chan River and Ba River to the east, the Feng River to the west, the Wei River to the north, and the Qinling mountains to the south. The elevation of the study area varies from 360 to 750 m. Figure 2 shows the quaternary geology map of the study area, where the Chang'an-Lintong fault (CAF hereinafter) and 14 ground fissures are superimposed, and loess ridge areas are labeled with white blocks. The terrain of Xi'an gradually inclines from the north-west to the south-east, and the landform gradually transforms from a flood plain to loess tableland terraces. Loess ridges and depressions are interchangeably distributed in central urban areas, where land subsidence and 14 ground fissures occurred [19]. CAF fault (mainly in the ENE direction) is the most active fault to the south of Xi'an city, which controls the activities of the 14 ground fissures occurring on the hanging wall of CAF. On the other hand, the ground fissures have impacts on the land subsidence area, constraining the subsidence areas to develop into elliptical shapes with their long axes parallel to the fissure direction, i.e., in the north-east (NE) direction [14,16,17].

As to the hydrogeological conditions of Xi'an City, three main aquifers are present in Xi'an stratum: the phreatic aquifer, the first artesian aquifer, and the second artesian aquifer. The bottom of the phreatic aquifer varies from 30 to 80 m below the ground surface. The primary constituents for this phreatic aquifer are fine sand and clay, and its water quality is poor. The bottom of the first artesian aquifer ranges from 140 to 180 m below the ground surface. This aquifer consists of sand, loess, gravel, and mudstone, and its water quality is good. The bottom of the second artesian aquifer varies from 170 to 300 m below the surface. This aquifer has good water quality, and mainly includes fine-medium sand and medium-coarse sand [14,17]. Xi'an belongs to a temperate, semi-humid continental monsoon climate, with an annual precipitation of about 585 mm, so it suffers from a shortage of water resources with an urban population of 7 million. The volume of groundwater withdrawal from 1980 to 1994 increased annually, amounting to 1388 million m<sup>3</sup> /year in 1994 [17]. Owing to the restriction of groundwater exploitation in Xi'an since 1996, Heihe water has become the main water supply, leading to a decrease of groundwater withdrawal from 1996 to 2010 [17,29]. Moreover, a cumulative volume of 1,552,800 m<sup>3</sup> had been recharged in Xi'an from 2009 to 2014.

which further leads to land subsidence [18].

Remote Sens. 2019, 11, x FOR PEER REVIEW 5 of 17

Yuhuazhai is one of the severest deformation areas in Xi'an after 1996. The localized hydrogeological conditions belong to a loose multi-layer porous aquifer system, where the groundwater is composed of porous phreatic water, shallow confined aquifer and deep confined aquifer. The deep confined aquifer was mainly pumped at a depth of 100 to 300 m below the surface until 2018 [30]. With the decline of the water level in pumping wells, a pressure difference between the aquifer pressure in pumping wells and the surrounding aquifer occurred, which drove the water in the surrounding soil to move toward pumping wells [31]. Under this situation, land subsidence accelerated and a subsided funnel was formed in the aquifer system. Over-exploitation of

Figure 2. Quaternary geology map of Xi'an, where Chang'an-Lintong fault (CAF) and 14 ground fissures are superimposed, and loess ridge areas are labeled with white blocks. **Figure 2.** Quaternary geology map of Xi'an, where Chang'an-Lintong fault (CAF) and 14 ground fissures are superimposed, and loess ridge areas are labeled with white blocks.

In order to study the spatiotemporal characteristics of subsidence and ground fissures, both global positioning system (GPS) and InSAR observations between 2005 and 2007 were employed [32,33]. Then, Envisat, advanced land observation satellite (ALOS) and TerraSAR SAR datasets were also used to investigate the two-dimensional deformation in Xi'an from 2005 to 2012 [34]. Furthermore, European remote sensing satellite (ERS), Envisat and Sentinel SAR datasets were also used to investigate the long-term deformation evolution and causative factors of land subsidence and ground fissures in Xi'an from 2003 to 2017 [35]. Recently, multi-sensor SAR datasets (ALOS, TerraSAR and Sentinel) were used to investigate spatiotemporal land deformation from 2012 to 2018 over Xi'an, where the surface deformation along three Xi'an subway lines was first analyzed [19]. Previous studies show that there has been a close spatiotemporal relationship between land subsidence and the formation of earth fissures. The degradation of the aquifer system led to these Yuhuazhai is one of the severest deformation areas in Xi'an after 1996. The localized hydrogeological conditions belong to a loose multi-layer porous aquifer system, where the groundwater is composed of porous phreatic water, shallow confined aquifer and deep confined aquifer. The deep confined aquifer was mainly pumped at a depth of 100 to 300 m below the surface until 2018 [30]. With the decline of the water level in pumping wells, a pressure difference between the aquifer pressure in pumping wells and the surrounding aquifer occurred, which drove the water in the surrounding soil to move toward pumping wells [31]. Under this situation, land subsidence accelerated and a subsided funnel was formed in the aquifer system. Over-exploitation of groundwater leads to a decrease of groundwater level. Although the confined aquifer is elastic, the continuous over-exploitation of groundwater leads to irrecoverable confined aquifer deformation, which further leads to land subsidence [18].

typical deformations and threatened urban infrastructure. In order to study the spatiotemporal characteristics of subsidence and ground fissures, both global positioning system (GPS) and InSAR observations between 2005 and 2007 were employed [32,33]. Then, Envisat, advanced land observation satellite (ALOS) and TerraSAR SAR datasets were also used to investigate the two-dimensional deformation in Xi'an from 2005 to 2012 [34]. Furthermore, European remote sensing satellite (ERS), Envisat and Sentinel SAR datasets were also used to investigate the long-term deformation evolution and causative factors of land subsidence and ground fissures in Xi'an from 2003 to 2017 [35]. Recently, multi-sensor SAR datasets (ALOS, TerraSAR and Sentinel) were used to investigate spatiotemporal land deformation from 2012 to 2018 over Xi'an, where the surface deformation along three Xi'an subway lines was first analyzed [19]. Previous studies show that there has been a close spatiotemporal relationship between land subsidence and the formation of earth fissures. The degradation of the aquifer system led to these typical deformations and threatened urban infrastructure.

3.2. Data

#### *3.2. Data* parameters. Specifically, we used estimated deformation parameters and their cofactor matrixes to

In total, 83 Sentinel-1A images from 20 June 2015 to 17 July 2019 in Xi'an, China were employed to generate 365 differential interferograms by setting the spatial and temporal baseline thresholds. For the sequential InSAR time series deformation processing, we divided SAR data into two groups. As shown in Figure 3, the first group is archived SAR data to generate interferograms (indicated in blue lines) with SBAS technology for parameter initialization, while the second group is newly received SAR images (i.e., a new SAR acquisition) to connect older archived SAR images and generate new interferograms (indicated with green lines) with SBAS technology to update new deformation parameters. Specifically, we used estimated deformation parameters and their cofactor matrixes to update dynamically the deformation parameters one by one, including time series, deformation rate and DEM error, by only considering the newly generated interferograms. As estimated results from conventional SBAS and sequential SBAS estimation inversion are exactly consistent in terms of deformation rate, DEM error and deformation time series [17], without a loss of generality, we took the first 70 SAR images from 20 June 2015 to 18 January 2019 as the first group to initiate the parameters. For the first group data, we used the conventional SBAS-InSAR method, including the selection of coherent pixels, phase unwrapping, DEM error correction, and atmospheric and orbital error correction, followed by the inversion of deformation parameters and their cofactor matrixes. update dynamically the deformation parameters one by one, including time series, deformation rate and DEM error, by only considering the newly generated interferograms. As estimated results from conventional SBAS and sequential SBAS estimation inversion are exactly consistent in terms of deformation rate, DEM error and deformation time series [17], without a loss of generality, we took the first 70 SAR images from 20 June 2015 to 18 January 2019 as the first group to initiate the parameters. For the first group data, we used the conventional SBAS-InSAR method, including the selection of coherent pixels, phase unwrapping, DEM error correction, and atmospheric and orbital error correction, followed by the inversion of deformation parameters and their cofactor matrixes. To update the time series deformation on the new SAR acquisition date, there are usually two ways to generate interferograms among newly received SAR images and the archived SAR images: single-link configuration to unwrap interferograms in the spatial domain, as shown in Figure 3 (A), and network-link configuration, shown in Figure 3 (B), where interferograms can be unwrapped in both spatial and time domains, i.e., 3D phase unwrapping [23–25]. We used the latter method to update the deformation time series. The selected coherent pixels in the first group of SAR data were used to extract the phase for the interferograms and connect to the newly received SAR images. After 3D phase unwrapping, followed by the correction of DEM, atmospheric and orbital errors, the deformation rate and time series were updated by sequential estimation.

Remote Sens. 2019, 11, x FOR PEER REVIEW 6 of 17

received SAR images (i.e., a new SAR acquisition) to connect older archived SAR images and generate new interferograms (indicated with green lines) with SBAS technology to update new deformation

In total, 83 Sentinel-1A images from 20 June 2015 to 17 July 2019 in Xi'an, China were employed to generate 365 differential interferograms by setting the spatial and temporal baseline thresholds. For the sequential InSAR time series deformation processing, we divided SAR data into two groups. As shown in Figure 3, the first group is archived SAR data to generate interferograms (indicated in

Figure 3. The illustration of interferogram configuration between the first group of SAR data (i.e., archived SAR data) and the newly received SAR images (i.e., new observation data from SAR satellites). (A) Single-link interferogram configuration; (B) network-link interferogram configuration. The blue lines indicate interferograms generated between archived SAR images in the first group by SBAS technology and the green lines show the new interferograms generated between newly received SAR images and older archived SAR images by SBAS technology. **Figure 3.** The illustration of interferogram configuration between the first group of SAR data (i.e., archived SAR data) and the newly received SAR images (i.e., new observation data from SAR satellites). (**A**) Single-link interferogram configuration; (**B**) network-link interferogram configuration. The blue lines indicate interferograms generated between archived SAR images in the first group by SBAS technology and the green lines show the new interferograms generated between newly received SAR images and older archived SAR images by SBAS technology.

4. Results To update the time series deformation on the new SAR acquisition date, there are usually two ways to generate interferograms among newly received SAR images and the archived SAR images: single-link configuration to unwrap interferograms in the spatial domain, as shown in Figure 3A, and network-link configuration, shown in Figure 3B, where interferograms can be unwrapped in both spatial and time domains, i.e., 3D phase unwrapping [23–25]. We used the latter method to update the deformation time series.

The selected coherent pixels in the first group of SAR data were used to extract the phase for the interferograms and connect to the newly received SAR images. After 3D phase unwrapping, followed by the correction of DEM, atmospheric and orbital errors, the deformation rate and time series were updated by sequential estimation.

#### **4. Results**

#### *4.1. Deformation Rate Map* Remote Sens. 2019, 11, x FOR PEER REVIEW 7 of 17

Figure 4 presents the annual deformation rate map in the vertical direction over the main Xi'an region from 20 June 2015 to 17 July 2019. We chose the stable area indicated with a black pentagram as the reference point, as has been verified by previous studies [19,34,35]. 4.1. Deformation Rate Map Figure 4 presents the annual deformation rate map in the vertical direction over the main Xi'an region from 20 June 2015 to 17 July 2019. We chose the stable area indicated with a black pentagram as the reference point, as has been verified by previous studies [19,34,35].

Figure 4. Annual deformation rate map in the vertical direction over the study area from 20 June 2015 to 17 July 2019. The deformation time series for six points indicated by A, B, C, D, E and F are shown in Figure 5. Rectangular boxes L1 and L2 are enlarged and shown in Figures 6 and 8, respectively. Red dotted line indicates ground fissures, and the red line indicates CAF faults. The black pentagram indicates the location of the reference point. **Figure 4.** Annual deformation rate map in the vertical direction over the study area from 20 June 2015 to 17 July 2019. The deformation time series for six points indicated by A–F are shown in Figure 5. Rectangular boxes L1 and L2 are enlarged and shown in Figures 6 and 8, respectively. Red dotted line indicates ground fissures, and the red line indicates CAF faults. The black pentagram indicates the location of the reference point.

Figure 5 shows the cumulative deformation time series for six typical points from 20 June 2015 to 17 July 2019, where the points A to C show continuous land subsidence with approximately linear deformation, and points D and E show continuous uplift with linear deformation. However, point F shows a fast rebound after long-term subsidence. Figure 5 shows the cumulative deformation time series for six typical points from 20 June 2015 to 17 July 2019, where the points A to C show continuous land subsidence with approximately linear deformation, and points D and E show continuous uplift with linear deformation. However, point F shows a fast rebound after long-term subsidence.

#### *4.2. Land Subsidence*

The Xi'an area has suffered from severe land subsidence and ground fissure hazards since the 1960s [12–14]. Historical leveling measurements indicate that the area of land subsidence reached about 300 km<sup>2</sup> in Xi'an from 1959 to 2018 [29]. Multi-temporal SAR observations and GPS measurements were used to study the spatiotemporal evolution and mechanism of land subsidence and ground fissure activities from 1992 to 2006 [32,33]. InSAR results from 2005 to 2018 uncovered the three land subsiding areas, namely, Yuhuazhai, Sanyaocun-Fengqiyuan and Qujiang New District [19,31–35]. As shown in Figures 4 and 5, two areas in Xi'an, namely, Sanyaocun-Fengqiyuan and Qujiang New District, continuously subsided from 20 June 2015 to 17 July 2019.

Remote Sens. 2019, 11, x FOR PEER REVIEW 8 of 17

Figure 5. Deformation time series at six typical points A, B, C, D, E and F, which are located in Figure 4. The six points show different deformation magnitude. **Figure 5.** Deformation time series at six typical points (**A**–**F**), which are located in Figure 4. The six points show different deformation magnitude.

The Xi'an area has suffered from severe land subsidence and ground fissure hazards since the

#### 4.2. Land Subsidence *4.3. Uplift of Xi'an City Wall*

1960s [12–14]. Historical leveling measurements indicate that the area of land subsidence reached about 300 km<sup>2</sup> in Xi'an from 1959 to 2018 [29]. Multi-temporal SAR observations and GPS measurements were used to study the spatiotemporal evolution and mechanism of land subsidence and ground fissure activities from 1992 to 2006 [32,33]. InSAR results from 2005 to 2018 uncovered the three land subsiding areas, namely, Yuhuazhai, Sanyaocun-Fengqiyuan and Qujiang New District [19,31–35]. As shown in Figure 4 and Figure 5, two areas in Xi'an, namely, Sanyaocun-Fengqiyuan and Qujiang New District, continuously subsided from 20 June 2015 to 17 July 2019. 4.3. Uplift of Xi'an City Wall Xi'an City Wall (Figure 6), as long as 13.74 kilometers, is famous globally for its complete preservation over its long history since the Ming Dynasty. As shown in Figure 6, the deformation rate from 20 June 2015 to 17 July 2019 showed that the north-west section of Xi'an City Wall was basically stable, with the deformation rate as small as 1 mm/year. However, the uplift rate in the south-east section reached 12 mm/year. The average uplift rate was 7 mm/year. Further, as shown in Figure 7, the cumulative deformation time series for four points, i.e., A–D at the south-east section of downtown Xi'an, showed a continuous uplift with di Remote Sens. <sup>2019</sup> fferent rates from 20 June 2015 to 17 July 2019. , 11, x FOR PEER REVIEW 9 of 17

Figure 6. The deformation and optical image of Xi'an City Wall; (A) deformation rate map from 20 June 2015 to 17 July 2019, which is an enlargement of L1 in Figure 4; (B) an optical image of Xi'an City Wall; (C) a photo of Xi'an City Wall. **Figure 6.** The deformation and optical image of Xi'an City Wall; (**A**) deformation rate map from 20 June 2015 to 17 July 2019, which is an enlargement of L1 in Figure 4; (**B**) an optical image of Xi'an City Wall; (**C**) a photo of Xi'an City Wall.

Figure 7. Deformation time series at points A, B, C and D; their locations are indicated in Figure 6 (A).

Figure 6. The deformation and optical image of Xi'an City Wall; (A) deformation rate map from 20

Remote Sens. 2019, 11, x FOR PEER REVIEW 9 of 17

*Remote Sens.* **2019**, *11*, 2854 June 2015 to 17 July 2019, which is an enlargement of L1 in Figure 4; (B) an optical image of Xi'an City

Wall; (C) a photo of Xi'an City Wall.

Figure 7. **Figure 7.** Deformation time series at points A, B, C and D; thei Deformation time series at points (**A**–**D** r locations are indicated in Figure 6 (A). ); their locations are indicated in Figure 6A.

#### *4.4. Rebound of Yuhuazhai*

Yuhuazhai is one of the severest land subsidence areas in Xi'an. Cumulative deformation time series over L2 (Yuhuazhai) in Figure 4 are calculated in Figure 8 from 20 June 2015 to 17 July 2019, and from 5 April 2018 to 17 July 2019 in Figure 9.

In Figure 9, the onset rebound date on 14 October 2018 can be visually detected, and the center of the rebound can be determined. The maximum rebound area was located in the settlement center and the rebound area diffused around the previous maximum land subsidence center, in particular, the area towards the south. The maximum rebound magnitude was 130 mm from 14 October 2018 to 17 July 2019. Figure 10 shows the enlarged deformation map on 13 December 2018, where a ground fissure (F4) is superimposed. We can see that the deformation, including subsidence and rebound, is restricted on the hanging wall (i.e., the south side) of ground fissure F4. Four points localized at A–D show different surface deformation processes. The deformation evolution can be divided into three stages, i.e., (i) a sustained land subsidence from 20 June 2015 to 14 October 2018; (ii) a quick rebound from 14 October 2018 to 1 December 2018; and (iii) a slower rebound from 1 December 2018 to 17 July 2019.

4.4. Rebound of Yuhuazhai

and from 5 April 2018 to 17 July 2019 in Figure 9.

Remote Sens. 2019, 11, x FOR PEER REVIEW 10 of 17

**Figure 8.** Figure 8. Cumulative deformation time series of Yuhuazhai from 20 June 2015 to 17 July 2019. Cumulative deformation time series of Yuhuazhai from 20 June 2015 to 17 July 2019.

to 17 July 2019.

Remote Sens. 2019, 11, x FOR PEER REVIEW 11 of 17

Figure 9. Cumulative rebound deformation time series of Yuhuazhai from 5 April 2018 to 17 July 2019. The black rectangular box is enlarged in Figure 10. **Figure 9.** Cumulative rebound deformation time series of Yuhuazhai from 5 April 2018 to 17 July 2019. The black rectangular box is enlarged in Figure 10.

In Figure 9, the onset rebound date on 14 October 2018 can be visually detected, and the center of the rebound can be determined. The maximum rebound area was located in the settlement center and the rebound area diffused around the previous maximum land subsidence center, in particular, the area towards the south. The maximum rebound magnitude was 130 mm from 14 October 2018 to 17 July 2019. Figure 10 shows the enlarged deformation map on 13 December 2018, where a ground fissure (F4) is superimposed. We can see that the deformation, including subsidence and rebound, is restricted on the hanging wall (i.e., the south side) of ground fissure F4. Four points localized at A, B, C and D show different surface deformation processes. The deformation evolution can be divided into three stages, i.e., i) a sustained land subsidence from 20 June 2015 to 14 October 2018; ii) a quick rebound from 14 October 2018 to 1 December 2018; and iii) a slower rebound from 1 December 2018 *Remote Sens.* **2019**, *11*, 2854 Remote Sens. 2019, 11, x FOR PEER REVIEW 12 of 17

Remote Sens. 2019, 11, x FOR PEER REVIEW 12 of 17

Figure 10. Enlarged deformation map of the area in the rectangle in Figure 9, with indication of the ground fissure F4. The time series deformation of four points localized at A, B, C and D are shown in **Figure 10.** Enlarged deformation map of the area in the rectangle in Figure 9, with indication of the ground fissure F4. The time series deformation of four points localized at A–D are shown in Figure 11. The Yuhuazhai area indicated in the rectangle is shown in Figure 12. Figure 10. Enlarged deformation map of the area in the rectangle in Figure 9, with indication of the ground fissure F4. The time series deformation of four points localized at A, B, C and D are shown in Figure 11. The Yuhuazhai area indicated in the rectangle is shown in Figure 12.

Figure 11. The Yuhuazhai area indicated in the rectangle is shown in Figure 12.

Figure 11. Deformation time series at four points A, B, C and D in Figure 10. Red lines divide time **Figure 11.** Deformation time series at four points A–D in Figure 10. Red lines divide time series deformation into three stages.

series deformation into three stages.

series deformation into three stages.

5. Discussion

5. Discussion

Figure 11. Deformation time series at four points A, B, C and D in Figure 10. Red lines divide time

The land subsidence in Xi'an can be divided into three stages; i) preliminary stage (1959 to 1971), ii) rapid development stage (1972 to 1990), and iii) slow development stage (1991 to present) [17]. As

The land subsidence in Xi'an can be divided into three stages; i) preliminary stage (1959 to 1971), ii) rapid development stage (1972 to 1990), and iii) slow development stage (1991 to present) [17]. As

Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 17

Figure 12. Optical image of Yuhuazhai (rectangular box in Figure 10). Seven pumping wells are identified. This area experienced large rebound deformation after artificial water injection. **Figure 12.** Optical image of Yuhuazhai (rectangular box in Figure 10). Seven pumping wells are identified. This area experienced large rebound deformation after artificial water injection.

#### **5. Discussion** The land subsidence in Xi'an can be divided into three stages; i) preliminary stage (1959 to 1971), ii) rapid development stage (1972 to 1990), and iii) slow development stage (1991 to present) [17]. As investigated, the localized groundwater withdrawal was the main factor for the long-term land subsidence [15,33–35]. The degradation of the aquifer system led to these typical deformations, and threatened infrastructure assets. The land subsidence occurring in Sanyaocun-Fengqiyuan and Qujiang New District areas shows the same deformation characteristics as in previous studies [19,34,35]. The artificial water injection and restricted exploitation of groundwater are effective measures to alleviate the imbalance of groundwater and land subsidence. Geo-mechanisms of subsurface water withdrawal and injection were discussed in [36,37]. Artificial recharge after long-term groundwater extraction causes land uplift, which consists of elastic, plastic, visco-elastic, and visco-plastic components. In reality, the recoverable elastic and visco-elastic deformation accounts for a very small proportion of total deformation [18]. When the water level of a confined aquifer rises in the process of artificial water injection, the elastic deformation can recover quickly, while the recovery of visco-elastic deformation is time-dependent. On the contrary, the plastic and visco-plastic deformation is irrecoverable [18]. Due to the restriction of groundwater exploitation in Xi'an since 1996, the groundwater level

diversion and comprehensive utilization across river basins), has become the main source of domestic

began to recover [29]. The Heihe water supply facility (a large-scale water conservancy project of water

Figure 13. Recognition of pumping wells 1, 2, and 3 in Figure 12 from optical image (A), (B), and (C), and by photo of the scene (D), (E), and (F), respectively. 6. Conclusions water, instead of groundwater, which has largely alleviated the imbalance of the water table in Xi'an. The exploitation of deep groundwater significantly reduced, and the confined water level in many areas increased from 2002 to 2017 [34], which led to the surface uplift as shown in Figure 4, including at the Xi'an City Wall. Therefore, the uplift of Xi'an City Wall is an example of recovery of visco-elastic deformation, which is time-dependent.

In the process of urbanization, over-exploitation of groundwater in the Xi'an area since 1960s has led to land subsidence. Artificial water injection has alleviated the land subsidence effectively. We employed the sequential estimation method to update the surface time series deformation dynamically, which is an efficient InSAR tool to monitor surface deformation as quickly as possible, when SAR images are acquired one by one. Yuhuazhai has experienced a severe surface deformation since 1996. The cumulative land subsidence was up to 1.8–2 m from 1996 to 2016 [30]. In order to alleviate the land subsidence caused by over-extraction of groundwater, the cessation of groundwater withdrawal and a water injection strategy were adopted by Xi'an Municipal Government since October 2018 [38]. For the water injection strategy, conventionally daily water was injected into the pumping well to avoid pollution of groundwater. The result shows that Yuhuazhai continuously subsided from 20 June 2015 to 14 October 2018, then rebounded from 14 October 2018 to 17 July 2019, owing to the artificial water injection operation.

Owing to the high water quality of the deep confined aquifer, it was used as the main source of residential water supply. However, long-term over-exploitation of groundwater led to the decline Remote Sens. 2019, 11, x FOR PEER REVIEW 14 of 17

1.7 km<sup>2</sup>

identify seven pumping wells. Figure 13 shows three images and photos corresponding to pumping wells 1, 2, and 3 of Figure 12. Figure 12. Optical image of Yuhuazhai (rectangular box in Figure 10). Seven pumping wells are identified. This area experienced large rebound deformation after artificial water injection.

Figure 13. Recognition of pumping wells 1, 2, and 3 in Figure 12 from optical image (A), (B), and (C), and by photo of the scene (D), (E), and (F), respectively. **Figure 13.** Recognition of pumping wells 1, 2, and 3 in Figure 12 from optical image (**A**–**C**), and by photo of the scene (**D**–**F**), respectively.

#### 6. Conclusions **6. Conclusions**

In the process of urbanization, over-exploitation of groundwater in the Xi'an area since 1960s has led to land subsidence. Artificial water injection has alleviated the land subsidence effectively. We employed the sequential estimation method to update the surface time series deformation dynamically, which is an efficient InSAR tool to monitor surface deformation as quickly as possible, when SAR images are acquired one by one. In the process of urbanization, over-exploitation of groundwater in the Xi'an area since 1960s has led to land subsidence. Artificial water injection has alleviated the land subsidence effectively. We employed the sequential estimation method to update the surface time series deformation dynamically, which is an efficient InSAR tool to monitor surface deformation as quickly as possible, when SAR images are acquired one by one.

Our contribution is to reveal the surface subsidence, uplift and rebound in the Xi'an area with 83 Sentinel SAR images from 20 June 2015 to 17 July 2019. The land subsidence occurred in Sanyaocun-Fengqiyuan and Qujiang New District areas. Meanwhile, the uplift and sudden rebound deformation were revealed in Xi'an City Wall and Yuhuazhai, respectively.

Owing to the restriction of groundwater exploitation in the Xi'an area since 1996, the declined groundwater level began to recover, and the confined water level increased accordingly. In addition, we find that the sudden rebound deformation located in Yuhuazhai was mainly due to artificial water injection, which was carried out in the pumping well of Yuhuazhai around October 2018. The rebound pattern comprises two stages: the elastic deformation and visco-elastic deformation. The former can recover immediately, while the latter is time-dependent. The complex surface deformation in Xi'an reflects the changes in the aquifer system. Therefore, the control of groundwater balance can alleviate surface deformation.

**Author Contributions:** B.W. and C.Z. performed the experiments and produced the results. B.W. drafted the manuscript and C.Z. finalized the manuscript. Q.Z., and M.P. contributed to the discussion of the results. All authors conceived the study, and reviewed and approved the manuscript.

**Funding:** This research was funded by the Natural Science Foundation of China (Grant No. 41874005), and the Fundamental Research Funds for the Central Universities (Grant Nos. 300102269303 and 300102269719).

**Acknowledgments:** We are grateful to the European Space Agency for providing Sentinel-1A data.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

# *Article* **Assessment of the Degree of Building Damage Caused by Disaster Using Convolutional Neural Networks in Combination with Ordinal Regression**

#### **Tianyu Ci 1,2 , Zhen Liu 3, \* and Ying Wang 1**


Received: 14 October 2019; Accepted: 26 November 2019; Published: 1 December 2019

**Abstract:** We propose a new convolutional neural networks method in combination with ordinal regression aiming at assessing the degree of building damage caused by earthquakes with aerial imagery. The ordinal regression model and a deep learning algorithm are incorporated to make full use of the information to improve the accuracy of the assessment. A new loss function was introduced in this paper to combine convolutional neural networks and ordinal regression. Assessing the level of damage to buildings can be considered as equivalent to predicting the ordered labels of buildings to be assessed. In the existing research, the problem has usually been simplified as a problem of pure classification to be further studied and discussed, which ignores the ordinal relationship between different levels of damage, resulting in a waste of information. Data accumulated throughout history are used to build network models for assessing the level of damage, and models for assessing levels of damage to buildings based on deep learning are described in detail, including model construction, implementation methods, and the selection of hyperparameters, and verification is conducted by experiments. When categorizing the damage to buildings into four types, we apply the method proposed in this paper to aerial images acquired from the 2014 Ludian earthquake and achieve an overall accuracy of 77.39%; when categorizing damage to buildings into two types, the overall accuracy of the model is 93.95%, exceeding such values in similar types of theories and methods.

**Keywords:** earthquake; rapid mapping; damage assessment; deep learning; convolutional neural networks; ordinal regression; aerial image

#### **1. Introduction**

The rapid and accurate acquisition of disaster losses can provide great help for disaster emergency response and decision-making. Remote sensing (RS) and Geographic Information System (GIS) can help assess earthquake damage within a short period of time after the event.

Many studies have presented assessment techniques for earthquake building damage by using aerial or satellite images [1–5]. Booth et al. [6] used vertical aerial images, Pictometry images, and ground observations to assess building damage in the 2011 Haitian earthquake. Building by building visual damage interpretation [7] based on the European Macroseismic Scale (EMS-98) [8] was carried out in a case study of the Bam earthquake. Huyck et al. [9] used multisensor optical satellite imagery to map citywide damage with neighborhood edge dissimilarities. Many different features have been introduced to determine building damage from remote sensing images [10]. Anniballe et al. [11] investigated the capability of earthquake damage mapping at the scale of individual buildings with

a set of 13 change detention features and support vector machine (SVM). Simon Plank [12] reviewed the methods of rapid damage assessment using multitemporal Synthetic Aperture Radar(SAR) data. Gupta et al. [13] present a satellite imagery dataset for building damage assessment with over 700,000 labeled building instances covering over 5000 km<sup>2</sup> of imagery.

Recent studies show that the machine learning algorithm performs well in earthquake damage assessment. Li [14] assessed building damage with one-class SVM using pre- and post-earthquake QuickBird imagery and assessed the discrimination power of different level (pixel-level, texture, and object-based) features. Haiyang et al. [15] combined SVM and the image segmentation method to detect building damage. Cooner et al. [16] evaluate the effectiveness of machine learning algorithms in detecting earthquake damage. A series of textural and structural features were used in this study. A SVM and feature selection approach was carried out for damage mapping with post-event very high spatial resolution(VHR) image and obtained overall accuracy (OA) of 96.8% and Kappa of 0.5240 [11]. Convolutional neural networks (CNN) was utilized to identify collapsed buildings from post-event satellite imagery and obtained an OA of 80.1% and Kappa of 0.46 [17]. Multiresolution feature maps were derived and fused with CNN for the image classification of building damages in [18], and an OA of 88.7% was obtained.

Most of the above-mentioned damage information extraction studies classified damaged buildings into two classes: damaged and intact. However, these two classes are not enough to meet actual needs.

Recently, deep learning (DL) methods have provided new ideas for remote sensing image recognition technology. An end-to-end framework with CNN for satellite image classification was proposed in [19]. Scott et al. [20] used transfer learning and data augmentation to demonstrate the performance of CNNs for remote sensing land-cover classification. Zou et al. [21] proposed a DL method for remote sensing scene classification. A DL-based image classification framework was introduced in [22]. Xie et al. [23] designed a deep CNN model that can achieve a multilevel detection of clouds. Chen et al. [24] combined a pretrained CNN feature extractor and the k-Nearest Neighbor(KNN) method to improve the performance of ship classification from remote sensing images.

In this paper, we propose a new approach based on CNNs and ordinal regression (OR) aiming at assessing the degree of building damage caused by earthquakes with aerial imagery. CNNs hierarchically extract useful high-level features from input building images, and then OR is used to classify the features into four different damage grades. Then, we can get the degree of damaged buildings. The manually labeled damaged building dataset in this paper was obtained from aerial images after several historical earthquakes. The proposed mothed was evaluated with different network architecture and classifiers. We also compared the method with several state-of-the-art methods including hand-engineered features such as edge, texture, spectra, and morphology feature and machine learning methods.

This is the first attempt to apply OR to assess the degree of building damage from aerial imagery. OR (also called "ordinal classification") is used to predict an ordinal variable. In this paper, the building damage degree, on a scale from "no observable damage" to "collapse", is just an ordinal variable. However, typical multiclass classification ignores the ordered information between the damage degree, while damage degrees have a strong ordinal correlation. Thus, we cast the assessment problem of the degree building damage as an OR problem and develop an ordinal classifier and corresponding loss function to learn our network parameters. Information utilization was improved by OR, so we can achieve a better accuracy with the same or a lesser amount of data. When categorizing the damage to buildings into four types, we apply the method proposed in this paper to aerial images acquired from the 2014 Ludian earthquake and achieve an overall accuracy of 77.49%; when categorizing the damage to buildings into two types, the overall accuracy of the model is 93.95%, exceeding such values in similar types of theories and methods.

Another contribution of this work is a dataset of labeled building damage including 13,780 individual buildings from aerial data by visual interpretation that is classified into four damage degrees building by building.

The main contributions of this paper are summarized as follows:

(1) A deep ordinal regression network for assessing the degree of building damage caused by an earthquake. The proposed network uses a CNN for extracting features and an OR loss for optimizing classification results. Different CNNs' architecture has also been evaluated.

(2) A dataset with more than 13,000 optical aerial images of labeled damage buildings can be download freely.

The rest of the paper is organized as follows: Section 2 presents an introduction to the dataset used in this research. Section 3 has a brief introduction to CNN and OR. Section 4 describes the proposed method and the different CNN architectures that we evaluated. We present the results of the experiments in Section 5. Finally, conclusions are drawn in Section 6.

#### **2. Data**

#### *2.1. Remote Sensing Data*

Two datasets from different seismic events were used in this study, including the Yushu earthquake in 2010 and Ludian earthquake in 2014, which are respectively described in the following text.

#### 2.1.1. Images From Yushu Earthquake

On April 14 2010, Yushu County in Qinghai Province, China was hit by a 7.1-magnitude earthquake [25]. In this study, the aerial images with 0.1-m resolution on 16 April 2010 in Jiegu Town, the worst-hit area in the earthquake, was obtained. The data overview is shown in Figure 1, and the relevant parameters of the data are shown in Table 1.

From the partial enlarged view corresponding to the red frame in Figure 1, a high building-collapse rate could be seen in the image-covered area, which was left in ruins. The details are clear, as the imaging quality is good.

**Figure 1.** Post-event aerial image of the Yushu earthquake, Qinghai Province, China.

**Table 1.** Remote sensing imagery specifications.


#### 2.1.2. Images From Ludian Earthquake

The 2014 Ludian earthquake was an Ms. 6.5 earthquake. The earthquake occurred on 3 August 2014 [26,27]. The earthquake caused major damage in Zhaotong City, Yunnan province. Aerial images were acquired to map the damage caused by the earthquake. Images acquired on 4 August 2014 were post-event airborne images for the remainder of the study. The aerial images have three spectral bands (R, G, and B) and a spatial resolution of 0.2 m. The images were georeferenced and mapped to a cartographic projection. On 7 and 14 August, after the earthquake, aerial remote sensing image data of the affected area was acquired. Figure 2 shows the range of the main aerial remote sensing image data acquired after the Ludian earthquake.

The data obtained in this paper mainly comes from the area with level VIII seismic intensity, Longtoushan Town and the northern bank of the Niulan River. The aerial remote sensing data of the Ludian earthquake (Figure 2) obtained in this paper was shot 4–10 days after the earthquake and has a spatial resolution of 0.2 m. With enough volume and good quality, it is suitable for damage degree assessment and the relevant study of single buildings. VIII

Dominated by mountains, the Ludian region has a wide distribution of low-rise masonry–timber and soil–timber structures in villages. The spacing between buildings is large. The earthquake occurred in the summer; green trees can be seen and parts of the roofs of some houses are blocked by vegetation.

**Figure 2.** Post-event aerial image of Ludian earthquake, Yunnan Province, China.

#### *2.2. Dataset of Labeled Damage Building*

In the research of DL image classification, a well-labeled dataset is very important, as it is used for training and evaluation benchmarks. Images of buildings at all levels of damage from the Ludian earthquake were used to construct the dataset. Each image was downsampled to 88 × 88. The size of the images is based on resolution and the length and width of local buildings.

The standard that we used to classify the damage degree is similar to EMS-98 [8], but with fewer levels. The damage degree D0 in this paper corresponds to G0-2 in EMS-98. D1 corresponds to G3 in EMS-98, and the rest can be done in the same manner. The standard can be found in Table 2, and some samples of each damage level can found in Figure 3. We got about 13,780 individual buildings from remote sensing data of Ludian and 3501 buildings from Yushu by visual interpretation and classified them into four damage degrees building by building. When we labeled these samples, a few ground photos were used as a reference. These photos can help us better understand the actual damage to the buildings and the damage grade.


**Table 2.** Classification of damage to buildings in the Ludian earthquake.

Before training the model, we needed to build a building dataset of different damage degrees. Thousands of building types were drawn by manual vectorization from the airborne images mentioned in Section 2.1.

Then, we intercepted each building into an image with a width and height of 88 pixels and placed the building in the center. Some samples can be found in Figure 3. In this paper, the damage terms "level", "grade", and "class" are used interchangeably. Building damage was classified into four classes.


**Figure 3.** Eamples of building damage in the datasets.

Samples in the two datasets, Ludian and Yushu, have different characteristic. Datasets are named by their location where the data was obtained. Table 3 shows the sample distribution of each damage grade.


**Table 3.** Distribution of the samples in the two datasets.

#### *2.3. Data Augmentation*

In this paper, we have applied data augmentation [28] in order to artificially enlarge the dataset by using label-preserving transformations to the input data in order to generate new samples. Data augmentation can effectively avoid overfitting during the training of complex models and can significantly improve data quality. Several data augmentation techniques such as vertical and horizontal flipping, rotating at a certain degree (less than 15◦ ), and increasing or reducing brightness were used. Examples can be found in Table 4.

**Table 4.** Examples of data augmentation results.


#### **3. Background Knowledge**

#### *3.1. Introduction to CNN*

The convolution layer convolves the input image with a set of learnable filters, each producing one feature map in the output image. After crossing a nonlinear activation layer, it can get the picture feature of the next layer. The input feature map is compressed in the pooling layer. On the one hand, the pooling layer shrinks the feature map and simplifies the network-computing complexity. On the other hand, it compresses and extracts the main features. Generally, there are two kinds of operations in the pooling layer: max pooling and average pooling. In this paper, max pooling is adopted. The fully connected layer can connect all the features and convey results to the classifier.

Parameters of CNN can be obtained by training. The training includes two processes: forward and back propagation [29]. Forward propagation calculates the classification results of samples by current network weights. Back propagation compares the calculated classification results with true values, and then updates the network weights backward, layer by layer.

#### *3.2. Ordinal Regression*

In studies on machine learning and statistical models, classification is used to predict categories where targets belong based on input data. In classification, the relationship between categories is equal and independent, while the output is usually discrete. In typical classification, such as in the study of remote sensing land use and cover, the land surface is usually classified into vegetation, bare soil, water, buildings, and roads according to the spectrum, texture, and context of the surface features in the images [30,31]. In the recognition of handwritten figures [32], the given target images are classified into 0–9 classes. Although figures are used as class tags, there is no other relationship between any two classes. There are many commonly used methods to solve classification problems [33], including SVM [34], decision tree classifier [35], nearest neighbor algorithm [36], and CNN-based classification algorithms. The accuracy rate is the most commonly used index to describe the classification quality.

Regression analysis is used to predict the value of some property of the target based on input data and the output values are in a row within a value range. Guo et al. (2009) [37], based on images of faces, used a support vector regression (SVR) algorithm to predict the actual ages of people whose faces were shown. Human age is a continuous value with a limited value range, and is suitable for prediction by a regression algorithm. In studies related to image depth estimation, the distance (depth) between an object and a camera, as a continuous value, is usually estimated by a linear regression method in a machine learning algorithm, as shown in [38]. The commonly used methods to solve regression problems include the support vector regression algorithm and linear regression analysis. Variance, mean squared error, and other indices are often used to describe the regression quality.

Ordinal regression (OR) [39] is a statistical analysis model to predict ordinal tag variables corresponding to targets. OR is a statistical model between a classification and regression model. In other words, the original regression model prediction results are transformed into ordered discrete variables. For example, people's ages are often expressed as positive integers, and they can also be predicted by an OR-based statistical learning model. For instance, Niu and Zhou et al. (2016) [40] used an OR model and CNNs to estimate age. In machine learning, OR can also be called ranked learning [41]. Table 5 lists the differences between regression, classification, and OR.


**Table 5.** Differences between regression, classification, and ordinal regression.

For OR problems, several original ordinal tag variables can be transformed into a set of binary classification subproblems [42]. By integrating the prediction results of all binary classification subproblems, the estimated results of an original OR problem can be obtained. Binary classifiers for ordinal regression can be solved by mature machine learning algorithms. In this study, the method to predict building damage degree is designed as a set of binary classification subproblems. For instance, OR was combined with CNNs for monocular depth estimation [43].

For ordinal tags including n classes and expressed by n natural numbers from 1 to n, when the tag corresponding to each target x is predicted, the original problem can be transformed to obtain n – 1 mapping relationships, each of which *fi*(*x*) means that the tag number y corresponding to the input x is less than or equal to probability i.

Using the characteristic extraction model of the image input to obtain the extracted advanced characteristic vector, the characteristic vector is imported into the classification model for classification.

#### **4. Proposed Method**

This section presents the details of the proposed "CNN in combination with OR" method. The proposed network is composed of two basic parts: a CNN feature extractor and classifier. These parts are discussed separately.

The CNN feature extractor includes several convolution layers followed by max-pooling and an activation function. The output of the CNN feature extractor is used as the feature vector of the classifier. The classifier usually consists of fully connected layers. An illustration of the proposed network is shown in Figure 4.

**Figure 4.** Illustration of the proposed network. The network consists of a convolutional neural network (CNN) feature extractor and a classifier. Solid arrows represent data flow. We adopt VGG-16, ResNet-50, and a baseline network as our CNN feature extractors. The Softmax classifier and ordinal regression (OR) classifier offer the choice of two classifiers. The OR classifier that is shown in this figure branches out into three layers, where each layer contains two neurons. The prediction damage degree is decoded from these layers. The supervised information of the network is the damage grade of buildings.

#### *4.1. CNN Feature Extractor*

CNN models are excellent in terms of representation learning. This feature makes them suitable for transfer learning, which consists of applying a model trained for a particular task to a different task. The transfer can be done by fine-tuning the existing weights of the network using the new dataset in order to adjust the model for a new target problem or by using the network as a feature extractor, which does not require retraining. In the latter case, an input sample is forwarded in order to obtain an intermediate representation, a vector; the vectors can be fed into other classifiers such as a Softmax classifier [44].

Two successful CNN models pretrained on ImageNet were evaluated as feature extractors in our work: the Visual Geometry Group Network (VGG) [45] and residual learning network (ResNet-50) [46]. Their parameters were initialized via the pretrained classification model on ImageNet Large Scale Visual Recognition Competition (ILSVRC) [47]. Fully connected layers in VGG-16 or ResNet-50 were removed and replaced with a new custom one that had 128 neurons. When the model was trained, all the convolutional layers were locked.

We design a baseline network to compare the performance. Every convolutional layer in this network is followed by Batch Normalization (BN) [48], Rectified Linear Unit (ReLU) [49] activation, and the max-pooling layer. The baseline is simple enough for us to preform initial configurations before using more complex topologies. A detailed description can be found in Table 6.


**Table 6.** Description of baseline network. Conv-BN-ReLU is a block, consisting of a convolutional layer, BN layer, and RelU activation.

#### *4.2. Classifier*

As described in Section 2, in this study, buildings damaged by earthquakes can be classified into four damage degrees: D0, D1, D2, and D3. Based on this ordinal relationship, an OR-based building damage degree classifier model is proposed. For verification and comparison, the building assessment problem can be turned into a multiclass classification problem that adopts a Softmax classifier in a straightforward manner.

The Softmax classifier is a common softmax function that is used to divide the input data into four classes and give the probability of each class. The maximum probability is the prediction category of the current sample. The loss function of the Softmax classifier is the cross-entropy loss function.

The architecture of the OR classifier is shown in Figure 4. The OR classifier branches out three binary classification layers. Each binary classification layer corresponding to the probability of *D* > 0, *D* > 1, and *D* > 2. After that, we concatenate the three outputs into a single vector *D*(*d*0, *d*1, · · · , *d*5). The predicted damage degree is decoded from this vector.

It is assumed that *<sup>D</sup>* <sup>=</sup> <sup>ϕ</sup>(χ, <sup>Θ</sup>) means that the results vector *<sup>D</sup>*(*d*0, *<sup>d</sup>*1, · · · , *<sup>d</sup>*5) from the calculation with data input <sup>χ</sup> and model parameters <sup>Θ</sup>. *<sup>Y</sup>*(*y*0, *<sup>y</sup>*1, · · · , *<sup>y</sup>*5) means the actual vector that is encoded from the damage degree corresponding to data input χ.

It is known from softmax function characteristics that

$$d\_{2i} + d\_{2i+1} = 1\tag{1}$$

where *i* ∈ {0, 1, 2}.

Based on the definition, the following characteristics exist:

$$y\_k \in \{0, 1\} \tag{2}$$

$$y\_{2i} + y\_{2i+1} = 1\tag{3}$$

where 0 ≤ k ≤ 5, *i* ∈ {0, 1, 2}.

The loss function L(*Y*, *D*) of the OR-based damage assessment model can be expressed as:

$$\mathcal{L}(Y, D) = -\frac{1}{3} \sum\_{i=1}^{3} \left[ y\_{2i} \log d\_{2i} + (1 - y\_{2i}) \log(1 - d\_{2i}) \right]. \tag{4}$$

The loss function can be derived. Therefore, based on the back propagation algorithm, the minimum value of the loss function is obtained iteratively to result in the weight of the optimized model.

During prediction, for any sample input χ, its dichotomous decomposition code D(*d*0, *d*1, · · · , *d*5) can be decoded to the corresponding damage degree ˆ*d* by the following method:

$$\hat{d} = \sum\_{i=0}^{3} \psi(d\_{2i} \ge 0.5) \tag{5}$$

where the indicator function ψ can be expressed as

$$\begin{cases} \psi(true) = 1\\ \psi(false) = 0 \end{cases} \tag{6}$$

#### *4.3. Evaluated Networks*

In this work, we evaluated six CNN topologies with different feature extractors and classifiers. The name and composition of each network can be found in Table 7. The two classification methods are the Softmax classifier (SC) and the ordinal regression classifier. All of these network topologies will be evaluated.

**Table 7.** Network topologies to be evaluated. Each network consists of a feature extractor and a classifier. ResNet: residual learning network.


#### *4.4. Model Realization*

In this section, the DL model algorithm was programmed by Keras [50] and a TensorFlow [51] open-source DL framework and the Python 3.6 programming language [52]. All experimental and test codes were run on the same computer platform. The hardware configuration of the computer consisted of an Intel i7 3.4 GHz CPU, 16.0 GB memory, GeForce RTX 2080 Ti graphics, and 8 G RAM display. The operating system was Ubuntu 18.04. CUDA version 9.0 [53] was used for acceleration computing. The GDAL2.2.2 geographic data processing software package [54] was used to read and write image data, conduct vector operations, and transform geographic projections.

The pretrained weight based on the ImageNet dataset is widely used in transfer learning because characteristics such as the edge, texture, and structure learned from the ImageNet dataset are universal in computer vision tasks [55]. The weight initialization of the VGG and ResNet characteristic extraction modules employs the pretrained weight based on the ImageNet dataset. In the baseline feature extractor, the weight initialization is conducted by Glorot uniform distribution initialization [56]. All models use the same training dataset for training.

The stochastic gradient descent (SGD) method [57] is a common optimization algorithm in DL model training [58]. In this paper, the SGD algorithm with momentum [59] is used for model training.

#### *4.5. Model Evaluation Methods and Indicators*

#### 4.5.1. Confusion Matrix

A confusion matrix is used to judge the consistency between the classification results of models or classifiers and the true category information, and is one of the basic evaluation methods for remote sensing image classification. The specific procedure is to compare the classification result tags with the true category information one by one, and *C* is used to represent the confusion matrix. It is assumed

that there are *K* classes of samples, and that *C* is a row *K* and column *K* matrix. Any *C* (*i*, *j*) represents the true category *i* and the total samples in the predicted category *j*.

#### 4.5.2. Overall Accuracy and Kappa Coefficient

Overall accuracy (OA) refers to the consistency probability between classification results and true classes. Its calculation formula is

$$OA = \frac{\sum\_{i}^{K} \mathbb{C}(i, i)}{\sum\_{i}^{K} \sum\_{j}^{K} \mathbb{C}(i, j)}. \tag{7}$$

The kappa coefficient [60] is calculated based on the confusion matrix to measure the calculation indicator of classification accuracy. The theoretical kappa coefficient falls between [–1, 1], but the actual value is often between [0, 1]. Its calculation formula is

$$p\_{\varepsilon} = \frac{\sum\_{i}^{K} \left( \sum\_{j}^{K} \mathbb{C}(i, j) \ast \sum\_{j}^{K} \mathbb{C}(j, i) \right)}{N^2} \tag{8}$$

$$Kappa = \frac{OA - p\_{\varepsilon}}{1 - p\_{\varepsilon}} \tag{9}$$

where *N* is the total number of samples.

#### 4.5.3. Mean Squared Error

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. The mean squared error is the average of the quadratic sum of the error between the predicted data and true values. Its calculation formula is

$$MSE = \frac{1}{N} \sum\_{i=1}^{N} (y\_i - \hat{y}\_i)^2 \tag{10}$$

where *y<sup>i</sup>* is the true value, *y*ˆ*<sup>i</sup>* is the predicted value, and *N* is the total number of samples.

In this study, MSE may be a more important indicator than overall accuracy. For example, Table 8 shows two confusion matrixes, which have same overall accuracy and different mean squared errors. In this study, Confusion matrix 1 is better than Confusion matrix 2, but *OA* does not reflect this situation. We need MSE to evaluate our model.

**Confusion Matrixes 1 Confusion Matrixes 2** A B C D A B C D A 10 8 6 4 A 10 6 8 6 B 8 10 8 6 B 6 10 6 8 C 6 8 10 8 C 8 6 10 6 D 4 6 8 10 D 6 8 6 10

OA 0.3333 OA 0.3333 Kappa 0.1098 Kappa 0.1111 MSE 1.8 MSE 2.2667

**Table 8.** Two confusion matrixes with the same overall accuracy (OA) and different mean squared errors (MSEs).

#### **5. Results**

#### *5.1. Dataset Configuration*

During the model training, each dataset prepared in Section 2 was divided into three parts at a proportion of 8:1:1. Then, 80% of the sample data was randomly selected for the training set, 10% was randomly selected for the Validation set, and 10% was randomly selected for the testing set. The amount of building damage classification is guaranteed to be balanced in each set. Fivefold cross-validation was applied to evaluate the model.

The two datasets mentioned in this paper, the Ludian dataset and Yushu dataset, have different uses. Among them, the Ludian dataset with more data is used to train the model, while the Yushu dataset with less data is used to verify the adaptability of the model.

During the training, the data of the training sets included four damage degrees, as shown in Section 2: complete damage, severe damage, common, and nearly intact. The validation set and testing set also included the four classes above. The training set was used to input models to make them automatically adjust the weight parameters based on the back propagation algorithm. The verification set was used for model seletion. The testing set was used to verify the actual model accuracy. The following accuracy and kappa coefficient calculation results were obtained from the data of the testing set.

As the buildings to be evaluated are classified as damaged or not damaged in most current studies, in this paper, three sets were created by grouping samples of different damage degrees (Table 9). In Set 1, D0, D1, and D2 were incorporated into an intact class and D3 was incorporated into a damaged class to compare with other methods; in Set 2, D0 and D1 were incorporated into a nearly intact class, D2 was incorporated into a severe damage class, and D3 was incorporated into a complete collapse class. The prediction results of models will be recalculated again to compare the evaluation indicators.


**Table 9.** Distribution of three damage grade sets.

#### *5.2. Accuracy Results on Ludian Dataset*

As shown in Table 10, for Set 3, the minimum overall accuracy of the six network models is 72.86% for Baseline-SC, and the average is 74.09%. The accuracy is 77.39% for VGG-OR, in which the maximum value, with a kappa coefficient of 0.69, represents good model consistency. For Set 1, the accuracy of all the models is about 92%–94% and the average is 93%, with a small fluctuation. The best accuracy is 93.95% for VGG-OR. The kappa coefficient, ranging between 0.78 and 0.83, representing very good model consistency.

In statistical modeling, the MSE can represent the difference between the actual observations and the observation values predicted by the model. So, when the overall accuracy is equal to or lower than the MSE, the better the model performance. It makes a lot of sense to minimize the MSE in the damage degree assessment to buildings. Table 10 shows that the MSE results of the OR approach are always better than the values of direct classification methods, which can be explained because more ordinal information can avoid bias.


**Table 10.** Accuracy indicators of deep learning (DL) models. The last rows show the average values. The best result for each classifier and set is shown in bold type.

According to the results of the comparison shown in Table 10, it is possible to affirm that our OR approach (VGG-OR) outperforms the direct classification methods.

We set the learning rate as 0.001, and the batch size was set to 32. Models with same CNN feature extractor take the same amount of time, because the OR classifier does not consume more computing resources. The baseline, VGG, and ResNet models require 6, 10, and 33 min, respectively, for 100 epochs of iterations using the same training dataset. Models usually converge within 100 epochs. It can be concluded that VGG-OR gains 3.66% increments over Baseline-OR with the cost of only a 4-minute increment of model training time.

In order to check whether the results are stable, the standard deviation (SD) of OA, Kappa, and MSE is shown in Table 11. Since the metrics of Set 1 and Set 2 are calculated from Set 3, only the SD of the metrics of Set 3 is shown in Table 11. All the SD values are quite small, which means that the results of models can be obtained relatively stably.


**Table 11.** The standard deviation (SD) of OA, Kappa, and MSE of deep learning (DL) models on Set 3.

#### *5.3. Accuracy Results on Yudian Dataset*

Given that the amount of data in the Yushu dataset is much smaller than that of the Ludian dataset, it is less effective in training the model. Therefore, we attempted to transfer the model trained by the Ludian dataset to the Yushu dataset. Firstly, the effects of the model trained with the Ludian dataset applied directly to the Yushu dataset were verified, as shown in Table 12.

**Table 12.** Accuracy indicators of the model trained with the Ludian dataset applied directly to the Yushu dataset.


It can be found that all the indicators demonstrate a significant decline, and the accuracy is only 64%, suggesting an invalid model. This indicates that there is a difference in the data distribution rules between the two datasets, so the model trained by one dataset is not applicable to the other.

Then, we tried to transfer the model trained by the Ludian dataset to the Yushu dataset. Through parameter fine-tuning, a learning rate of 0.0001 was adopted, and all the layers except for the full connection layer were locked. As a contrast, the model was also directly trained by the Yushu dataset. The actual number of training set samples was controlled to analyze the impact of the input data on the model performance.

Figure 5 shows the impact of the number of training set samples on the overall accuracy, and the error bar represents the SD value. The model that was transfered from the Ludian dataset is more accurate and more stable.

**Figure 5.** The impact of the number of training set samples on overall accuracy.

#### **6. Discussion**

The proposed method is an "end-to-end" solution. The input to this method is the sample image data, and the output is the damage level label. The method can directly obtain the available results without worrying about intermediate products. Considering the damage level of a building as an OR problem with ordered labels, it can make more effective use of model input information, which can improve the accuracy of the model and reduce the MSE of the prediction results. The deep learning-based algorithm model applied in this paper can also be regarded as a data-driven method. This means that the larger the dataset, the better the model performance.

In this study, we try to transfer the model between datasets of labeled damage buildings acquired from different earthquake locations. The datasets share the same damage levels but have different data characteristics. They are similar but not the same, so a model trained with one cannot be used for the other. The transfer learning experiment not only verified a method to solve the problem of a lack of data, but also proved the stability of the model in different regions.

In the study of machine learning, it is commonly accepted that the more samples for the training model there are, the better, but it does not mean that increasing the data of one model will definitely lead to an obvious performance improvement. When there are few samples, the performance of the algorithm based on DL may not be good because the algorithm needs a large amount of parameters in many data-training models. Correspondingly, if there is less data, the performance of the machine learning algorithm based on manual characteristic selection may be better with customized rules and the help of professionals. With a huge amount of data, the performance of the DL algorithm will increase with the increasing data scale.

CNN models are developed by training the network to represent the relationships and processes that are inherent within the datasets. They perform an input–output mapping using a set of interconnected simple processing features. We should realize that such models typically do not really represent the physics of a modeled process; they are just devices used to capture relationships between the relevant input and output variables [61]. These models can also be considered as data-driven models. So, the amount and quality of an input dataset may influence the upper limit of the model performance.

A critical factor for the use of proposed model is data availability. The amount of well-labeled samples should be enough. In the case study of the Yushu dataset, 1500 or more images are needed in the training set, and the validation and testing sets also need some data. This number can go down significantly if a pretrained model is used.

In this study, four damage grades were adopted. However, the visual interpretation of aerial images includes uncertainty or mis-classification especially for light and heavy damage levels [6]. The damage degree will be underestimated by aerial images (Figure 6). A Bayesian updating process is discussed in [6] to reduce uncertainties with ground truth data.

**Figure 6.** Example of underestimated building damage by visual interpretation of an aerial image. **Left**: ground photo; **Right**: aerial image. The collapse of the building is not visible on the aerial image.

#### **7. Conclusions**

The study was carried out on the high-precision and automated assessment method of damage to buildings; the entire process, including experimental data preparation, dataset construction, detailed model implementation, verification by experiment, and assessment and verification, was systematically conducted; and the performance of the model in practical applications was predicted through independent and disparate datasets, applying and validating the strengths and potential of the proposed assessment method.

We propose a new approach based on CNNs and OR aiming at assessing the degree of building damage caused by earthquakes with aerial imagery. The network consists of a CNN feature extractor and an OR classifier. This is the first attempt to apply OR to assess the degree of building damage from aerial imagery. Information utilization was improved by OR, so we can achieve a better accuracy with the same or a lesser amount of data. As the buildings to be evaluated are classified as damaged or not damaged in most current studies, we recalculate the evaluation indicators in the case of two classes and three classes. The proposed method significantly outperforms previous approaches.

In this study, we produced a new dataset that consisted of labeled images of damaged buildings. More than 13,000 optical aerial images were classified into four damage degrees based on the damage scale in Table 3. The dataset and code are freely available online and can be found at [62].

In the future, we will attempt to expand the training data on more sensors and types of buildings. A transfer learning algorithm will also be considered when lacking training data. Based on the existing classification model, combined with the object detection algorithm, such as RetinaNet [63], the end-to-end automatic extraction of damaged building locations and corresponding damage levels within the image range can be achieved, further reducing the intermediate process. We would apply our method to more extensive and diverse types of remote sensing data. OR method has great potential to be widely used in other ordinal-scale signals, such as sea ice concentration.

**Author Contributions:** Methodology, T.C. and Z.L.; Resources, Z.L.; Supervision, Y.W.; Writing—original draft, T.C.; Writing—review and editing, Z.L. and Y.W.

**Funding:** This research was funded by the National Key Research and Development Program (2017YFC1502505) and the China National Science and Technology Major Project entitled "The application demonstration system of emergency monitoring and evoluation of major natural disasters" (03-Y30B06-9001-13/15).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Remote Sensing* Editorial Office E-mail: remotesensing@mdpi.com www.mdpi.com/journal/remotesensing

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-4307-9